WorldWideScience

Sample records for sanger sequencing technology

  1. Comparison of the Equine Reference Sequence with Its Sanger Source Data and New Illumina Reads.

    Directory of Open Access Journals (Sweden)

    Jovan Rebolledo-Mendez

    Full Text Available The reference assembly for the domestic horse, EquCab2, published in 2009, was built using approximately 30 million Sanger reads from a Thoroughbred mare named Twilight. Contiguity in the assembly was facilitated using nearly 315 thousand BAC end sequences from Twilight's half brother Bravo. Since then, it has served as the foundation for many genome-wide analyses that include not only the modern horse, but ancient horses and other equid species as well. As data mapped to this reference has accumulated, consistent variation between mapped datasets and the reference, in terms of regions with no read coverage, single nucleotide variants, and small insertions/deletions have become apparent. In many cases, it is not clear whether these differences are the result of true sequence variation between the research subjects' and Twilight's genome or due to errors in the reference. EquCab2 is regarded as "The Twilight Assembly." The objective of this study was to identify inconsistencies between the EquCab2 assembly and the source Twilight Sanger data used to build it. To that end, the original Sanger and BAC end reads have been mapped back to this equine reference and assessed with the addition of approximately 40X coverage of new Illumina Paired-End sequence data. The resulting mapped datasets identify those regions with low Sanger read coverage, as well as variation in genomic content that is not consistent with either the original Twilight Sanger data or the new genomic sequence data generated from Twilight on the Illumina platform. As the haploid EquCab2 reference assembly was created using Sanger reads derived largely from a single individual, the vast majority of variation detected in a mapped dataset comprised of those same Sanger reads should be heterozygous. In contrast, homozygous variations would represent either errors in the reference or contributions from Bravo's BAC end sequences. Our analysis identifies 720,843 homozygous discrepancies

  2. Comparing Whole-Genome Sequencing with Sanger Sequencing for spa Typing of Methicillin-Resistant Staphylococcus aureus

    DEFF Research Database (Denmark)

    Bartels, Mette Damkjaer; Petersen, Andreas; Worning, Peder

    2014-01-01

    spa typing of methicillin-resistant Staphylococcus aureus (MRSA) has traditionally been done by PCR amplification and Sanger sequencing of the spa repeat region. At Hvidovre Hospital, Denmark, whole-genome sequencing (WGS) of all MRSA isolates has been performed routinely since January 2013, and ...

  3. Comparison of base composition analysis and Sanger sequencing of mitochondrial DNA for four U.S. population groups.

    Science.gov (United States)

    Kiesler, Kevin M; Coble, Michael D; Hall, Thomas A; Vallone, Peter M

    2014-01-01

    A set of 711 samples from four U.S. population groups was analyzed using a novel mass spectrometry based method for mitochondrial DNA (mtDNA) base composition profiling. Comparison of the mass spectrometry results with Sanger sequencing derived data yielded a concordance rate of 99.97%. Length heteroplasmy was identified in 46% of samples and point heteroplasmy was observed in 6.6% of samples in the combined mass spectral and Sanger data set. Using discrimination capacity as a metric, Sanger sequencing of the full control region had the highest discriminatory power, followed by the mass spectrometry base composition method, which was more discriminating than Sanger sequencing of just the hypervariable regions. This trend is in agreement with the number of nucleotides covered by each of the three assays. Published by Elsevier Ireland Ltd.

  4. Comparing whole-genome sequencing with Sanger sequencing for spa typing of methicillin-resistant Staphylococcus aureus.

    Science.gov (United States)

    Bartels, Mette Damkjær; Petersen, Andreas; Worning, Peder; Nielsen, Jesper Boye; Larner-Svensson, Hanna; Johansen, Helle Krogh; Andersen, Leif Percival; Jarløv, Jens Otto; Boye, Kit; Larsen, Anders Rhod; Westh, Henrik

    2014-12-01

    spa typing of methicillin-resistant Staphylococcus aureus (MRSA) has traditionally been done by PCR amplification and Sanger sequencing of the spa repeat region. At Hvidovre Hospital, Denmark, whole-genome sequencing (WGS) of all MRSA isolates has been performed routinely since January 2013, and an in-house analysis pipeline determines the spa types. Due to national surveillance, all MRSA isolates are sent to Statens Serum Institut, where the spa type is determined by PCR and Sanger sequencing. The purpose of this study was to evaluate the reliability of the spa types obtained by 150-bp paired-end Illumina WGS. MRSA isolates from new MRSA patients in 2013 (n = 699) in the capital region of Denmark were included. We found a 97% agreement between spa types obtained by the two methods. All isolates achieved a spa type by both methods. Nineteen isolates differed in spa types by the two methods, in most cases due to the lack of 24-bp repeats in the whole-genome-sequenced isolates. These related but incorrect spa types should have no consequence in outbreak investigations, since all epidemiologically linked isolates, regardless of spa type, will be included in the single nucleotide polymorphism (SNP) analysis. This will reveal the close relatedness of the spa types. In conclusion, our data show that WGS is a reliable method to determine the spa type of MRSA. Copyright © 2014, American Society for Microbiology. All Rights Reserved.

  5. Homozygosity mapping and targeted sanger sequencing reveal genetic defects underlying inherited retinal disease in families from pakistan.

    Directory of Open Access Journals (Sweden)

    Maleeha Maria

    Full Text Available Homozygosity mapping has facilitated the identification of the genetic causes underlying inherited diseases, particularly in consanguineous families with multiple affected individuals. This knowledge has also resulted in a mutation dataset that can be used in a cost and time effective manner to screen frequent population-specific genetic variations associated with diseases such as inherited retinal disease (IRD.We genetically screened 13 families from a cohort of 81 Pakistani IRD families diagnosed with Leber congenital amaurosis (LCA, retinitis pigmentosa (RP, congenital stationary night blindness (CSNB, or cone dystrophy (CD. We employed genome-wide single nucleotide polymorphism (SNP array analysis to identify homozygous regions shared by affected individuals and performed Sanger sequencing of IRD-associated genes located in the sizeable homozygous regions. In addition, based on population specific mutation data we performed targeted Sanger sequencing (TSS of frequent variants in AIPL1, CEP290, CRB1, GUCY2D, LCA5, RPGRIP1 and TULP1, in probands from 28 LCA families.Homozygosity mapping and Sanger sequencing of IRD-associated genes revealed the underlying mutations in 10 families. TSS revealed causative variants in three families. In these 13 families four novel mutations were identified in CNGA1, CNGB1, GUCY2D, and RPGRIP1.Homozygosity mapping and TSS revealed the underlying genetic cause in 13 IRD families, which is useful for genetic counseling as well as therapeutic interventions that are likely to become available in the near future.

  6. Rapid Sanger sequencing of the 16S rRNA gene for identification of some common pathogens.

    Directory of Open Access Journals (Sweden)

    Linxiang Chen

    Full Text Available Conventional Sanger sequencing remains time-consuming and laborious. In this study, we developed a rapid improved sequencing protocol of 16S rRNA for pathogens identification by using a new combination of SYBR Green I real-time PCR and Sanger sequencing with FTA® cards. To compare the sequencing quality of this method with conventional Sanger sequencing, 12 strains, including three kinds of strains (1 reference strain and 3 clinical strains, which were previously identified by biochemical tests, which have 4 Pseudomonas aeruginosa, 4 Staphyloccocus aureus and 4 Escherichia coli, were targeted. Additionally, to validate the sequencing results and bacteria identification, expanded specimens with 90 clinical strains, also comprised of the three kinds of strains which included 30 samples respectively, were performed as just described. The results showed that although statistical differences (P<0.05 were found in sequencing quality between the two methods, their identification results were all correct and consistent. The workload, the time consumption and the cost per batch were respectively light versus heavy, 8 h versus 11 h and $420 versus $400. In the 90 clinical strains, all of the Pseudomonas aeruginosa and Staphyloccocus aureus strains were correctly identified, but only 26.7% of the Escherichia coli strains were recognized as Escherichia coli, while 33.3% as Shigella sonnei and 40% as Shigella dysenteriae. The protocol described here is a rapid, reliable, stable and convenient method for 16S rRNA sequencing, and can be used for Pseudomonas aeruginosa and Staphyloccocus aureus identification, yet it is not completely suitable for discriminating Escherichia coli and Shigella strains.

  7. A comparison of parallel pyrosequencing and sanger clone-based sequencing and its impact on the characterization of the genetic diversity of HIV-1.

    Directory of Open Access Journals (Sweden)

    Binhua Liang

    Full Text Available BACKGROUND: Pyrosequencing technology has the potential to rapidly sequence HIV-1 viral quasispecies without requiring the traditional approach of cloning. In this study, we investigated the utility of ultra-deep pyrosequencing to characterize genetic diversity of the HIV-1 gag quasispecies and assessed the possible contribution of pyrosequencing technology in studying HIV-1 biology and evolution. METHODOLOGY/PRINCIPAL FINDINGS: HIV-1 gag gene was amplified from 96 patients using nested PCR. The PCR products were cloned and sequenced using capillary based Sanger fluorescent dideoxy termination sequencing. The same PCR products were also directly sequenced using the 454 pyrosequencing technology. The two sequencing methods were evaluated for their ability to characterize quasispecies variation, and to reveal sites under host immune pressure for their putative functional significance. A total of 14,034 variations were identified by 454 pyrosequencing versus 3,632 variations by Sanger clone-based (SCB sequencing. 11,050 of these variations were detected only by pyrosequencing. These undetected variations were located in the HIV-1 Gag region which is known to contain putative cytotoxic T lymphocyte (CTL and neutralizing antibody epitopes, and sites related to virus assembly and packaging. Analysis of the positively selected sites derived by the two sequencing methods identified several differences. All of them were located within the CTL epitope regions. CONCLUSIONS/SIGNIFICANCE: Ultra-deep pyrosequencing has proven to be a powerful tool for characterization of HIV-1 genetic diversity with enhanced sensitivity, efficiency, and accuracy. It also improved reliability of downstream evolutionary and functional analysis of HIV-1 quasispecies.

  8. KRAS mutation detection in colorectal cancer by a commercially available gene chip array compares well with Sanger sequencing.

    Science.gov (United States)

    French, Deborah; Smith, Andrew; Powers, Martin P; Wu, Alan H B

    2011-08-17

    Binding of a ligand to the epidermal growth factor receptor (EGFR) stimulates various intracellular signaling pathways resulting in cell cycle progression, proliferation, angiogenesis and apoptosis inhibition. KRAS is involved in signaling pathways including RAF/MAPK and PI3K and mutations in this gene result in constitutive activation of these pathways, independent of EGFR activation. Seven mutations in codons 12 and 13 of KRAS comprise around 95% of the observed human mutations, rendering monoclonal antibodies against EGFR (e.g. cetuximab and panitumumab) useless in treatment of colorectal cancer. KRAS mutation testing by two different methodologies was compared; Sanger sequencing and AutoGenomics INFINITI® assay, on DNA extracted from colorectal cancers. Out of 29 colorectal tumor samples tested, 28 were concordant between the two methodologies for the KRAS mutations that were detected in both assays with the INFINITI® assay detecting a mutation in one sample that was indeterminate by Sanger sequencing and a third methodology; single nucleotide primer extension. This study indicates the utility of the AutoGenomics INFINITI® methodology in a clinical laboratory setting where technical expertise or access to equipment for DNA sequencing does not exist. Copyright © 2011 Elsevier B.V. All rights reserved.

  9. Very high resolution single pass HLA genotyping using amplicon sequencing on the 454 next generation DNA sequencers: Comparison with Sanger sequencing.

    Science.gov (United States)

    Yamamoto, F; Höglund, B; Fernandez-Vina, M; Tyan, D; Rastrou, M; Williams, T; Moonsamy, P; Goodridge, D; Anderson, M; Erlich, H A; Holcomb, C L

    2015-12-01

    Compared to Sanger sequencing, next-generation sequencing offers advantages for high resolution HLA genotyping including increased throughput, lower cost, and reduced genotype ambiguity. Here we describe an enhancement of the Roche 454 GS GType HLA genotyping assay to provide very high resolution (VHR) typing, by the addition of 8 primer pairs to the original 14, to genotype 11 HLA loci. These additional amplicons help resolve common and well-documented alleles and exclude commonly found null alleles in genotype ambiguity strings. Simplification of workflow to reduce the initial preparation effort using early pooling of amplicons or the Fluidigm Access Array™ is also described. Performance of the VHR assay was evaluated on 28 well characterized cell lines using Conexio Assign MPS software which uses genomic, rather than cDNA, reference sequence. Concordance was 98.4%; 1.6% had no genotype assignment. Of concordant calls, 53% were unambiguous. To further assess the assay, 59 clinical samples were genotyped and results compared to unambiguous allele assignments obtained by prior sequence-based typing supplemented with SSO and/or SSP. Concordance was 98.7% with 58.2% as unambiguous calls; 1.3% could not be assigned. Our results show that the amplicon-based VHR assay is robust and can replace current Sanger methodology. Together with software enhancements, it has the potential to provide even higher resolution HLA typing. Copyright © 2015. Published by Elsevier Inc.

  10. Barcoding the food chain: from Sanger to high-throughput sequencing.

    Science.gov (United States)

    Littlefair, Joanne E; Clare, Elizabeth L

    2016-11-01

    Society faces the complex challenge of supporting biodiversity and ecosystem functioning, while ensuring food security by providing safe traceable food through an ever-more-complex global food chain. The increase in human mobility brings the added threat of pests, parasites, and invaders that further complicate our agro-industrial efforts. DNA barcoding technologies allow researchers to identify both individual species, and, when combined with universal primers and high-throughput sequencing techniques, the diversity within mixed samples (metabarcoding). These tools are already being employed to detect market substitutions, trace pests through the forensic evaluation of trace "environmental DNA", and to track parasitic infections in livestock. The potential of DNA barcoding to contribute to increased security of the food chain is clear, but challenges remain in regulation and the need for validation of experimental analysis. Here, we present an overview of the current uses and challenges of applied DNA barcoding in agriculture, from agro-ecosystems within farmland to the kitchen table.

  11. Insights into bacterioplankton community structure from Sundarbans mangrove ecoregion using Sanger and Illumina MiSeq sequencing approaches: A comparative analysis

    Directory of Open Access Journals (Sweden)

    Anwesha Ghosh

    2017-03-01

    Full Text Available Next generation sequencing using platforms such as Illumina MiSeq provides a deeper insight into the structure and function of bacterioplankton communities in coastal ecosystems compared to traditional molecular techniques such as clone library approach which incorporates Sanger sequencing. In this study, structure of bacterioplankton communities was investigated from two stations of Sundarbans mangrove ecoregion using both Sanger and Illumina MiSeq sequencing approaches. The Illumina MiSeq data is available under the BioProject ID PRJNA35180 and Sanger sequencing data under accession numbers KX014101-KX014140 (Stn1 and KX014372-KX014410 (Stn3. Proteobacteria-, Firmicutes- and Bacteroidetes-like sequences retrieved from both approaches appeared to be abundant in the studied ecosystem. The Illumina MiSeq data (2.1 GB provided a deeper insight into the structure of bacterioplankton communities and revealed the presence of bacterial phyla such as Actinobacteria, Cyanobacteria, Tenericutes, Verrucomicrobia which were not recovered based on Sanger sequencing. A comparative analysis of bacterioplankton communities from both stations highlighted the presence of genera that appear in both stations and genera that occur exclusively in either station. However, both the Sanger sequencing and Illumina MiSeq data were coherent at broader taxonomic levels. Pseudomonas, Devosia, Hyphomonas and Erythrobacter-like sequences were the abundant bacterial genera found in the studied ecosystem. Both the sequencing methods showed broad coherence although as expected the Illumina MiSeq data helped identify rarer bacterioplankton groups and also showed the presence of unassigned OTUs indicating possible presence of novel bacterioplankton from the studied mangrove ecosystem.

  12. Diagnosis of Fanconi Anemia: Mutation Analysis by Multiplex Ligation-Dependent Probe Amplification and PCR-Based Sanger Sequencing

    Science.gov (United States)

    Gille, Johan J. P.; Floor, Karijn; Kerkhoven, Lianne; Ameziane, Najim; Joenje, Hans; de Winter, Johan P.

    2012-01-01

    Fanconi anemia (FA) is a rare inherited disease characterized by developmental defects, short stature, bone marrow failure, and a high risk of malignancies. FA is heterogeneous: 15 genetic subtypes have been distinguished so far. A clinical diagnosis of FA needs to be confirmed by testing cells for sensitivity to cross-linking agents in a chromosomal breakage test. As a second step, DNA testing can be employed to elucidate the genetic subtype of the patient and to identify the familial mutations. This knowledge allows preimplantation genetic diagnosis (PGD) and enables prenatal DNA testing in future pregnancies. Although simultaneous testing of all FA genes by next generation sequencing will be possible in the near future, this technique will not be available immediately for all laboratories. In addition, in populations with strong founder mutations, a limited test using Sanger sequencing and MLPA will be a cost-effective alternative. We describe a strategy and optimized conditions for the screening of FANCA, FANCB, FANCC, FANCE, FANCF, and FANCG and present the results obtained in a cohort of 54 patients referred to our diagnostic service since 2008. In addition, the follow up with respect to genetic counseling and carrier screening in the families is discussed. PMID:22778927

  13. Diagnosis of Fanconi Anemia: Mutation Analysis by Multiplex Ligation-Dependent Probe Amplification and PCR-Based Sanger Sequencing

    Directory of Open Access Journals (Sweden)

    Johan J. P. Gille

    2012-01-01

    Full Text Available Fanconi anemia (FA is a rare inherited disease characterized by developmental defects, short stature, bone marrow failure, and a high risk of malignancies. FA is heterogeneous: 15 genetic subtypes have been distinguished so far. A clinical diagnosis of FA needs to be confirmed by testing cells for sensitivity to cross-linking agents in a chromosomal breakage test. As a second step, DNA testing can be employed to elucidate the genetic subtype of the patient and to identify the familial mutations. This knowledge allows preimplantation genetic diagnosis (PGD and enables prenatal DNA testing in future pregnancies. Although simultaneous testing of all FA genes by next generation sequencing will be possible in the near future, this technique will not be available immediately for all laboratories. In addition, in populations with strong founder mutations, a limited test using Sanger sequencing and MLPA will be a cost-effective alternative. We describe a strategy and optimized conditions for the screening of FANCA, FANCB, FANCC, FANCE, FANCF, and FANCG and present the results obtained in a cohort of 54 patients referred to our diagnostic service since 2008. In addition, the follow up with respect to genetic counseling and carrier screening in the families is discussed.

  14. Identification of novel BRCA founder mutations in Middle Eastern breast cancer patients using capture and Sanger sequencing analysis.

    Science.gov (United States)

    Bu, Rong; Siraj, Abdul K; Al-Obaisi, Khadija A S; Beg, Shaham; Al Hazmi, Mohsen; Ajarim, Dahish; Tulbah, Asma; Al-Dayel, Fouad; Al-Kuraya, Khawla S

    2016-09-01

    Ethnic differences of breast cancer genomics have prompted us to investigate the spectra of BRCA1 and BRCA2 mutations in different populations. The prevalence and effect of BRCA 1 and BRCA 2 mutations in Middle Eastern population is not fully explored. To characterize the prevalence of BRCA mutations in Middle Eastern breast cancer patients, BRCA mutation screening was performed in 818 unselected breast cancer patients using Capture and/or Sanger sequencing. 19 short tandem repeat (STR) markers were used for founder mutation analysis. In our study, nine different types of deleterious mutation were identified in 28 (3.4%) cases, 25 (89.3%) cases in BRCA 1 and 3 (10.7%) cases in BRCA 2. Seven recurrent mutations identified accounted for 92.9% (26/28) of all the mutant cases. Haplotype analysis was performed to confirm c.1140 dupG and c.4136_4137delCT mutations as novel putative founder mutation, accounting for 46.4% (13/28) of all BRCA mutant cases and 1.6% (13/818) of all the breast cancer cases, respectively. Moreover, BRCA 1 mutation was significantly associated with BRCA 1 protein expression loss (p = 0.0005). Our finding revealed that a substantial number of BRCA mutations were identified in clinically high risk breast cancer from Middle East region. Identification of the mutation spectrum, prevalence and founder effect in Middle Eastern population facilitates genetic counseling, risk assessment and development of cost-effective screening strategy. © 2016 UICC.

  15. Comparison of three human papillomavirus DNA detection methods: Next generation sequencing, multiplex-PCR and nested-PCR followed by Sanger based sequencing.

    Science.gov (United States)

    da Fonseca, Allex Jardim; Galvão, Renata Silva; Miranda, Angelica Espinosa; Ferreira, Luiz Carlos de Lima; Chen, Zigui

    2016-05-01

    To compare the diagnostic performance for HPV infection using three laboratorial techniques. Ninty-five cervicovaginal samples were randomly selected; each was tested for HPV DNA and genotypes using 3 methods in parallel: Multiplex-PCR, the Nested PCR followed by Sanger sequencing, and the Next_Gen Sequencing (NGS) with two assays (NGS-A1, NGS-A2). The study was approved by the Brazilian National IRB (CONEP protocol 16,800). The prevalence of HPV by the NGS assays was higher than that using the Multiplex-PCR (64.2% vs. 45.2%, respectively; P = 0.001) and the Nested-PCR (64.2% vs. 49.5%, respectively; P = 0.003). NGS also showed better performance in detecting high-risk HPV (HR-HPV) and HPV16. There was a weak interobservers agreement between the results of Multiplex-PCR and Nested-PCR in relation to NGS for the diagnosis of HPV infection, and a moderate correlation for HR-HPV detection. Both NGS assays showed a strong correlation for detection of HPVs (k = 0.86), HR-HPVs (k = 0.91), HPV16 (k = 0.92) and HPV18 (k = 0.91). NGS is more sensitive than the traditional Sanger sequencing and the Multiplex PCR to genotype HPVs, with promising ability to detect multiple infections, and may have the potential to establish an alternative method for the diagnosis and genotyping of HPV. © 2015 Wiley Periodicals, Inc.

  16. Comprehensive transcriptome assembly of Chickpea (Cicer arietinum L. using sanger and next generation sequencing platforms: development and applications.

    Directory of Open Access Journals (Sweden)

    Himabindu Kudapa

    Full Text Available A comprehensive transcriptome assembly of chickpea has been developed using 134.95 million Illumina single-end reads, 7.12 million single-end FLX/454 reads and 139,214 Sanger expressed sequence tags (ESTs from >17 genotypes. This hybrid transcriptome assembly, referred to as Cicer arietinumTranscriptome Assembly version 2 (CaTA v2, available at http://data.comparative-legumes.org/transcriptomes/cicar/lista_cicar-201201, comprising 46,369 transcript assembly contigs (TACs has an N50 length of 1,726 bp and a maximum contig size of 15,644 bp. Putative functions were determined for 32,869 (70.8% of the TACs and gene ontology assignments were determined for 21,471 (46.3%. The new transcriptome assembly was compared with the previously available chickpea transcriptome assemblies as well as to the chickpea genome. Comparative analysis of CaTA v2 against transcriptomes of three legumes - Medicago, soybean and common bean, resulted in 27,771 TACs common to all three legumes indicating strong conservation of genes across legumes. CaTA v2 was also used for identification of simple sequence repeats (SSRs and intron spanning regions (ISRs for developing molecular markers. ISRs were identified by aligning TACs to the Medicago genome, and their putative mapping positions at chromosomal level were identified using transcript map of chickpea. Primer pairs were designed for 4,990 ISRs, each representing a single contig for which predicted positions are inferred and distributed across eight linkage groups. A subset of randomly selected ISRs representing all eight chickpea linkage groups were validated on five chickpea genotypes and showed 20% polymorphism with average polymorphic information content (PIC of 0.27. In summary, the hybrid transcriptome assembly developed and novel markers identified can be used for a variety of applications such as gene discovery, marker-trait association, diversity analysis etc., to advance genetics research and breeding

  17. Sanger sequencing as a first-line approach for molecular diagnosis of Andersen-Tawil syndrome [version 1; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Armando Totomoch-Serra

    2017-06-01

    Full Text Available In 1977, Frederick Sanger developed a new method for DNA sequencing based on the chain termination method, now known as the Sanger sequencing method (SSM.  Recently, massive parallel sequencing, better known as next-generation sequencing (NGS,  is replacing the SSM for detecting mutations in cardiovascular diseases with a genetic background. The present opinion article wants to remark that “targeted” SSM is still effective as a first-line approach for the molecular diagnosis of some specific conditions, as is the case for Andersen-Tawil syndrome (ATS. ATS is described as a rare multisystemic autosomal dominant channelopathy syndrome caused mainly by a heterozygous mutation in the KCNJ2 gene. KCJN2 has particular characteristics that make it attractive for “directed” SSM. KCNJ2 has a sequence of 17,510 base pairs (bp, and a short coding region with two exons (exon 1=166 bp and exon 2=5220 bp, half of the mutations are located in the C-terminal cytosolic domain, a mutational hotspot has been described in residue Arg218, and this gene explains the phenotype in 60% of ATS cases that fulfill all the clinical criteria of the disease. In order to increase the diagnosis of ATS we urge cardiologists to search for facial and muscular abnormalities in subjects with frequent ventricular arrhythmias (especially bigeminy and prominent U waves on the electrocardiogram.

  18. "First generation" automated DNA sequencing technology.

    Science.gov (United States)

    Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M

    2011-10-01

    Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines. © 2011 by John Wiley & Sons, Inc.

  19. Screening for duplications, deletions and a common intronic mutation detects 35% of second mutations in patients with USH2A monoallelic mutations on Sanger sequencing.

    Science.gov (United States)

    Steele-Stallard, Heather B; Le Quesne Stabej, Polona; Lenassi, Eva; Luxon, Linda M; Claustres, Mireille; Roux, Anne-Francoise; Webster, Andrew R; Bitner-Glindzicz, Maria

    2013-08-08

    Usher Syndrome is the leading cause of inherited deaf-blindness. It is divided into three subtypes, of which the most common is Usher type 2, and the USH2A gene accounts for 75-80% of cases. Despite recent sequencing strategies, in our cohort a significant proportion of individuals with Usher type 2 have just one heterozygous disease-causing mutation in USH2A, or no convincing disease-causing mutations across nine Usher genes. The purpose of this study was to improve the molecular diagnosis in these families by screening USH2A for duplications, heterozygous deletions and a common pathogenic deep intronic variant USH2A: c.7595-2144A>G. Forty-nine Usher type 2 or atypical Usher families who had missing mutations (mono-allelic USH2A or no mutations following Sanger sequencing of nine Usher genes) were screened for duplications/deletions using the USH2A SALSA MLPA reagent kit (MRC-Holland). Identification of USH2A: c.7595-2144A>G was achieved by Sanger sequencing. Mutations were confirmed by a combination of reverse transcription PCR using RNA extracted from nasal epithelial cells or fibroblasts, and by array comparative genomic hybridisation with sequencing across the genomic breakpoints. Eight mutations were identified in 23 Usher type 2 families (35%) with one previously identified heterozygous disease-causing mutation in USH2A. These consisted of five heterozygous deletions, one duplication, and two heterozygous instances of the pathogenic variant USH2A: c.7595-2144A>G. No variants were found in the 15 Usher type 2 families with no previously identified disease-causing mutations. In 11 atypical families, none of whom had any previously identified convincing disease-causing mutations, the mutation USH2A: c.7595-2144A>G was identified in a heterozygous state in one family. All five deletions and the heterozygous duplication we report here are novel. This is the first time that a duplication in USH2A has been reported as a cause of Usher syndrome. We found that 8 of

  20. Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons

    Science.gov (United States)

    Haas, Brian J.; Gevers, Dirk; Earl, Ashlee M.; Feldgarden, Mike; Ward, Doyle V.; Giannoukos, Georgia; Ciulla, Dawn; Tabbaa, Diana; Highlander, Sarah K.; Sodergren, Erica; Methé, Barbara; DeSantis, Todd Z.; Petrosino, Joseph F.; Knight, Rob; Birren, Bruce W.

    2011-01-01

    Bacterial diversity among environmental samples is commonly assessed with PCR-amplified 16S rRNA gene (16S) sequences. Perceived diversity, however, can be influenced by sample preparation, primer selection, and formation of chimeric 16S amplification products. Chimeras are hybrid products between multiple parent sequences that can be falsely interpreted as novel organisms, thus inflating apparent diversity. We developed a new chimera detection tool called Chimera Slayer (CS). CS detects chimeras with greater sensitivity than previous methods, performs well on short sequences such as those produced by the 454 Life Sciences (Roche) Genome Sequencer, and can scale to large data sets. By benchmarking CS performance against sequences derived from a controlled DNA mixture of known organisms and a simulated chimera set, we provide insights into the factors that affect chimera formation such as sequence abundance, the extent of similarity between 16S genes, and PCR conditions. Chimeras were found to reproducibly form among independent amplifications and contributed to false perceptions of sample diversity and the false identification of novel taxa, with less-abundant species exhibiting chimera rates exceeding 70%. Shotgun metagenomic sequences of our mock community appear to be devoid of 16S chimeras, supporting a role for shotgun metagenomics in validating novel organisms discovered in targeted sequence surveys. PMID:21212162

  1. Comparison of cobas HCV GT against Versant HCV Genotype 2.0 (LiPA) with confirmation by Sanger sequencing.

    Science.gov (United States)

    Yusrina, Falah; Chua, Cui Wen; Lee, Chun Kiat; Chiu, Lily; Png, Tracy Si-Yu; Khoo, Mui Joo; Yan, Gabriel; Lee, Guan Huei; Yan, Benedict; Lee, Hong Kai

    2018-05-01

    Correct identification of infecting hepatitis C virus (HCV) genotype is helpful for targeted antiviral therapy. Here, we compared the HCV genotyping performance of the cobas HCV GT assay against the Versant HCV Genotype 2.0 (LiPA) assay, using 97 archived serum samples. In the event of discrepant or indeterminate results produced by either assay, the core and NS5B regions were sequenced. Of the 97 samples tested by the cobas, 25 (26%) were deemed indeterminate. Sequencing analyses confirmed 21 (84%) of the 25 samples as genotype 6 viruses with either subtype 6m, 6n, 6v, 6xa, or unknown subtype. Of the 97 samples tested by the LiPA, thirteen (13%) were deemed indeterminate. Seven (7%) were assigned with genotype 1, with unavailable/inconclusive results from the core region of the LiPA. Notably, the 7 samples were later found to be either genotype 3 or 6 by sequencing analyses. Moreover, 1 sample by the LiPA was assigned as genotypes 4 (cobas: indeterminate) but were later found to be genotype 3 by sequencing analyses, highlighting its limitation in assigning the correct genotype. The cobas showed similar or slightly higher accuracy (100%; 95% CI 94-100%) compared to the LiPA (99%; 95% CI 92-100%). Twenty-six percent of the 97 samples tested by the cobas had indeterminate results, mainly due to its limitation in identifying genotype 6 other than subtypes 6a and 6b. This presents a significant assay limitation in Southeast Asia, where genotype 6 infection is highly prevalent. Copyright © 2018 Elsevier B.V. All rights reserved.

  2. Discovery of novel MHC-class I alleles and haplotypes in Filipino cynomolgus macaques (Macaca fascicularis) by pyrosequencing and Sanger sequencing: Mafa-class I polymorphism.

    Science.gov (United States)

    Shiina, Takashi; Yamada, Yukiho; Aarnink, Alice; Suzuki, Shingo; Masuya, Anri; Ito, Sayaka; Ido, Daisuke; Yamanaka, Hisashi; Iwatani, Chizuru; Tsuchiya, Hideaki; Ishigaki, Hirohito; Itoh, Yasushi; Ogasawara, Kazumasa; Kulski, Jerzy K; Blancher, Antoine

    2015-10-01

    Although the low polymorphism of the major histocompatibility complex (MHC) transplantation genes in the Filipino cynomolgus macaque (Macaca fascicularis) is expected to have important implications in the selection and breeding of animals for medical research, detailed polymorphism information is still lacking for many of the duplicated class I genes. To better elucidate the degree and types of MHC polymorphisms and haplotypes in the Filipino macaque population, we genotyped 127 unrelated animals by the Sanger sequencing method and high-resolution pyrosequencing and identified 112 different alleles, 28 at cynomolgus macaque MHC (Mafa)-A, 54 at Mafa-B, 12 at Mafa-I, 11 at Mafa-E, and seven at Mafa-F alleles, of which 56 were newly described. Of them, the newly discovered Mafa-A8*01:01 lineage allele had low nucleotide similarities (Filipino macaque population would identify these and other high-frequency Mafa-class I haplotypes that could be used as MHC control animals for the benefit of biomedical research.

  3. Highly sensitive KRAS mutation detection from formalin-fixed paraffin-embedded biopsies and circulating tumour cells using wild-type blocking polymerase chain reaction and Sanger sequencing.

    Science.gov (United States)

    Huang, Meggie Mo Chao; Leong, Sai Mun; Chua, Hui Wen; Tucker, Steven; Cheong, Wai Chye; Chiu, Lily; Li, Mo-Huang; Koay, Evelyn Siew-Chuan

    2014-08-01

    Among patients with colorectal cancer (CRC), KRAS mutations were reported to occur in 30-51 % of all cases. CRC patients with KRAS mutations were reported to be non-responsive to anti-epidermal growth factor receptor (EGFR) monoclonal antibody (MoAb) treatment in many clinical trials. Hence, accurate detection of KRAS mutations would be critical in guiding the use of anti-EGFR MoAb therapies in CRC. In this study, we carried out a detailed investigation of the efficacy of a wild-type (WT) blocking real-time polymerase chain reaction (PCR), employing WT KRAS locked nucleic acid blockers, and Sanger sequencing, for KRAS mutation detection in rare cells. Analyses were first conducted on cell lines to optimize the assay protocol which was subsequently applied to peripheral blood and tissue samples from patients with CRC. The optimized assay provided a superior sensitivity enabling detection of as little as two cells with mutated KRAS in the background of 10(4) WT cells (0.02 %). The feasibility of this assay was further investigated to assess the KRAS status of 45 colorectal tissue samples, which had been tested previously, using a conventional PCR sequencing approach. The analysis showed a mutational discordance between these two methods in 4 of 18 WT cases. Our results present a simple, effective, and robust method for KRAS mutation detection in both paraffin embedded tissues and circulating tumour cells, at single-cell level. The method greatly enhances the detection sensitivity and alleviates the need of exhaustively removing co-enriched contaminating lymphocytes.

  4. A sweetpotato gene index established by de novo assembly of pyrosequencing and Sanger sequences and mining for gene-based microsatellite markers

    Directory of Open Access Journals (Sweden)

    Solis Julio

    2010-10-01

    Full Text Available Abstract Background Sweetpotato (Ipomoea batatas (L. Lam., a hexaploid outcrossing crop, is an important staple and food security crop in developing countries in Africa and Asia. The availability of genomic resources for sweetpotato is in striking contrast to its importance for human nutrition. Previously existing sequence data were restricted to around 22,000 expressed sequence tag (EST sequences and ~ 1,500 GenBank sequences. We have used 454 pyrosequencing to augment the available gene sequence information to enhance functional genomics and marker design for this plant species. Results Two quarter 454 pyrosequencing runs used two normalized cDNA collections from stems and leaves from drought-stressed sweetpotato clone Tanzania and yielded 524,209 reads, which were assembled together with 22,094 publically available expressed sequence tags into 31,685 sets of overlapping DNA segments and 34,733 unassembled sequences. Blastx comparisons with the UniRef100 database allowed annotation of 23,957 contigs and 15,342 singletons resulting in 24,657 putatively unique genes. Further, 27,119 sequences had no match to protein sequences of UniRef100database. On the basis of this gene index, we have identified 1,661 gene-based microsatellite sequences, of which 223 were selected for testing and 195 were successfully amplified in a test panel of 6 hexaploid (I. batatas and 2 diploid (I. trifida accessions. Conclusions The sweetpotato gene index is a useful source for functionally annotated sweetpotato gene sequences that contains three times more gene sequence information for sweetpotato than previous EST assemblies. A searchable version of the gene index, including a blastn function, is available at http://www.cipotato.org/sweetpotato_gene_index.

  5. Introduction of the hybcell-based compact sequencing technology and comparison to state-of-the-art methodologies for KRAS mutation detection.

    Science.gov (United States)

    Zopf, Agnes; Raim, Roman; Danzer, Martin; Niklas, Norbert; Spilka, Rita; Pröll, Johannes; Gabriel, Christian; Nechansky, Andreas; Roucka, Markus

    2015-03-01

    The detection of KRAS mutations in codons 12 and 13 is critical for anti-EGFR therapy strategies; however, only those methodologies with high sensitivity, specificity, and accuracy as well as the best cost and turnaround balance are suitable for routine daily testing. Here we compared the performance of compact sequencing using the novel hybcell technology with 454 next-generation sequencing (454-NGS), Sanger sequencing, and pyrosequencing, using an evaluation panel of 35 specimens. A total of 32 mutations and 10 wild-type cases were reported using 454-NGS as the reference method. Specificity ranged from 100% for Sanger sequencing to 80% for pyrosequencing. Sanger sequencing and hybcell-based compact sequencing achieved a sensitivity of 96%, whereas pyrosequencing had a sensitivity of 88%. Accuracy was 97% for Sanger sequencing, 85% for pyrosequencing, and 94% for hybcell-based compact sequencing. Quantitative results were obtained for 454-NGS and hybcell-based compact sequencing data, resulting in a significant correlation (r = 0.914). Whereas pyrosequencing and Sanger sequencing were not able to detect multiple mutated cell clones within one tumor specimen, 454-NGS and the hybcell-based compact sequencing detected multiple mutations in two specimens. Our comparison shows that the hybcell-based compact sequencing is a valuable alternative to state-of-the-art methodologies used for detection of clinically relevant point mutations.

  6. Allele Re-sequencing Technologies

    DEFF Research Database (Denmark)

    Byrne, Stephen; Farrell, Jacqueline Danielle; Asp, Torben

    2013-01-01

    The development of next-generation sequencing technologies has made sequencing an affordable approach for detection of genetic variations associated with various traits. However, the cost of whole genome re-sequencing still remains too high to be feasible for many plant species with large...... alternative to whole genome re-sequencing to identify causative genetic variations in plants. One challenge, however, will be efficient bioinformatics strategies for data handling and analysis from the increasing amount of sequence information....

  7. Electrostatic Potential Maps and Natural Bond Orbital Analysis: Visualization and Conceptualization of Reactivity in Sanger's Reagent

    Science.gov (United States)

    Mottishaw, Jeffery D.; Erck, Adam R.; Kramer, Jordan H.; Sun, Haoran; Koppang, Miles

    2015-01-01

    Frederick Sanger's early work on protein sequencing through the use of colorimetric labeling combined with liquid chromatography involves an important nucleophilic aromatic substitution (S[subscript N]Ar) reaction in which the N-terminus of a protein is tagged with Sanger's reagent. Understanding the inherent differences between this S[subscript…

  8. DNA Polymerases Drive DNA Sequencing-by-Synthesis Technologies: Both Past and Present

    Directory of Open Access Journals (Sweden)

    Cheng-Yao eChen

    2014-06-01

    Full Text Available Next-generation sequencing (NGS technologies have revolutionized modern biological and biomedical research. The engines responsible for this innovation are DNA polymerases; they catalyze the biochemical reaction for deriving template sequence information. In fact, DNA polymerase has been a cornerstone of DNA sequencing from the very beginning. E. coli DNA polymerase I proteolytic (Klenow fragment was originally utilized in Sanger's dideoxy chain terminating DNA sequencing chemistry. From these humble beginnings followed an explosion of organism-specific, genome sequence information accessible via public database. Family A/B DNA polymerases from mesophilic/thermophilic bacteria/archaea were modified and tested in today's standard capillary electrophoresis (CE and NGS sequencing platforms. These enzymes were selected for their efficient incorporation of bulky dye-terminator and reversible dye-terminator nucleotides respectively. Third generation, real-time single molecule sequencing platform requires slightly different enzyme properties. Enterobacterial phage ⱷ29 DNA polymerase copies long stretches of DNA and possesses a unique capability to efficiently incorporate terminal phosphate-labeled nucleoside polyphosphates. Furthermore, ⱷ29 enzyme has also been utilized in emerging DNA sequencing technologies including nanopore-, and protein-transistor-based sequencing. DNA polymerase is, and will continue to be, a crucial component of sequencing technologies.

  9. Transcriptome sequencing of the blind subterranean mole rat, Spalax galili: Utility and potential for the discovery of novel evolutionary patterns

    KAUST Repository

    Malik, Assaf; Korol, Abraham; Hü bner, Sariel; Hernandez, Alvaro G.; Thimmapuram, Jyothi; Ali, Shahjahan; Glaser, Fabian; Paz, Arnon; Avivi, Aaron; Band, Mark

    2011-01-01

    sequencing of Spalax galili, a chromosomal type of S. ehrenbergi. cDNA pools from muscle and brain tissues isolated from animals exposed to hypoxic and normoxic conditions were sequenced using Sanger, GS FLX, and GS FLX Titanium technologies. Assembly

  10. The fast changing landscape of sequencing technologies and their impact on microbial genome assemblies and annotation.

    Science.gov (United States)

    Mavromatis, Konstantinos; Land, Miriam L; Brettin, Thomas S; Quest, Daniel J; Copeland, Alex; Clum, Alicia; Goodwin, Lynne; Woyke, Tanja; Lapidus, Alla; Klenk, Hans Peter; Cottingham, Robert W; Kyrpides, Nikos C

    2012-01-01

    The emergence of next generation sequencing (NGS) has provided the means for rapid and high throughput sequencing and data generation at low cost, while concomitantly creating a new set of challenges. The number of available assembled microbial genomes continues to grow rapidly and their quality reflects the quality of the sequencing technology used, but also of the analysis software employed for assembly and annotation. In this work, we have explored the quality of the microbial draft genomes across various sequencing technologies. We have compared the draft and finished assemblies of 133 microbial genomes sequenced at the Department of Energy-Joint Genome Institute and finished at the Los Alamos National Laboratory using a variety of combinations of sequencing technologies, reflecting the transition of the institute from Sanger-based sequencing platforms to NGS platforms. The quality of the public assemblies and of the associated gene annotations was evaluated using various metrics. Results obtained with the different sequencing technologies, as well as their effects on downstream processes, were analyzed. Our results demonstrate that the Illumina HiSeq 2000 sequencing system, the primary sequencing technology currently used for de novo genome sequencing and assembly at JGI, has various advantages in terms of total sequence throughput and cost, but it also introduces challenges for the downstream analyses. In all cases assembly results although on average are of high quality, need to be viewed critically and consider sources of errors in them prior to analysis. These data follow the evolution of microbial sequencing and downstream processing at the JGI from draft genome sequences with large gaps corresponding to missing genes of significant biological role to assemblies with multiple small gaps (Illumina) and finally to assemblies that generate almost complete genomes (Illumina+PacBio).

  11. Genome Sequencing

    DEFF Research Database (Denmark)

    Sato, Shusei; Andersen, Stig Uggerhøj

    2014-01-01

    The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based on transcr......The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based...

  12. MutAid: Sanger and NGS Based Integrated Pipeline for Mutation Identification, Validation and Annotation in Human Molecular Genetics.

    Directory of Open Access Journals (Sweden)

    Ram Vinay Pandey

    Full Text Available Traditional Sanger sequencing as well as Next-Generation Sequencing have been used for the identification of disease causing mutations in human molecular research. The majority of currently available tools are developed for research and explorative purposes and often do not provide a complete, efficient, one-stop solution. As the focus of currently developed tools is mainly on NGS data analysis, no integrative solution for the analysis of Sanger data is provided and consequently a one-stop solution to analyze reads from both sequencing platforms is not available. We have therefore developed a new pipeline called MutAid to analyze and interpret raw sequencing data produced by Sanger or several NGS sequencing platforms. It performs format conversion, base calling, quality trimming, filtering, read mapping, variant calling, variant annotation and analysis of Sanger and NGS data under a single platform. It is capable of analyzing reads from multiple patients in a single run to create a list of potential disease causing base substitutions as well as insertions and deletions. MutAid has been developed for expert and non-expert users and supports four sequencing platforms including Sanger, Illumina, 454 and Ion Torrent. Furthermore, for NGS data analysis, five read mappers including BWA, TMAP, Bowtie, Bowtie2 and GSNAP and four variant callers including GATK-HaplotypeCaller, SAMTOOLS, Freebayes and VarScan2 pipelines are supported. MutAid is freely available at https://sourceforge.net/projects/mutaid.

  13. MutAid: Sanger and NGS Based Integrated Pipeline for Mutation Identification, Validation and Annotation in Human Molecular Genetics.

    Science.gov (United States)

    Pandey, Ram Vinay; Pabinger, Stephan; Kriegner, Albert; Weinhäusel, Andreas

    2016-01-01

    Traditional Sanger sequencing as well as Next-Generation Sequencing have been used for the identification of disease causing mutations in human molecular research. The majority of currently available tools are developed for research and explorative purposes and often do not provide a complete, efficient, one-stop solution. As the focus of currently developed tools is mainly on NGS data analysis, no integrative solution for the analysis of Sanger data is provided and consequently a one-stop solution to analyze reads from both sequencing platforms is not available. We have therefore developed a new pipeline called MutAid to analyze and interpret raw sequencing data produced by Sanger or several NGS sequencing platforms. It performs format conversion, base calling, quality trimming, filtering, read mapping, variant calling, variant annotation and analysis of Sanger and NGS data under a single platform. It is capable of analyzing reads from multiple patients in a single run to create a list of potential disease causing base substitutions as well as insertions and deletions. MutAid has been developed for expert and non-expert users and supports four sequencing platforms including Sanger, Illumina, 454 and Ion Torrent. Furthermore, for NGS data analysis, five read mappers including BWA, TMAP, Bowtie, Bowtie2 and GSNAP and four variant callers including GATK-HaplotypeCaller, SAMTOOLS, Freebayes and VarScan2 pipelines are supported. MutAid is freely available at https://sourceforge.net/projects/mutaid.

  14. [Sequencing technology in gene diagnosis and its application].

    Science.gov (United States)

    Yibin, Guo

    2014-11-01

    The study of gene mutation is one of the hot topics in the field of life science nowadays, and the related detection methods and diagnostic technology have been developed rapidly. Sequencing technology plays an indispensable role in the definite diagnosis and classification of genetic diseases. In this review, we summarize the research progress in sequencing technology, evaluate the advantages and disadvantages of 1(st) ~3(rd) generation of sequencing technology, and describe its application in gene diagnosis. Also we made forecasts and prospects on its development trend.

  15. Rapid sequencing of the bamboo mitochondrial genome using Illumina technology and parallel episodic evolution of organelle genomes in grasses.

    Science.gov (United States)

    Ma, Peng-Fei; Guo, Zhen-Hua; Li, De-Zhu

    2012-01-01

    Compared to their counterparts in animals, the mitochondrial (mt) genomes of angiosperms exhibit a number of unique features. However, unravelling their evolution is hindered by the few completed genomes, of which are essentially Sanger sequenced. While next-generation sequencing technologies have revolutionized chloroplast genome sequencing, they are just beginning to be applied to angiosperm mt genomes. Chloroplast genomes of grasses (Poaceae) have undergone episodic evolution and the evolutionary rate was suggested to be correlated between chloroplast and mt genomes in Poaceae. It is interesting to investigate whether correlated rate change also occurred in grass mt genomes as expected under lineage effects. A time-calibrated phylogenetic tree is needed to examine rate change. We determined a largely completed mt genome from a bamboo, Ferrocalamus rimosivaginus (Poaceae), through Illumina sequencing of total DNA. With combination of de novo and reference-guided assembly, 39.5-fold coverage Illumina reads were finally assembled into scaffolds totalling 432,839 bp. The assembled genome contains nearly the same genes as the completed mt genomes in Poaceae. For examining evolutionary rate in grass mt genomes, we reconstructed a phylogenetic tree including 22 taxa based on 31 mt genes. The topology of the well-resolved tree was almost identical to that inferred from chloroplast genome with only minor difference. The inconsistency possibly derived from long branch attraction in mtDNA tree. By calculating absolute substitution rates, we found significant rate change (∼4-fold) in mt genome before and after the diversification of Poaceae both in synonymous and nonsynonymous terms. Furthermore, the rate change was correlated with that of chloroplast genomes in grasses. Our result demonstrates that it is a rapid and efficient approach to obtain angiosperm mt genome sequences using Illumina sequencing technology. The parallel episodic evolution of mt and chloroplast

  16. Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area.

    Science.gov (United States)

    Nakano, Kazuma; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Ashimine, Noriko; Ohki, Shun; Shinzato, Misuzu; Minami, Maiko; Nakanishi, Tetsuhiro; Teruya, Kuniko; Satou, Kazuhito; Hirano, Takashi

    2017-07-01

    PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II's sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.

  17. Comparison of next generation sequencing technologies for transcriptome characterization

    Directory of Open Access Journals (Sweden)

    Soltis Douglas E

    2009-08-01

    Full Text Available Abstract Background We have developed a simulation approach to help determine the optimal mixture of sequencing methods for most complete and cost effective transcriptome sequencing. We compared simulation results for traditional capillary sequencing with "Next Generation" (NG ultra high-throughput technologies. The simulation model was parameterized using mappings of 130,000 cDNA sequence reads to the Arabidopsis genome (NCBI Accession SRA008180.19. We also generated 454-GS20 sequences and de novo assemblies for the basal eudicot California poppy (Eschscholzia californica and the magnoliid avocado (Persea americana using a variety of methods for cDNA synthesis. Results The Arabidopsis reads tagged more than 15,000 genes, including new splice variants and extended UTR regions. Of the total 134,791 reads (13.8 MB, 119,518 (88.7% mapped exactly to known exons, while 1,117 (0.8% mapped to introns, 11,524 (8.6% spanned annotated intron/exon boundaries, and 3,066 (2.3% extended beyond the end of annotated UTRs. Sequence-based inference of relative gene expression levels correlated significantly with microarray data. As expected, NG sequencing of normalized libraries tagged more genes than non-normalized libraries, although non-normalized libraries yielded more full-length cDNA sequences. The Arabidopsis data were used to simulate additional rounds of NG and traditional EST sequencing, and various combinations of each. Our simulations suggest a combination of FLX and Solexa sequencing for optimal transcriptome coverage at modest cost. We have also developed ESTcalc http://fgp.huck.psu.edu/NG_Sims/ngsim.pl, an online webtool, which allows users to explore the results of this study by specifying individualized costs and sequencing characteristics. Conclusion NG sequencing technologies are a highly flexible set of platforms that can be scaled to suit different project goals. In terms of sequence coverage alone, the NG sequencing is a dramatic advance

  18. Next generation sequencing (NGS)technologies and applications

    Energy Technology Data Exchange (ETDEWEB)

    Vuyisich, Momchilo [Los Alamos National Laboratory

    2012-09-11

    NGS technology overview: (1) NGS library preparation - Nucleic acids extraction, Sample quality control, RNA conversion to cDNA, Addition of sequencing adapters, Quality control of library; (2) Sequencing - Clonal amplification of library fragments, (except PacBio), Sequencing by synthesis, Data output (reads and quality); and (3) Data analysis - Read mapping, Genome assembly, Gene expression, Operon structure, sRNA discovery, and Epigenetic analyses.

  19. Sequencing of chloroplast genome using whole cellular DNA and Solexa sequencing technology

    Directory of Open Access Journals (Sweden)

    Jian eWu

    2012-11-01

    Full Text Available Sequencing of the chloroplast genome using traditional sequencing methods has been difficult because of its size (>120 kb and the complicated procedures required to prepare templates. To explore the feasibility of sequencing the chloroplast genome using DNA extracted from whole cells and Solexa sequencing technology, we sequenced whole cellular DNA isolated from leaves of three Brassica rapa accessions with one lane per accession. In total, 246 Mb, 362Mb, 361 Mb sequence data were generated for the three accessions Chiifu-401-42, Z16 and FT, respectively. Microreads were assembled by reference-guided assembly using the cpDNA sequences of B. rapa, Arabidopsis thaliana, and Nicotiana tabacum. We achieved coverage of more than 99.96% of the cp genome in the three tested accessions using the B. rapa sequence as the reference. When A. thaliana or N. tabacum sequences were used as references, 99.7–99.8% or 95.5–99.7% of the B. rapa chloroplast genome was covered, respectively. These results demonstrated that sequencing of whole cellular DNA isolated from young leaves using the Illumina Genome Analyzer is an efficient method for high-throughput sequencing of chloroplast genome.

  20. Assessment of metagenomic assembly using simulated next generation sequencing data

    DEFF Research Database (Denmark)

    Mende, Daniel R; Waller, Alison S; Sunagawa, Shinichi

    2012-01-01

    with platform-specific (Sanger, pyrosequencing, Illumina) base-error models, and simulated metagenomes of differing community complexities. We first evaluated the effect of rigorous quality control on Illumina data. Although quality filtering removed a large proportion of the data, it greatly improved...... the accuracy and contig lengths of resulting assemblies. We then compared the quality-trimmed Illumina assemblies to those from Sanger and pyrosequencing. For the simple community (10 genomes) all sequencing technologies assembled a similar amount and accurately represented the expected functional composition...... the Sanger reads still represented the overall functional composition reasonably well. We further examined the effect of scaffolding of contigs using paired-end Illumina reads. It dramatically increased contig lengths of the simple community and yielded minor improvements to the more complex communities...

  1. Special Issue: Next Generation DNA Sequencing

    Directory of Open Access Journals (Sweden)

    Paul Richardson

    2010-10-01

    Full Text Available Next Generation Sequencing (NGS refers to technologies that do not rely on traditional dideoxy-nucleotide (Sanger sequencing where labeled DNA fragments are physically resolved by electrophoresis. These new technologies rely on different strategies, but essentially all of them make use of real-time data collection of a base level incorporation event across a massive number of reactions (on the order of millions versus 96 for capillary electrophoresis for instance. The major commercial NGS platforms available to researchers are the 454 Genome Sequencer (Roche, Illumina (formerly Solexa Genome analyzer, the SOLiD system (Applied Biosystems/Life Technologies and the Heliscope (Helicos Corporation. The techniques and different strategies utilized by these platforms are reviewed in a number of the papers in this special issue. These technologies are enabling new applications that take advantage of the massive data produced by this next generation of sequencing instruments. [...

  2. FDA's Activities Supporting Regulatory Application of "Next Gen" Sequencing Technologies.

    Science.gov (United States)

    Wilson, Carolyn A; Simonyan, Vahan

    2014-01-01

    Applications of next-generation sequencing (NGS) technologies require availability and access to an information technology (IT) infrastructure and bioinformatics tools for large amounts of data storage and analyses. The U.S. Food and Drug Administration (FDA) anticipates that the use of NGS data to support regulatory submissions will continue to increase as the scientific and clinical communities become more familiar with the technologies and identify more ways to apply these advanced methods to support development and evaluation of new biomedical products. FDA laboratories are conducting research on different NGS platforms and developing the IT infrastructure and bioinformatics tools needed to enable regulatory evaluation of the technologies and the data sponsors will submit. A High-performance Integrated Virtual Environment, or HIVE, has been launched, and development and refinement continues as a collaborative effort between the FDA and George Washington University to provide the tools to support these needs. The use of a highly parallelized environment facilitated by use of distributed cloud storage and computation has resulted in a platform that is both rapid and responsive to changing scientific needs. The FDA plans to further develop in-house capacity in this area, while also supporting engagement by the external community, by sponsoring an open, public workshop to discuss NGS technologies and data formats standardization, and to promote the adoption of interoperability protocols in September 2014. Next-generation sequencing (NGS) technologies are enabling breakthroughs in how the biomedical community is developing and evaluating medical products. One example is the potential application of this method to the detection and identification of microbial contaminants in biologic products. In order for the U.S. Food and Drug Administration (FDA) to be able to evaluate the utility of this technology, we need to have the information technology infrastructure and

  3. Technology development for gene discovery and full-length sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Marcelo Bento Soares

    2004-07-19

    In previous years, with support from the U.S. Department of Energy, we developed methods for construction of normalized and subtracted cDNA libraries, and constructed hundreds of high-quality libraries for production of Expressed Sequence Tags (ESTs). Our clones were made widely available to the scientific community through the IMAGE Consortium, and millions of ESTs were produced from our libraries either by collaborators or by our own sequencing laboratory at the University of Iowa. During this grant period, we focused on (1) the development of a method for preferential cloning of tissue-specific and/or rare transcripts, (2) its utilization to expedite EST-based gene discovery for the NIH Mouse Brain Molecular Anatomy Project, (3) further development and optimization of a method for construction of full-length-enriched cDNA libraries, and (4) modification of a plasmid vector to maximize efficiency of full-length cDNA sequencing by the transposon-mediated approach. It is noteworthy that the technology developed for preferential cloning of rare mRNAs enabled identification of over 2,000 mouse transcripts differentially expressed in the hippocampus. In addition, the method that we optimized for construction of full-length-enriched cDNA libraries was successfully utilized for the production of approximately fifty libraries from the developing mouse nervous system, from which over 2,500 full-ORF-containing cDNAs have been identified and accurately sequenced in their entirety either by our group or by the NIH-Mammalian Gene Collection Program Sequencing Team.

  4. Genome Sequence Databases (Overview): Sequencing and Assembly

    Energy Technology Data Exchange (ETDEWEB)

    Lapidus, Alla L.

    2009-01-01

    From the date its role in heredity was discovered, DNA has been generating interest among scientists from different fields of knowledge: physicists have studied the three dimensional structure of the DNA molecule, biologists tried to decode the secrets of life hidden within these long molecules, and technologists invent and improve methods of DNA analysis. The analysis of the nucleotide sequence of DNA occupies a special place among the methods developed. Thanks to the variety of sequencing technologies available, the process of decoding the sequence of genomic DNA (or whole genome sequencing) has become robust and inexpensive. Meanwhile the assembly of whole genome sequences remains a challenging task. In addition to the need to assemble millions of DNA fragments of different length (from 35 bp (Solexa) to 800 bp (Sanger)), great interest in analysis of microbial communities (metagenomes) of different complexities raises new problems and pushes some new requirements for sequence assembly tools to the forefront. The genome assembly process can be divided into two steps: draft assembly and assembly improvement (finishing). Despite the fact that automatically performed assembly (or draft assembly) is capable of covering up to 98% of the genome, in most cases, it still contains incorrectly assembled reads. The error rate of the consensus sequence produced at this stage is about 1/2000 bp. A finished genome represents the genome assembly of much higher accuracy (with no gaps or incorrectly assembled areas) and quality ({approx}1 error/10,000 bp), validated through a number of computer and laboratory experiments.

  5. SEQUENCING BATCH REACTOR: A PROMISING TECHNOLOGY IN WASTEWATER TREATMENT

    Directory of Open Access Journals (Sweden)

    A. H. Mahvi

    2008-04-01

    Full Text Available Discharge of domestic and industrial wastewater to surface or groundwater is very dangerous to the environment. Therefore treatment of any kind of wastewater to produce effluent with good quality is necessary. In this regard choosing an effective treatment system is important. Sequencing batch reactor is a modification of activated sludge process which has been successfully used to treat municipal and industrial wastewater. The process could be applied for nutrients removal, high biochemical oxygen demand containing industrial wastewater, wastewater containing toxic materials such as cyanide, copper, chromium, lead and nickel, food industries effluents, landfill leachates and tannery wastewater. Of the process advantages are single-tank configuration, small foot print, easily expandable, simple operation and low capital costs. Many researches have been conducted on this treatment technology. The authors had been conducted some investigations on a modification of sequencing batch reactor. Their studies resulted in very high percentage removal of biochemical oxygen demand, chemical oxygen demand, total kjeldahl nitrogen, total nitrogen, total phosphorus and total suspended solids respectively. This paper reviews some of the published works in addition to experiences of the authors.

  6. Applications and Case Studies of the Next-Generation Sequencing Technologies in Food, Nutrition and Agriculture.

    Science.gov (United States)

    Next-generation sequencing technologies are able to produce high-throughput short sequence reads in a cost-effective fashion. The emergence of these technologies has not only facilitated genome sequencing but also changed the landscape of life sciences. Here I survey their major applications ranging...

  7. Biomolecule Sequencer: Next-Generation DNA Sequencing Technology for In-Flight Environmental Monitoring, Research, and Beyond

    Science.gov (United States)

    Smith, David J.; Burton, Aaron; Castro-Wallace, Sarah; John, Kristen; Stahl, Sarah E.; Dworkin, Jason Peter; Lupisella, Mark L.

    2016-01-01

    On the International Space Station (ISS), technologies capable of rapid microbial identification and disease diagnostics are not currently available. NASA still relies upon sample return for comprehensive, molecular-based sample characterization. Next-generation DNA sequencing is a powerful approach for identifying microorganisms in air, water, and surfaces onboard spacecraft. The Biomolecule Sequencer payload, manifested to SpaceX-9 and scheduled on the Increment 4748 research plan (June 2016), will assess the functionality of a commercially-available next-generation DNA sequencer in the microgravity environment of ISS. The MinION device from Oxford Nanopore Technologies (Oxford, UK) measures picoamp changes in electrical current dependent on nucleotide sequences of the DNA strand migrating through nanopores in the system. The hardware is exceptionally small (9.5 x 3.2 x 1.6 cm), lightweight (120 grams), and powered only by a USB connection. For the ISS technology demonstration, the Biomolecule Sequencer will be powered by a Microsoft Surface Pro3. Ground-prepared samples containing lambda bacteriophage, Escherichia coli, and mouse genomic DNA, will be launched and stored frozen on the ISS until experiment initiation. Immediately prior to sequencing, a crew member will collect and thaw frozen DNA samples, connect the sequencer to the Surface Pro3, inject thawed samples into a MinION flow cell, and initiate sequencing. At the completion of the sequencing run, data will be downlinked for ground analysis. Identical, synchronous ground controls will be used for data comparisons to determine sequencer functionality, run-time sequence, current dynamics, and overall accuracy. We will present our latest results from the ISS flight experiment the first time DNA has ever been sequenced in space and discuss the many potential applications of the Biomolecule Sequencer for environmental monitoring, medical diagnostics, higher fidelity and more adaptable Space Biology Human

  8. Application of Next-generation Sequencing in Clinical Molecular Diagnostics

    Directory of Open Access Journals (Sweden)

    Morteza Seifi

    2017-05-01

    Full Text Available ABSTRACT Next-generation sequencing (NGS is the catch all terms that used to explain several different modern sequencing technologies which let us to sequence nucleic acids much more rapidly and cheaply than the formerly used Sanger sequencing, and as such have revolutionized the study of molecular biology and genomics with excellent resolution and accuracy. Over the past years, many academic companies and institutions have continued technological advances to expand NGS applications from research to the clinic. In this review, the performance and technical features of current NGS platforms were described. Furthermore, advances in the applying of NGS technologies towards the progress of clinical molecular diagnostics were emphasized. General advantages and disadvantages of each sequencing system are summarized and compared to guide the selection of NGS platforms for specific research aims.

  9. Clinical Application of Picodroplet Digital PCR Technology for Rapid Detection of EGFR T790M in Next-Generation Sequencing Libraries and DNA from Limited Tumor Samples.

    Science.gov (United States)

    Borsu, Laetitia; Intrieri, Julie; Thampi, Linta; Yu, Helena; Riely, Gregory; Nafa, Khedoudja; Chandramohan, Raghu; Ladanyi, Marc; Arcila, Maria E

    2016-11-01

    Although next-generation sequencing (NGS) is a robust technology for comprehensive assessment of EGFR-mutant lung adenocarcinomas with acquired resistance to tyrosine kinase inhibitors, it may not provide sufficiently rapid and sensitive detection of the EGFR T790M mutation, the most clinically relevant resistance biomarker. Here, we describe a digital PCR (dPCR) assay for rapid T790M detection on aliquots of NGS libraries prepared for comprehensive profiling, fully maximizing broad genomic analysis on limited samples. Tumor DNAs from patients with EGFR-mutant lung adenocarcinomas and acquired resistance to epidermal growth factor receptor inhibitors were prepared for Memorial Sloan-Kettering-Integrated Mutation Profiling of Actionable Cancer Targets sequencing, a hybrid capture-based assay interrogating 410 cancer-related genes. Precapture library aliquots were used for rapid EGFR T790M testing by dPCR, and results were compared with NGS and locked nucleic acid-PCR Sanger sequencing (reference high sensitivity method). Seventy resistance samples showed 99% concordance with the reference high sensitivity method in accuracy studies. Input as low as 2.5 ng provided a sensitivity of 1% and improved further with increasing DNA input. dPCR on libraries required less DNA and showed better performance than direct genomic DNA. dPCR on NGS libraries is a robust and rapid approach to EGFR T790M testing, allowing most economical utilization of limited material for comprehensive assessment. The same assay can also be performed directly on any limited DNA source and cell-free DNA. Copyright © 2016 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  10. Applications of Next-Generation Sequencing Technologies to Diagnostic Virology

    Directory of Open Access Journals (Sweden)

    Giorgio Palù

    2011-11-01

    Full Text Available Novel DNA sequencing techniques, referred to as “next-generation” sequencing (NGS, provide high speed and throughput that can produce an enormous volume of sequences with many possible applications in research and diagnostic settings. In this article, we provide an overview of the many applications of NGS in diagnostic virology. NGS techniques have been used for high-throughput whole viral genome sequencing, such as sequencing of new influenza viruses, for detection of viral genome variability and evolution within the host, such as investigation of human immunodeficiency virus and human hepatitis C virus quasispecies, and monitoring of low-abundance antiviral drug-resistance mutations. NGS techniques have been applied to metagenomics-based strategies for the detection of unexpected disease-associated viruses and for the discovery of novel human viruses, including cancer-related viruses. Finally, the human virome in healthy and disease conditions has been described by NGS-based metagenomics.

  11. A machine learning model to determine the accuracy of variant calls in capture-based next generation sequencing.

    Science.gov (United States)

    van den Akker, Jeroen; Mishne, Gilad; Zimmer, Anjali D; Zhou, Alicia Y

    2018-04-17

    Next generation sequencing (NGS) has become a common technology for clinical genetic tests. The quality of NGS calls varies widely and is influenced by features like reference sequence characteristics, read depth, and mapping accuracy. With recent advances in NGS technology and software tools, the majority of variants called using NGS alone are in fact accurate and reliable. However, a small subset of difficult-to-call variants that still do require orthogonal confirmation exist. For this reason, many clinical laboratories confirm NGS results using orthogonal technologies such as Sanger sequencing. Here, we report the development of a deterministic machine-learning-based model to differentiate between these two types of variant calls: those that do not require confirmation using an orthogonal technology (high confidence), and those that require additional quality testing (low confidence). This approach allows reliable NGS-based calling in a clinical setting by identifying the few important variant calls that require orthogonal confirmation. We developed and tested the model using a set of 7179 variants identified by a targeted NGS panel and re-tested by Sanger sequencing. The model incorporated several signals of sequence characteristics and call quality to determine if a variant was identified at high or low confidence. The model was tuned to eliminate false positives, defined as variants that were called by NGS but not confirmed by Sanger sequencing. The model achieved very high accuracy: 99.4% (95% confidence interval: +/- 0.03%). It categorized 92.2% (6622/7179) of the variants as high confidence, and 100% of these were confirmed to be present by Sanger sequencing. Among the variants that were categorized as low confidence, defined as NGS calls of low quality that are likely to be artifacts, 92.1% (513/557) were found to be not present by Sanger sequencing. This work shows that NGS data contains sufficient characteristics for a machine-learning-based model to

  12. Next-generation sequencing for genetic testing of familial colorectal cancer syndromes.

    Science.gov (United States)

    Simbolo, Michele; Mafficini, Andrea; Agostini, Marco; Pedrazzani, Corrado; Bedin, Chiara; Urso, Emanuele D; Nitti, Donato; Turri, Giona; Scardoni, Maria; Fassan, Matteo; Scarpa, Aldo

    2015-01-01

    Genetic screening in families with high risk to develop colorectal cancer (CRC) prevents incurable disease and permits personalized therapeutic and follow-up strategies. The advancement of next-generation sequencing (NGS) technologies has revolutionized the throughput of DNA sequencing. A series of 16 probands for either familial adenomatous polyposis (FAP; 8 cases) or hereditary nonpolyposis colorectal cancer (HNPCC; 8 cases) were investigated for intragenic mutations in five CRC familial syndromes-associated genes (APC, MUTYH, MLH1, MSH2, MSH6) applying both a custom multigene Ion AmpliSeq NGS panel and conventional Sanger sequencing. Fourteen pathogenic variants were detected in 13/16 FAP/HNPCC probands (81.3 %); one FAP proband presented two co-existing pathogenic variants, one in APC and one in MUTYH. Thirteen of these 14 pathogenic variants were detected by both NGS and Sanger, while one MSH2 mutation (L280FfsX3) was identified only by Sanger sequencing. This is due to a limitation of the NGS approach in resolving sequences close or within homopolymeric stretches of DNA. To evaluate the performance of our NGS custom panel we assessed its capability to resolve the DNA sequences corresponding to 2225 pathogenic variants reported in the COSMIC database for APC, MUTYH, MLH1, MSH2, MSH6. Our NGS custom panel resolves the sequences where 2108 (94.7 %) of these variants occur. The remaining 117 mutations reside inside or in close proximity to homopolymer stretches; of these 27 (1.2 %) are imprecisely identified by the software but can be resolved by visual inspection of the region, while the remaining 90 variants (4.0 %) are blind spots. In summary, our custom panel would miss 4 % (90/2225) of pathogenic variants that would need a small set of Sanger sequencing reactions to be solved. The multiplex NGS approach has the advantage of analyzing multiple genes in multiple samples simultaneously, requiring only a reduced number of Sanger sequences to resolve

  13. The application of the high throughput sequencing technology in the transposable elements.

    Science.gov (United States)

    Liu, Zhen; Xu, Jian-hong

    2015-09-01

    High throughput sequencing technology has dramatically improved the efficiency of DNA sequencing, and decreased the costs to a great extent. Meanwhile, this technology usually has advantages of better specificity, higher sensitivity and accuracy. Therefore, it has been applied to the research on genetic variations, transcriptomics and epigenomics. Recently, this technology has been widely employed in the studies of transposable elements and has achieved fruitful results. In this review, we summarize the application of high throughput sequencing technology in the fields of transposable elements, including the estimation of transposon content, preference of target sites and distribution, insertion polymorphism and population frequency, identification of rare copies, transposon horizontal transfers as well as transposon tagging. We also briefly introduce the major common sequencing strategies and algorithms, their advantages and disadvantages, and the corresponding solutions. Finally, we envision the developing trends of high throughput sequencing technology, especially the third generation sequencing technology, and its application in transposon studies in the future, hopefully providing a comprehensive understanding and reference for related scientific researchers.

  14. The history and advances of reversible terminators used in new generations of sequencing technology.

    Science.gov (United States)

    Chen, Fei; Dong, Mengxing; Ge, Meng; Zhu, Lingxiang; Ren, Lufeng; Liu, Guocheng; Mu, Rong

    2013-02-01

    DNA sequencing using reversible terminators, as one sequencing by synthesis strategy, has garnered a great deal of interest due to its popular application in the second-generation high-throughput DNA sequencing technology. In this review, we provided its history of development, classification, and working mechanism of this technology. We also outlined the screening strategies for DNA polymerases to accommodate the reversible terminators as substrates during polymerization; particularly, we introduced the "REAP" method developed by us. At the end of this review, we discussed current limitations of this approach and provided potential solutions to extend its application. Copyright © 2013. Production and hosting by Elsevier Ltd.

  15. Treatment of Laboratory Wastewater by Sequence Batch reactor technology

    International Nuclear Information System (INIS)

    Imtiaz, N.; Butt, M.; Khan, R.A.; Saeed, M.T.; Irfan, M.

    2012-01-01

    These studies were conducted on the characterization and treatment of sewage mixed with waste -water of research and testing laboratory (PCSIR Laboratories Lahore). In this study all the parameters COD, BOD and TSS etc of influent (untreated waste-water) and effluent (treated waste-water) were characterized using the standard methods of examination for water and waste-water. All the results of the analyzed waste-water parameters were above the National Environmental Quality Standards (NEQS) set at National level. Treatment of waste-water was carried out by conventional sequencing batch reactor technique (SBR) using aeration and settling technique in the same treatment reactor at laboratory scale. The results of COD after treatment were reduced from (90-95 %), BOD (95-97 %) and TSS (96-99 %) and the reclaimed effluent quality was suitable for gardening purposes. (author)

  16. Integrating sequencing technologies in personal genomics: optimal low cost reconstruction of structural variants.

    Directory of Open Access Journals (Sweden)

    Jiang Du

    2009-07-01

    Full Text Available The goal of human genome re-sequencing is obtaining an accurate assembly of an individual's genome. Recently, there has been great excitement in the development of many technologies for this (e.g. medium and short read sequencing from companies such as 454 and SOLiD, and high-density oligo-arrays from Affymetrix and NimbelGen, with even more expected to appear. The costs and sensitivities of these technologies differ considerably from each other. As an important goal of personal genomics is to reduce the cost of re-sequencing to an affordable point, it is worthwhile to consider optimally integrating technologies. Here, we build a simulation toolbox that will help us optimally combine different technologies for genome re-sequencing, especially in reconstructing large structural variants (SVs. SV reconstruction is considered the most challenging step in human genome re-sequencing. (It is sometimes even harder than de novo assembly of small genomes because of the duplications and repetitive sequences in the human genome. To this end, we formulate canonical problems that are representative of issues in reconstruction and are of small enough scale to be computationally tractable and simulatable. Using semi-realistic simulations, we show how we can combine different technologies to optimally solve the assembly at low cost. With mapability maps, our simulations efficiently handle the inhomogeneous repeat-containing structure of the human genome and the computational complexity of practical assembly algorithms. They quantitatively show how combining different read lengths is more cost-effective than using one length, how an optimal mixed sequencing strategy for reconstructing large novel SVs usually also gives accurate detection of SNPs/indels, how paired-end reads can improve reconstruction efficiency, and how adding in arrays is more efficient than just sequencing for disentangling some complex SVs. Our strategy should facilitate the sequencing of

  17. Next generation sequencing in clinical medicine: Challenges and lessons for pathology and biomedical informatics

    Directory of Open Access Journals (Sweden)

    Rama R Gullapalli

    2012-01-01

    Full Text Available The Human Genome Project (HGP provided the initial draft of mankind′s DNA sequence in 2001. The HGP was produced by 23 collaborating laboratories using Sanger sequencing of mapped regions as well as shotgun sequencing techniques in a process that occupied 13 years at a cost of ~$3 billion. Today, Next Generation Sequencing (NGS techniques represent the next phase in the evolution of DNA sequencing technology at dramatically reduced cost compared to traditional Sanger sequencing. A single laboratory today can sequence the entire human genome in a few days for a few thousand dollars in reagents and staff time. Routine whole exome or even whole genome sequencing of clinical patients is well within the realm of affordability for many academic institutions across the country. This paper reviews current sequencing technology methods and upcoming advancements in sequencing technology as well as challenges associated with data generation, data manipulation and data storage. Implementation of routine NGS data in cancer genomics is discussed along with potential pitfalls in the interpretation of the NGS data. The overarching importance of bioinformatics in the clinical implementation of NGS is emphasized. [7] We also review the issue of physician education which also is an important consideration for the successful implementation of NGS in the clinical workplace. NGS technologies represent a golden opportunity for the next generation of pathologists to be at the leading edge of the personalized medicine approaches coming our way. Often under-emphasized issues of data access and control as well as potential ethical implications of whole genome NGS sequencing are also discussed. Despite some challenges, it′s hard not to be optimistic about the future of personalized genome sequencing and its potential impact on patient care and the advancement of knowledge of human biology and disease in the near future.

  18. DNA fingerprinting, DNA barcoding, and next generation sequencing technology in plants.

    Science.gov (United States)

    Sucher, Nikolaus J; Hennell, James R; Carles, Maria C

    2012-01-01

    DNA fingerprinting of plants has become an invaluable tool in forensic, scientific, and industrial laboratories all over the world. PCR has become part of virtually every variation of the plethora of approaches used for DNA fingerprinting today. DNA sequencing is increasingly used either in combination with or as a replacement for traditional DNA fingerprinting techniques. A prime example is the use of short, standardized regions of the genome as taxon barcodes for biological identification of plants. Rapid advances in "next generation sequencing" (NGS) technology are driving down the cost of sequencing and bringing large-scale sequencing projects into the reach of individual investigators. We present an overview of recent publications that demonstrate the use of "NGS" technology for DNA fingerprinting and DNA barcoding applications.

  19. Application of genotyping by sequencing technology to a variety of crop breeding programs.

    Science.gov (United States)

    Kim, Changsoo; Guo, Hui; Kong, Wenqian; Chandnani, Rahul; Shuang, Lan-Shuan; Paterson, Andrew H

    2016-01-01

    Since the Arabidopsis genome was completed, draft sequences or pseudomolecules have been published for more than 100 plant genomes including green algae, in large part due to advances in sequencing technologies. Advanced DNA sequencing technologies have also conferred new opportunities for high-throughput low-cost crop genotyping, based on single-nucleotide polymorphisms (SNPs). However, a recurring complication in crop genotyping that differs from other taxa is a higher level of DNA sequence duplication, noting that all angiosperms are thought to have polyploidy in their evolutionary history. In the current article, we briefly review current genotyping methods using next-generation sequencing (NGS) technologies. We also explore case studies of genotyping-by-sequencing (GBS) applications to several crops differing in genome size, organization and breeding system (paleopolyploids, neo-allopolyploids, neo-autopolyploids). GBS typically shows good results when it is applied to an inbred diploid species with a well-established reference genome. However, we have also made some progress toward GBS of outcrossing species lacking reference genomes and of polyploid populations, which still need much improvement. Regardless of some limitations, low-cost and multiplexed genotyping offered by GBS will be beneficial to breed superior cultivars in many crop species. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  20. The role of the physician: Eugene Sanger and a standard of care at the Elmira prison camp.

    Science.gov (United States)

    Waggoner, Jesse

    2008-01-01

    The conduct of American military physicians in prisoner of war (POW) camps has been called into question by the abuse scandals at Abu Ghraib and Guantánamo Bay. This essay explores the experiences of the first U.S. military physicians to confront POW patients in large numbers-events that occurred during the American Civil War. While POWs received sub-standard care in camps north and south, the war also saw the issuance of the first document to outline the rights of POWs. This ambivalence toward the proper care and treatment of the POW is evident in the career of Dr. Eugene Sanger, the first Union surgeon at the prison camp in Elmira, New York. Sanger demonstrated both concern about the sanitary condition of the camp and pride in the deaths of POWs as furthering the overall war aims. His cruelty attracted some censure, but Sanger never faced disciplinary action. He was honorably discharged and went on to become the Surgeon General of his home state. This article places his actions at Elmira in the context of medical ethics, Army orders, and Northern opinion in 1864, and it will argue that the lack of Federal response to Eugene Sanger's poor record while serving at the prison set a precedent for inferior medical care of POWs by American military physicians.

  1. A combination of LongSAGE with Solexa sequencing is well suited to explore the depth and the complexity of transcriptome

    Directory of Open Access Journals (Sweden)

    Scoté-Blachon Céline

    2008-09-01

    Full Text Available Abstract Background "Open" transcriptome analysis methods allow to study gene expression without a priori knowledge of the transcript sequences. As of now, SAGE (Serial Analysis of Gene Expression, LongSAGE and MPSS (Massively Parallel Signature Sequencing are the mostly used methods for "open" transcriptome analysis. Both LongSAGE and MPSS rely on the isolation of 21 pb tag sequences from each transcript. In contrast to LongSAGE, the high throughput sequencing method used in MPSS enables the rapid sequencing of very large libraries containing several millions of tags, allowing deep transcriptome analysis. However, a bias in the complexity of the transcriptome representation obtained by MPSS was recently uncovered. Results In order to make a deep analysis of mouse hypothalamus transcriptome avoiding the limitation introduced by MPSS, we combined LongSAGE with the Solexa sequencing technology and obtained a library of more than 11 millions of tags. We then compared it to a LongSAGE library of mouse hypothalamus sequenced with the Sanger method. Conclusion We found that Solexa sequencing technology combined with LongSAGE is perfectly suited for deep transcriptome analysis. In contrast to MPSS, it gives a complex representation of transcriptome as reliable as a LongSAGE library sequenced by the Sanger method.

  2. Stepwise threshold clustering: a new method for genotyping MHC loci using next-generation sequencing technology.

    Directory of Open Access Journals (Sweden)

    William E Stutz

    Full Text Available Genes of the vertebrate major histocompatibility complex (MHC are of great interest to biologists because of their important role in immunity and disease, and their extremely high levels of genetic diversity. Next generation sequencing (NGS technologies are quickly becoming the method of choice for high-throughput genotyping of multi-locus templates like MHC in non-model organisms. Previous approaches to genotyping MHC genes using NGS technologies suffer from two problems:1 a "gray zone" where low frequency alleles and high frequency artifacts can be difficult to disentangle and 2 a similar sequence problem, where very similar alleles can be difficult to distinguish as two distinct alleles. Here were present a new method for genotyping MHC loci--Stepwise Threshold Clustering (STC--that addresses these problems by taking full advantage of the increase in sequence data provided by NGS technologies. Unlike previous approaches for genotyping MHC with NGS data that attempt to classify individual sequences as alleles or artifacts, STC uses a quasi-Dirichlet clustering algorithm to cluster similar sequences at increasing levels of sequence similarity. By applying frequency and similarity based criteria to clusters rather than individual sequences, STC is able to successfully identify clusters of sequences that correspond to individual or similar alleles present in the genomes of individual samples. Furthermore, STC does not require duplicate runs of all samples, increasing the number of samples that can be genotyped in a given project. We show how the STC method works using a single sample library. We then apply STC to 295 threespine stickleback (Gasterosteus aculeatus samples from four populations and show that neighboring populations differ significantly in MHC allele pools. We show that STC is a reliable, accurate, efficient, and flexible method for genotyping MHC that will be of use to biologists interested in a variety of downstream applications.

  3. Students' Guided Reinvention of Definition of Limit of a Sequence with Interactive Technology

    Science.gov (United States)

    Flores, Alfinio; Park, Jungeun

    2016-01-01

    In a course emphasizing interactive technology, 19 students, including 18 mathematics education majors, mostly in their first year, reinvented the definition of limit of a sequence while working in small cooperative groups. The class spent four sessions of 75 minutes each on a cyclical process of guided reinvention of the definition of limit of a…

  4. A complete mitochondrial genome sequence from a mesolithic wild aurochs (Bos primigenius).

    LENUS (Irish Health Repository)

    Edwards, Ceiridwen J

    2010-01-01

    BACKGROUND: The derivation of domestic cattle from the extinct wild aurochs (Bos primigenius) has been well-documented by archaeological and genetic studies. Genetic studies point towards the Neolithic Near East as the centre of origin for Bos taurus, with some lines of evidence suggesting possible, albeit rare, genetic contributions from locally domesticated wild aurochsen across Eurasia. Inferences from these investigations have been based largely on the analysis of partial mitochondrial DNA sequences generated from modern animals, with limited sequence data from ancient aurochsen samples. Recent developments in DNA sequencing technologies, however, are affording new opportunities for the examination of genetic material retrieved from extinct species, providing new insight into their evolutionary history. Here we present DNA sequence analysis of the first complete mitochondrial genome (16,338 base pairs) from an archaeologically-verified and exceptionally-well preserved aurochs bone sample. METHODOLOGY: DNA extracts were generated from an aurochs humerus bone sample recovered from a cave site located in Derbyshire, England and radiocarbon-dated to 6,738+\\/-68 calibrated years before present. These extracts were prepared for both Sanger and next generation DNA sequencing technologies (Illumina Genome Analyzer). In total, 289.9 megabases (22.48%) of the post-filtered DNA sequences generated using the Illumina Genome Analyzer from this sample mapped with confidence to the bovine genome. A consensus B. primigenius mitochondrial genome sequence was constructed and was analysed alongside all available complete bovine mitochondrial genome sequences. CONCLUSIONS: For all nucleotide positions where both Sanger and Illumina Genome Analyzer sequencing methods gave high-confidence calls, no discrepancies were observed. Sequence analysis reveals evidence of heteroplasmy in this sample and places this mitochondrial genome sequence securely within a previously identified

  5. Transcriptome analysis of carnation (Dianthus caryophyllus L.) based on next-generation sequencing technology.

    Science.gov (United States)

    Tanase, Koji; Nishitani, Chikako; Hirakawa, Hideki; Isobe, Sachiko; Tabata, Satoshi; Ohmiya, Akemi; Onozaki, Takashi

    2012-07-02

    Carnation (Dianthus caryophyllus L.), in the family Caryophyllaceae, can be found in a wide range of colors and is a model system for studies of flower senescence. In addition, it is one of the most important flowers in the global floriculture industry. However, few genomics resources, such as sequences and markers are available for carnation or other members of the Caryophyllaceae. To increase our understanding of the genetic control of important characters in carnation, we generated an expressed sequence tag (EST) database for a carnation cultivar important in horticulture by high-throughput sequencing using 454 pyrosequencing technology. We constructed a normalized cDNA library and a 3'-UTR library of carnation, obtaining a total of 1,162,126 high-quality reads. These reads were assembled into 300,740 unigenes consisting of 37,844 contigs and 262,896 singlets. The contigs were searched against an Arabidopsis sequence database, and 61.8% (23,380) of them had at least one BLASTX hit. These contigs were also annotated with Gene Ontology (GO) and were found to cover a broad range of GO categories. Furthermore, we identified 17,362 potential simple sequence repeats (SSRs) in 14,291 of the unigenes. We focused on gene discovery in the areas of flower color and ethylene biosynthesis. Transcripts were identified for almost every gene involved in flower chlorophyll and carotenoid metabolism and in anthocyanin biosynthesis. Transcripts were also identified for every step in the ethylene biosynthesis pathway. We present the first large-scale sequence data set for carnation, generated using next-generation sequencing technology. The large EST database generated from these sequences is an informative resource for identifying genes involved in various biological processes in carnation and provides an EST resource for understanding the genetic diversity of this plant.

  6. Transcriptome analysis of carnation (Dianthus caryophyllus L. based on next-generation sequencing technology

    Directory of Open Access Journals (Sweden)

    Tanase Koji

    2012-07-01

    Full Text Available Abstract Background Carnation (Dianthus caryophyllus L., in the family Caryophyllaceae, can be found in a wide range of colors and is a model system for studies of flower senescence. In addition, it is one of the most important flowers in the global floriculture industry. However, few genomics resources, such as sequences and markers are available for carnation or other members of the Caryophyllaceae. To increase our understanding of the genetic control of important characters in carnation, we generated an expressed sequence tag (EST database for a carnation cultivar important in horticulture by high-throughput sequencing using 454 pyrosequencing technology. Results We constructed a normalized cDNA library and a 3’-UTR library of carnation, obtaining a total of 1,162,126 high-quality reads. These reads were assembled into 300,740 unigenes consisting of 37,844 contigs and 262,896 singlets. The contigs were searched against an Arabidopsis sequence database, and 61.8% (23,380 of them had at least one BLASTX hit. These contigs were also annotated with Gene Ontology (GO and were found to cover a broad range of GO categories. Furthermore, we identified 17,362 potential simple sequence repeats (SSRs in 14,291 of the unigenes. We focused on gene discovery in the areas of flower color and ethylene biosynthesis. Transcripts were identified for almost every gene involved in flower chlorophyll and carotenoid metabolism and in anthocyanin biosynthesis. Transcripts were also identified for every step in the ethylene biosynthesis pathway. Conclusions We present the first large-scale sequence data set for carnation, generated using next-generation sequencing technology. The large EST database generated from these sequences is an informative resource for identifying genes involved in various biological processes in carnation and provides an EST resource for understanding the genetic diversity of this plant.

  7. Improved Efficiency and Reliability of NGS Amplicon Sequencing Data Analysis for Genetic Diagnostic Procedures Using AGSA Software

    Directory of Open Access Journals (Sweden)

    Axel Poulet

    2016-01-01

    Full Text Available Screening for BRCA mutations in women with familial risk of breast or ovarian cancer is an ideal situation for high-throughput sequencing, providing large amounts of low cost data. However, 454, Roche, and Ion Torrent, Thermo Fisher, technologies produce homopolymer-associated indel errors, complicating their use in routine diagnostics. We developed software, named AGSA, which helps to detect false positive mutations in homopolymeric sequences. Seventy-two familial breast cancer cases were analysed in parallel by amplicon 454 pyrosequencing and Sanger dideoxy sequencing for genetic variations of the BRCA genes. All 565 variants detected by dideoxy sequencing were also detected by pyrosequencing. Furthermore, pyrosequencing detected 42 variants that were missed with Sanger technique. Six amplicons contained homopolymer tracts in the coding sequence that were systematically misread by the software supplied by Roche. Read data plotted as histograms by AGSA software aided the analysis considerably and allowed validation of the majority of homopolymers. As an optimisation, additional 250 patients were analysed using microfluidic amplification of regions of interest (Access Array Fluidigm of the BRCA genes, followed by 454 sequencing and AGSA analysis. AGSA complements a complete line of high-throughput diagnostic sequence analysis, reducing time and costs while increasing reliability, notably for homopolymer tracts.

  8. Comparing microarrays and next-generation sequencing technologies for microbial ecology research.

    Science.gov (United States)

    Roh, Seong Woon; Abell, Guy C J; Kim, Kyoung-Ho; Nam, Young-Do; Bae, Jin-Woo

    2010-06-01

    Recent advances in molecular biology have resulted in the application of DNA microarrays and next-generation sequencing (NGS) technologies to the field of microbial ecology. This review aims to examine the strengths and weaknesses of each of the methodologies, including depth and ease of analysis, throughput and cost-effectiveness. It also intends to highlight the optimal application of each of the individual technologies toward the study of a particular environment and identify potential synergies between the two main technologies, whereby both sample number and coverage can be maximized. We suggest that the efficient use of microarray and NGS technologies will allow researchers to advance the field of microbial ecology, and importantly, improve our understanding of the role of microorganisms in their various environments.

  9. [Application of next-generation semiconductor sequencing technologies in genetic diagnosis of inherited cardiomyopathies].

    Science.gov (United States)

    Zhao, Yue; Zhang, Hong; Xia, Xue-shan

    2015-07-01

    Inherited cardiomyopathy is the most common hereditary cardiac disease. It also causes a significant proportion of sudden cardiac deaths in young adults and athletes. So far, approximately one hundred genes have been reported to be involved in cardiomyopathies through different mechanisms. Therefore, the identification of the genetic basis and disease mechanisms of cardiomyopathies are important for establishing a clinical diagnosis and genetic testing. Next-generation semiconductor sequencing (NGSS) technology platform is a high-throughput sequencer capable of analyzing clinically derived genomes with high productivity, sensitivity and specificity. It was launched in 2010 by Life Technologies of USA, and it is based on a high density semiconductor chip, which was covered with tens of thousands of wells. NGSS has been successfully used in candidate gene mutation screening to identify hereditary disease. In this review, we summarize these genetic variations, challenge and application of NGSS in inherited cardiomyopathy, and its value in disease diagnosis, prevention and treatment.

  10. Read length and repeat resolution: exploring prokaryote genomes using next-generation sequencing technologies.

    Directory of Open Access Journals (Sweden)

    Matt J Cahill

    Full Text Available BACKGROUND: There are a growing number of next-generation sequencing technologies. At present, the most cost-effective options also produce the shortest reads. However, even for prokaryotes, there is uncertainty concerning the utility of these technologies for the de novo assembly of complete genomes. This reflects an expectation that short reads will be unable to resolve small, but presumably abundant, repeats. METHODOLOGY/PRINCIPAL FINDINGS: Using a simple model of repeat assembly, we develop and test a technique that, for any read length, can estimate the occurrence of unresolvable repeats in a genome, and thus predict the number of gaps that would need to be closed to produce a complete sequence. We apply this technique to 818 prokaryote genome sequences. This provides a quantitative assessment of the relative performance of various lengths. Notably, unpaired reads of only 150nt can reconstruct approximately 50% of the analysed genomes with fewer than 96 repeat-induced gaps. Nonetheless, there is considerable variation amongst prokaryotes. Some genomes can be assembled to near contiguity using very short reads while others require much longer reads. CONCLUSIONS: Given the diversity of prokaryote genomes, a sequencing strategy should be tailored to the organism under study. Our results will provide researchers with a practical resource to guide the selection of the appropriate read length.

  11. Read length and repeat resolution: Exploring prokaryote genomes using next-generation sequencing technologies

    KAUST Repository

    Cahill, Matt J.

    2010-07-12

    Background: There are a growing number of next-generation sequencing technologies. At present, the most cost-effective options also produce the shortest reads. However, even for prokaryotes, there is uncertainty concerning the utility of these technologies for the de novo assembly of complete genomes. This reflects an expectation that short reads will be unable to resolve small, but presumably abundant, repeats. Methodology/Principal Findings: Using a simple model of repeat assembly, we develop and test a technique that, for any read length, can estimate the occurrence of unresolvable repeats in a genome, and thus predict the number of gaps that would need to be closed to produce a complete sequence. We apply this technique to 818 prokaryote genome sequences. This provides a quantitative assessment of the relative performance of various lengths. Notably, unpaired reads of only 150nt can reconstruct approximately 50% of the analysed genomes with fewer than 96 repeat-induced gaps. Nonetheless, there is considerable variation amongst prokaryotes. Some genomes can be assembled to near contiguity using very short reads while others require much longer reads. Conclusions: Given the diversity of prokaryote genomes, a sequencing strategy should be tailored to the organism under study. Our results will provide researchers with a practical resource to guide the selection of the appropriate read length. 2010 Cahill et al.

  12. Read length and repeat resolution: Exploring prokaryote genomes using next-generation sequencing technologies

    KAUST Repository

    Cahill, Matt J.; Kö ser, Claudio U.; Ross, Nicholas E.; Archer, John A.C.

    2010-01-01

    Background: There are a growing number of next-generation sequencing technologies. At present, the most cost-effective options also produce the shortest reads. However, even for prokaryotes, there is uncertainty concerning the utility of these technologies for the de novo assembly of complete genomes. This reflects an expectation that short reads will be unable to resolve small, but presumably abundant, repeats. Methodology/Principal Findings: Using a simple model of repeat assembly, we develop and test a technique that, for any read length, can estimate the occurrence of unresolvable repeats in a genome, and thus predict the number of gaps that would need to be closed to produce a complete sequence. We apply this technique to 818 prokaryote genome sequences. This provides a quantitative assessment of the relative performance of various lengths. Notably, unpaired reads of only 150nt can reconstruct approximately 50% of the analysed genomes with fewer than 96 repeat-induced gaps. Nonetheless, there is considerable variation amongst prokaryotes. Some genomes can be assembled to near contiguity using very short reads while others require much longer reads. Conclusions: Given the diversity of prokaryote genomes, a sequencing strategy should be tailored to the organism under study. Our results will provide researchers with a practical resource to guide the selection of the appropriate read length. 2010 Cahill et al.

  13. Evaluation of a pooled strategy for high-throughput sequencing of cosmid clones from metagenomic libraries.

    Science.gov (United States)

    Lam, Kathy N; Hall, Michael W; Engel, Katja; Vey, Gregory; Cheng, Jiujun; Neufeld, Josh D; Charles, Trevor C

    2014-01-01

    High-throughput sequencing methods have been instrumental in the growing field of metagenomics, with technological improvements enabling greater throughput at decreased costs. Nonetheless, the economy of high-throughput sequencing cannot be fully leveraged in the subdiscipline of functional metagenomics. In this area of research, environmental DNA is typically cloned to generate large-insert libraries from which individual clones are isolated, based on specific activities of interest. Sequence data are required for complete characterization of such clones, but the sequencing of a large set of clones requires individual barcode-based sample preparation; this can become costly, as the cost of clone barcoding scales linearly with the number of clones processed, and thus sequencing a large number of metagenomic clones often remains cost-prohibitive. We investigated a hybrid Sanger/Illumina pooled sequencing strategy that omits barcoding altogether, and we evaluated this strategy by comparing the pooled sequencing results to reference sequence data obtained from traditional barcode-based sequencing of the same set of clones. Using identity and coverage metrics in our evaluation, we show that pooled sequencing can generate high-quality sequence data, without producing problematic chimeras. Though caveats of a pooled strategy exist and further optimization of the method is required to improve recovery of complete clone sequences and to avoid circumstances that generate unrecoverable clone sequences, our results demonstrate that pooled sequencing represents an effective and low-cost alternative for sequencing large sets of metagenomic clones.

  14. High-throughput sequence alignment using Graphics Processing Units

    Directory of Open Access Journals (Sweden)

    Trapnell Cole

    2007-12-01

    Full Text Available Abstract Background The recent availability of new, less expensive high-throughput DNA sequencing technologies has yielded a dramatic increase in the volume of sequence data that must be analyzed. These data are being generated for several purposes, including genotyping, genome resequencing, metagenomics, and de novo genome assembly projects. Sequence alignment programs such as MUMmer have proven essential for analysis of these data, but researchers will need ever faster, high-throughput alignment tools running on inexpensive hardware to keep up with new sequence technologies. Results This paper describes MUMmerGPU, an open-source high-throughput parallel pairwise local sequence alignment program that runs on commodity Graphics Processing Units (GPUs in common workstations. MUMmerGPU uses the new Compute Unified Device Architecture (CUDA from nVidia to align multiple query sequences against a single reference sequence stored as a suffix tree. By processing the queries in parallel on the highly parallel graphics card, MUMmerGPU achieves more than a 10-fold speedup over a serial CPU version of the sequence alignment kernel, and outperforms the exact alignment component of MUMmer on a high end CPU by 3.5-fold in total application time when aligning reads from recent sequencing projects using Solexa/Illumina, 454, and Sanger sequencing technologies. Conclusion MUMmerGPU is a low cost, ultra-fast sequence alignment program designed to handle the increasing volume of data produced by new, high-throughput sequencing technologies. MUMmerGPU demonstrates that even memory-intensive applications can run significantly faster on the relatively low-cost GPU than on the CPU.

  15. DSAP: deep-sequencing small RNA analysis pipeline.

    Science.gov (United States)

    Huang, Po-Jung; Liu, Yi-Chung; Lee, Chi-Ching; Lin, Wei-Chen; Gan, Richie Ruei-Chi; Lyu, Ping-Chiang; Tang, Petrus

    2010-07-01

    DSAP is an automated multiple-task web service designed to provide a total solution to analyzing deep-sequencing small RNA datasets generated by next-generation sequencing technology. DSAP uses a tab-delimited file as an input format, which holds the unique sequence reads (tags) and their corresponding number of copies generated by the Solexa sequencing platform. The input data will go through four analysis steps in DSAP: (i) cleanup: removal of adaptors and poly-A/T/C/G/N nucleotides; (ii) clustering: grouping of cleaned sequence tags into unique sequence clusters; (iii) non-coding RNA (ncRNA) matching: sequence homology mapping against a transcribed sequence library from the ncRNA database Rfam (http://rfam.sanger.ac.uk/); and (iv) known miRNA matching: detection of known miRNAs in miRBase (http://www.mirbase.org/) based on sequence homology. The expression levels corresponding to matched ncRNAs and miRNAs are summarized in multi-color clickable bar charts linked to external databases. DSAP is also capable of displaying miRNA expression levels from different jobs using a log(2)-scaled color matrix. Furthermore, a cross-species comparative function is also provided to show the distribution of identified miRNAs in different species as deposited in miRBase. DSAP is available at http://dsap.cgu.edu.tw.

  16. Genome Microscale Heterogeneity among Wild Potatoes Revealed by Diversity Arrays Technology Marker Sequences

    Directory of Open Access Journals (Sweden)

    Alessandra Traini

    2013-01-01

    Full Text Available Tuber-bearing potato species possess several genes that can be exploited to improve the genetic background of the cultivated potato Solanum tuberosum. Among them, S. bulbocastanum and S. commersonii are well known for their strong resistance to environmental stresses. However, scant information is available for these species in terms of genome organization, gene function, and regulatory networks. Consequently, genomic tools to assist breeding are meager, and efficient exploitation of these species has been limited so far. In this paper, we employed the reference genome sequences from cultivated potato and tomato and a collection of sequences of 1,423 potato Diversity Arrays Technology (DArT markers that show polymorphic representation across the genomes of S. bulbocastanum and/or S. commersonii genotypes. Our results highlighted microscale genome sequence heterogeneity that may play a significant role in functional and structural divergence between related species. Our analytical approach provides knowledge of genome structural and sequence variability that could not be detected by transcriptome and proteome approaches.

  17. Genome Microscale Heterogeneity among Wild Potatoes Revealed by Diversity Arrays Technology Marker Sequences.

    Science.gov (United States)

    Traini, Alessandra; Iorizzo, Massimo; Mann, Harpartap; Bradeen, James M; Carputo, Domenico; Frusciante, Luigi; Chiusano, Maria Luisa

    2013-01-01

    Tuber-bearing potato species possess several genes that can be exploited to improve the genetic background of the cultivated potato Solanum tuberosum. Among them, S. bulbocastanum and S. commersonii are well known for their strong resistance to environmental stresses. However, scant information is available for these species in terms of genome organization, gene function, and regulatory networks. Consequently, genomic tools to assist breeding are meager, and efficient exploitation of these species has been limited so far. In this paper, we employed the reference genome sequences from cultivated potato and tomato and a collection of sequences of 1,423 potato Diversity Arrays Technology (DArT) markers that show polymorphic representation across the genomes of S. bulbocastanum and/or S. commersonii genotypes. Our results highlighted microscale genome sequence heterogeneity that may play a significant role in functional and structural divergence between related species. Our analytical approach provides knowledge of genome structural and sequence variability that could not be detected by transcriptome and proteome approaches.

  18. The utility of Next Generation Sequencing for molecular diagnostics in Rett syndrome.

    Science.gov (United States)

    Vidal, Silvia; Brandi, Núria; Pacheco, Paola; Gerotina, Edgar; Blasco, Laura; Trotta, Jean-Rémi; Derdak, Sophia; Del Mar O'Callaghan, Maria; Garcia-Cazorla, Àngels; Pineda, Mercè; Armstrong, Judith

    2017-09-25

    Rett syndrome (RTT) is an early-onset neurodevelopmental disorder that almost exclusively affects girls and is totally disabling. Three genes have been identified that cause RTT: MECP2, CDKL5 and FOXG1. However, the etiology of some of RTT patients still remains unknown. Recently, next generation sequencing (NGS) has promoted genetic diagnoses because of the quickness and affordability of the method. To evaluate the usefulness of NGS in genetic diagnosis, we present the genetic study of RTT-like patients using different techniques based on this technology. We studied 1577 patients with RTT-like clinical diagnoses and reviewed patients who were previously studied and thought to have RTT genes by Sanger sequencing. Genetically, 477 of 1577 patients with a RTT-like suspicion have been diagnosed. Positive results were found in 30% by Sanger sequencing, 23% with a custom panel, 24% with a commercial panel and 32% with whole exome sequencing. A genetic study using NGS allows the study of a larger number of genes associated with RTT-like symptoms simultaneously, providing genetic study of a wider group of patients as well as significantly reducing the response time and cost of the study.

  19. Experience of targeted Usher exome sequencing as a clinical test

    Science.gov (United States)

    Besnard, Thomas; García-García, Gema; Baux, David; Vaché, Christel; Faugère, Valérie; Larrieu, Lise; Léonard, Susana; Millan, Jose M; Malcolm, Sue; Claustres, Mireille; Roux, Anne-Françoise

    2014-01-01

    We show that massively parallel targeted sequencing of 19 genes provides a new and reliable strategy for molecular diagnosis of Usher syndrome (USH) and nonsyndromic deafness, particularly appropriate for these disorders characterized by a high clinical and genetic heterogeneity and a complex structure of several of the genes involved. A series of 71 patients including Usher patients previously screened by Sanger sequencing plus newly referred patients was studied. Ninety-eight percent of the variants previously identified by Sanger sequencing were found by next-generation sequencing (NGS). NGS proved to be efficient as it offers analysis of all relevant genes which is laborious to reach with Sanger sequencing. Among the 13 newly referred Usher patients, both mutations in the same gene were identified in 77% of cases (10 patients) and one candidate pathogenic variant in two additional patients. This work can be considered as pilot for implementing NGS for genetically heterogeneous diseases in clinical service. PMID:24498627

  20. Academic performance in a pharmacotherapeutics course sequence taught synchronously on two campuses using distance education technology.

    Science.gov (United States)

    Steinberg, Michael; Morin, Anna K

    2011-10-10

    To compare the academic performance of campus-based students in a pharmacotherapeutics course with that of students at a distant campus taught via synchronous teleconferencing. Examination scores and final course grades for campus-based and distant students completing the case-based pharmacotherapeutics course sequence over a 5-year period were collected and analyzed. The mean examination scores and final course grades were not significantly different between students on the 2 campuses. The use of synchronous distance education technology to teach students does not affect students' academic performance when used in an active-learning, case-based pharmacotherapeutics course.

  1. A genome-wide analysis of lentivector integration sites using targeted sequence capture and next generation sequencing technology.

    Science.gov (United States)

    Ustek, Duran; Sirma, Sema; Gumus, Ergun; Arikan, Muzaffer; Cakiris, Aris; Abaci, Neslihan; Mathew, Jaicy; Emrence, Zeliha; Azakli, Hulya; Cosan, Fulya; Cakar, Atilla; Parlak, Mahmut; Kursun, Olcay

    2012-10-01

    One application of next-generation sequencing (NGS) is the targeted resequencing of interested genes which has not been used in viral integration site analysis of gene therapy applications. Here, we combined targeted sequence capture array and next generation sequencing to address the whole genome profiling of viral integration sites. Human 293T and K562 cells were transduced with a HIV-1 derived vector. A custom made DNA probe sets targeted pLVTHM vector used to capture lentiviral vector/human genome junctions. The captured DNA was sequenced using GS FLX platform. Seven thousand four hundred and eighty four human genome sequences flanking the long terminal repeats (LTR) of pLVTHM fragment sequences matched with an identity of at least 98% and minimum 50 bp criteria in both cells. In total, 203 unique integration sites were identified. The integrations in both cell lines were totally distant from the CpG islands and from the transcription start sites and preferentially located in introns. A comparison between the two cell lines showed that the lentiviral-transduced DNA does not have the same preferred regions in the two different cell lines. Copyright © 2012 Elsevier B.V. All rights reserved.

  2. A practical comparison of de novo genome assembly software tools for next-generation sequencing technologies.

    Directory of Open Access Journals (Sweden)

    Wenyu Zhang

    Full Text Available The advent of next-generation sequencing technologies is accompanied with the development of many whole-genome sequence assembly methods and software, especially for de novo fragment assembly. Due to the poor knowledge about the applicability and performance of these software tools, choosing a befitting assembler becomes a tough task. Here, we provide the information of adaptivity for each program, then above all, compare the performance of eight distinct tools against eight groups of simulated datasets from Solexa sequencing platform. Considering the computational time, maximum random access memory (RAM occupancy, assembly accuracy and integrity, our study indicate that string-based assemblers, overlap-layout-consensus (OLC assemblers are well-suited for very short reads and longer reads of small genomes respectively. For large datasets of more than hundred millions of short reads, De Bruijn graph-based assemblers would be more appropriate. In terms of software implementation, string-based assemblers are superior to graph-based ones, of which SOAPdenovo is complex for the creation of configuration file. Our comparison study will assist researchers in selecting a well-suited assembler and offer essential information for the improvement of existing assemblers or the developing of novel assemblers.

  3. Characterizing ncRNAs in human pathogenic protists using high-throughput sequencing technology

    Directory of Open Access Journals (Sweden)

    Lesley Joan Collins

    2011-12-01

    Full Text Available ncRNAs are key genes in many human diseases including cancer and viral infection, as well as providing critical functions in pathogenic organisms such as fungi, bacteria, viruses and protists. Until now the identification and characterization of ncRNAs associated with disease has been slow or inaccurate requiring many years of testing to understand complicated RNA and protein gene relationships. High-throughput sequencing now offers the opportunity to characterize miRNAs, siRNAs, snoRNAs and long ncRNAs on a genomic scale making it faster and easier to clarify how these ncRNAs contribute to the disease state. However, this technology is still relatively new, and ncRNA discovery is not an application of high priority for streamlined bioinformatics. Here we summarize background concepts and practical approaches for ncRNA analysis using high-throughput sequencing, and how it relates to understanding human disease. As a case study, we focus on the parasitic protists Giardia lamblia and Trichomonas vaginalis, where large evolutionary distance has meant difficulties in comparing ncRNAs with those from model eukaryotes. A combination of biological, computational and sequencing approaches has enabled easier classification of ncRNA classes such as snoRNAs, but has also aided the identification of novel classes. It is hoped that a higher level of understanding of ncRNA expression and interaction may aid in the development of less harsh treatment for protist-based diseases.

  4. Characterizing ncRNAs in Human Pathogenic Protists Using High-Throughput Sequencing Technology

    Science.gov (United States)

    Collins, Lesley Joan

    2011-01-01

    ncRNAs are key genes in many human diseases including cancer and viral infection, as well as providing critical functions in pathogenic organisms such as fungi, bacteria, viruses, and protists. Until now the identification and characterization of ncRNAs associated with disease has been slow or inaccurate requiring many years of testing to understand complicated RNA and protein gene relationships. High-throughput sequencing now offers the opportunity to characterize miRNAs, siRNAs, small nucleolar RNAs (snoRNAs), and long ncRNAs on a genomic scale, making it faster and easier to clarify how these ncRNAs contribute to the disease state. However, this technology is still relatively new, and ncRNA discovery is not an application of high priority for streamlined bioinformatics. Here we summarize background concepts and practical approaches for ncRNA analysis using high-throughput sequencing, and how it relates to understanding human disease. As a case study, we focus on the parasitic protists Giardia lamblia and Trichomonas vaginalis, where large evolutionary distance has meant difficulties in comparing ncRNAs with those from model eukaryotes. A combination of biological, computational, and sequencing approaches has enabled easier classification of ncRNA classes such as snoRNAs, but has also aided the identification of novel classes. It is hoped that a higher level of understanding of ncRNA expression and interaction may aid in the development of less harsh treatment for protist-based diseases. PMID:22303390

  5. SNP discovery in the bovine milk transcriptome using RNA-Seq technology.

    Science.gov (United States)

    Cánovas, Angela; Rincon, Gonzalo; Islas-Trejo, Alma; Wickramasinghe, Saumya; Medrano, Juan F

    2010-12-01

    High-throughput sequencing of RNA (RNA-Seq) was developed primarily to analyze global gene expression in different tissues. However, it also is an efficient way to discover coding SNPs. The objective of this study was to perform a SNP discovery analysis in the milk transcriptome using RNA-Seq. Seven milk samples from Holstein cows were analyzed by sequencing cDNAs using the Illumina Genome Analyzer system. We detected 19,175 genes expressed in milk samples corresponding to approximately 70% of the total number of genes analyzed. The SNP detection analysis revealed 100,734 SNPs in Holstein samples, and a large number of those corresponded to differences between the Holstein breed and the Hereford bovine genome assembly Btau4.0. The number of polymorphic SNPs within Holstein cows was 33,045. The accuracy of RNA-Seq SNP discovery was tested by comparing SNPs detected in a set of 42 candidate genes expressed in milk that had been resequenced earlier using Sanger sequencing technology. Seventy of 86 SNPs were detected using both RNA-Seq and Sanger sequencing technologies. The KASPar Genotyping System was used to validate unique SNPs found by RNA-Seq but not observed by Sanger technology. Our results confirm that analyzing the transcriptome using RNA-Seq technology is an efficient and cost-effective method to identify SNPs in transcribed regions. This study creates guidelines to maximize the accuracy of SNP discovery and prevention of false-positive SNP detection, and provides more than 33,000 SNPs located in coding regions of genes expressed during lactation that can be used to develop genotyping platforms to perform marker-trait association studies in Holstein cattle.

  6. Targeted amplicon sequencing (TAS): a scalable next-gen approach to multilocus, multitaxa phylogenetics.

    Science.gov (United States)

    Bybee, Seth M; Bracken-Grissom, Heather; Haynes, Benjamin D; Hermansen, Russell A; Byers, Robert L; Clement, Mark J; Udall, Joshua A; Wilcox, Edward R; Crandall, Keith A

    2011-01-01

    Next-gen sequencing technologies have revolutionized data collection in genetic studies and advanced genome biology to novel frontiers. However, to date, next-gen technologies have been used principally for whole genome sequencing and transcriptome sequencing. Yet many questions in population genetics and systematics rely on sequencing specific genes of known function or diversity levels. Here, we describe a targeted amplicon sequencing (TAS) approach capitalizing on next-gen capacity to sequence large numbers of targeted gene regions from a large number of samples. Our TAS approach is easily scalable, simple in execution, neither time-nor labor-intensive, relatively inexpensive, and can be applied to a broad diversity of organisms and/or genes. Our TAS approach includes a bioinformatic application, BarcodeCrucher, to take raw next-gen sequence reads and perform quality control checks and convert the data into FASTA format organized by gene and sample, ready for phylogenetic analyses. We demonstrate our approach by sequencing targeted genes of known phylogenetic utility to estimate a phylogeny for the Pancrustacea. We generated data from 44 taxa using 68 different 10-bp multiplexing identifiers. The overall quality of data produced was robust and was informative for phylogeny estimation. The potential for this method to produce copious amounts of data from a single 454 plate (e.g., 325 taxa for 24 loci) significantly reduces sequencing expenses incurred from traditional Sanger sequencing. We further discuss the advantages and disadvantages of this method, while offering suggestions to enhance the approach.

  7. Identification of SNP and SSR Markers in Finger Millet Using Next Generation Sequencing Technologies.

    Science.gov (United States)

    Gimode, Davis; Odeny, Damaris A; de Villiers, Etienne P; Wanyonyi, Solomon; Dida, Mathews M; Mneney, Emmarold E; Muchugi, Alice; Machuka, Jesse; de Villiers, Santie M

    2016-01-01

    Finger millet is an important cereal crop in eastern Africa and southern India with excellent grain storage quality and unique ability to thrive in extreme environmental conditions. Since negligible attention has been paid to improving this crop to date, the current study used Next Generation Sequencing (NGS) technologies to develop both Simple Sequence Repeat (SSR) and Single Nucleotide Polymorphism (SNP) markers. Genomic DNA from cultivated finger millet genotypes KNE755 and KNE796 was sequenced using both Roche 454 and Illumina technologies. Non-organelle sequencing reads were assembled into 207 Mbp representing approximately 13% of the finger millet genome. We identified 10,327 SSRs and 23,285 non-homeologous SNPs and tested 101 of each for polymorphism across a diverse set of wild and cultivated finger millet germplasm. For the 49 polymorphic SSRs, the mean polymorphism information content (PIC) was 0.42, ranging from 0.16 to 0.77. We also validated 92 SNP markers, 80 of which were polymorphic with a mean PIC of 0.29 across 30 wild and 59 cultivated accessions. Seventy-six of the 80 SNPs were polymorphic across 30 wild germplasm with a mean PIC of 0.30 while only 22 of the SNP markers showed polymorphism among the 59 cultivated accessions with an average PIC value of 0.15. Genetic diversity analysis using the polymorphic SNP markers revealed two major clusters; one of wild and another of cultivated accessions. Detailed STRUCTURE analysis confirmed this grouping pattern and further revealed 2 sub-populations within wild E. coracana subsp. africana. Both STRUCTURE and genetic diversity analysis assisted with the correct identification of the new germplasm collections. These polymorphic SSR and SNP markers are a significant addition to the existing 82 published SSRs, especially with regard to the previously reported low polymorphism levels in finger millet. Our results also reveal an unexploited finger millet genetic resource that can be included in the regional

  8. Identification of SNP and SSR Markers in Finger Millet Using Next Generation Sequencing Technologies.

    Directory of Open Access Journals (Sweden)

    Davis Gimode

    Full Text Available Finger millet is an important cereal crop in eastern Africa and southern India with excellent grain storage quality and unique ability to thrive in extreme environmental conditions. Since negligible attention has been paid to improving this crop to date, the current study used Next Generation Sequencing (NGS technologies to develop both Simple Sequence Repeat (SSR and Single Nucleotide Polymorphism (SNP markers. Genomic DNA from cultivated finger millet genotypes KNE755 and KNE796 was sequenced using both Roche 454 and Illumina technologies. Non-organelle sequencing reads were assembled into 207 Mbp representing approximately 13% of the finger millet genome. We identified 10,327 SSRs and 23,285 non-homeologous SNPs and tested 101 of each for polymorphism across a diverse set of wild and cultivated finger millet germplasm. For the 49 polymorphic SSRs, the mean polymorphism information content (PIC was 0.42, ranging from 0.16 to 0.77. We also validated 92 SNP markers, 80 of which were polymorphic with a mean PIC of 0.29 across 30 wild and 59 cultivated accessions. Seventy-six of the 80 SNPs were polymorphic across 30 wild germplasm with a mean PIC of 0.30 while only 22 of the SNP markers showed polymorphism among the 59 cultivated accessions with an average PIC value of 0.15. Genetic diversity analysis using the polymorphic SNP markers revealed two major clusters; one of wild and another of cultivated accessions. Detailed STRUCTURE analysis confirmed this grouping pattern and further revealed 2 sub-populations within wild E. coracana subsp. africana. Both STRUCTURE and genetic diversity analysis assisted with the correct identification of the new germplasm collections. These polymorphic SSR and SNP markers are a significant addition to the existing 82 published SSRs, especially with regard to the previously reported low polymorphism levels in finger millet. Our results also reveal an unexploited finger millet genetic resource that can be included

  9. The first FDA marketing authorizations of next-generation sequencing technology and tests: challenges, solutions and impact for future assays.

    Science.gov (United States)

    Bijwaard, Karen; Dickey, Jennifer S; Kelm, Kellie; Težak, Živana

    2015-01-01

    The rapid emergence and clinical translation of novel high-throughput sequencing technologies created a need to clarify the regulatory pathway for the evaluation and authorization of these unique technologies. Recently, the US FDA authorized for marketing four next generation sequencing (NGS)-based diagnostic devices which consisted of two heritable disease-specific assays, library preparation reagents and a NGS platform that are intended for human germline targeted sequencing from whole blood. These first authorizations can serve as a case study in how different types of NGS-based technology are reviewed by the FDA. In this manuscript we describe challenges associated with the evaluation of these novel technologies and provide an overview of what was reviewed. Besides making validated NGS-based devices available for in vitro diagnostic use, these first authorizations create a regulatory path for similar future instruments and assays.

  10. Automatic start-up system of nuclear reactor based on sequence control technology

    International Nuclear Information System (INIS)

    Zhang Yao; Zhang Dafa; Peng Huaqing

    2009-01-01

    A conceptive design of an automatic start-up system based on the sequence control for the nuclear reactors is given in this paper, so as to solve the problems during the start-up process, such as the long operation time, low automatic control level and high accident rate. The start-up process and its requirements are analyzed in detail at first. Then,the principle, the architecture, the key technologies of the automatic start-up system of nuclear reactors are designed and discussed. With the designed system, the automatic start-up of the nuclear reactor can be realized,the work load of the operator can be reduced,and the safety and efficiency of the nuclear power plant during its start-up can be improved. (authors)

  11. Memory Efficient Sequence Analysis Using Compressed Data Structures (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    Energy Technology Data Exchange (ETDEWEB)

    Simpson, Jared

    2011-10-13

    Wellcome Trust Sanger Institute's Jared Simpson on Memory efficient sequence analysis using compressed data structures at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  12. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons

    Science.gov (United States)

    Olson, Nathan D.; Lund, Steven P.; Zook, Justin M.; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S.; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B.

    2015-01-01

    This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies. PMID:27077030

  13. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons

    Directory of Open Access Journals (Sweden)

    Nathan D. Olson

    2015-03-01

    Full Text Available This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1 identity of biologically conserved position, (2 ratio of 16S rRNA gene copies featuring identified variants, and (3 the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies.

  14. Refining the Results of a Classical SELEX Experiment by Expanding the Sequence Data Set of an Aptamer Pool Selected for Protein A

    OpenAIRE

    Regina Stoltenburg; Beate Strehlitz

    2018-01-01

    New, as yet undiscovered aptamers for Protein A were identified by applying next generation sequencing (NGS) to a previously selected aptamer pool. This pool was obtained in a classical SELEX (Systematic Evolution of Ligands by EXponential enrichment) experiment using the FluMag-SELEX procedure followed by cloning and Sanger sequencing. PA#2/8 was identified as the only Protein A-binding aptamer from the Sanger sequence pool, and was shown to be able to bind intact cells of Staphylococcus aur...

  15. A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies.

    Science.gov (United States)

    Utturkar, Sagar M; Klingeman, Dawn M; Hurt, Richard A; Brown, Steven D

    2017-01-01

    This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted. PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.

  16. Killer Immunoglobulin-Like Receptor Allele Determination Using Next-Generation Sequencing Technology

    Directory of Open Access Journals (Sweden)

    Bercelin Maniangou

    2017-05-01

    Full Text Available The impact of natural killer (NK cell alloreactivity on hematopoietic stem cell transplantation (HSCT outcome is still debated due to the complexity of graft parameters, HLA class I environment, the nature of killer cell immunoglobulin-like receptor (KIR/KIR ligand genetic combinations studied, and KIR+ NK cell repertoire size. KIR genes are known to be polymorphic in terms of gene content, copy number variation, and number of alleles. These allelic polymorphisms may impact both the phenotype and function of KIR+ NK cells. We, therefore, speculate that polymorphisms may alter donor KIR+ NK cell phenotype/function thus modulating post-HSCT KIR+ NK cell alloreactivity. To investigate KIR allele polymorphisms of all KIR genes, we developed a next-generation sequencing (NGS technology on a MiSeq platform. To ensure the reliability and specificity of our method, genomic DNA from well-characterized cell lines were used; high-resolution KIR typing results obtained were then compared to those previously reported. Two different bioinformatic pipelines were used allowing the attribution of sequencing reads to specific KIR genes and the assignment of KIR alleles for each KIR gene. Our results demonstrated successful long-range KIR gene amplifications of all reference samples using intergenic KIR primers. The alignment of reads to the human genome reference (hg19 using BiRD pipeline or visualization of data using Profiler software demonstrated that all KIR genes were completely sequenced with a sufficient read depth (mean 317× for all loci and a high percentage of mapping (mean 93% for all loci. Comparison of high-resolution KIR typing obtained to those published data using exome capture resulted in a reported concordance rate of 95% for centromeric and telomeric KIR genes. Overall, our results suggest that NGS can be used to investigate the broad KIR allelic polymorphism. Hence, these data improve our knowledge, not only on KIR+ NK cell alloreactivity in

  17. Nanopore sequencing technology and tools for genome assembly: computational analysis of the current state, bottlenecks and future directions.

    Science.gov (United States)

    Senol Cali, Damla; Kim, Jeremie S; Ghose, Saugata; Alkan, Can; Mutlu, Onur

    2018-04-02

    Nanopore sequencing technology has the potential to render other sequencing technologies obsolete with its ability to generate long reads and provide portability. However, high error rates of the technology pose a challenge while generating accurate genome assemblies. The tools used for nanopore sequence analysis are of critical importance, as they should overcome the high error rates of the technology. Our goal in this work is to comprehensively analyze current publicly available tools for nanopore sequence analysis to understand their advantages, disadvantages and performance bottlenecks. It is important to understand where the current tools do not perform well to develop better tools. To this end, we (1) analyze the multiple steps and the associated tools in the genome assembly pipeline using nanopore sequence data, and (2) provide guidelines for determining the appropriate tools for each step. Based on our analyses, we make four key observations: (1) the choice of the tool for basecalling plays a critical role in overcoming the high error rates of nanopore sequencing technology. (2) Read-to-read overlap finding tools, GraphMap and Minimap, perform similarly in terms of accuracy. However, Minimap has a lower memory usage, and it is faster than GraphMap. (3) There is a trade-off between accuracy and performance when deciding on the appropriate tool for the assembly step. The fast but less accurate assembler Miniasm can be used for quick initial assembly, and further polishing can be applied on top of it to increase the accuracy, which leads to faster overall assembly. (4) The state-of-the-art polishing tool, Racon, generates high-quality consensus sequences while providing a significant speedup over another polishing tool, Nanopolish. We analyze various combinations of different tools and expose the trade-offs between accuracy, performance, memory usage and scalability. We conclude that our observations can guide researchers and practitioners in making conscious

  18. Next-Generation Sequencing Platforms

    Science.gov (United States)

    Mardis, Elaine R.

    2013-06-01

    Automated DNA sequencing instruments embody an elegant interplay among chemistry, engineering, software, and molecular biology and have built upon Sanger's founding discovery of dideoxynucleotide sequencing to perform once-unfathomable tasks. Combined with innovative physical mapping approaches that helped to establish long-range relationships between cloned stretches of genomic DNA, fluorescent DNA sequencers produced reference genome sequences for model organisms and for the reference human genome. New types of sequencing instruments that permit amazing acceleration of data-collection rates for DNA sequencing have been developed. The ability to generate genome-scale data sets is now transforming the nature of biological inquiry. Here, I provide an historical perspective of the field, focusing on the fundamental developments that predated the advent of next-generation sequencing instruments and providing information about how these instruments work, their application to biological research, and the newest types of sequencers that can extract data from single DNA molecules.

  19. Detecting novel genetic mutations in Chinese Usher syndrome families using next-generation sequencing technology.

    Science.gov (United States)

    Qu, Ling-Hui; Jin, Xin; Xu, Hai-Wei; Li, Shi-Ying; Yin, Zheng-Qin

    2015-02-01

    Usher syndrome (USH) is the most common cause of combined blindness and deafness inherited in an autosomal recessive mode. Molecular diagnosis is of great significance in revealing the molecular pathogenesis and aiding the clinical diagnosis of this disease. However, molecular diagnosis remains a challenge due to high phenotypic and genetic heterogeneity in USH. This study explored an approach for detecting disease-causing genetic mutations in candidate genes in five index cases from unrelated USH families based on targeted next-generation sequencing (NGS) technology. Through systematic data analysis using an established bioinformatics pipeline and segregation analysis, 10 pathogenic mutations in the USH disease genes were identified in the five USH families. Six of these mutations were novel: c.4398G > A and EX38-49del in MYO7A, c.988_989delAT in USH1C, c.15104_15105delCA and c.6875_6876insG in USH2A. All novel variations segregated with the disease phenotypes in their respective families and were absent from ethnically matched control individuals. This study expanded the mutation spectrum of USH and revealed the genotype-phenotype relationships of the novel USH mutations in Chinese patients. Moreover, this study proved that targeted NGS is an accurate and effective method for detecting genetic mutations related to USH. The identification of pathogenic mutations is of great significance for elucidating the underlying pathophysiology of USH.

  20. Integration of microbiological, epidemiological and next generation sequencing technologies data for the managing of nosocomial infections

    Directory of Open Access Journals (Sweden)

    Matteo Brilli

    2018-02-01

    Full Text Available At its core, the work of clinical microbiologists consists in the retrieving of a few bytes of information (species identification; metabolic capacities; staining and antigenic properties; antibiotic resistance profiles, etc. from pathogenic agents. The development of next generation sequencing technologies (NGS, and the possibility to determine the entire genome for bacterial pathogens, fungi and protozoans will likely introduce a breakthrough in the amount of information generated by clinical microbiology laboratories: from bytes to Megabytes of information, for a single isolate. In parallel, the development of novel informatics tools, designed for the management and analysis of the so-called Big Data, offers the possibility to search for patterns in databases collecting genomic and microbiological information on the pathogens, as well as epidemiological data and information on the clinical parameters of the patients. Nosocomial infections and antibiotic resistance will likely represent major challenges for clinical microbiologists, in the next decades. In this paper, we describe how bacterial genomics based on NGS, integrated with novel informatic tools, could contribute to the control of hospital infections and multi-drug resistant pathogens.

  1. Targeted 'Next-Generation' sequencing in anophthalmia and microphthalmia patients confirms SOX2, OTX2 and FOXE3 mutations

    Directory of Open Access Journals (Sweden)

    Lopez Jimenez Nelson

    2011-12-01

    Full Text Available Abstract Background Anophthalmia/microphthalmia (A/M is caused by mutations in several different transcription factors, but mutations in each causative gene are relatively rare, emphasizing the need for a testing approach that screens multiple genes simultaneously. We used next-generation sequencing to screen 15 A/M patients for mutations in 9 pathogenic genes to evaluate this technology for screening in A/M. Methods We used a pooled sequencing design, together with custom single nucleotide polymorphism (SNP calling software. We verified predicted sequence alterations using Sanger sequencing. Results We verified three mutations - c.542delC in SOX2, resulting in p.Pro181Argfs*22, p.Glu105X in OTX2 and p.Cys240X in FOXE3. We found several novel sequence alterations and SNPs that were likely to be non-pathogenic - p.Glu42Lys in CRYBA4, p.Val201Met in FOXE3 and p.Asp291Asn in VSX2. Our analysis methodology gave one false positive result comprising a mutation in PAX6 (c.1268A > T, predicting p.X423LeuextX*15 that was not verified by Sanger sequencing. We also failed to detect one 20 base pair (bp deletion and one 3 bp duplication in SOX2. Conclusions Our results demonstrated the power of next-generation sequencing with pooled sample groups for the rapid screening of candidate genes for A/M as we were correctly able to identify disease-causing mutations. However, next-generation sequencing was less useful for small, intragenic deletions and duplications. We did not find mutations in 10/15 patients and conclude that there is a need for further gene discovery in A/M.

  2. Targeted 'next-generation' sequencing in anophthalmia and microphthalmia patients confirms SOX2, OTX2 and FOXE3 mutations.

    Science.gov (United States)

    Jimenez, Nelson Lopez; Flannick, Jason; Yahyavi, Mani; Li, Jiang; Bardakjian, Tanya; Tonkin, Leath; Schneider, Adele; Sherr, Elliott H; Slavotinek, Anne M

    2011-12-28

    Anophthalmia/microphthalmia (A/M) is caused by mutations in several different transcription factors, but mutations in each causative gene are relatively rare, emphasizing the need for a testing approach that screens multiple genes simultaneously. We used next-generation sequencing to screen 15 A/M patients for mutations in 9 pathogenic genes to evaluate this technology for screening in A/M. We used a pooled sequencing design, together with custom single nucleotide polymorphism (SNP) calling software. We verified predicted sequence alterations using Sanger sequencing. We verified three mutations - c.542delC in SOX2, resulting in p.Pro181Argfs*22, p.Glu105X in OTX2 and p.Cys240X in FOXE3. We found several novel sequence alterations and SNPs that were likely to be non-pathogenic - p.Glu42Lys in CRYBA4, p.Val201Met in FOXE3 and p.Asp291Asn in VSX2. Our analysis methodology gave one false positive result comprising a mutation in PAX6 (c.1268A > T, predicting p.X423LeuextX*15) that was not verified by Sanger sequencing. We also failed to detect one 20 base pair (bp) deletion and one 3 bp duplication in SOX2. Our results demonstrated the power of next-generation sequencing with pooled sample groups for the rapid screening of candidate genes for A/M as we were correctly able to identify disease-causing mutations. However, next-generation sequencing was less useful for small, intragenic deletions and duplications. We did not find mutations in 10/15 patients and conclude that there is a need for further gene discovery in A/M.

  3. Towards clinical molecular diagnosis of inherited cardiac conditions: a comparison of bench-top genome DNA sequencers.

    Directory of Open Access Journals (Sweden)

    Xinzhong Li

    Full Text Available Molecular genetic testing is recommended for diagnosis of inherited cardiac disease, to guide prognosis and treatment, but access is often limited by cost and availability. Recently introduced high-throughput bench-top DNA sequencing platforms have the potential to overcome these limitations.We evaluated two next-generation sequencing (NGS platforms for molecular diagnostics. The protein-coding regions of six genes associated with inherited arrhythmia syndromes were amplified from 15 human samples using parallelised multiplex PCR (Access Array, Fluidigm, and sequenced on the MiSeq (Illumina and Ion Torrent PGM (Life Technologies. Overall, 97.9% of the target was sequenced adequately for variant calling on the MiSeq, and 96.8% on the Ion Torrent PGM. Regions missed tended to be of high GC-content, and most were problematic for both platforms. Variant calling was assessed using 107 variants detected using Sanger sequencing: within adequately sequenced regions, variant calling on both platforms was highly accurate (Sensitivity: MiSeq 100%, PGM 99.1%. Positive predictive value: MiSeq 95.9%, PGM 95.5%. At the time of the study the Ion Torrent PGM had a lower capital cost and individual runs were cheaper and faster. The MiSeq had a higher capacity (requiring fewer runs, with reduced hands-on time and simpler laboratory workflows. Both provide significant cost and time savings over conventional methods, even allowing for adjunct Sanger sequencing to validate findings and sequence exons missed by NGS.MiSeq and Ion Torrent PGM both provide accurate variant detection as part of a PCR-based molecular diagnostic workflow, and provide alternative platforms for molecular diagnosis of inherited cardiac conditions. Though there were performance differences at this throughput, platforms differed primarily in terms of cost, scalability, protocol stability and ease of use. Compared with current molecular genetic diagnostic tests for inherited cardiac arrhythmias

  4. Comparison and evaluation of two exome capture kits and sequencing platforms for variant calling.

    Science.gov (United States)

    Zhang, Guoqiang; Wang, Jianfeng; Yang, Jin; Li, Wenjie; Deng, Yutian; Li, Jing; Huang, Jun; Hu, Songnian; Zhang, Bing

    2015-08-05

    To promote the clinical application of next-generation sequencing, it is important to obtain accurate and consistent variants of target genomic regions at low cost. Ion Proton, the latest updated semiconductor-based sequencing instrument from Life Technologies, is designed to provide investigators with an inexpensive platform for human whole exome sequencing that achieves a rapid turnaround time. However, few studies have comprehensively compared and evaluated the accuracy of variant calling between Ion Proton and Illumina sequencing platforms such as HiSeq 2000, which is the most popular sequencing platform for the human genome. The Ion Proton sequencer combined with the Ion TargetSeq Exome Enrichment Kit together make up TargetSeq-Proton, whereas SureSelect-Hiseq is based on the Agilent SureSelect Human All Exon v4 Kit and the HiSeq 2000 sequencer. Here, we sequenced exonic DNA from four human blood samples using both TargetSeq-Proton and SureSelect-HiSeq. We then called variants in the exonic regions that overlapped between the two exome capture kits (33.6 Mb). The rates of shared variant loci called by two sequencing platforms were from 68.0 to 75.3% in four samples, whereas the concordance of co-detected variant loci reached 99%. Sanger sequencing validation revealed that the validated rate of concordant single nucleotide polymorphisms (SNPs) (91.5%) was higher than the SNPs specific to TargetSeq-Proton (60.0%) or specific to SureSelect-HiSeq (88.3%). With regard to 1-bp small insertions and deletions (InDels), the Sanger sequencing validated rates of concordant variants (100.0%) and SureSelect-HiSeq-specific (89.6%) were higher than those of TargetSeq-Proton-specific (15.8%). In the sequencing of exonic regions, a combination of using of two sequencing strategies (SureSelect-HiSeq and TargetSeq-Proton) increased the variant calling specificity for concordant variant loci and the sensitivity for variant loci called by any one platform. However, for the

  5. A Review on the Applications of Next Generation Sequencing Technologies as Applied to Food-Related Microbiome Studies

    Directory of Open Access Journals (Sweden)

    Yu Cao

    2017-09-01

    Full Text Available The development of next generation sequencing (NGS techniques has enabled researchers to study and understand the world of microorganisms from broader and deeper perspectives. The contemporary advances in DNA sequencing technologies have not only enabled finer characterization of bacterial genomes but also provided deeper taxonomic identification of complex microbiomes which in its genomic essence is the combined genetic material of the microorganisms inhabiting an environment, whether the environment be a particular body econiche (e.g., human intestinal contents or a food manufacturing facility econiche (e.g., floor drain. To date, 16S rDNA sequencing, metagenomics and metatranscriptomics are the three basic sequencing strategies used in the taxonomic identification and characterization of food-related microbiomes. These sequencing strategies have used different NGS platforms for DNA and RNA sequence identification. Traditionally, 16S rDNA sequencing has played a key role in understanding the taxonomic composition of a food-related microbiome. Recently, metagenomic approaches have resulted in improved understanding of a microbiome by providing a species-level/strain-level characterization. Further, metatranscriptomic approaches have contributed to the functional characterization of the complex interactions between different microbial communities within a single microbiome. Many studies have highlighted the use of NGS techniques in investigating the microbiome of fermented foods. However, the utilization of NGS techniques in studying the microbiome of non-fermented foods are limited. This review provides a brief overview of the advances in DNA sequencing chemistries as the technology progressed from first, next and third generations and highlights how NGS provided a deeper understanding of food-related microbiomes with special focus on non-fermented foods.

  6. A Review on the Applications of Next Generation Sequencing Technologies as Applied to Food-Related Microbiome Studies

    Science.gov (United States)

    Cao, Yu; Fanning, Séamus; Proos, Sinéad; Jordan, Kieran; Srikumar, Shabarinath

    2017-01-01

    The development of next generation sequencing (NGS) techniques has enabled researchers to study and understand the world of microorganisms from broader and deeper perspectives. The contemporary advances in DNA sequencing technologies have not only enabled finer characterization of bacterial genomes but also provided deeper taxonomic identification of complex microbiomes which in its genomic essence is the combined genetic material of the microorganisms inhabiting an environment, whether the environment be a particular body econiche (e.g., human intestinal contents) or a food manufacturing facility econiche (e.g., floor drain). To date, 16S rDNA sequencing, metagenomics and metatranscriptomics are the three basic sequencing strategies used in the taxonomic identification and characterization of food-related microbiomes. These sequencing strategies have used different NGS platforms for DNA and RNA sequence identification. Traditionally, 16S rDNA sequencing has played a key role in understanding the taxonomic composition of a food-related microbiome. Recently, metagenomic approaches have resulted in improved understanding of a microbiome by providing a species-level/strain-level characterization. Further, metatranscriptomic approaches have contributed to the functional characterization of the complex interactions between different microbial communities within a single microbiome. Many studies have highlighted the use of NGS techniques in investigating the microbiome of fermented foods. However, the utilization of NGS techniques in studying the microbiome of non-fermented foods are limited. This review provides a brief overview of the advances in DNA sequencing chemistries as the technology progressed from first, next and third generations and highlights how NGS provided a deeper understanding of food-related microbiomes with special focus on non-fermented foods. PMID:29033905

  7. Ion torrent personal genome machine sequencing for genomic typing of Neisseria meningitidis for rapid determination of multiple layers of typing information.

    Science.gov (United States)

    Vogel, Ulrich; Szczepanowski, Rafael; Claus, Heike; Jünemann, Sebastian; Prior, Karola; Harmsen, Dag

    2012-06-01

    Neisseria meningitidis causes invasive meningococcal disease in infants, toddlers, and adolescents worldwide. DNA sequence-based typing, including multilocus sequence typing, analysis of genetic determinants of antibiotic resistance, and sequence typing of vaccine antigens, has become the standard for molecular epidemiology of the organism. However, PCR of multiple targets and consecutive Sanger sequencing provide logistic constraints to reference laboratories. Taking advantage of the recent development of benchtop next-generation sequencers (NGSs) and of BIGSdb, a database accommodating and analyzing genome sequence data, we therefore explored the feasibility and accuracy of Ion Torrent Personal Genome Machine (PGM) sequencing for genomic typing of meningococci. Three strains from a previous meningococcus serogroup B community outbreak were selected to compare conventional typing results with data generated by semiconductor chip-based sequencing. In addition, sequencing of the meningococcal type strain MC58 provided information about the general performance of the technology. The PGM technology generated sequence information for all target genes addressed. The results were 100% concordant with conventional typing results, with no further editing being necessary. In addition, the amount of typing information, i.e., nucleotides and target genes analyzed, could be substantially increased by the combined use of genome sequencing and BIGSdb compared to conventional methods. In the near future, affordable and fast benchtop NGS machines like the PGM might enable reference laboratories to switch to genomic typing on a routine basis. This will reduce workloads and rapidly provide information for laboratory surveillance, outbreak investigation, assessment of vaccine preventability, and antibiotic resistance gene monitoring.

  8. Complete nucleotide sequence of a novel Hibiscus-infecting Cilevirus from Florida and its relationship with closely associated Cileviruses

    Science.gov (United States)

    The complete nucleotide sequence of a recently discovered Florida (FL) isolate of Hibiscus infecting Cilevirus (HiCV) was determined by Sanger sequencing. The movement- and coat- protein gene sequences of the HiCV-FL isolate are more divergent than other genes of the previously sequenced HiCV-HA (Ha...

  9. Sim3C: simulation of Hi-C and Meta3C proximity ligation sequencing technologies.

    Science.gov (United States)

    DeMaere, Matthew Z; Darling, Aaron E

    2018-02-01

    Chromosome conformation capture (3C) and Hi-C DNA sequencing methods have rapidly advanced our understanding of the spatial organization of genomes and metagenomes. Many variants of these protocols have been developed, each with their own strengths. Currently there is no systematic means for simulating sequence data from this family of sequencing protocols, potentially hindering the advancement of algorithms to exploit this new datatype. We describe a computational simulator that, given simple parameters and reference genome sequences, will simulate Hi-C sequencing on those sequences. The simulator models the basic spatial structure in genomes that is commonly observed in Hi-C and 3C datasets, including the distance-decay relationship in proximity ligation, differences in the frequency of interaction within and across chromosomes, and the structure imposed by cells. A means to model the 3D structure of randomly generated topologically associating domains is provided. The simulator considers several sources of error common to 3C and Hi-C library preparation and sequencing methods, including spurious proximity ligation events and sequencing error. We have introduced the first comprehensive simulator for 3C and Hi-C sequencing protocols. We expect the simulator to have use in testing of Hi-C data analysis algorithms, as well as more general value for experimental design, where questions such as the required depth of sequencing, enzyme choice, and other decisions can be made in advance in order to ensure adequate statistical power with respect to experimental hypothesis testing.

  10. Draft genome sequence of pigeonpea (Cajanus cajan), an orphan legume crop of resource-poor farmers

    DEFF Research Database (Denmark)

    Varshney, Rajeev K.; Chen, Wenbin; Li, Yupeng

    2012-01-01

    Pigeonpea is an important legume food crop grown primarily by smallholder farmers in many semi-arid tropical regions of the world. We used the Illumina next-generation sequencing platform to generate 237.2 Gb of sequence, which along with Sanger-based bacterial artificial chromosome end sequences...

  11. A complete mitochondrial genome sequence from a mesolithic wild aurochs (Bos primigenius.

    Directory of Open Access Journals (Sweden)

    Ceiridwen J Edwards

    Full Text Available BACKGROUND: The derivation of domestic cattle from the extinct wild aurochs (Bos primigenius has been well-documented by archaeological and genetic studies. Genetic studies point towards the Neolithic Near East as the centre of origin for Bos taurus, with some lines of evidence suggesting possible, albeit rare, genetic contributions from locally domesticated wild aurochsen across Eurasia. Inferences from these investigations have been based largely on the analysis of partial mitochondrial DNA sequences generated from modern animals, with limited sequence data from ancient aurochsen samples. Recent developments in DNA sequencing technologies, however, are affording new opportunities for the examination of genetic material retrieved from extinct species, providing new insight into their evolutionary history. Here we present DNA sequence analysis of the first complete mitochondrial genome (16,338 base pairs from an archaeologically-verified and exceptionally-well preserved aurochs bone sample. METHODOLOGY: DNA extracts were generated from an aurochs humerus bone sample recovered from a cave site located in Derbyshire, England and radiocarbon-dated to 6,738+/-68 calibrated years before present. These extracts were prepared for both Sanger and next generation DNA sequencing technologies (Illumina Genome Analyzer. In total, 289.9 megabases (22.48% of the post-filtered DNA sequences generated using the Illumina Genome Analyzer from this sample mapped with confidence to the bovine genome. A consensus B. primigenius mitochondrial genome sequence was constructed and was analysed alongside all available complete bovine mitochondrial genome sequences. CONCLUSIONS: For all nucleotide positions where both Sanger and Illumina Genome Analyzer sequencing methods gave high-confidence calls, no discrepancies were observed. Sequence analysis reveals evidence of heteroplasmy in this sample and places this mitochondrial genome sequence securely within a previously

  12. Is Whole Exome Sequencing an Ethically Disruptive Technology? Perspectives of Pediatric Oncologists and Parents of Pediatric Patients with Solid Tumors

    Science.gov (United States)

    McCullough, Laurence B.; Slashinski, Melody J.; McGuire, Amy L.; Street, Richard L.; Eng, Christine M.; Gibbs, Richard A.; Parsons, D. Williams; Plon, Sharon E.

    2016-01-01

    Background Some anticipate that physician and parents will be ill-prepared or unprepared for the clinical introduction of genome sequencing, making it ethically disruptive. Procedure As part of the Baylor Advancing Sequencing in Childhood Cancer Care (BASIC3) study, we conducted semi-structured interviews with 16 pediatric oncologists and 40 parents of pediatric patients with cancer prior to the return of sequencing results. We elicited expectations and attitudes concerning the impact of sequencing on clinical decision-making, clinical utility, and treatment expectations from both groups. Using accepted methods of qualitative research to analyze interview transcripts, we completed a thematic analysis to provide inductive insights into their views of sequencing. Results Our major findings reveal that neither pediatric oncologists nor parents anticipate sequencing to be an ethically disruptive technology, because they expect to be prepared to integrate sequencing results into their existing approaches to learning and using new clinical information for care. Pediatric oncologists do not expect sequencing results to be more complex than other diagnostic information and plan simply to incorporate these data into their evidence-based approach to clinical practice although they were concerned about impact on parents. For parents, there is an urgency to protect their chil's health and in this context they expect genomic information to better prepare them to participate in decisions about their chil's care. Conclusion Our data do not support concern that introducing genome sequencing into childhood cancer care will be ethically disruptive, i.e., leave physicians or parents ill-prepared or unprepared to make responsible decisions about patient care. PMID:26505993

  13. Technology trajectories and the selection of optimal R and D project sequences

    NARCIS (Netherlands)

    van Bommel, Ties; Mahieu, R.J.; Nijssen, E.J.

    2014-01-01

    Given a set of R&D projects drawing on the same underlying technology, a technology trajectory refers to the order in which projects are executed. Due to their technological interdependence, the successful execution of one project can increase a firm's technological capability, and help to

  14. De novo 454 sequencing of barcoded BAC pools for comprehensive gene survey and genome analysis in the complex genome of barley

    Directory of Open Access Journals (Sweden)

    Scholz Uwe

    2009-11-01

    Full Text Available Abstract Background De novo sequencing the entire genome of a large complex plant genome like the one of barley (Hordeum vulgare L. is a major challenge both in terms of experimental feasibility and costs. The emergence and breathtaking progress of next generation sequencing technologies has put this goal into focus and a clone based strategy combined with the 454/Roche technology is conceivable. Results To test the feasibility, we sequenced 91 barcoded, pooled, gene containing barley BACs using the GS FLX platform and assembled the sequences under iterative change of parameters. The BAC assemblies were characterized by N50 of ~50 kb (N80 ~31 kb, N90 ~21 kb and a Q40 of 94%. For ~80% of the clones, the best assemblies consisted of less than 10 contigs at 24-fold mean sequence coverage. Moreover we show that gene containing regions seem to assemble completely and uninterrupted thus making the approach suitable for detecting complete and positionally anchored genes. By comparing the assemblies of four clones to their complete reference sequences generated by the Sanger method, we evaluated the distribution, quality and representativeness of the 454 sequences as well as the consistency and reliability of the assemblies. Conclusion The described multiplex 454 sequencing of barcoded BACs leads to sequence consensi highly representative for the clones. Assemblies are correct for the majority of contigs. Though the resolution of complex repetitive structures requires additional experimental efforts, our approach paves the way for a clone based strategy of sequencing the barley genome.

  15. On the optimal trimming of high-throughput mRNA sequence data

    Directory of Open Access Journals (Sweden)

    Matthew D MacManes

    2014-01-01

    Full Text Available The widespread and rapid adoption of high-throughput sequencing technologies has afforded researchers the opportunity to gain a deep understanding of genome level processes that underlie evolutionary change, and perhaps more importantly, the links between genotype and phenotype. In particular, researchers interested in functional biology and adaptation have used these technologies to sequence mRNA transcriptomes of specific tissues, which in turn are often compared to other tissues, or other individuals with different phenotypes. While these techniques are extremely powerful, careful attention to data quality is required. In particular, because high-throughput sequencing is more error-prone than traditional Sanger sequencing, quality trimming of sequence reads should be an important step in all data processing pipelines. While several software packages for quality trimming exist, no general guidelines for the specifics of trimming have been developed. Here, using empirically derived sequence data, I provide general recommendations regarding the optimal strength of trimming, specifically in mRNA-Seq studies. Although very aggressive quality trimming is common, this study suggests that a more gentle trimming, specifically of those nucleotides whose Phred score < 2 or < 5, is optimal for most studies across a wide variety of metrics.

  16. Transcriptome sequencing of lentil based on second-generation technology permits large-scale unigene assembly and SSR marker discovery

    Directory of Open Access Journals (Sweden)

    Materne Michael

    2011-05-01

    Full Text Available Abstract Background Lentil (Lens culinaris Medik. is a cool-season grain legume which provides a rich source of protein for human consumption. In terms of genomic resources, lentil is relatively underdeveloped, in comparison to other Fabaceae species, with limited available data. There is hence a significant need to enhance such resources in order to identify novel genes and alleles for molecular breeding to increase crop productivity and quality. Results Tissue-specific cDNA samples from six distinct lentil genotypes were sequenced using Roche 454 GS-FLX Titanium technology, generating c. 1.38 × 106 expressed sequence tags (ESTs. De novo assembly generated a total of 15,354 contigs and 68,715 singletons. The complete unigene set was sequence-analysed against genome drafts of the model legume species Medicago truncatula and Arabidopsis thaliana to identify 12,639, and 7,476 unique matches, respectively. When compared to the genome of Glycine max, a total of 20,419 unique hits were observed corresponding to c. 31% of the known gene space. A total of 25,592 lentil unigenes were subsequently annoated from GenBank. Simple sequence repeat (SSR-containing ESTs were identified from consensus sequences and a total of 2,393 primer pairs were designed. A subset of 192 EST-SSR markers was screened for validation across a panel 12 cultivated lentil genotypes and one wild relative species. A total of 166 primer pairs obtained successful amplification, of which 47.5% detected genetic polymorphism. Conclusions A substantial collection of ESTs has been developed from sequence analysis of lentil genotypes using second-generation technology, permitting unigene definition across a broad range of functional categories. As well as providing resources for functional genomics studies, the unigene set has permitted significant enhancement of the number of publicly-available molecular genetic markers as tools for improvement of this species.

  17. Exome sequencing and genetic testing for MODY.

    Directory of Open Access Journals (Sweden)

    Stefan Johansson

    Full Text Available Genetic testing for monogenic diabetes is important for patient care. Given the extensive genetic and clinical heterogeneity of diabetes, exome sequencing might provide additional diagnostic potential when standard Sanger sequencing-based diagnostics is inconclusive.The aim of the study was to examine the performance of exome sequencing for a molecular diagnosis of MODY in patients who have undergone conventional diagnostic sequencing of candidate genes with negative results.We performed exome enrichment followed by high-throughput sequencing in nine patients with suspected MODY. They were Sanger sequencing-negative for mutations in the HNF1A, HNF4A, GCK, HNF1B and INS genes. We excluded common, non-coding and synonymous gene variants, and performed in-depth analysis on filtered sequence variants in a pre-defined set of 111 genes implicated in glucose metabolism.On average, we obtained 45 X median coverage of the entire targeted exome and found 199 rare coding variants per individual. We identified 0-4 rare non-synonymous and nonsense variants per individual in our a priori list of 111 candidate genes. Three of the variants were considered pathogenic (in ABCC8, HNF4A and PPARG, respectively, thus exome sequencing led to a genetic diagnosis in at least three of the nine patients. Approximately 91% of known heterozygous SNPs in the target exomes were detected, but we also found low coverage in some key diabetes genes using our current exome sequencing approach. Novel variants in the genes ARAP1, GLIS3, MADD, NOTCH2 and WFS1 need further investigation to reveal their possible role in diabetes.Our results demonstrate that exome sequencing can improve molecular diagnostics of MODY when used as a complement to Sanger sequencing. However, improvements will be needed, especially concerning coverage, before the full potential of exome sequencing can be realized.

  18. Use of Metagenomic Shotgun Sequencing Technology To Detect Foodborne Pathogens within the Microbiome of the Beef Production Chain

    OpenAIRE

    Yang, Xiang; Noyes, Noelle R.; Doster, Enrique; Martin, Jennifer N.; Linke, Lyndsey M.; Magnuson, Roberta J.; Yang, Hua; Geornaras, Ifigenia; Woerner, Dale R.; Jones, Kenneth L.; Ruiz, Jaime; Boucher, Christina; Morley, Paul S.; Belk, Keith E.

    2016-01-01

    Foodborne illnesses associated with pathogenic bacteria are a global public health and economic challenge. The diversity of microorganisms (pathogenic and nonpathogenic) that exists within the food and meat industries complicates efforts to understand pathogen ecology. Further, little is known about the interaction of pathogens within the microbiome throughout the meat production chain. Here, a metagenomic approach and shotgun sequencing technology were used as tools to detect pathogenic bact...

  19. Molecular-Sized DNA or RNA Sequencing Machine | NCI Technology Transfer Center | TTC

    Science.gov (United States)

    The National Cancer Institute's Gene Regulation and Chromosome Biology Laboratory is seeking statements of capability or interest from parties interested in collaborative research to co-develop a molecular-sized DNA or RNA sequencing machine.

  20. Molecular Characterization of Transgenic Events Using Next Generation Sequencing Approach.

    Science.gov (United States)

    Guttikonda, Satish K; Marri, Pradeep; Mammadov, Jafar; Ye, Liang; Soe, Khaing; Richey, Kimberly; Cruse, James; Zhuang, Meibao; Gao, Zhifang; Evans, Clive; Rounsley, Steve; Kumpatla, Siva P

    2016-01-01

    Demand for the commercial use of genetically modified (GM) crops has been increasing in light of the projected growth of world population to nine billion by 2050. A prerequisite of paramount importance for regulatory submissions is the rigorous safety assessment of GM crops. One of the components of safety assessment is molecular characterization at DNA level which helps to determine the copy number, integrity and stability of a transgene; characterize the integration site within a host genome; and confirm the absence of vector DNA. Historically, molecular characterization has been carried out using Southern blot analysis coupled with Sanger sequencing. While this is a robust approach to characterize the transgenic crops, it is both time- and resource-consuming. The emergence of next-generation sequencing (NGS) technologies has provided highly sensitive and cost- and labor-effective alternative for molecular characterization compared to traditional Southern blot analysis. Herein, we have demonstrated the successful application of both whole genome sequencing and target capture sequencing approaches for the characterization of single and stacked transgenic events and compared the results and inferences with traditional method with respect to key criteria required for regulatory submissions.

  1. Molecular Characterization of Transgenic Events Using Next Generation Sequencing Approach.

    Directory of Open Access Journals (Sweden)

    Satish K Guttikonda

    Full Text Available Demand for the commercial use of genetically modified (GM crops has been increasing in light of the projected growth of world population to nine billion by 2050. A prerequisite of paramount importance for regulatory submissions is the rigorous safety assessment of GM crops. One of the components of safety assessment is molecular characterization at DNA level which helps to determine the copy number, integrity and stability of a transgene; characterize the integration site within a host genome; and confirm the absence of vector DNA. Historically, molecular characterization has been carried out using Southern blot analysis coupled with Sanger sequencing. While this is a robust approach to characterize the transgenic crops, it is both time- and resource-consuming. The emergence of next-generation sequencing (NGS technologies has provided highly sensitive and cost- and labor-effective alternative for molecular characterization compared to traditional Southern blot analysis. Herein, we have demonstrated the successful application of both whole genome sequencing and target capture sequencing approaches for the characterization of single and stacked transgenic events and compared the results and inferences with traditional method with respect to key criteria required for regulatory submissions.

  2. Generation of expressed sequence tags for discovery of genes responsible for floral traits of Chrysanthemum morifolium by next-generation sequencing technology.

    Science.gov (United States)

    Sasaki, Katsutomo; Mitsuda, Nobutaka; Nashima, Kenji; Kishimoto, Kyutaro; Katayose, Yuichi; Kanamori, Hiroyuki; Ohmiya, Akemi

    2017-09-04

    Chrysanthemum morifolium is one of the most economically valuable ornamental plants worldwide. Chrysanthemum is an allohexaploid plant with a large genome that is commercially propagated by vegetative reproduction. New cultivars with different floral traits, such as color, morphology, and scent, have been generated mainly by classical cross-breeding and mutation breeding. However, only limited genetic resources and their genome information are available for the generation of new floral traits. To obtain useful information about molecular bases for floral traits of chrysanthemums, we read expressed sequence tags (ESTs) of chrysanthemums by high-throughput sequencing using the 454 pyrosequencing technology. We constructed normalized cDNA libraries, consisting of full-length, 3'-UTR, and 5'-UTR cDNAs derived from various tissues of chrysanthemums. These libraries produced a total number of 3,772,677 high-quality reads, which were assembled into 213,204 contigs. By comparing the data obtained with those of full genome-sequenced species, we confirmed that our chrysanthemum contig set contained the majority of all expressed genes, which was sufficient for further molecular analysis in chrysanthemums. We confirmed that our chrysanthemum EST set (contigs) contained a number of contigs that encoded transcription factors and enzymes involved in pigment and aroma compound metabolism that was comparable to that of other species. This information can serve as an informative resource for identifying genes involved in various biological processes in chrysanthemums. Moreover, the findings of our study will contribute to a better understanding of the floral characteristics of chrysanthemums including the myriad cultivars at the molecular level.

  3. Sequence based polymorphic (SBP marker technology for targeted genomic regions: its application in generating a molecular map of the Arabidopsis thaliana genome

    Directory of Open Access Journals (Sweden)

    Sahu Binod B

    2012-01-01

    Full Text Available Abstract Background Molecular markers facilitate both genotype identification, essential for modern animal and plant breeding, and the isolation of genes based on their map positions. Advancements in sequencing technology have made possible the identification of single nucleotide polymorphisms (SNPs for any genomic regions. Here a sequence based polymorphic (SBP marker technology for generating molecular markers for targeted genomic regions in Arabidopsis is described. Results A ~3X genome coverage sequence of the Arabidopsis thaliana ecotype, Niederzenz (Nd-0 was obtained by applying Illumina's sequencing by synthesis (Solexa technology. Comparison of the Nd-0 genome sequence with the assembled Columbia-0 (Col-0 genome sequence identified putative single nucleotide polymorphisms (SNPs throughout the entire genome. Multiple 75 base pair Nd-0 sequence reads containing SNPs and originating from individual genomic DNA molecules were the basis for developing co-dominant SBP markers. SNPs containing Col-0 sequences, supported by transcript sequences or sequences from multiple BAC clones, were compared to the respective Nd-0 sequences to identify possible restriction endonuclease enzyme site variations. Small amplicons, PCR amplified from both ecotypes, were digested with suitable restriction enzymes and resolved on a gel to reveal the sequence based polymorphisms. By applying this technology, 21 SBP markers for the marker poor regions of the Arabidopsis map representing polymorphisms between Col-0 and Nd-0 ecotypes were generated. Conclusions The SBP marker technology described here allowed the development of molecular markers for targeted genomic regions of Arabidopsis. It should facilitate isolation of co-dominant molecular markers for targeted genomic regions of any animal or plant species, whose genomic sequences have been assembled. This technology will particularly facilitate the development of high density molecular marker maps, essential for

  4. Sequence recombination and conservation of Varroa destructor virus-1 and deformed wing virus in field collected honey bees (Apis mellifera.

    Directory of Open Access Journals (Sweden)

    Hui Wang

    Full Text Available We sequenced small (s RNAs from field collected honeybees (Apis mellifera and bumblebees (Bombuspascuorum using the Illumina technology. The sRNA reads were assembled and resulting contigs were used to search for virus homologues in GenBank. Matches with Varroadestructor virus-1 (VDV1 and Deformed wing virus (DWV genomic sequences were obtained for A. mellifera but not B. pascuorum. Further analyses suggested that the prevalent virus population was composed of VDV-1 and a chimera of 5'-DWV-VDV1-DWV-3'. The recombination junctions in the chimera genomes were confirmed by using RT-PCR, cDNA cloning and Sanger sequencing. We then focused on conserved short fragments (CSF, size > 25 nt in the virus genomes by using GenBank sequences and the deep sequencing data obtained in this study. The majority of CSF sites confirmed conservation at both between-species (GenBank sequences and within-population (dataset of this study levels. However, conserved nucleotide positions in the GenBank sequences might be variable at the within-population level. High mutation rates (Pi>10% were observed at a number of sites using the deep sequencing data, suggesting that sequence conservation might not always be maintained at the population level. Virus-host interactions and strategies for developing RNAi treatments against VDV1/DWV infections are discussed.

  5. Direct comparisons of Illumina vs. Roche 454 sequencing technologies on the same microbial community DNA sample.

    Directory of Open Access Journals (Sweden)

    Chengwei Luo

    Full Text Available Next-generation sequencing (NGS is commonly used in metagenomic studies of complex microbial communities but whether or not different NGS platforms recover the same diversity from a sample and their assembled sequences are of comparable quality remain unclear. We compared the two most frequently used platforms, the Roche 454 FLX Titanium and the Illumina Genome Analyzer (GA II, on the same DNA sample obtained from a complex freshwater planktonic community. Despite the substantial differences in read length and sequencing protocols, the platforms provided a comparable view of the community sampled. For instance, derived assemblies overlapped in ~90% of their total sequences and in situ abundances of genes and genotypes (estimated based on sequence coverage correlated highly between the two platforms (R(2>0.9. Evaluation of base-call error, frameshift frequency, and contig length suggested that Illumina offered equivalent, if not better, assemblies than Roche 454. The results from metagenomic samples were further validated against DNA samples of eighteen isolate genomes, which showed a range of genome sizes and G+C% content. We also provide quantitative estimates of the errors in gene and contig sequences assembled from datasets characterized by different levels of complexity and G+C% content. For instance, we noted that homopolymer-associated, single-base errors affected ~1% of the protein sequences recovered in Illumina contigs of 10× coverage and 50% G+C; this frequency increased to ~3% when non-homopolymer errors were also considered. Collectively, our results should serve as a useful practical guide for choosing proper sampling strategies and data possessing protocols for future metagenomic studies.

  6. Direct comparisons of Illumina vs. Roche 454 sequencing technologies on the same microbial community DNA sample.

    Science.gov (United States)

    Luo, Chengwei; Tsementzi, Despina; Kyrpides, Nikos; Read, Timothy; Konstantinidis, Konstantinos T

    2012-01-01

    Next-generation sequencing (NGS) is commonly used in metagenomic studies of complex microbial communities but whether or not different NGS platforms recover the same diversity from a sample and their assembled sequences are of comparable quality remain unclear. We compared the two most frequently used platforms, the Roche 454 FLX Titanium and the Illumina Genome Analyzer (GA) II, on the same DNA sample obtained from a complex freshwater planktonic community. Despite the substantial differences in read length and sequencing protocols, the platforms provided a comparable view of the community sampled. For instance, derived assemblies overlapped in ~90% of their total sequences and in situ abundances of genes and genotypes (estimated based on sequence coverage) correlated highly between the two platforms (R(2)>0.9). Evaluation of base-call error, frameshift frequency, and contig length suggested that Illumina offered equivalent, if not better, assemblies than Roche 454. The results from metagenomic samples were further validated against DNA samples of eighteen isolate genomes, which showed a range of genome sizes and G+C% content. We also provide quantitative estimates of the errors in gene and contig sequences assembled from datasets characterized by different levels of complexity and G+C% content. For instance, we noted that homopolymer-associated, single-base errors affected ~1% of the protein sequences recovered in Illumina contigs of 10× coverage and 50% G+C; this frequency increased to ~3% when non-homopolymer errors were also considered. Collectively, our results should serve as a useful practical guide for choosing proper sampling strategies and data possessing protocols for future metagenomic studies.

  7. ReRep: Computational detection of repetitive sequences in genome survey sequences (GSS

    Directory of Open Access Journals (Sweden)

    Alves-Ferreira Marcelo

    2008-09-01

    Full Text Available Abstract Background Genome survey sequences (GSS offer a preliminary global view of a genome since, unlike ESTs, they cover coding as well as non-coding DNA and include repetitive regions of the genome. A more precise estimation of the nature, quantity and variability of repetitive sequences very early in a genome sequencing project is of considerable importance, as such data strongly influence the estimation of genome coverage, library quality and progress in scaffold construction. Also, the elimination of repetitive sequences from the initial assembly process is important to avoid errors and unnecessary complexity. Repetitive sequences are also of interest in a variety of other studies, for instance as molecular markers. Results We designed and implemented a straightforward pipeline called ReRep, which combines bioinformatics tools for identifying repetitive structures in a GSS dataset. In a case study, we first applied the pipeline to a set of 970 GSSs, sequenced in our laboratory from the human pathogen Leishmania braziliensis, the causative agent of leishmaniosis, an important public health problem in Brazil. We also verified the applicability of ReRep to new sequencing technologies using a set of 454-reads of an Escheria coli. The behaviour of several parameters in the algorithm is evaluated and suggestions are made for tuning of the analysis. Conclusion The ReRep approach for identification of repetitive elements in GSS datasets proved to be straightforward and efficient. Several potential repetitive sequences were found in a L. braziliensis GSS dataset generated in our laboratory, and further validated by the analysis of a more complete genomic dataset from the EMBL and Sanger Centre databases. ReRep also identified most of the E. coli K12 repeats prior to assembly in an example dataset obtained by automated sequencing using 454 technology. The parameters controlling the algorithm behaved consistently and may be tuned to the properties

  8. Nanopore sequencing technology: a new route for the fast detection of unauthorized GMO.

    Science.gov (United States)

    Fraiture, Marie-Alice; Saltykova, Assia; Hoffman, Stefan; Winand, Raf; Deforce, Dieter; Vanneste, Kevin; De Keersmaecker, Sigrid C J; Roosens, Nancy H C

    2018-05-21

    In order to strengthen the current genetically modified organism (GMO) detection system for unauthorized GMO, we have recently developed a new workflow based on DNA walking to amplify unknown sequences surrounding a known DNA region. This DNA walking is performed on transgenic elements, commonly found in GMO, that were earlier detected by real-time PCR (qPCR) screening. Previously, we have demonstrated the ability of this approach to detect unauthorized GMO via the identification of unique transgene flanking regions and the unnatural associations of elements from the transgenic cassette. In the present study, we investigate the feasibility to integrate the described workflow with the MinION Next-Generation-Sequencing (NGS). The MinION sequencing platform can provide long read-lengths and deal with heterogenic DNA libraries, allowing for rapid and efficient delivery of sequences of interest. In addition, the ability of this NGS platform to characterize unauthorized and unknown GMO without any a priori knowledge has been assessed.

  9. A dated molecular phylogeny of manta and devil rays (Mobulidae) based on mitogenome and nuclear sequences

    NARCIS (Netherlands)

    Poortvliet, Marloes; Olsen, Jeanine; Croll, Donald A.; Bernardi, Giacomo; Newton, Kelly; Kollias, Spyros; O'Sullivan, John; Fernando, Daniel; Stevens, Guy; Galván Magaña, Felipe; Seret, Bernard; Wintner, Sabine; Hoarau, Galice

    Manta and devil rays are an iconic group of globally distributed pelagic filter feeders, yet their evolutionary history remains enigmatic. We employed next generation sequencing of mitogenomes for nine of the 11 recognized species and two outgroups; as well as additional Sanger sequencing of two

  10. Technological sequence of creating components of the training system of the future officers to the management of physical training

    Directory of Open Access Journals (Sweden)

    Olkhovy O.M.

    2012-09-01

    Full Text Available The goal is to determine constructive ways of sequence of constructing components of the training system of the future officers to carry out official questions of managing the physical training in the process of the further military career. The structural logic circuit of the interconnections stages of optimum cycle management and technological sequence of constructing the components of the training system of the future officers to the management of physical training, which provides: definition of requirements to the typical problems of professional activities on the issues of the leadership, organization and conducting of physical training, the creation of the phased system model cadets training, training of the curriculum discipline ″Physical education, special physical training and sport″; model creation and definition of criteria of the integral evaluation of the readiness of the future officers to the management of physical training was determined through the analysis more than thirty documentary and scientific literature.

  11. Analysis of quality raw data of second generation sequencers with Quality Assessment Software.

    Science.gov (United States)

    Ramos, Rommel Tj; Carneiro, Adriana R; Baumbach, Jan; Azevedo, Vasco; Schneider, Maria Pc; Silva, Artur

    2011-04-18

    Second generation technologies have advantages over Sanger; however, they have resulted in new challenges for the genome construction process, especially because of the small size of the reads, despite the high degree of coverage. Independent of the program chosen for the construction process, DNA sequences are superimposed, based on identity, to extend the reads, generating contigs; mismatches indicate a lack of homology and are not included. This process improves our confidence in the sequences that are generated. We developed Quality Assessment Software, with which one can review graphs showing the distribution of quality values from the sequencing reads. This software allow us to adopt more stringent quality standards for sequence data, based on quality-graph analysis and estimated coverage after applying the quality filter, providing acceptable sequence coverage for genome construction from short reads. Quality filtering is a fundamental step in the process of constructing genomes, as it reduces the frequency of incorrect alignments that are caused by measuring errors, which can occur during the construction process due to the size of the reads, provoking misassemblies. Application of quality filters to sequence data, using the software Quality Assessment, along with graphing analyses, provided greater precision in the definition of cutoff parameters, which increased the accuracy of genome construction.

  12. Illumina MiSeq Sequencing for Preliminary Analysis of Microbiome Causing Primary Endodontic Infections in Egypt

    Directory of Open Access Journals (Sweden)

    Sally Ali Tawfik

    2018-01-01

    Full Text Available The use of high throughput next generation technologies has allowed more comprehensive analysis than traditional Sanger sequencing. The specific aim of this study was to investigate the microbial diversity of primary endodontic infections using Illumina MiSeq sequencing platform in Egyptian patients. Samples were collected from 19 patients in Suez Canal University Hospital (Endodontic Department using sterile # 15K file and paper points. DNA was extracted using Mo Bio power soil DNA isolation extraction kit followed by PCR amplification and agarose gel electrophoresis. The microbiome was characterized on the basis of the V3 and V4 hypervariable region of the 16S rRNA gene by using paired-end sequencing on Illumina MiSeq device. MOTHUR software was used in sequence filtration and analysis of sequenced data. A total of 1858 operational taxonomic units at 97% similarity were assigned to 26 phyla, 245 families, and 705 genera. Four main phyla Firmicutes, Bacteroidetes, Proteobacteria, and Synergistetes were predominant in all samples. At genus level, Prevotella, Bacillus, Porphyromonas, Streptococcus, and Bacteroides were the most abundant. Illumina MiSeq platform sequencing can be used to investigate oral microbiome composition of endodontic infections. Elucidating the ecology of endodontic infections is a necessary step in developing effective intracanal antimicrobials.

  13. New Approaches and Technologies to Sequence de novo Plant reference Genomes (2013 DOE JGI Genomics of Energy and Environment 8th Annual User Meeting)

    Energy Technology Data Exchange (ETDEWEB)

    Schmutz, Jeremy

    2013-03-01

    Jeremy Schmutz of the HudsonAlpha Institute for Biotechnology on New approaches and technologies to sequence de novo plant reference genomes at the 8th Annual Genomics of Energy Environment Meeting on March 27, 2013 in Walnut Creek, CA.

  14. Assessing the utility of the Oxford Nanopore MinION for snake venom gland cDNA sequencing

    Directory of Open Access Journals (Sweden)

    Adam D. Hargreaves

    2015-11-01

    Full Text Available Portable DNA sequencers such as the Oxford Nanopore MinION device have the potential to be truly disruptive technologies, facilitating new approaches and analyses and, in some cases, taking sequencing out of the lab and into the field. However, the capabilities of these technologies are still being revealed. Here we show that single-molecule cDNA sequencing using the MinION accurately characterises venom toxin-encoding genes in the painted saw-scaled viper, Echis coloratus. We find the raw sequencing error rate to be around 12%, improved to 0–2% with hybrid error correction and 3% with de novo error correction. Our corrected data provides full coding sequences and 5′ and 3′ UTRs for 29 of 33 candidate venom toxins detected, far superior to Illumina data (13/40 complete and Sanger-based ESTs (15/29. We suggest that, should the current pace of improvement continue, the MinION will become the default approach for cDNA sequencing in a variety of species.

  15. Assessing the utility of the Oxford Nanopore MinION for snake venom gland cDNA sequencing.

    Science.gov (United States)

    Hargreaves, Adam D; Mulley, John F

    2015-01-01

    Portable DNA sequencers such as the Oxford Nanopore MinION device have the potential to be truly disruptive technologies, facilitating new approaches and analyses and, in some cases, taking sequencing out of the lab and into the field. However, the capabilities of these technologies are still being revealed. Here we show that single-molecule cDNA sequencing using the MinION accurately characterises venom toxin-encoding genes in the painted saw-scaled viper, Echis coloratus. We find the raw sequencing error rate to be around 12%, improved to 0-2% with hybrid error correction and 3% with de novo error correction. Our corrected data provides full coding sequences and 5' and 3' UTRs for 29 of 33 candidate venom toxins detected, far superior to Illumina data (13/40 complete) and Sanger-based ESTs (15/29). We suggest that, should the current pace of improvement continue, the MinION will become the default approach for cDNA sequencing in a variety of species.

  16. CRISPR-Cas9 technology: applications in genome engineering, development of sequence-specific antimicrobials, and future prospects.

    Science.gov (United States)

    de la Fuente-Núñez, César; Lu, Timothy K

    2017-02-20

    The development of CRISPR-Cas9 technology has revolutionized our ability to edit DNA and to modulate expression levels of genes of interest, thus providing powerful tools to accelerate the precise engineering of a wide range of organisms. In addition, the CRISPR-Cas system can be harnessed to design "precision" antimicrobials that target bacterial pathogens in a DNA sequence-specific manner. This capability will enable killing of drug-resistant microbes by selectively targeting genes involved in antibiotic resistance, biofilm formation and virulence. Here, we review the origins and mechanistic basis of CRISPR-Cas systems, discuss how this technology can be leveraged to provide a range of applications in both eukaryotic and prokaryotic systems, and finish by outlining limitations and future prospects.

  17. The Application of Next Generation Sequencing Technology on Noninvasive Prenatal Test

    DEFF Research Database (Denmark)

    Jiang, Hui

    There are nearly 7000 rare diseases that have been reported in the world. Although most of them occur with a frequency of less than one in 2000, in total about 6% of the population suffers from rare diseases. These rare diseases are often caused by changes in genes, which is currently lack of eff...... diseases and monogenetic diseases in a noninvasively manner. The new approach has great potential to be wildly used in the worldwide with the decreasing in sequencing costs, and therefore play an incredible role to prevent rare diseases....

  18. Charcot-Marie-Tooth disease: The development of a diagnostic platform using next generation sequencing

    DEFF Research Database (Denmark)

    Christensen, Rikke; Væth, Signe; Thorsen, Kasper

    , Sanger sequencing of 4 genes have led to a diagnosis in approximately 30% of the patients. Aims: 1) Development of a targeted NGS platform containing 63 genes that currently are found to be associated with CMT. 2) Analysis of the increased diagnostic yield using this platform to analyze 200 CMT samples...... previously analyzed using Sanger sequencing without identification of a disease causing mutation. Materials and Methods: Libraries for 200 patient samples obtained for CMT diagnostics were prepared using Illumina Truseq and target enrichment using SeqCap EZ Choise Library (Nimblegen). The libraries were...

  19. Complete genome sequencing of the luminescent bacterium, Vibrio qinghaiensis sp. Q67 using PacBio technology

    Science.gov (United States)

    Gong, Liang; Wu, Yu; Jian, Qijie; Yin, Chunxiao; Li, Taotao; Gupta, Vijai Kumar; Duan, Xuewu; Jiang, Yueming

    2018-01-01

    Vibrio qinghaiensis sp.-Q67 (Vqin-Q67) is a freshwater luminescent bacterium that continuously emits blue-green light (485 nm). The bacterium has been widely used for detecting toxic contaminants. Here, we report the complete genome sequence of Vqin-Q67, obtained using third-generation PacBio sequencing technology. Continuous long reads were attained from three PacBio sequencing runs and reads >500 bp with a quality value of >0.75 were merged together into a single dataset. This resultant highly-contiguous de novo assembly has no genome gaps, and comprises two chromosomes with substantial genetic information, including protein-coding genes, non-coding RNA, transposon and gene islands. Our dataset can be useful as a comparative genome for evolution and speciation studies, as well as for the analysis of protein-coding gene families, the pathogenicity of different Vibrio species in fish, the evolution of non-coding RNA and transposon, and the regulation of gene expression in relation to the bioluminescence of Vqin-Q67.

  20. Understanding invasion history and predicting invasive niches using genetic sequencing technology in Australia: case studies from Cucurbitaceae and Boraginaceae.

    Science.gov (United States)

    Shaik, Razia S; Zhu, Xiaocheng; Clements, David R; Weston, Leslie A

    2016-01-01

    Part of the challenge in dealing with invasive plant species is that they seldom represent a uniform, static entity. Often, an accurate understanding of the history of plant introduction and knowledge of the real levels of genetic diversity present in species and populations of importance is lacking. Currently, the role of genetic diversity in promoting the successful establishment of invasive plants is not well defined. Genetic profiling of invasive plants should enhance our understanding of the dynamics of colonization in the invaded range. Recent advances in DNA sequencing technology have greatly facilitated the rapid and complete assessment of plant population genetics. Here, we apply our current understanding of the genetics and ecophysiology of plant invasions to recent work on Australian plant invaders from the Cucurbitaceae and Boraginaceae. The Cucurbitaceae study showed that both prickly paddy melon ( Cucumis myriocarpus ) and camel melon ( Citrullus lanatus ) were represented by only a single genotype in Australia, implying that each was probably introduced as a single introduction event. In contrast, a third invasive melon, Citrullus colocynthis , possessed a moderate level of genetic diversity in Australia and was potentially introduced to the continent at least twice. The Boraginaceae study demonstrated the value of comparing two similar congeneric species; one, Echium plantagineum , is highly invasive and genetically diverse, whereas the other, Echium vulgare , exhibits less genetic diversity and occupies a more limited ecological niche. Sequence analysis provided precise identification of invasive plant species, as well as information on genetic diversity and phylogeographic history. Improved sequencing technologies will continue to allow greater resolution of genetic relationships among invasive plant populations, thereby potentially improving our ability to predict the impact of these relationships upon future spread and better manage invaders

  1. Next generation DNA sequencing technology delivers valuable genetic markers for the genomic orphan legume species, Bituminaria bituminosa

    Directory of Open Access Journals (Sweden)

    Pazos-Navarro María

    2011-12-01

    Full Text Available Abstract Background Bituminaria bituminosa is a perennial legume species from the Canary Islands and Mediterranean region that has potential as a drought-tolerant pasture species and as a source of pharmaceutical compounds. Three botanical varieties have previously been identified in this species: albomarginata, bituminosa and crassiuscula. B. bituminosa can be considered a genomic 'orphan' species with very few genomic resources available. New DNA sequencing technologies provide an opportunity to develop high quality molecular markers for such orphan species. Results 432,306 mRNA molecules were sampled from a leaf transcriptome of a single B. bituminosa plant using Roche 454 pyrosequencing, resulting in an average read length of 345 bp (149.1 Mbp in total. Sequences were assembled into 3,838 isotigs/contigs representing putatively unique gene transcripts. Gene ontology descriptors were identified for 3,419 sequences. Raw sequence reads containing simple sequence repeat (SSR motifs were identified, and 240 primer pairs flanking these motifs were designed. Of 87 primer pairs developed this way, 75 (86.2% successfully amplified primarily single fragments by PCR. Fragment analysis using 20 primer pairs in 79 accessions of B. bituminosa detected 130 alleles at 21 SSR loci. Genetic diversity analyses confirmed that variation at these SSR loci accurately reflected known taxonomic relationships in original collections of B. bituminosa and provided additional evidence that a division of the botanical variety bituminosa into two according to geographical origin (Mediterranean region and Canary Islands may be appropriate. Evidence of cross-pollination was also found between botanical varieties within a B. bituminosa breeding programme. Conclusions B. bituminosa can no longer be considered a genomic orphan species, having now a large (albeit incomplete repertoire of expressed gene sequences that can serve as a resource for future genetic studies. This

  2. Second generation sequencing of the mesothelioma tumor genome.

    Directory of Open Access Journals (Sweden)

    Raphael Bueno

    2010-05-01

    Full Text Available The current paradigm for elucidating the molecular etiology of cancers relies on the interrogation of small numbers of genes, which limits the scope of investigation. Emerging second-generation massively parallel DNA sequencing technologies have enabled more precise definition of the cancer genome on a global scale. We examined the genome of a human primary malignant pleural mesothelioma (MPM tumor and matched normal tissue by using a combination of sequencing-by-synthesis and pyrosequencing methodologies to a 9.6X depth of coverage. Read density analysis uncovered significant aneuploidy and numerous rearrangements. Method-dependent informatics rules, which combined the results of different sequencing platforms, were developed to identify and validate candidate mutations of multiple types. Many more tumor-specific rearrangements than point mutations were uncovered at this depth of sequencing, resulting in novel, large-scale, inter- and intra-chromosomal deletions, inversions, and translocations. Nearly all candidate point mutations appeared to be previously unknown SNPs. Thirty tumor-specific fusions/translocations were independently validated with PCR and Sanger sequencing. Of these, 15 represented disrupted gene-encoding regions, including kinases, transcription factors, and growth factors. One large deletion in DPP10 resulted in altered transcription and expression of DPP10 transcripts in a set of 53 additional MPM tumors correlated with survival. Additionally, three point mutations were observed in the coding regions of NKX6-2, a transcription regulator, and NFRKB, a DNA-binding protein involved in modulating NFKB1. Several regions containing genes such as PCBD2 and DHFR, which are involved in growth factor signaling and nucleotide synthesis, respectively, were selectively amplified in the tumor. Second-generation sequencing uncovered all types of mutations in this MPM tumor, with DNA rearrangements representing the dominant type.

  3. Identification of rare paired box 3 variant in strabismus by whole exome sequencing

    Directory of Open Access Journals (Sweden)

    Hui-Min Gong

    2017-08-01

    Full Text Available AIM: To identify the potentially pathogenic gene variants that contributes to the etiology of strabismus. METHODS: A Chinese pedigree with strabismus was collected and the exomes of two affected individuals were sequenced using the next-generation sequencing technology. The resulting variants from exome sequencing were filtered by subsequent bioinformatics methods and the candidate mutation was verified as heterozygous in the affected proposita and her mother by sanger sequencing. RESULTS: Whole exome sequencing and filtering identified a nonsynonymous mutation c.434G-T transition in paired box 3 (PAX3 in the two affected individuals, which were predicted to be deleterious by more than 4 bioinformatics programs. This altered amino acid residue was located in the conserved PAX domain of PAX3. This gene encodes a member of the PAX family of transcription factors, which play critical roles during fetal development. Mutations in PAX3 were associated with Waardenburg syndrome with strabismus. CONCLUSION: Our results report that the c.434G-T mutation (p.R145L in PAX3 may contribute to strabismus, expanding our understanding of the causally relevant genes for this disorder.

  4. [Molecular and prenatal diagnosis of a family with Fanconi anemia by next generation sequencing].

    Science.gov (United States)

    Gong, Zhuwen; Yu, Yongguo; Zhang, Qigang; Gu, Xuefan

    2015-04-01

    To provide prenatal diagnosis for a pregnant woman who had given birth to a child with Fanconi anemia with combined next-generation sequencing (NGS) and Sanger sequencing. For the affected child, potential mutations of the FANCA gene were analyzed with NGS. Suspected mutation was verified with Sanger sequencing. For prenatal diagnosis, genomic DNA was extracted from cultured fetal amniotic fluid cells and subjected to analysis of the same mutations. A low-frequency frameshifting mutation c.989_995del7 (p.H330LfsX2, inherited from his father) and a truncating mutation c.3971C>T (p.P1324L, inherited from his mother) have been identified in the affected child and considered to be pathogenic. The two mutations were subsequently verified by Sanger sequencing. Upon prenatal diagnosis, the fetus was found to carry two mutations. The combined next-generation sequencing and Sanger sequencing can reduce the time for diagnosis and identify subtypes of Fanconi anemia and the mutational sites, which has enabled reliable prenatal diagnosis of this disease.

  5. Characterization of microflora in Latin-style cheeses by next-generation sequencing technology

    Directory of Open Access Journals (Sweden)

    Lusk Tina S

    2012-11-01

    Full Text Available Abstract Background Cheese contamination can occur at numerous stages in the manufacturing process including the use of improperly pasteurized or raw milk. Of concern is the potential contamination by Listeria monocytogenes and other pathogenic bacteria that find the high moisture levels and moderate pH of popular Latin-style cheeses like queso fresco a hospitable environment. In the investigation of a foodborne outbreak, samples typically undergo enrichment in broth for 24 hours followed by selective agar plating to isolate bacterial colonies for confirmatory testing. The broth enrichment step may also enable background microflora to proliferate, which can confound subsequent analysis if not inhibited by effective broth or agar additives. We used 16S rRNA gene sequencing to provide a preliminary survey of bacterial species associated with three brands of Latin-style cheeses after 24-hour broth enrichment. Results Brand A showed a greater diversity than the other two cheese brands (Brands B and C at nearly every taxonomic level except phylum. Brand B showed the least diversity and was dominated by a single bacterial taxon, Exiguobacterium, not previously reported in cheese. This genus was also found in Brand C, although Lactococcus was prominent, an expected finding since this bacteria belongs to the group of lactic acid bacteria (LAB commonly found in fermented foods. Conclusions The contrasting diversity observed in Latin-style cheese was surprising, demonstrating that despite similarity of cheese type, raw materials and cheese making conditions appear to play a critical role in the microflora composition of the final product. The high bacterial diversity associated with Brand A suggests it may have been prepared with raw materials of high bacterial diversity or influenced by the ecology of the processing environment. Additionally, the presence of Exiguobacterium in high proportions (96% in Brand B and, to a lesser extent, Brand C (46%, may

  6. A New Targeted CFTR Mutation Panel Based on Next-Generation Sequencing Technology.

    Science.gov (United States)

    Lucarelli, Marco; Porcaro, Luigi; Biffignandi, Alice; Costantino, Lucy; Giannone, Valentina; Alberti, Luisella; Bruno, Sabina Maria; Corbetta, Carlo; Torresani, Erminio; Colombo, Carla; Seia, Manuela

    2017-09-01

    Searching for mutations in the cystic fibrosis transmembrane conductance regulator gene (CFTR) is a key step in the diagnosis of and neonatal and carrier screening for cystic fibrosis (CF), and it has implications for prognosis and personalized therapy. The large number of mutations and genetic and phenotypic variability make this search a complex task. Herein, we developed, validated, and tested a laboratory assay for an extended search for mutations in CFTR using a next-generation sequencing-based method, with a panel of 188 CFTR mutations customized for the Italian population. Overall, 1426 dried blood spots from neonatal screening, 402 genomic DNA samples from various origins, and 1138 genomic DNA samples from patients with CF were analyzed. The assay showed excellent analytical and diagnostic operative characteristics. We identified and experimentally validated 159 (of 188) CFTR mutations. The assay achieved detection rates of 95.0% and 95.6% in two large-scale case series of CF patients from central and northern Italy, respectively. These detection rates are among the highest reported so far with a genetic test for CF based on a mutation panel. This assay appears to be well suited for diagnostics, neonatal and carrier screening, and assisted reproduction, and it represents a considerable advantage in CF genetic counseling. Copyright © 2017 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  7. A safe an easy method for building consensus HIV sequences from 454 massively parallel sequencing data.

    Science.gov (United States)

    Fernández-Caballero Rico, Jose Ángel; Chueca Porcuna, Natalia; Álvarez Estévez, Marta; Mosquera Gutiérrez, María Del Mar; Marcos Maeso, María Ángeles; García, Federico

    2018-02-01

    To show how to generate a consensus sequence from the information of massive parallel sequences data obtained from routine HIV anti-retroviral resistance studies, and that may be suitable for molecular epidemiology studies. Paired Sanger (Trugene-Siemens) and next-generation sequencing (NGS) (454 GSJunior-Roche) HIV RT and protease sequences from 62 patients were studied. NGS consensus sequences were generated using Mesquite, using 10%, 15%, and 20% thresholds. Molecular evolutionary genetics analysis (MEGA) was used for phylogenetic studies. At a 10% threshold, NGS-Sanger sequences from 17/62 patients were phylogenetically related, with a median bootstrap-value of 88% (IQR83.5-95.5). Association increased to 36/62 sequences, median bootstrap 94% (IQR85.5-98)], using a 15% threshold. Maximum association was at the 20% threshold, with 61/62 sequences associated, and a median bootstrap value of 99% (IQR98-100). A safe method is presented to generate consensus sequences from HIV-NGS data at 20% threshold, which will prove useful for molecular epidemiological studies. Copyright © 2016 Elsevier España, S.L.U. and Sociedad Española de Enfermedades Infecciosas y Microbiología Clínica. All rights reserved.

  8. Can abundance of protists be inferred from sequence data: a case study of foraminifera.

    Directory of Open Access Journals (Sweden)

    Alexandra A-T Weber

    Full Text Available Protists are key players in microbial communities, yet our understanding of their role in ecosystem functioning is seriously impeded by difficulties in identification of protistan species and their quantification. Current microscopy-based methods used for determining the abundance of protists are tedious and often show a low taxonomic resolution. Recent development of next-generation sequencing technologies offered a very powerful tool for studying the richness of protistan communities. Still, the relationship between abundance of species and number of sequences remains subjected to various technical and biological biases. Here, we test the impact of some of these biological biases on sequence abundance of SSU rRNA gene in foraminifera. First, we quantified the rDNA copy number and rRNA expression level of three species of foraminifera by qPCR. Then, we prepared five mock communities with these species, two in equal proportions and three with one species ten times more abundant. The libraries of rDNA and cDNA of the mock communities were constructed, Sanger sequenced and the sequence abundance was calculated. The initial species proportions were compared to the raw sequence proportions as well as to the sequence abundance normalized by rDNA copy number and rRNA expression level per species. Our results showed that without normalization, all sequence data differed significantly from the initial proportions. After normalization, the congruence between the number of sequences and number of specimens was much better. We conclude that without normalization, species abundance determination based on sequence data was not possible because of the effect of biological biases. Nevertheless, by taking into account the variation of rDNA copy number and rRNA expression level we were able to infer species abundance, suggesting that our approach can be successful in controlled conditions.

  9. SEED 2: a user-friendly platform for amplicon high-throughput sequencing data analyses.

    Science.gov (United States)

    Vetrovský, Tomáš; Baldrian, Petr; Morais, Daniel; Berger, Bonnie

    2018-02-14

    Modern molecular methods have increased our ability to describe microbial communities. Along with the advances brought by new sequencing technologies, we now require intensive computational resources to make sense of the large numbers of sequences continuously produced. The software developed by the scientific community to address this demand, although very useful, require experience of the command-line environment, extensive training and have steep learning curves, limiting their use. We created SEED 2, a graphical user interface for handling high-throughput amplicon-sequencing data under Windows operating systems. SEED 2 is the only sequence visualizer that empowers users with tools to handle amplicon-sequencing data of microbial community markers. It is suitable for any marker genes sequences obtained through Illumina, IonTorrent or Sanger sequencing. SEED 2 allows the user to process raw sequencing data, identify specific taxa, produce of OTU-tables, create sequence alignments and construct phylogenetic trees. Standard dual core laptops with 8 GB of RAM can handle ca. 8 million of Illumina PE 300 bp sequences, ca. 4GB of data. SEED 2 was implemented in Object Pascal and uses internal functions and external software for amplicon data processing. SEED 2 is a freeware software, available at http://www.biomed.cas.cz/mbu/lbwrf/seed/ as a self-contained file, including all the dependencies, and does not require installation. Supplementary data contain a comprehensive list of supported functions. daniel.morais@biomed.cas.cz. Supplementary data are available at Bioinformatics online. © The Author(s) 2018. Published by Oxford University Press.

  10. Sequence assembly

    DEFF Research Database (Denmark)

    Scheibye-Alsing, Karsten; Hoffmann, S.; Frankel, Annett Maria

    2009-01-01

    Despite the rapidly increasing number of sequenced and re-sequenced genomes, many issues regarding the computational assembly of large-scale sequencing data have remain unresolved. Computational assembly is crucial in large genome projects as well for the evolving high-throughput technologies and...... in genomic DNA, highly expressed genes and alternative transcripts in EST sequences. We summarize existing comparisons of different assemblers and provide a detailed descriptions and directions for download of assembly programs at: http://genome.ku.dk/resources/assembly/methods.html....

  11. Investigating the mechanisms of glyphosate resistance in goosegrass (Eleusine indica (L.) Gaertn.) by RNA sequencing technology.

    Science.gov (United States)

    Chen, Jingchao; Huang, Hongjuan; Wei, Shouhui; Huang, Zhaofeng; Wang, Xu; Zhang, Chaoxian

    2017-01-01

    Glyphosate is an important non-selective herbicide that is in common use worldwide. However, evolved glyphosate-resistant (GR) weeds significantly affect crop yields. Unfortunately, the mechanisms underlying resistance in GR weeds, such as goosegrass (Eleusine indica (L.) Gaertn.), an annual weed found worldwide, have not been fully elucidated. In this study, transcriptome analysis was conducted to further assess the potential mechanisms of glyphosate resistance in goosegrass. The RNA sequencing libraries generated 24 597 462 clean reads. De novo assembly analysis produced 48 852 UniGenes with an average length of 847 bp. All UniGenes were annotated using seven databases. Sixteen candidate differentially expressed genes selected by digital gene expression analysis were validated by quantitative real-time PCR (qRT-PCR). Among these UniGenes, the EPSPS and PFK genes were constitutively up-regulated in resistant (R) individuals and showed a higher copy number than that in susceptible (S) individuals. The expressions of four UniGenes relevant to photosynthesis were inhibited by glyphosate in S individuals, and this toxic response was confirmed by gas exchange analysis. Two UniGenes annotated as glutathione transferase (GST) were constitutively up-regulated in R individuals, and were induced by glyphosate both in R and S. In addition, the GST activities in R individuals were higher than in S. Our research confirmed that two UniGenes (PFK, EPSPS) were strongly associated with target resistance, and two GST-annotated UniGenes may play a role in metabolic glyphosate resistance in goosegrass. © 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.

  12. A next generation semiconductor based sequencing approach for the identification of meat species in DNA mixtures.

    Directory of Open Access Journals (Sweden)

    Francesca Bertolini

    Full Text Available The identification of the species of origin of meat and meat products is an important issue to prevent and detect frauds that might have economic, ethical and health implications. In this paper we evaluated the potential of the next generation semiconductor based sequencing technology (Ion Torrent Personal Genome Machine for the identification of DNA from meat species (pig, horse, cattle, sheep, rabbit, chicken, turkey, pheasant, duck, goose and pigeon as well as from human and rat in DNA mixtures through the sequencing of PCR products obtained from different couples of universal primers that amplify 12S and 16S rRNA mitochondrial DNA genes. Six libraries were produced including PCR products obtained separately from 13 species or from DNA mixtures containing DNA from all species or only avian or only mammalian species at equimolar concentration or at 1:10 or 1:50 ratios for pig and horse DNA. Sequencing obtained a total of 33,294,511 called nucleotides of which 29,109,688 with Q20 (87.43% in a total of 215,944 reads. Different alignment algorithms were used to assign the species based on sequence data. Error rate calculated after confirmation of the obtained sequences by Sanger sequencing ranged from 0.0003 to 0.02 for the different species. Correlation about the number of reads per species between different libraries was high for mammalian species (0.97 and lower for avian species (0.70. PCR competition limited the efficiency of amplification and sequencing for avian species for some primer pairs. Detection of low level of pig and horse DNA was possible with reads obtained from different primer pairs. The sequencing of the products obtained from different universal PCR primers could be a useful strategy to overcome potential problems of amplification. Based on these results, the Ion Torrent technology can be applied for the identification of meat species in DNA mixtures.

  13. A high-throughput method to detect RNA profiling by integration of RT-MLPA with next generation sequencing technology.

    Science.gov (United States)

    Wang, Jing; Yang, Xue; Chen, Haofeng; Wang, Xuewei; Wang, Xiangyu; Fang, Yi; Jia, Zhenyu; Gao, Jidong

    2017-07-11

    RNA in formalin-fixed and paraffin-embedded (FFPE) tissues provides large amount of information indicating disease stages, histological tumor types and grades, as well as clinical outcomes. However, Detection of RNA expression levels in formalin-fixed and paraffin-embedded samples is extremely difficult due to poor RNA quality. Here we developed a high-throughput method, Reverse Transcription-Multiple Ligation-dependent Probe Sequencing (RT-MLPSeq), to determine expression levels of multiple transcripts in FFPE samples. By combining Reverse Transcription-Multiple Ligation-dependent Amplification method and next generation sequencing technology, RT-MLPSeq overcomes the limit of probe length in multiplex ligation-dependent probe amplification assay and thus could detect expression levels of transcripts without quantitative limitations. We proved that different RT-MLPSeq probes targeting on the same transcripts have highly consistent results and the starting RNA/cDNA input could be as little as 1 ng. RT-MLPSeq also presented consistent relative RNA levels of selected 13 genes with reverse transcription quantitative PCR. Finally, we demonstrated the application of the new RT-MLPSeq method by measuring the mRNA expression levels of 21 genes which can be used for accurate calculation of the breast cancer recurrence score - an index that has been widely used for managing breast cancer patients.

  14. [Research on soil bacteria under the impact of sealed CO2 leakage by high-throughput sequencing technology].

    Science.gov (United States)

    Tian, Di; Ma, Xin; Li, Yu-E; Zha, Liang-Song; Wu, Yang; Zou, Xiao-Xia; Liu, Shuang

    2013-10-01

    Carbon dioxide Capture and Storage has provided a new option for mitigating global anthropogenic CO2 emission with its unique advantages. However, there is a risk of the sealed CO2 leakage, bringing a serious threat to the ecology system. It is widely known that soil microorganisms are closely related to soil health, while the study on the impact of sequestered CO2 leakage on soil microorganisms is quite deficient. In this study, the leakage scenarios of sealed CO2 were constructed and the 16S rRNA genes of soil bacteria were sequenced by Illumina high-throughput sequencing technology on Miseq platform, and related biological analysis was conducted to explore the changes of soil bacterial abundance, diversity and structure. There were 486,645 reads for 43,017 OTUs of 15 soil samples and the results of biological analysis showed that there were differences in the abundance, diversity and community structure of soil bacterial community under different CO, leakage scenarios while the abundance and diversity of the bacterial community declined with the amplification of CO2 leakage quantity and leakage time, and some bacteria species became the dominant bacteria species in the bacteria community, therefore the increase of Acidobacteria species would be a biological indicator for the impact of sealed CO2 leakage on soil ecology system.

  15. Shedding light on the Early Pleistocene of TD6 (Gran Dolina, Atapuerca, Spain): The technological sequence and occupational inferences.

    Science.gov (United States)

    Mosquera, Marina; Ollé, Andreu; Rodríguez-Álvarez, Xose Pedro; Carbonell, Eudald

    2018-01-01

    This paper aims to update the information available on the lithic assemblage from the entire sequence of TD6 now that the most recent excavations have been completed, and to explore possible changes in both occupational patterns and technological strategies evidenced in the unit. This is the first study to analyse the entire TD6 sequence, including subunits TD6.3 and TD6.1, which have never been studied, along with the better-known TD6.2 Homo antecessor-bearing subunit. We also present an analysis of several lithic refits found in TD6, as well as certain technical features that may help characterise the hominin occupations. The archaeo-palaeontological record from TD6 consists of 9,452 faunal remains, 443 coprolites, 1,046 lithic pieces, 170 hominin remains and 91 Celtis seeds. The characteristics of this record seem to indicate two main stages of occupation. In the oldest subunit, TD6.3, the lithic assemblage points to the light and limited hominin occupation of the cave, which does, however, grow over the course of the level. In contrast, the lithic assemblages from TD6.2 and TD6.1 are rich and varied, which may reflect Gran Dolina cave's establishment as a landmark in the region. Despite the occupational differences between the lowermost subunit and the rest of the deposit, technologically the TD6 lithic assemblage is extremely homogeneous throughout. In addition, the composition and spatial distribution of the 12 groups of lithic refits found in unit TD6, as well as the in situ nature of the assemblage demonstrate the high degree of preservation at the site. This may help clarify the nature of the Early Pleistocene hominin occupations of TD6, and raise reasonable doubt about the latest interpretations that support the ex situ character of the assemblage as a whole.

  16. Implementation of Targeted Next Generation Sequencing in Clinical Diagnostics

    DEFF Research Database (Denmark)

    Larsen, Martin Jakob; Burton, Mark; Thomassen, Mads

    Accurate mutation detection is essential in clinical genetic diagnostics of monogenic hereditary diseases. Targeted next generation sequencing (NGS) provides a promising and cost-effective alternative to Sanger sequencing and MLPA analysis currently used in most diagnostic laboratories. One...... of mutation positive controls previously characterized by Sanger/MLPA analysis. Agilent SureSelect Target-Enrichment kits were used for capturing a set of genes associated with hereditary breast and ovarian cancer syndrome and a compilation of genes involved in multiple rare single gene disorders......, respectively. For diagnostics, the sequencing coverage is essential, wherefore a minimum coverage of 30x per nucleotide in the coding regions was used as our primary quality criterion. For the majority of the included genes, we obtained adequate gene coverage, in which we were able to detect 100% of the known...

  17. First fungal genome sequence from Africa: A preliminary analysis

    Directory of Open Access Journals (Sweden)

    Rene Sutherland

    2012-01-01

    Full Text Available Some of the most significant breakthroughs in the biological sciences this century will emerge from the development of next generation sequencing technologies. The ease of availability of DNA sequence made possible through these new technologies has given researchers opportunities to study organisms in a manner that was not possible with Sanger sequencing. Scientists will, therefore, need to embrace genomics, as well as develop and nurture the human capacity to sequence genomes and utilise the ’tsunami‘ of data that emerge from genome sequencing. In response to these challenges, we sequenced the genome of Fusarium circinatum, a fungal pathogen of pine that causes pitch canker, a disease of great concern to the South African forestry industry. The sequencing work was conducted in South Africa, making F. circinatum the first eukaryotic organism for which the complete genome has been sequenced locally. Here we report on the process that was followed to sequence, assemble and perform a preliminary characterisation of the genome. Furthermore, details of the computer annotation and manual curation of this genome are presented. The F. circinatum genome was found to be nearly 44 million bases in size, which is similar to that of four other Fusarium genomes that have been sequenced elsewhere. The genome contains just over 15 000 open reading frames, which is less than that of the related species, Fusarium oxysporum, but more than that for Fusarium verticillioides. Amongst the various putative gene clusters identified in F. circinatum, those encoding the secondary metabolites fumosin and fusarin appeared to harbour evidence of gene translocation. It is anticipated that similar comparisons of other loci will provide insights into the genetic basis for pathogenicity of the pitch canker pathogen. Perhaps more importantly, this project has engaged a relatively large group of scientists

  18. Discrimination of the Lactobacillus acidophilus group using sequencing, species-specific PCR and SNaPshot mini-sequencing technology based on the recA gene.

    Science.gov (United States)

    Huang, Chien-Hsun; Chang, Mu-Tzu; Huang, Mu-Chiou; Wang, Li-Tin; Huang, Lina; Lee, Fwu-Ling

    2012-10-01

    To clearly identify specific species and subspecies of the Lactobacillus acidophilus group using phenotypic and genotypic (16S rDNA sequence analysis) techniques alone is difficult. The aim of this study was to use the recA gene for species discrimination in the L. acidophilus group, as well as to develop a species-specific primer and single nucleotide polymorphism primer based on the recA gene sequence for species and subspecies identification. The average sequence similarity for the recA gene among type strains was 80.0%, and most members of the L. acidophilus group could be clearly distinguished. The species-specific primer was designed according to the recA gene sequencing, which was employed for polymerase chain reaction with the template DNA of Lactobacillus strains. A single 231-bp species-specific band was found only in L. delbrueckii. A SNaPshot mini-sequencing assay using recA as a target gene was also developed. The specificity of the mini-sequencing assay was evaluated using 31 strains of L. delbrueckii species and was able to unambiguously discriminate strains belonging to the subspecies L. delbrueckii subsp. bulgaricus. The phylogenetic relationships of most strains in the L. acidophilus group can be resolved using recA gene sequencing, and a novel method to identify the species and subspecies of the L. delbrueckii and L. delbrueckii subsp. bulgaricus was developed by species-specific polymerase chain reaction combined with SNaPshot mini-sequencing. Copyright © 2012 Society of Chemical Industry.

  19. Report on achievements in fiscal 1998 on research and development of the genome infomatics technology in the industrial and scientific technology research and development project. Research and development of the genome infomatics technology; 1998 nendo genome infomatics gijutsu kenkyu kaihatsu seika hokokusho

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2000-03-01

    This paper describes the achievements in fiscal 1998 on research and development of the genome infomatics technology. First, plasmid DNA was prepared that becomes a mold for sequence reaction; primers were prepared based on the base sequence of terminal groups, which were used to perform the sequence reaction for the next step; and determination was made on the base sequence following the terminal sequence that has been determined previously. This primer walking process was repeated, whereas the obtained data for each base sequence piece were unified to have determined the base sequence in the complete-length cDNA300 clone. The complete-length cDNA853 for homo-sapiens was analyzed by using the primer walking process. The Sanger's sequencing method was used for the reaction. The resultant sequence data was verified to be of a complete-length cDNA containing the actual protein codon from the N terminal to the C terminal. A cDNA database was newly structured. The complete-length cDNA can be retrieved by using as the retrieval condition each organ originating the sequence, the manifestation frequency therein, and the keyword representing the function. (NEDO)

  20. Report on achievements in fiscal 1998 on research and development of the genome infomatics technology in the industrial and scientific technology research and development project. Research and development of the genome infomatics technology; 1998 nendo genome infomatics gijutsu kenkyu kaihatsu seika hokokusho

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2000-03-01

    This paper describes the achievements in fiscal 1998 on research and development of the genome infomatics technology. First, plasmid DNA was prepared that becomes a mold for sequence reaction; primers were prepared based on the base sequence of terminal groups, which were used to perform the sequence reaction for the next step; and determination was made on the base sequence following the terminal sequence that has been determined previously. This primer walking process was repeated, whereas the obtained data for each base sequence piece were unified to have determined the base sequence in the complete-length cDNA300 clone. The complete-length cDNA853 for homo-sapiens was analyzed by using the primer walking process. The Sanger's sequencing method was used for the reaction. The resultant sequence data was verified to be of a complete-length cDNA containing the actual protein codon from the N terminal to the C terminal. A cDNA database was newly structured. The complete-length cDNA can be retrieved by using as the retrieval condition each organ originating the sequence, the manifestation frequency therein, and the keyword representing the function. (NEDO)

  1. Clinical validation of targeted next-generation sequencing for inherited disorders.

    Science.gov (United States)

    Yohe, Sophia; Hauge, Adam; Bunjer, Kari; Kemmer, Teresa; Bower, Matthew; Schomaker, Matthew; Onsongo, Getiria; Wilson, Jon; Erdmann, Jesse; Zhou, Yi; Deshpande, Archana; Spears, Michael D; Beckman, Kenneth; Silverstein, Kevin A T; Thyagarajan, Bharat

    2015-02-01

    Although next-generation sequencing (NGS) can revolutionize molecular diagnostics, several hurdles remain in the implementation of this technology in clinical laboratories. To validate and implement an NGS panel for genetic diagnosis of more than 100 inherited diseases, such as neurologic conditions, congenital hearing loss and eye disorders, developmental disorders, nonmalignant diseases treated by hematopoietic cell transplantation, familial cancers, connective tissue disorders, metabolic disorders, disorders of sexual development, and cardiac disorders. The diagnostic gene panels ranged from 1 to 54 genes with most of panels containing 10 genes or fewer. We used a liquid hybridization-based, target-enrichment strategy to enrich 10 067 exons in 568 genes, followed by NGS with a HiSeq 2000 sequencing system (Illumina, San Diego, California). We successfully sequenced 97.6% (9825 of 10 067) of the targeted exons to obtain a minimum coverage of 20× at all bases. We demonstrated 100% concordance in detecting 19 pathogenic single-nucleotide variations and 11 pathogenic insertion-deletion mutations ranging in size from 1 to 18 base pairs across 18 samples that were previously characterized by Sanger sequencing. Using 4 pairs of blinded, duplicate samples, we demonstrated a high degree of concordance (>99%) among the blinded, duplicate pairs. We have successfully demonstrated the feasibility of using the NGS platform to multiplex genetic tests for several rare diseases and the use of cloud computing for bioinformatics analysis as a relatively low-cost solution for implementing NGS in clinical laboratories.

  2. The Quest for Rare Variants: Pooled Multiplexed Next Generation Sequencing in Plants

    OpenAIRE

    Fabio eMarroni; Sara ePinosio; Sara ePinosio; Michele eMorgante

    2012-01-01

    Next generation sequencing (NGS) instruments produce an unprecedented amount of sequence data at contained costs. This gives researchers the possibility of designing studies with adequate power to identify rare variants at a fraction of the economic and labor resources required by individual Sanger sequencing. As of today, only three research groups working in plant sciences have exploited this potentiality. They showed that pooled NGS can provide results in excellent agreement with those obt...

  3. The Quest for Rare Variants: Pooled Multiplexed Next Generation Sequencing in Plants

    OpenAIRE

    Marroni, Fabio; Pinosio, Sara; Morgante, Michele

    2012-01-01

    Next generation sequencing (NGS) instruments produce an unprecedented amount of sequence data at contained costs. This gives researchers the possibility of designing studies with adequate power to identify rare variants at a fraction of the economic and labor resources required by individual Sanger sequencing. As of today, few research groups working in plant sciences have exploited this potentiality, showing that pooled NGS provides results in excellent agreement with those obtained by indiv...

  4. Validation of Ion TorrentTM Inherited Disease Panel with the PGMTM Sequencing Platform for Rapid and Comprehensive Mutation Detection

    Directory of Open Access Journals (Sweden)

    Abeer E. Mustafa

    2018-05-01

    Full Text Available Quick and accurate molecular testing is necessary for the better management of many inherited diseases. Recent technological advances in various next generation sequencing (NGS platforms, such as target panel-based sequencing, has enabled comprehensive, quick, and precise interrogation of many genetic variations. As a result, these technologies have become a valuable tool for gene discovery and for clinical diagnostics. The AmpliSeq Inherited Disease Panel (IDP consists of 328 genes underlying more than 700 inherited diseases. Here, we aimed to assess the performance of the IDP as a sensitive and rapid comprehensive gene panel testing. A total of 88 patients with inherited diseases and causal mutations that were previously identified by Sanger sequencing were randomly selected for assessing the performance of the IDP. The IDP successfully detected 93.1% of the mutations in our validation cohort, achieving high overall gene coverage (98%. The sensitivity for detecting single nucleotide variants (SNVs and short Indels was 97.3% and 69.2%, respectively. IDP, when coupled with Ion Torrent Personal Genome Machine (PGM, delivers comprehensive and rapid sequencing for genes that are responsible for various inherited diseases. Our validation results suggest the suitability of this panel for use as a first-line screening test after applying the necessary clinical validation.

  5. Use of Metagenomic Shotgun Sequencing Technology To Detect Foodborne Pathogens within the Microbiome of the Beef Production Chain.

    Science.gov (United States)

    Yang, Xiang; Noyes, Noelle R; Doster, Enrique; Martin, Jennifer N; Linke, Lyndsey M; Magnuson, Roberta J; Yang, Hua; Geornaras, Ifigenia; Woerner, Dale R; Jones, Kenneth L; Ruiz, Jaime; Boucher, Christina; Morley, Paul S; Belk, Keith E

    2016-04-01

    Foodborne illnesses associated with pathogenic bacteria are a global public health and economic challenge. The diversity of microorganisms (pathogenic and nonpathogenic) that exists within the food and meat industries complicates efforts to understand pathogen ecology. Further, little is known about the interaction of pathogens within the microbiome throughout the meat production chain. Here, a metagenomic approach and shotgun sequencing technology were used as tools to detect pathogenic bacteria in environmental samples collected from the same groups of cattle at different longitudinal processing steps of the beef production chain: cattle entry to feedlot, exit from feedlot, cattle transport trucks, abattoir holding pens, and the end of the fabrication system. The log read counts classified as pathogens per million reads for Salmonella enterica,Listeria monocytogenes,Escherichia coli,Staphylococcus aureus, Clostridium spp. (C. botulinum and C. perfringens), and Campylobacter spp. (C. jejuni,C. coli, and C. fetus) decreased over subsequential processing steps. Furthermore, the normalized read counts for S. enterica,E. coli, and C. botulinumwere greater in the final product than at the feedlots, indicating that the proportion of these bacteria increased (the effect on absolute numbers was unknown) within the remaining microbiome. From an ecological perspective, data indicated that shotgun metagenomics can be used to evaluate not only the microbiome but also shifts in pathogen populations during beef production. Nonetheless, there were several challenges in this analysis approach, one of the main ones being the identification of the specific pathogen from which the sequence reads originated, which makes this approach impractical for use in pathogen identification for regulatory and confirmation purposes. Copyright © 2016 Yang et al.

  6. In vitro identification and in silico utilization of interspecies sequence similarities using GeneChip® technology

    Directory of Open Access Journals (Sweden)

    Ye Shui Q

    2005-05-01

    Full Text Available Abstract Background Genomic approaches in large animal models (canine, ovine etc are challenging due to insufficient genomic information for these species and the lack of availability of corresponding microarray platforms. To address this problem, we speculated that conserved interspecies genetic sequences can be experimentally detected by cross-species hybridization. The Affymetrix platform probe redundancy offers flexibility in selecting individual probes with high sequence similarities between related species for gene expression analysis. Results Gene expression profiles of 40 canine samples were generated using the human HG-U133A GeneChip (U133A. Due to interspecies genetic differences, only 14 ± 2% of canine transcripts were detected by U133A probe sets whereas profiling of 40 human samples detected 49 ± 6% of human transcripts. However, when these probe sets were deconstructed into individual probes and examined performance of each probe, we found that 47% of human probes were able to find their targets in canine tissues and generate a detectable hybridization signal. Therefore, we restricted gene expression analysis to these probes and observed the 60% increase in the number of identified canine transcripts. These results were validated by comparison of transcripts identified by our restricted analysis of cross-species hybridization with transcripts identified by hybridization of total lung canine mRNA to new Affymetrix Canine GeneChip®. Conclusion The experimental identification and restriction of gene expression analysis to probes with detectable hybridization signal drastically increases transcript detection of canine-human hybridization suggesting the possibility of broad utilization of cross-hybridizations of related species using GeneChip technology.

  7. A comparative study of mutation screening of sarcomeric genes (MYBPC3, MYH7, TNNT2 using single gene approach versus targeted gene panel next generation sequencing in a cohort of HCM patients in Egypt

    Directory of Open Access Journals (Sweden)

    Heba Sh. Kassem

    2017-10-01

    Full Text Available Background: NGS enables simultaneous sequencing of large numbers of associated genes in genetic heterogeneous disorders, in a more rapid and cost-effective manner than traditional technologies. However there have been limited direct comparisons between NGS and more established technologies to assess the sensitivity and false negative rates of this new approach. The scope of the present manuscript is to compare variants detected in MYBPC3, MYH7 and TNNT2 genes using the stepwise dHPLC/Sanger versus targeted NGS. Methods: In this study, we have analysed a group of 150 samples of patients from the Bibliotheca Alexandrina-Aswan Heart Centre National HCM program. The genetic testing was simultaneously undertaken by high throughput denaturing high-performance liquid chromatography (dHPLC followed by Sanger based sequencing and targeted next generation deep sequencing using panel of inherited cardiac genes (ICC. The panel included over 100 genes including the 3 sarcomeric genes. Analysis of the sequencing data of the 3 genes was undertaken in a double blinded strategy. Results: NGS analysis detected all pathogenic and likely pathogenic variants identified by dHPLC (50 in total, some samples had double hits. There was a 0% false negative rate for NGS based analysis. Nineteen variants were missed by dHPLC and detected by NGS, thus increasing the diagnostic yield in this co- analysed cohort from 22.0% (33/150 to 31.3% (47/150.Of interest to note that the mutation spectrum in this Egyptian HCM population revealed a high rate of homozygosity in MYBPC3 and MYH7 genes in comparison to other population studies (6/150, 4%. None of the homozygous samples were detected by dHPLC analysis. Conclusion: NGS provides a useful and rapid tool to allow panoramic screening of several genes simultaneously with a high sensitivity rate amongst genes of known etiologic role allowing high throughput analysis of HCM patients and relevant control series in a less characterised

  8. The diploid genome sequence of an individual human.

    Directory of Open Access Journals (Sweden)

    Samuel Levy

    2007-09-01

    Full Text Available Presented here is a genome sequence of an individual human. It was produced from approximately 32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel included 3,213,401 single nucleotide polymorphisms (SNPs, 53,823 block substitutions (2-206 bp, 292,102 heterozygous insertion/deletion events (indels(1-571 bp, 559,473 homozygous indels (1-82,711 bp, 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information.

  9. Description and pilot results from a novel method for evaluating return of incidental findings from next-generation sequencing technologies.

    Science.gov (United States)

    Goddard, Katrina A B; Whitlock, Evelyn P; Berg, Jonathan S; Williams, Marc S; Webber, Elizabeth M; Webster, Jennifer A; Lin, Jennifer S; Schrader, Kasmintan A; Campos-Outcalt, Doug; Offit, Kenneth; Feigelson, Heather Spencer; Hollombe, Celine

    2013-09-01

    The aim of this study was to develop, operationalize, and pilot test a transparent, reproducible, and evidence-informed method to determine when to report incidental findings from next-generation sequencing technologies. Using evidence-based principles, we proposed a three-stage process. Stage I "rules out" incidental findings below a minimal threshold of evidence and is evaluated using inter-rater agreement and comparison with an expert-based approach. Stage II documents criteria for clinical actionability using a standardized approach to allow experts to consistently consider and recommend whether results should be routinely reported (stage III). We used expert opinion to determine the face validity of stages II and III using three case studies. We evaluated the time and effort for stages I and II. For stage I, we assessed 99 conditions and found high inter-rater agreement (89%), and strong agreement with a separate expert-based method. Case studies for familial adenomatous polyposis, hereditary hemochromatosis, and α1-antitrypsin deficiency were all recommended for routine reporting as incidental findings. The method requires definition of clinically actionable incidental findings and provide documentation and pilot testing of a feasible method that is scalable to the whole genome.

  10. First Complete Genomic Sequence of a Rabies Virus from the Republic of Tajikistan Obtained Directly from a Flinders Technology Associates Card

    OpenAIRE

    Goharriz, H.; Marston, D. A.; Sharifzoda, F.; Ellis, R. J.; Horton, D. L.; Khakimov, T.; Whatmore, A.; Khamroev, K.; Makhmadshoev, A. N.; Bazarov, M.; Fooks, A. R.; Banyard, A. C.

    2017-01-01

    ABSTRACT A brain homogenate derived from a rabid dog in the district of Tojikobod, Republic of Tajikistan, was applied to a Flinders Technology Associates (FTA) card. A full-genome sequence of rabies virus (RABV) was generated from the FTA card directly without extraction, demonstrating the utility of these cards for readily obtaining genetic data.

  11. First Complete Genomic Sequence of a Rabies Virus from the Republic of Tajikistan Obtained Directly from a Flinders Technology Associates Card.

    Science.gov (United States)

    Goharriz, H; Marston, D A; Sharifzoda, F; Ellis, R J; Horton, D L; Khakimov, T; Whatmore, A; Khamroev, K; Makhmadshoev, A N; Bazarov, M; Fooks, A R; Banyard, A C

    2017-07-06

    A brain homogenate derived from a rabid dog in the district of Tojikobod, Republic of Tajikistan, was applied to a Flinders Technology Associates (FTA) card. A full-genome sequence of rabies virus (RABV) was generated from the FTA card directly without extraction, demonstrating the utility of these cards for readily obtaining genetic data. © Crown copyright 2017.

  12. Next-Generation Sequencing-Based Detection of Germline Copy Number Variations in BRCA1/BRCA2

    DEFF Research Database (Denmark)

    Schmidt, Ane Y; Hansen, Thomas V O; Ahlborn, Lise B

    2017-01-01

    Genetic testing of BRCA1/2 includes screening for single nucleotide variants and small insertions/deletions and for larger copy number variations (CNVs), primarily by Sanger sequencing and multiplex ligation-dependent probe amplification (MLPA). With the advent of next-generation sequencing (NGS)...

  13. Artemis and ACT: viewing, annotating and comparing sequences stored in a relational database

    Science.gov (United States)

    Carver, Tim; Berriman, Matthew; Tivey, Adrian; Patel, Chinmay; Böhme, Ulrike; Barrell, Barclay G.; Parkhill, Julian; Rajandream, Marie-Adèle

    2008-01-01

    Motivation: Artemis and Artemis Comparison Tool (ACT) have become mainstream tools for viewing and annotating sequence data, particularly for microbial genomes. Since its first release, Artemis has been continuously developed and supported with additional functionality for editing and analysing sequences based on feedback from an active user community of laboratory biologists and professional annotators. Nevertheless, its utility has been somewhat restricted by its limitation to reading and writing from flat files. Therefore, a new version of Artemis has been developed, which reads from and writes to a relational database schema, and allows users to annotate more complex, often large and fragmented, genome sequences. Results: Artemis and ACT have now been extended to read and write directly to the Generic Model Organism Database (GMOD, http://www.gmod.org) Chado relational database schema. In addition, a Gene Builder tool has been developed to provide structured forms and tables to edit coordinates of gene models and edit functional annotation, based on standard ontologies, controlled vocabularies and free text. Availability: Artemis and ACT are freely available (under a GPL licence) for download (for MacOSX, UNIX and Windows) at the Wellcome Trust Sanger Institute web sites: http://www.sanger.ac.uk/Software/Artemis/ http://www.sanger.ac.uk/Software/ACT/ Contact: artemis@sanger.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:18845581

  14. Genetic mapping and exome sequencing identify variants associated with five novel diseases.

    Directory of Open Access Journals (Sweden)

    Erik G Puffenberger

    Full Text Available The Clinic for Special Children (CSC has integrated biochemical and molecular methods into a rural pediatric practice serving Old Order Amish and Mennonite (Plain children. Among the Plain people, we have used single nucleotide polymorphism (SNP microarrays to genetically map recessive disorders to large autozygous haplotype blocks (mean = 4.4 Mb that contain many genes (mean = 79. For some, uninformative mapping or large gene lists preclude disease-gene identification by Sanger sequencing. Seven such conditions were selected for exome sequencing at the Broad Institute; all had been previously mapped at the CSC using low density SNP microarrays coupled with autozygosity and linkage analyses. Using between 1 and 5 patient samples per disorder, we identified sequence variants in the known disease-causing genes SLC6A3 and FLVCR1, and present evidence to strongly support the pathogenicity of variants identified in TUBGCP6, BRAT1, SNIP1, CRADD, and HARS. Our results reveal the power of coupling new genotyping technologies to population-specific genetic knowledge and robust clinical data.

  15. Toward an Integrated BAC Library Resource for Genome Sequencing and Analysis; FINAL

    International Nuclear Information System (INIS)

    Simon, M. I.; Kim, U.-J.

    2002-01-01

    We developed a great deal of expertise in building large BAC libraries from a variety of DNA sources including humans, mice, corn, microorganisms, worms, and Arabidopsis. We greatly improved the technology for screening these libraries rapidly and for selecting appropriate BACs and mapping BACs to develop large overlapping contigs. We became involved in supplying BACs and BAC contigs to a variety of sequencing and mapping projects and we began to collaborate with Drs. Adams and Venter at TIGR and with Dr. Leroy Hood and his group at University of Washington to provide BACs for end sequencing and for mapping and sequencing of large fragments of chromosome 16. Together with Dr. Ian Dunham and his co-workers at the Sanger Center we completed the mapping and they completed the sequencing of the first human chromosome, chromosome 22. This was published in Nature in 1999 and our BAC contigs made a major contribution to this sequencing effort. Drs. Shizuya and Ding invented an automated highly accurate BAC mapping technique. We also developed long-term collaborations with Dr. Uli Weier at UCSF in the design of BAC probes for characterization of human tumors and specific chromosome deletions and breakpoints. Finally the contribution of our work to the human genome project has been recognized in the publication both by the international consortium and the NIH of a draft sequence of the human genome in Nature last year. Dr. Shizuya was acknowledged in the authorship of that landmark paper. Dr. Simon was also an author on the Venter/Adams Celera project sequencing the human genome that was published in Science last year

  16. Detection of a divergent variant of grapevine virus F by next-generation sequencing.

    Science.gov (United States)

    Molenaar, Nicholas; Burger, Johan T; Maree, Hans J

    2015-08-01

    The complete genome sequence of a South African isolate of grapevine virus F (GVF) is presented. It was first detected by metagenomic next-generation sequencing of field samples and validated through direct Sanger sequencing. The genome sequence of GVF isolate V5 consists of 7539 nucleotides and contains a poly(A) tail. It has a typical vitivirus genome arrangement that comprises five open reading frames (ORFs), which share only 88.96 % nucleotide sequence identity with the existing complete GVF genome sequence (JX105428).

  17. Is Whole-Exome Sequencing an Ethically Disruptive Technology? Perspectives of Pediatric Oncologists and Parents of Pediatric Patients With Solid Tumors.

    Science.gov (United States)

    McCullough, Laurence B; Slashinski, Melody J; McGuire, Amy L; Street, Richard L; Eng, Christine M; Gibbs, Richard A; Parsons, D William; Plon, Sharon E

    2016-03-01

    It has been anticipated that physician and parents will be ill prepared or unprepared for the clinical introduction of genome sequencing, making it ethically disruptive. As a part of the Baylor Advancing Sequencing in Childhood Cancer Care study, we conducted semistructured interviews with 16 pediatric oncologists and 40 parents of pediatric patients with cancer prior to the return of sequencing results. We elicited expectations and attitudes concerning the impact of sequencing on clinical decision making, clinical utility, and treatment expectations from both groups. Using accepted methods of qualitative research to analyze interview transcripts, we completed a thematic analysis to provide inductive insights into their views of sequencing. Our major findings reveal that neither pediatric oncologists nor parents anticipate sequencing to be an ethically disruptive technology, because they expect to be prepared to integrate sequencing results into their existing approaches to learning and using new clinical information for care. Pediatric oncologists do not expect sequencing results to be more complex than other diagnostic information and plan simply to incorporate these data into their evidence-based approach to clinical practice, although they were concerned about impact on parents. For parents, there is an urgency to protect their child's health and in this context they expect genomic information to better prepare them to participate in decisions about their child's care. Our data do not support the concern that introducing genome sequencing into childhood cancer care will be ethically disruptive, that is, leave physicians or parents ill prepared or unprepared to make responsible decisions about patient care. © 2015 Wiley Periodicals, Inc.

  18. Identification of Heterozygous Single- and Multi-exon Deletions in IL7R by Whole Exome Sequencing.

    OpenAIRE

    Engelhardt, Karin R; Xu, Yaobo; Grainger, Angela; Germani Batacchi, Mila G C; Swan, David J; Willet, Joseph D P; Abd Hamid, Intan J; Agyeman, Philipp; Barge, Dawn; Bibi, Shahnaz; Jenkins, Lucy; Flood, Terence J; Abinun, Mario; Slatter, Mary A; Gennery, Andrew R

    2017-01-01

    Purpose We aimed to achieve a retrospective molecular diagnosis by applying state-of-the-art genomic sequencing methods to past patients with T-B+NK+ severe combined immunodeficiency (SCID). We included identification of copy number variations (CNVs) by whole exome sequencing (WES) using the CNV calling method ExomeDepth to detect gene alterations for which routine Sanger sequencing analysis is not suitable, such as large heterozygous deletions. Methods Of a total of 12 undiagnosed patients w...

  19. Exome sequencing identifies mutations in ABCD1 and DACH2 in two brothers with a distinct phenotype

    OpenAIRE

    Zhang, Yanliang; Liu, Yanhui; Li, Ya; Duan, Yong; Zhang, Keyun; Wang, Junwang; Dai, Yong

    2014-01-01

    Background We report on two brothers with a distinct syndromic phenotype and explore the potential pathogenic cause. Methods Cytogenetic tests and exome sequencing were performed on the two brothers and their parents. Variants detected by exome sequencing were validated by Sanger sequencing. Results The main phenotype of the two brothers included congenital language disorder, growth retardation, intellectual disability, difficulty in standing and walking, and urinary and fecal incontinence. T...

  20. The objective of this program is to develop innovative DNA detection technologies to achieve fast microbial community assessment. The specific approaches are (1) to develop inexpensive and reliable sequence-proof hybridization DNA detection technology (2) to develop quantitative DNA hybridization technology for microbial community assessment and (3) to study the microbes which have demonstrated the potential to have nuclear waste bioremediation

    International Nuclear Information System (INIS)

    Chen, Chung H.

    2004-01-01

    The objective of this program is to develop innovative DNA detection technologies to achieve fast microbial community assessment. The specific approaches are (1) to develop inexpensive and reliable sequence-proof hybridization DNA detection technology (2) to develop quantitative DNA hybridization technology for microbial community assessment and (3) to study the microbes which have demonstrated the potential to have nuclear waste bioremediation

  1. Real-time DNA barcoding in a rainforest using nanopore sequencing: opportunities for rapid biodiversity assessments and local capacity building.

    Science.gov (United States)

    Pomerantz, Aaron; Peñafiel, Nicolás; Arteaga, Alejandro; Bustamante, Lucas; Pichardo, Frank; Coloma, Luis A; Barrio-Amorós, César L; Salazar-Valenzuela, David; Prost, Stefan

    2018-04-01

    Advancements in portable scientific instruments provide promising avenues to expedite field work in order to understand the diverse array of organisms that inhabit our planet. Here, we tested the feasibility for in situ molecular analyses of endemic fauna using a portable laboratory fitting within a single backpack in one of the world's most imperiled biodiversity hotspots, the Ecuadorian Chocó rainforest. We used portable equipment, including the MinION nanopore sequencer (Oxford Nanopore Technologies) and the miniPCR (miniPCR), to perform DNA extraction, polymerase chain reaction amplification, and real-time DNA barcoding of reptile specimens in the field. We demonstrate that nanopore sequencing can be implemented in a remote tropical forest to quickly and accurately identify species using DNA barcoding, as we generated consensus sequences for species resolution with an accuracy of >99% in less than 24 hours after collecting specimens. The flexibility of our mobile laboratory further allowed us to generate sequence information at the Universidad Tecnológica Indoamérica in Quito for rare, endangered, and undescribed species. This includes the recently rediscovered Jambato toad, which was thought to be extinct for 28 years. Sequences generated on the MinION required as few as 30 reads to achieve high accuracy relative to Sanger sequencing, and with further multiplexing of samples, nanopore sequencing can become a cost-effective approach for rapid and portable DNA barcoding. Overall, we establish how mobile laboratories and nanopore sequencing can help to accelerate species identification in remote areas to aid in conservation efforts and be applied to research facilities in developing countries. This opens up possibilities for biodiversity studies by promoting local research capacity building, teaching nonspecialists and students about the environment, tackling wildlife crime, and promoting conservation via research-focused ecotourism.

  2. Identification and verification of hybridoma-derived monoclonal antibody variable region sequences using recombinant DNA technology and mass spectrometry.

    Science.gov (United States)

    Babrak, Lmar; McGarvey, Jeffery A; Stanker, Larry H; Hnasko, Robert

    2017-10-01

    Antibody engineering requires the identification of antigen binding domains or variable regions (VR) unique to each antibody. It is the VR that define the unique antigen binding properties and proper sequence identification is essential for functional evaluation and performance of recombinant antibodies (rAb). This determination can be achieved by sequence analysis of immunoglobulin (Ig) transcripts obtained from a monoclonal antibody (MAb) producing hybridoma and subsequent expression of a rAb. However the polyploidy nature of a hybridoma cell often results in the added expression of aberrant immunoglobulin-like transcripts or even production of anomalous antibodies which can confound production of rAb. An incorrect VR sequence will result in a non-functional rAb and de novo assembly of Ig primary structure without a sequence map is challenging. To address these problems, we have developed a methodology which combines: 1) selective PCR amplification of VR from both the heavy and light chain IgG from hybridoma, 2) molecular cloning and DNA sequence analysis and 3) tandem mass spectrometry (MS/MS) on enzyme digests obtained from the purified IgG. Peptide analysis proceeds by evaluating coverage of the predicted primary protein sequence provided by the initial DNA maps for the VR. This methodology serves to both identify and verify the primary structure of the MAb VR for production as rAb. Published by Elsevier Ltd.

  3. A Bioinformatic Pipeline for Monitoring of the Mutational Stability of Viral Drug Targets with Deep-Sequencing Technology.

    Science.gov (United States)

    Kravatsky, Yuri; Chechetkin, Vladimir; Fedoseeva, Daria; Gorbacheva, Maria; Kravatskaya, Galina; Kretova, Olga; Tchurikov, Nickolai

    2017-11-23

    The efficient development of antiviral drugs, including efficient antiviral small interfering RNAs (siRNAs), requires continuous monitoring of the strict correspondence between a drug and the related highly variable viral DNA/RNA target(s). Deep sequencing is able to provide an assessment of both the general target conservation and the frequency of particular mutations in the different target sites. The aim of this study was to develop a reliable bioinformatic pipeline for the analysis of millions of short, deep sequencing reads corresponding to selected highly variable viral sequences that are drug target(s). The suggested bioinformatic pipeline combines the available programs and the ad hoc scripts based on an original algorithm of the search for the conserved targets in the deep sequencing data. We also present the statistical criteria for the threshold of reliable mutation detection and for the assessment of variations between corresponding data sets. These criteria are robust against the possible sequencing errors in the reads. As an example, the bioinformatic pipeline is applied to the study of the conservation of RNA interference (RNAi) targets in human immunodeficiency virus 1 (HIV-1) subtype A. The developed pipeline is freely available to download at the website http://virmut.eimb.ru/. Brief comments and comparisons between VirMut and other pipelines are also presented.

  4. A Bioinformatic Pipeline for Monitoring of the Mutational Stability of Viral Drug Targets with Deep-Sequencing Technology

    Directory of Open Access Journals (Sweden)

    Yuri Kravatsky

    2017-11-01

    Full Text Available The efficient development of antiviral drugs, including efficient antiviral small interfering RNAs (siRNAs, requires continuous monitoring of the strict correspondence between a drug and the related highly variable viral DNA/RNA target(s. Deep sequencing is able to provide an assessment of both the general target conservation and the frequency of particular mutations in the different target sites. The aim of this study was to develop a reliable bioinformatic pipeline for the analysis of millions of short, deep sequencing reads corresponding to selected highly variable viral sequences that are drug target(s. The suggested bioinformatic pipeline combines the available programs and the ad hoc scripts based on an original algorithm of the search for the conserved targets in the deep sequencing data. We also present the statistical criteria for the threshold of reliable mutation detection and for the assessment of variations between corresponding data sets. These criteria are robust against the possible sequencing errors in the reads. As an example, the bioinformatic pipeline is applied to the study of the conservation of RNA interference (RNAi targets in human immunodeficiency virus 1 (HIV-1 subtype A. The developed pipeline is freely available to download at the website http://virmut.eimb.ru/. Brief comments and comparisons between VirMut and other pipelines are also presented.

  5. Development and validation of a 36-gene sequencing assay for hereditary cancer risk assessment

    Directory of Open Access Journals (Sweden)

    Valentina S. Vysotskaia

    2017-02-01

    Full Text Available The past two decades have brought many important advances in our understanding of the hereditary susceptibility to cancer. Numerous studies have provided convincing evidence that identification of germline mutations associated with hereditary cancer syndromes can lead to reductions in morbidity and mortality through targeted risk management options. Additionally, advances in gene sequencing technology now permit the development of multigene hereditary cancer testing panels. Here, we describe the 2016 revision of the Counsyl Inherited Cancer Screen for detecting single-nucleotide variants (SNVs, short insertions and deletions (indels, and copy number variants (CNVs in 36 genes associated with an elevated risk for breast, ovarian, colorectal, gastric, endometrial, pancreatic, thyroid, prostate, melanoma, and neuroendocrine cancers. To determine test accuracy and reproducibility, we performed a rigorous analytical validation across 341 samples, including 118 cell lines and 223 patient samples. The screen achieved 100% test sensitivity across different mutation types, with high specificity and 100% concordance with conventional Sanger sequencing and multiplex ligation-dependent probe amplification (MLPA. We also demonstrated the screen’s high intra-run and inter-run reproducibility and robust performance on blood and saliva specimens. Furthermore, we showed that pathogenic Alu element insertions can be accurately detected by our test. Overall, the validation in our clinical laboratory demonstrated the analytical performance required for collecting and reporting genetic information related to risk of developing hereditary cancers.

  6. High throughput resistance profiling of Plasmodium falciparum infections based on custom dual indexing and Illumina next generation sequencing-technology

    DEFF Research Database (Denmark)

    Nag, Sidsel; Dalgaard, Marlene Danner; Kofoed, Poul-Erik

    2017-01-01

    Genetic polymorphisms in P. falciparum can be used to indicate the parasite's susceptibility to antimalarial drugs as well as its geographical origin. Both of these factors are key to monitoring development and spread of antimalarial drug resistance. In this study, we combine multiplex PCR, custom...... designed dual indexing and Miseq sequencing for high throughput SNP-profiling of 457 malaria infections from Guinea-Bissau, at the cost of 10 USD per sample. By amplifying and sequencing 15 genetic fragments, we cover 20 resistance-conferring SNPs occurring in pfcrt, pfmdr1, pfdhfr, pfdhps, as well...

  7. Transcriptome sequencing of the blind subterranean mole rat, Spalax galili: Utility and potential for the discovery of novel evolutionary patterns

    KAUST Repository

    Malik, Assaf

    2011-08-12

    The blind subterranean mole rat (Spalax ehrenbergi superspecies) is a model animal for survival under extreme environments due to its ability to live in underground habitats under severe hypoxic stress and darkness. Here we report the transcriptome sequencing of Spalax galili, a chromosomal type of S. ehrenbergi. cDNA pools from muscle and brain tissues isolated from animals exposed to hypoxic and normoxic conditions were sequenced using Sanger, GS FLX, and GS FLX Titanium technologies. Assembly of the sequences yielded over 51,000 isotigs with homology to ~12,000 mouse, rat or human genes. Based on these results, it was possible to detect large numbers of splice variants, SNPs, and novel transcribed regions. In addition, multiple differential expression patterns were detected between tissues and treatments. The results presented here will serve as a valuable resource for future studies aimed at identifying genes and gene regions evolved during the adaptive radiation associated with underground life of the blind mole rat. 2011 Malik et al.

  8. Exome Sequencing Identified a Recessive RDH12 Mutation in a Family with Severe Early-Onset Retinitis Pigmentosa

    Directory of Open Access Journals (Sweden)

    Bo Gong

    2015-01-01

    Full Text Available Retinitis pigmentosa (RP is the most important hereditary retinal disease caused by progressive degeneration of the photoreceptor cells. This study is to identify gene mutations responsible for autosomal recessive retinitis pigmentosa (arRP in a Chinese family using next-generation sequencing technology. A Chinese family with 7 members including two individuals affected with severe early-onset RP was studied. All patients underwent a complete ophthalmic examination. Exome sequencing was performed on a single RP patient (the proband of this family and direct Sanger sequencing on other family members and normal controls was followed to confirm the causal mutations. A homozygous mutation c.437T

  9. Is the extraction by Whatman FTA filter matrix technology and sequencing of large ribosomal subunit D1-D2 region sufficient for identification of clinical fungi?

    Science.gov (United States)

    Kiraz, Nuri; Oz, Yasemin; Aslan, Huseyin; Erturan, Zayre; Ener, Beyza; Akdagli, Sevtap Arikan; Muslumanoglu, Hamza; Cetinkaya, Zafer

    2015-10-01

    Although conventional identification of pathogenic fungi is based on the combination of tests evaluating their morphological and biochemical characteristics, they can fail to identify the less common species or the differentiation of closely related species. In addition these tests are time consuming, labour-intensive and require experienced personnel. We evaluated the feasibility and sufficiency of DNA extraction by Whatman FTA filter matrix technology and DNA sequencing of D1-D2 region of the large ribosomal subunit gene for identification of clinical isolates of 21 yeast and 160 moulds in our clinical mycology laboratory. While the yeast isolates were identified at species level with 100% homology, 102 (63.75%) clinically important mould isolates were identified at species level, 56 (35%) isolates at genus level against fungal sequences existing in DNA databases and two (1.25%) isolates could not be identified. Consequently, Whatman FTA filter matrix technology was a useful method for extraction of fungal DNA; extremely rapid, practical and successful. Sequence analysis strategy of D1-D2 region of the large ribosomal subunit gene was found considerably sufficient in identification to genus level for the most clinical fungi. However, the identification to species level and especially discrimination of closely related species may require additional analysis. © 2015 Blackwell Verlag GmbH.

  10. SNP discovery in the transcriptome of white Pacific shrimp Litopenaeus vannamei by next generation sequencing.

    Directory of Open Access Journals (Sweden)

    Yang Yu

    Full Text Available The application of next generation sequencing technology has greatly facilitated high throughput single nucleotide polymorphism (SNP discovery and genotyping in genetic research. In the present study, SNPs were discovered based on two transcriptomes of Litopenaeus vannamei (L. vannamei generated from Illumina sequencing platform HiSeq 2000. One transcriptome of L. vannamei was obtained through sequencing on the RNA from larvae at mysis stage and its reference sequence was de novo assembled. The data from another transcriptome were downloaded from NCBI and the reads of the two transcriptomes were mapped separately to the assembled reference by BWA. SNP calling was performed using SAMtools. A total of 58,717 and 36,277 SNPs with high quality were predicted from the two transcriptomes, respectively. SNP calling was also performed using the reads of two transcriptomes together, and a total of 96,040 SNPs with high quality were predicted. Among these 96,040 SNPs, 5,242 and 29,129 were predicted as non-synonymous and synonymous SNPs respectively. Characterization analysis of the predicted SNPs in L. vannamei showed that the estimated SNP frequency was 0.21% (one SNP per 476 bp and the estimated ratio for transition to transversion was 2.0. Fifty SNPs were randomly selected for validation by Sanger sequencing after PCR amplification and 76% of SNPs were confirmed, which indicated that the SNPs predicted in this study were reliable. These SNPs will be very useful for genetic study in L. vannamei, especially for the high density linkage map construction and genome-wide association studies.

  11. Identification and verification of hybridoma-derived monoclonal antibody variable region sequences using recombinant DNA technology and mass spectrometry

    Science.gov (United States)

    Antibody engineering requires the identification of antigen binding domains or variable regions (VR) unique to each antibody. It is the VR that define the unique antigen binding properties and proper sequence identification is essential for functional evaluation and performance of recombinant antibo...

  12. Massively parallel sequencing, aCGH, and RNA-Seq technologies provide a comprehensive molecular diagnosis of Fanconi anemia.

    Science.gov (United States)

    Chandrasekharappa, Settara C; Lach, Francis P; Kimble, Danielle C; Kamat, Aparna; Teer, Jamie K; Donovan, Frank X; Flynn, Elizabeth; Sen, Shurjo K; Thongthip, Supawat; Sanborn, Erica; Smogorzewska, Agata; Auerbach, Arleen D; Ostrander, Elaine A

    2013-05-30

    Current methods for detecting mutations in Fanconi anemia (FA)-suspected patients are inefficient and often miss mutations. We have applied recent advances in DNA sequencing and genomic capture to the diagnosis of FA. Specifically, we used custom molecular inversion probes or TruSeq-enrichment oligos to capture and sequence FA and related genes, including introns, from 27 samples from the International Fanconi Anemia Registry at The Rockefeller University. DNA sequencing was complemented with custom array comparative genomic hybridization (aCGH) and RNA sequencing (RNA-seq) analysis. aCGH identified deletions/duplications in 4 different FA genes. RNA-seq analysis revealed lack of allele specific expression associated with a deletion and splicing defects caused by missense, synonymous, and deep-in-intron variants. The combination of TruSeq-targeted capture, aCGH, and RNA-seq enabled us to identify the complementation group and biallelic germline mutations in all 27 families: FANCA (7), FANCB (3), FANCC (3), FANCD1 (1), FANCD2 (3), FANCF (2), FANCG (2), FANCI (1), FANCJ (2), and FANCL (3). FANCC mutations are often the cause of FA in patients of Ashkenazi Jewish (AJ) ancestry, and we identified 2 novel FANCC mutations in 2 patients of AJ ancestry. We describe here a strategy for efficient molecular diagnosis of FA.

  13. Learning with Technology: Video Modeling with Concrete-Representational-Abstract Sequencing for Students with Autism Spectrum Disorder

    Science.gov (United States)

    Yakubova, Gulnoza; Hughes, Elizabeth M.; Shinaberry, Megan

    2016-01-01

    The purpose of this study was to determine the effectiveness of a video modeling intervention with concrete-representational-abstract instructional sequence in teaching mathematics concepts to students with autism spectrum disorder (ASD). A multiple baseline across skills design of single-case experimental methodology was used to determine the…

  14. Model SNP development for complex genomes based on hexaploid oat using high-throughput 454 sequencing technology

    Directory of Open Access Journals (Sweden)

    Chao Shiaoman

    2011-01-01

    Full Text Available Abstract Background Genetic markers are pivotal to modern genomics research; however, discovery and genotyping of molecular markers in oat has been hindered by the size and complexity of the genome, and by a scarcity of sequence data. The purpose of this study was to generate oat expressed sequence tag (EST information, develop a bioinformatics pipeline for SNP discovery, and establish a method for rapid, cost-effective, and straightforward genotyping of SNP markers in complex polyploid genomes such as oat. Results Based on cDNA libraries of four cultivated oat genotypes, approximately 127,000 contigs were assembled from approximately one million Roche 454 sequence reads. Contigs were filtered through a novel bioinformatics pipeline to eliminate ambiguous polymorphism caused by subgenome homology, and 96 in silico SNPs were selected from 9,448 candidate loci for validation using high-resolution melting (HRM analysis. Of these, 52 (54% were polymorphic between parents of the Ogle1040 × TAM O-301 (OT mapping population, with 48 segregating as single Mendelian loci, and 44 being placed on the existing OT linkage map. Ogle and TAM amplicons from 12 primers were sequenced for SNP validation, revealing complex polymorphism in seven amplicons but general sequence conservation within SNP loci. Whole-amplicon interrogation with HRM revealed insertions, deletions, and heterozygotes in secondary oat germplasm pools, generating multiple alleles at some primer targets. To validate marker utility, 36 SNP assays were used to evaluate the genetic diversity of 34 diverse oat genotypes. Dendrogram clusters corresponded generally to known genome composition and genetic ancestry. Conclusions The high-throughput SNP discovery pipeline presented here is a rapid and effective method for identification of polymorphic SNP alleles in the oat genome. The current-generation HRM system is a simple and highly-informative platform for SNP genotyping. These techniques provide

  15. Multiplexed microsatellite recovery using massively parallel sequencing

    Science.gov (United States)

    Jennings, T.N.; Knaus, B.J.; Mullins, T.D.; Haig, S.M.; Cronn, R.C.

    2011-01-01

    Conservation and management of natural populations requires accurate and inexpensive genotyping methods. Traditional microsatellite, or simple sequence repeat (SSR), marker analysis remains a popular genotyping method because of the comparatively low cost of marker development, ease of analysis and high power of genotype discrimination. With the availability of massively parallel sequencing (MPS), it is now possible to sequence microsatellite-enriched genomic libraries in multiplex pools. To test this approach, we prepared seven microsatellite-enriched, barcoded genomic libraries from diverse taxa (two conifer trees, five birds) and sequenced these on one lane of the Illumina Genome Analyzer using paired-end 80-bp reads. In this experiment, we screened 6.1 million sequences and identified 356958 unique microreads that contained di- or trinucleotide microsatellites. Examination of four species shows that our conversion rate from raw sequences to polymorphic markers compares favourably to Sanger- and 454-based methods. The advantage of multiplexed MPS is that the staggering capacity of modern microread sequencing is spread across many libraries; this reduces sample preparation and sequencing costs to less than $400 (USD) per species. This price is sufficiently low that microsatellite libraries could be prepared and sequenced for all 1373 organisms listed as 'threatened' and 'endangered' in the United States for under $0.5M (USD).

  16. Rapid and Accurate Sequencing of Enterovirus Genomes Using MinION Nanopore Sequencer.

    Science.gov (United States)

    Wang, Ji; Ke, Yue Hua; Zhang, Yong; Huang, Ke Qiang; Wang, Lei; Shen, Xin Xin; Dong, Xiao Ping; Xu, Wen Bo; Ma, Xue Jun

    2017-10-01

    Knowledge of an enterovirus genome sequence is very important in epidemiological investigation to identify transmission patterns and ascertain the extent of an outbreak. The MinION sequencer is increasingly used to sequence various viral pathogens in many clinical situations because of its long reads, portability, real-time accessibility of sequenced data, and very low initial costs. However, information is lacking on MinION sequencing of enterovirus genomes. In this proof-of-concept study using Enterovirus 71 (EV71) and Coxsackievirus A16 (CA16) strains as examples, we established an amplicon-based whole genome sequencing method using MinION. We explored the accuracy, minimum sequencing time, discrimination and high-throughput sequencing ability of MinION, and compared its performance with Sanger sequencing. Within the first minute (min) of sequencing, the accuracy of MinION was 98.5% for the single EV71 strain and 94.12%-97.33% for 10 genetically-related CA16 strains. In as little as 14 min, 99% identity was reached for the single EV71 strain, and in 17 min (on average), 99% identity was achieved for 10 CA16 strains in a single run. MinION is suitable for whole genome sequencing of enteroviruses with sufficient accuracy and fine discrimination and has the potential as a fast, reliable and convenient method for routine use. Copyright © 2017 The Editorial Board of Biomedical and Environmental Sciences. Published by China CDC. All rights reserved.

  17. Design of a High Density SNP Genotyping Assay in the Pig Using SNPs Identified and Characterized by Next Generation Sequencing Technology

    Science.gov (United States)

    Ramos, Antonio M.; Crooijmans, Richard P. M. A.; Affara, Nabeel A.; Amaral, Andreia J.; Archibald, Alan L.; Beever, Jonathan E.; Bendixen, Christian; Churcher, Carol; Clark, Richard; Dehais, Patrick; Hansen, Mark S.; Hedegaard, Jakob; Hu, Zhi-Liang; Kerstens, Hindrik H.; Law, Andy S.; Megens, Hendrik-Jan; Milan, Denis; Nonneman, Danny J.; Rohrer, Gary A.; Rothschild, Max F.; Smith, Tim P. L.; Schnabel, Robert D.; Van Tassell, Curt P.; Taylor, Jeremy F.; Wiedmann, Ralph T.; Schook, Lawrence B.; Groenen, Martien A. M.

    2009-01-01

    Background The dissection of complex traits of economic importance to the pig industry requires the availability of a significant number of genetic markers, such as single nucleotide polymorphisms (SNPs). This study was conducted to discover several hundreds of thousands of porcine SNPs using next generation sequencing technologies and use these SNPs, as well as others from different public sources, to design a high-density SNP genotyping assay. Methodology/Principal Findings A total of 19 reduced representation libraries derived from four swine breeds (Duroc, Landrace, Large White, Pietrain) and a Wild Boar population and three restriction enzymes (AluI, HaeIII and MspI) were sequenced using Illumina's Genome Analyzer (GA). The SNP discovery effort resulted in the de novo identification of over 372K SNPs. More than 549K SNPs were used to design the Illumina Porcine 60K+SNP iSelect Beadchip, now commercially available as the PorcineSNP60. A total of 64,232 SNPs were included on the Beadchip. Results from genotyping the 158 individuals used for sequencing showed a high overall SNP call rate (97.5%). Of the 62,621 loci that could be reliably scored, 58,994 were polymorphic yielding a SNP conversion success rate of 94%. The average minor allele frequency (MAF) for all scorable SNPs was 0.274. Conclusions/Significance Overall, the results of this study indicate the utility of using next generation sequencing technologies to identify large numbers of reliable SNPs. In addition, the validation of the PorcineSNP60 Beadchip demonstrated that the assay is an excellent tool that will likely be used in a variety of future studies in pigs. PMID:19654876

  18. Transcriptomic SNP discovery for custom genotyping arrays: impacts of sequence data, SNP calling method and genotyping technology on the probability of validation success.

    Science.gov (United States)

    Humble, Emily; Thorne, Michael A S; Forcada, Jaume; Hoffman, Joseph I

    2016-08-26

    Single nucleotide polymorphism (SNP) discovery is an important goal of many studies. However, the number of 'putative' SNPs discovered from a sequence resource may not provide a reliable indication of the number that will successfully validate with a given genotyping technology. For this it may be necessary to account for factors such as the method used for SNP discovery and the type of sequence data from which it originates, suitability of the SNP flanking sequences for probe design, and genomic context. To explore the relative importance of these and other factors, we used Illumina sequencing to augment an existing Roche 454 transcriptome assembly for the Antarctic fur seal (Arctocephalus gazella). We then mapped the raw Illumina reads to the new hybrid transcriptome using BWA and BOWTIE2 before calling SNPs with GATK. The resulting markers were pooled with two existing sets of SNPs called from the original 454 assembly using NEWBLER and SWAP454. Finally, we explored the extent to which SNPs discovered using these four methods overlapped and predicted the corresponding validation outcomes for both Illumina Infinium iSelect HD and Affymetrix Axiom arrays. Collating markers across all discovery methods resulted in a global list of 34,718 SNPs. However, concordance between the methods was surprisingly poor, with only 51.0 % of SNPs being discovered by more than one method and 13.5 % being called from both the 454 and Illumina datasets. Using a predictive modeling approach, we could also show that SNPs called from the Illumina data were on average more likely to successfully validate, as were SNPs called by more than one method. Above and beyond this pattern, predicted validation outcomes were also consistently better for Affymetrix Axiom arrays. Our results suggest that focusing on SNPs called by more than one method could potentially improve validation outcomes. They also highlight possible differences between alternative genotyping technologies that could be

  19. Transcriptome analysis of carnation (Dianthus caryophyllus L.) based on next-generation sequencing technology

    OpenAIRE

    Tanase Koji; Nishitani Chikako; Hirakawa Hideki; Isobe Sachiko; Tabata Satoshi; Ohmiya Akemi; Onozaki Takashi

    2012-01-01

    Abstract Background Carnation (Dianthus caryophyllus L.), in the family Caryophyllaceae, can be found in a wide range of colors and is a model system for studies of flower senescence. In addition, it is one of the most important flowers in the global floriculture industry. However, few genomics resources, such as sequences and markers are available for carnation or other members of the Caryophyllaceae. To increase our understanding of the genetic control of important characters in carnation, ...

  20. Single-base resolution and long-coverage sequencing based on single-molecule nanomanipulation

    International Nuclear Information System (INIS)

    An Hongjie; Huang Jiehuan; Lue Ming; Li Xueling; Lue Junhong; Li Haikuo; Zhang Yi; Li Minqian; Hu Jun

    2007-01-01

    We show new approaches towards a novel single-molecule sequencing strategy which consists of high-resolution positioning isolation of overlapping DNA fragments with atomic force microscopy (AFM), subsequent single-molecule PCR amplification and conventional Sanger sequencing. In this study, a DNA labelling technique was used to guarantee the accuracy in positioning the target DNA. Single-molecule multiplex PCR was carried out to test the contamination. The results showed that the two overlapping DNA fragments isolated by AFM could be successfully sequenced with high quality and perfect contiguity, indicating that single-base resolution and long-coverage sequencing have been achieved simultaneously

  1. Next-generation phylogeography: a targeted approach for multilocus sequencing of non-model organisms.

    Directory of Open Access Journals (Sweden)

    Jonathan B Puritz

    Full Text Available The field of phylogeography has long since realized the need and utility of incorporating nuclear DNA (nDNA sequences into analyses. However, the use of nDNA sequence data, at the population level, has been hindered by technical laboratory difficulty, sequencing costs, and problematic analytical methods dealing with genotypic sequence data, especially in non-model organisms. Here, we present a method utilizing the 454 GS-FLX Titanium pyrosequencing platform with the capacity to simultaneously sequence two species of sea star (Meridiastra calcar and Parvulastra exigua at five different nDNA loci across 16 different populations of 20 individuals each per species. We compare results from 3 populations with traditional Sanger sequencing based methods, and demonstrate that this next-generation sequencing platform is more time and cost effective and more sensitive to rare variants than Sanger based sequencing. A crucial advantage is that the high coverage of clonally amplified sequences simplifies haplotype determination, even in highly polymorphic species. This targeted next-generation approach can greatly increase the use of nDNA sequence loci in phylogeographic and population genetic studies by mitigating many of the time, cost, and analytical issues associated with highly polymorphic, diploid sequence markers.

  2. Learning with Technology: Video Modeling with Concrete-Representational-Abstract Sequencing for Students with Autism Spectrum Disorder.

    Science.gov (United States)

    Yakubova, Gulnoza; Hughes, Elizabeth M; Shinaberry, Megan

    2016-07-01

    The purpose of this study was to determine the effectiveness of a video modeling intervention with concrete-representational-abstract instructional sequence in teaching mathematics concepts to students with autism spectrum disorder (ASD). A multiple baseline across skills design of single-case experimental methodology was used to determine the effectiveness of the intervention on the acquisition and maintenance of addition, subtraction, and number comparison skills for four elementary school students with ASD. Findings supported the effectiveness of the intervention in improving skill acquisition and maintenance at a 3-week follow-up. Implications for practice and future research are discussed.

  3. Ion Torrent sequencing as a tool for mutation discovery in the flax (Linum usitatissimum L.) genome.

    Science.gov (United States)

    Galindo-González, Leonardo; Pinzón-Latorre, David; Bergen, Erik A; Jensen, Dustin C; Deyholos, Michael K

    2015-01-01

    Detection of induced mutations is valuable for inferring gene function and for developing novel germplasm for crop improvement. Many reverse genetics approaches have been developed to identify mutations in genes of interest within a mutagenized population, including some approaches that rely on next-generation sequencing (e.g. exome capture, whole genome resequencing). As an alternative to these genome or exome-scale methods, we sought to develop a scalable and efficient method for detection of induced mutations that could be applied to a small number of target genes, using Ion Torrent technology. We developed this method in flax (Linum usitatissimum), to demonstrate its utility in a crop species. We used an amplicon-based approach in which DNA samples from an ethyl methanesulfonate (EMS)-mutagenized population were pooled and used as template in PCR reactions to amplify a region of each gene of interest. Barcodes were incorporated during PCR, and the pooled amplicons were sequenced using an Ion Torrent PGM. A pilot experiment with known SNPs showed that they could be detected at a frequency > 0.3% within the pools. We then selected eight genes for which we wanted to discover novel mutations, and applied our approach to screen 768 individuals from the EMS population, using either the Ion 314 or Ion 316 chips. Out of 29 potential mutations identified after processing the NGS reads, 16 mutations were confirmed using Sanger sequencing. The methodology presented here demonstrates the utility of Ion Torrent technology in detecting mutation variants in specific genome regions for large populations of a species such as flax. The methodology could be scaled-up to test >100 genes using the higher capacity chips now available from Ion Torrent.

  4. First study on gene expression of cement proteins and potential adhesion-related genes of a membranous-based barnacle as revealed from Next-Generation Sequencing technology

    KAUST Repository

    Lin, Hsiu Chin; Wong, Yue Him; Tsang, Ling Ming; Chu, Ka Hou; Qian, Pei Yuan; Chan, Benny K K

    2013-01-01

    This is the first study applying Next-Generation Sequencing (NGS) technology to survey the kinds, expression location, and pattern of adhesion-related genes in a membranous-based barnacle. A total of 77,528,326 and 59,244,468 raw sequence reads of total RNA were generated from the prosoma and the basis of Tetraclita japonica formosana, respectively. In addition, 55,441 and 67,774 genes were further assembled and analyzed. The combined sequence data from both body parts generates a total of 79,833 genes of which 47.7% were shared. Homologues of barnacle cement proteins - CP-19K, -52K, and -100K - were found and all were dominantly expressed at the basis where the cement gland complex is located. This is the main area where transcripts of cement proteins and other potential adhesion-related genes were detected. The absence of another common barnacle cement protein, CP-20K, in the adult transcriptome suggested a possible life-stage restricted gene function and/or a different mechanism in adhesion between membranous-based and calcareous-based barnacles. © 2013 © 2013 Taylor & Francis.

  5. First study on gene expression of cement proteins and potential adhesion-related genes of a membranous-based barnacle as revealed from Next-Generation Sequencing technology

    KAUST Repository

    Lin, Hsiu Chin

    2013-12-12

    This is the first study applying Next-Generation Sequencing (NGS) technology to survey the kinds, expression location, and pattern of adhesion-related genes in a membranous-based barnacle. A total of 77,528,326 and 59,244,468 raw sequence reads of total RNA were generated from the prosoma and the basis of Tetraclita japonica formosana, respectively. In addition, 55,441 and 67,774 genes were further assembled and analyzed. The combined sequence data from both body parts generates a total of 79,833 genes of which 47.7% were shared. Homologues of barnacle cement proteins - CP-19K, -52K, and -100K - were found and all were dominantly expressed at the basis where the cement gland complex is located. This is the main area where transcripts of cement proteins and other potential adhesion-related genes were detected. The absence of another common barnacle cement protein, CP-20K, in the adult transcriptome suggested a possible life-stage restricted gene function and/or a different mechanism in adhesion between membranous-based and calcareous-based barnacles. © 2013 © 2013 Taylor & Francis.

  6. The usefulness of DNA sequencing after extraction by Whatman FTA filter matrix technology and phenotypic tests for differentiation of Candida albicans and Candida dubliniensis.

    Science.gov (United States)

    Kiraz, Nuri; Oz, Yasemin; Aslan, Huseyin; Muslumanoglu, Hamza

    2014-02-01

    Since C. dubliniensis is similar to C. albicans phenotypically, it can be misidentified as C. albicans. We aimed to investigate the prevalence of C. dubliniensis among isolates previously identified as C. albicans in our stocks and to compare the phenotypic methods and DNA sequencing of D1/D2 region on the ribosomal large subunit (rLSU) gene. A total of 850 isolates included in this study. Phenotypic identification was performed based on germ tube formation, chlamydospore production, colony colors on chromogenic agar, inability of growth at 45 °C and growth on hypertonic Sabouraud dextrose agar. Eighty isolates compatible with C. dubliniensis by at least one phenotypic test were included in the sequence analysis. Nested PCR amplification of D1/D2 region of the rLSU gene was performed after the fungal DNA extraction by Whatman FTA filter paper technology. The sequencing analysis of PCR products carried out by an automated capillary gel electrophoresis device. The rate of C. dubliniensis was 2.35 % (n = 20) among isolates previously described as C. albicans. Consequently, none of the phenotypic tests provided satisfactory performance alone in our study, and molecular methods required special equipment and high cost. Thus, at least two phenotypic methods can be used for identification of C. dubliniensis, and molecular methods can be used for confirmation.

  7. Identification and Characterization of Epstein-Barr Virus Genomes in Lung Carcinoma Biopsy Samples by Next-Generation Sequencing Technology.

    Science.gov (United States)

    Wang, Shanshan; Xiong, Hongchao; Yan, Shi; Wu, Nan; Lu, Zheming

    2016-05-18

    Epstein-Barr virus (EBV) has been detected in the tumor cells of several cancers, including some cases of lung carcinoma (LC). However, the genomic characteristics and diversity of EBV strains associated with LC are poorly understood. In this study, we sequenced the EBV genomes isolated from four primary LC tumor biopsy samples, designated LC1 to LC4. Comparative analysis demonstrated that LC strains were more closely related to GD1 strain. Compared to GD1 reference genome, a total of 520 variations in all, including 498 substitutions, 12 insertions, and 10 deletions were found. Latent genes were found to harbor the most numbers of nonsynonymous mutations. Phylogenetic analysis showed that all LC strains were closely related to Asian EBV strains, whereas different from African/American strains. LC2 genome was distinct from the other three LC genomes, suggesting at least two parental lineages of EBV among the LC genomes may exist. All LC strains could be classified as China 1 and V-val subtype according to the amino acid sequence of LMP1 and EBNA1, respectively. In conclusion, our results showed the genomic diversity among EBV genomes isolated from LC, which might facilitate to uncover the previously unknown variations of pathogenic significance.

  8. Existing and emerging detection technologies for DNA (Deoxyribonucleic Acid) finger printing, sequencing, bio- and analytical chips: a multidisciplinary development unifying molecular biology, chemical and electronics engineering.

    Science.gov (United States)

    Kumar Khanna, Vinod

    2007-01-01

    The current status and research trends of detection techniques for DNA-based analysis such as DNA finger printing, sequencing, biochips and allied fields are examined. An overview of main detectors is presented vis-à-vis these DNA operations. The biochip method is explained, the role of micro- and nanoelectronic technologies in biochip realization is highlighted, various optical and electrical detection principles employed in biochips are indicated, and the operational mechanisms of these detection devices are described. Although a diversity of biochips for diagnostic and therapeutic applications has been demonstrated in research laboratories worldwide, only some of these chips have entered the clinical market, and more chips are awaiting commercialization. The necessity of tagging is eliminated in refractive-index change based devices, but the basic flaw of indirect nature of most detection methodologies can only be overcome by generic and/or reagentless DNA sensors such as the conductance-based approach and the DNA-single electron transistor (DNA-SET) structure. Devices of the electrical detection-based category are expected to pave the pathway for the next-generation DNA chips. The review provides a comprehensive coverage of the detection technologies for DNA finger printing, sequencing and related techniques, encompassing a variety of methods from the primitive art to the state-of-the-art scenario as well as promising methods for the future.

  9. [Whole Genome Sequencing of Human mtDNA Based on Ion Torrent PGM™ Platform].

    Science.gov (United States)

    Cao, Y; Zou, K N; Huang, J P; Ma, K; Ping, Y

    2017-08-01

    To analyze and detect the whole genome sequence of human mitochondrial DNA (mtDNA) by Ion Torrent PGM™ platform and to study the differences of mtDNA sequence in different tissues. Samples were collected from 6 unrelated individuals by forensic postmortem examination, including chest blood, hair, costicartilage, nail, skeletal muscle and oral epithelium. Amplification of whole genome sequence of mtDNA was performed by 4 pairs of primer. Libraries were constructed with Ion Shear™ Plus Reagents kit and Ion Plus Fragment Library kit. Whole genome sequencing of mtDNA was performed using Ion Torrent PGM™ platform. Sanger sequencing was used to determine the heteroplasmy positions and the mutation positions on HVⅠ region. The whole genome sequence of mtDNA from all samples were amplified successfully. Six unrelated individuals belonged to 6 different haplotypes. Different tissues in one individual had heteroplasmy difference. The heteroplasmy positions and the mutation positions on HVⅠ region were verified by Sanger sequencing. After a consistency check by the Kappa method, it was found that the results of mtDNA sequence had a high consistency in different tissues. The testing method used in present study for sequencing the whole genome sequence of human mtDNA can detect the heteroplasmy difference in different tissues, which have good consistency. The results provide guidance for the further applications of mtDNA in forensic science. Copyright© by the Editorial Department of Journal of Forensic Medicine

  10. Genetic diagnosis of Duchenne and Becker muscular dystrophy using next-generation sequencing technology: comprehensive mutational search in a single platform.

    Science.gov (United States)

    Lim, Byung Chan; Lee, Seungbok; Shin, Jong-Yeon; Kim, Jong-Il; Hwang, Hee; Kim, Ki Joong; Hwang, Yong Seung; Seo, Jeong-Sun; Chae, Jong Hee

    2011-11-01

    Duchenne muscular dystrophy or Becker muscular dystrophy might be a suitable candidate disease for application of next-generation sequencing in the genetic diagnosis because the complex mutational spectrum and the large size of the dystrophin gene require two or more analytical methods and have a high cost. The authors tested whether large deletions/duplications or small mutations, such as point mutations or short insertions/deletions of the dystrophin gene, could be predicted accurately in a single platform using next-generation sequencing technology. A custom solution-based target enrichment kit was designed to capture whole genomic regions of the dystrophin gene and other muscular-dystrophy-related genes. A multiplexing strategy, wherein four differently bar-coded samples were captured and sequenced together in a single lane of the Illumina Genome Analyser, was applied. The study subjects were 25 16 with deficient dystrophin expression without a large deletion/duplication and 9 with a known large deletion/duplication. Nearly 100% of the exonic region of the dystrophin gene was covered by at least eight reads with a mean read depth of 107. Pathogenic small mutations were identified in 15 of the 16 patients without a large deletion/duplication. Using these 16 patients as the standard, the authors' method accurately predicted the deleted or duplicated exons in the 9 patients with known mutations. Inclusion of non-coding regions and paired-end sequence analysis enabled accurate identification by increasing the read depth and providing information about the breakpoint junction. The current method has an advantage for the genetic diagnosis of Duchenne muscular dystrophy and Becker muscular dystrophy wherein a comprehensive mutational search may be feasible using a single platform.

  11. Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    KAUST Repository

    Black, PA

    2015-10-24

    Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug resistance. In an age where whole genome sequencing is increasingly relied upon for defining the structure of bacterial genomes, it is important to investigate the reliability of next generation sequencing to identify clonal variants present in a minor percentage of the population. This study aimed to define a reliable cut-off for identification of low frequency sequence variants and to subsequently investigate genetic heterogeneity and the evolution of drug resistance in M. tuberculosis. Methods Genomic DNA was isolated from single colonies from 14 rifampicin mono-resistant M. tuberculosis isolates, as well as the primary cultures and follow up MDR cultures from two of these patients. The whole genomes of the M. tuberculosis isolates were sequenced using either the Illumina MiSeq or Illumina HiSeq platforms. Sequences were analysed with an in-house pipeline. Results Using next-generation sequencing in combination with Sanger sequencing and statistical analysis we defined a read frequency cut-off of 30 % to identify low frequency M. tuberculosis variants with high confidence. Using this cut-off we demonstrated a high rate of genetic diversity between single colonies isolated from one population, showing that by using the current sequencing technology, single colonies are not a true reflection of the genetic diversity within a whole population and vice versa. We further showed that numerous heterogeneous variants emerge and then disappear during the evolution of isoniazid resistance within individual patients. Our findings allowed us to formulate a model for the selective bottleneck which occurs during the course of infection, acting as a genomic purification event. Conclusions Our study demonstrated true levels of genetic diversity

  12. Supplementary Material for: Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    KAUST Repository

    Black, PA

    2015-01-01

    Abstract Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug resistance. In an age where whole genome sequencing is increasingly relied upon for defining the structure of bacterial genomes, it is important to investigate the reliability of next generation sequencing to identify clonal variants present in a minor percentage of the population. This study aimed to define a reliable cut-off for identification of low frequency sequence variants and to subsequently investigate genetic heterogeneity and the evolution of drug resistance in M. tuberculosis. Methods Genomic DNA was isolated from single colonies from 14 rifampicin mono-resistant M. tuberculosis isolates, as well as the primary cultures and follow up MDR cultures from two of these patients. The whole genomes of the M. tuberculosis isolates were sequenced using either the Illumina MiSeq or Illumina HiSeq platforms. Sequences were analysed with an in-house pipeline. Results Using next-generation sequencing in combination with Sanger sequencing and statistical analysis we defined a read frequency cut-off of 30 % to identify low frequency M. tuberculosis variants with high confidence. Using this cut-off we demonstrated a high rate of genetic diversity between single colonies isolated from one population, showing that by using the current sequencing technology, single colonies are not a true reflection of the genetic diversity within a whole population and vice versa. We further showed that numerous heterogeneous variants emerge and then disappear during the evolution of isoniazid resistance within individual patients. Our findings allowed us to formulate a model for the selective bottleneck which occurs during the course of infection, acting as a genomic purification event. Conclusions Our study demonstrated true levels of genetic

  13. The advantages of SMRT sequencing

    OpenAIRE

    Roberts, Richard J; Carneiro, Mauricio O; Schatz, Michael C

    2013-01-01

    Of the current next-generation sequencing technologies, SMRT sequencing is sometimes overlooked. However, attributes such as long reads, modified base detection and high accuracy make SMRT a useful technology and an ideal approach to the complete sequencing of small genomes.

  14. Artemis and ACT: viewing, annotating and comparing sequences stored in a relational database.

    Science.gov (United States)

    Carver, Tim; Berriman, Matthew; Tivey, Adrian; Patel, Chinmay; Böhme, Ulrike; Barrell, Barclay G; Parkhill, Julian; Rajandream, Marie-Adèle

    2008-12-01

    Artemis and Artemis Comparison Tool (ACT) have become mainstream tools for viewing and annotating sequence data, particularly for microbial genomes. Since its first release, Artemis has been continuously developed and supported with additional functionality for editing and analysing sequences based on feedback from an active user community of laboratory biologists and professional annotators. Nevertheless, its utility has been somewhat restricted by its limitation to reading and writing from flat files. Therefore, a new version of Artemis has been developed, which reads from and writes to a relational database schema, and allows users to annotate more complex, often large and fragmented, genome sequences. Artemis and ACT have now been extended to read and write directly to the Generic Model Organism Database (GMOD, http://www.gmod.org) Chado relational database schema. In addition, a Gene Builder tool has been developed to provide structured forms and tables to edit coordinates of gene models and edit functional annotation, based on standard ontologies, controlled vocabularies and free text. Artemis and ACT are freely available (under a GPL licence) for download (for MacOSX, UNIX and Windows) at the Wellcome Trust Sanger Institute web sites: http://www.sanger.ac.uk/Software/Artemis/ http://www.sanger.ac.uk/Software/ACT/

  15. High-Throughput Next-Generation Sequencing of Polioviruses

    Science.gov (United States)

    Montmayeur, Anna M.; Schmidt, Alexander; Zhao, Kun; Magaña, Laura; Iber, Jane; Castro, Christina J.; Chen, Qi; Henderson, Elizabeth; Ramos, Edward; Shaw, Jing; Tatusov, Roman L.; Dybdahl-Sissoko, Naomi; Endegue-Zanga, Marie Claire; Adeniji, Johnson A.; Oberste, M. Steven; Burns, Cara C.

    2016-01-01

    ABSTRACT The poliovirus (PV) is currently targeted for worldwide eradication and containment. Sanger-based sequencing of the viral protein 1 (VP1) capsid region is currently the standard method for PV surveillance. However, the whole-genome sequence is sometimes needed for higher resolution global surveillance. In this study, we optimized whole-genome sequencing protocols for poliovirus isolates and FTA cards using next-generation sequencing (NGS), aiming for high sequence coverage, efficiency, and throughput. We found that DNase treatment of poliovirus RNA followed by random reverse transcription (RT), amplification, and the use of the Nextera XT DNA library preparation kit produced significantly better results than other preparations. The average viral reads per total reads, a measurement of efficiency, was as high as 84.2% ± 15.6%. PV genomes covering >99 to 100% of the reference length were obtained and validated with Sanger sequencing. A total of 52 PV genomes were generated, multiplexing as many as 64 samples in a single Illumina MiSeq run. This high-throughput, sequence-independent NGS approach facilitated the detection of a diverse range of PVs, especially for those in vaccine-derived polioviruses (VDPV), circulating VDPV, or immunodeficiency-related VDPV. In contrast to results from previous studies on other viruses, our results showed that filtration and nuclease treatment did not discernibly increase the sequencing efficiency of PV isolates. However, DNase treatment after nucleic acid extraction to remove host DNA significantly improved the sequencing results. This NGS method has been successfully implemented to generate PV genomes for molecular epidemiology of the most recent PV isolates. Additionally, the ability to obtain full PV genomes from FTA cards will aid in facilitating global poliovirus surveillance. PMID:27927929

  16. Sequence and expression analysis of gaps in human chromosome 20

    DEFF Research Database (Denmark)

    Minocherhomji, Sheroy; Seemann, Stefan; Mang, Yuan

    2012-01-01

    /or overlap disease-associated loci, including the DLGAP4 locus. In this study, we sequenced ~99% of all three unfinished gaps on human chr 20, determined their complete genomic sizes and assessed epigenetic profiles using a combination of Sanger sequencing, mate pair paired-end high-throughput sequencing......The finished human genome-assemblies comprise several hundred un-sequenced euchromatic gaps, which may be rich in long polypurine/polypyrimidine stretches. Human chromosome 20 (chr 20) currently has three unfinished gaps remaining on its q-arm. All three gaps are within gene-dense regions and...... and chromatin, methylation and expression analyses. We found histone 3 trimethylated at Lysine 27 to be distributed across all three gaps in immortalized B-lymphocytes. In one gap, five novel CpG islands were predominantly hypermethylated in genomic DNA from peripheral blood lymphocytes and human cerebellum...

  17. Development and characterization of 26 novel microsatellite loci for the trochid gastropod Gibbula divaricata (Linnaeus, 1758, using Illumina MiSeq next generation sequencing technology

    Directory of Open Access Journals (Sweden)

    Violeta López-Márquez

    2016-03-01

    Full Text Available In the present study we used the high-throughput sequencing technology Illumina MiSeq to develop 26 polymorphic microsatellite loci for the marine snail Gibbula divaricata. Four to 32 alleles were detected per locus across 30 samples analyzed. Observed and expected heterozygosities ranged from 0.130 to 0.933 and from 0.294 to 0.956, respectively. No significant linkage disequilibrium existed. Seven loci deviated from Hardy-Weinberg equilibrium that could not totally be explained by the presence of null alleles. Sympatric distribution with other species of the genus Gibbula, as G. rarilineata and G. varia, lead us to test the cross utility of the developed markers in these two species, which could be useful to test common biogeographic patterns or potential hybridization phenomena, since morphological intermediate specimens were found.

  18. Secure and robust cloud computing for high-throughput forensic microsatellite sequence analysis and databasing.

    Science.gov (United States)

    Bailey, Sarah F; Scheible, Melissa K; Williams, Christopher; Silva, Deborah S B S; Hoggan, Marina; Eichman, Christopher; Faith, Seth A

    2017-11-01

    Next-generation Sequencing (NGS) is a rapidly evolving technology with demonstrated benefits for forensic genetic applications, and the strategies to analyze and manage the massive NGS datasets are currently in development. Here, the computing, data storage, connectivity, and security resources of the Cloud were evaluated as a model for forensic laboratory systems that produce NGS data. A complete front-to-end Cloud system was developed to upload, process, and interpret raw NGS data using a web browser dashboard. The system was extensible, demonstrating analysis capabilities of autosomal and Y-STRs from a variety of NGS instrumentation (Illumina MiniSeq and MiSeq, and Oxford Nanopore MinION). NGS data for STRs were concordant with standard reference materials previously characterized with capillary electrophoresis and Sanger sequencing. The computing power of the Cloud was implemented with on-demand auto-scaling to allow multiple file analysis in tandem. The system was designed to store resulting data in a relational database, amenable to downstream sample interpretations and databasing applications following the most recent guidelines in nomenclature for sequenced alleles. Lastly, a multi-layered Cloud security architecture was tested and showed that industry standards for securing data and computing resources were readily applied to the NGS system without disadvantageous effects for bioinformatic analysis, connectivity or data storage/retrieval. The results of this study demonstrate the feasibility of using Cloud-based systems for secured NGS data analysis, storage, databasing, and multi-user distributed connectivity. Copyright © 2017 Elsevier B.V. All rights reserved.

  19. Molecular diagnostics for congenital hearing loss including 15 deafness genes using a next generation sequencing platform

    Directory of Open Access Journals (Sweden)

    De Keulenaer Sarah

    2012-05-01

    Full Text Available Abstract Background Hereditary hearing loss (HL can originate from mutations in one of many genes involved in the complex process of hearing. Identification of the genetic defects in patients is currently labor intensive and expensive. While screening with Sanger sequencing for GJB2 mutations is common, this is not the case for the other known deafness genes (> 60. Next generation sequencing technology (NGS has the potential to be much more cost efficient. Published methods mainly use hybridization based target enrichment procedures that are time saving and efficient, but lead to loss in sensitivity. In this study we used a semi-automated PCR amplification and NGS in order to combine high sensitivity, speed and cost efficiency. Results In this proof of concept study, we screened 15 autosomal recessive deafness genes in 5 patients with congenital genetic deafness. 646 specific primer pairs for all exons and most of the UTR of the 15 selected genes were designed using primerXL. Using patient specific identifiers, all amplicons were pooled and analyzed using the Roche 454 NGS technology. Three of these patients are members of families in which a region of interest has previously been characterized by linkage studies. In these, we were able to identify two new mutations in CDH23 and OTOF. For another patient, the etiology of deafness was unclear, and no causal mutation was found. In a fifth patient, included as a positive control, we could confirm a known mutation in TMC1. Conclusions We have developed an assay that holds great promise as a tool for screening patients with familial autosomal recessive nonsyndromal hearing loss (ARNSHL. For the first time, an efficient, reliable and cost effective genetic test, based on PCR enrichment, for newborns with undiagnosed deafness is available.

  20. Novel Genetic Variants of Sporadic Atrial Septal Defect (ASD) in a Chinese Population Identified by Whole-Exome Sequencing (WES).

    Science.gov (United States)

    Liu, Yong; Cao, Yu; Li, Yaxiong; Lei, Dongyun; Li, Lin; Hou, Zong Liu; Han, Shen; Meng, Mingyao; Shi, Jianlin; Zhang, Yayong; Wang, Yi; Niu, Zhaoyi; Xie, Yanhua; Xiao, Benshan; Wang, Yuanfei; Li, Xiao; Yang, Lirong; Wang, Wenju; Jiang, Lihong

    2018-03-05

    BACKGROUND Recently, mutations in several genes have been described to be associated with sporadic ASD, but some genetic variants remain to be identified. The aim of this study was to use whole-exome sequencing (WES) combined with bioinformatics analysis to identify novel genetic variants in cases of sporadic congenital ASD, followed by validation by Sanger sequencing. MATERIAL AND METHODS Five Han patients with secundum ASD were recruited, and their tissue samples were analyzed by WES, followed by verification by Sanger sequencing of tissue and blood samples. Further evaluation using blood samples included 452 additional patients with sporadic secundum ASD (212 male and 240 female patients) and 519 healthy subjects (252 male and 267 female subjects) for further verification by a multiplexed MassARRAY system. Bioinformatic analyses were performed to identify novel genetic variants associated with sporadic ASD. RESULTS From five patients with sporadic ASD, a total of 181,762 genomic variants in 33 exon loci, validated by Sanger sequencing, were selected and underwent MassARRAY analysis in 452 patients with ASD and 519 healthy subjects. Three loci with high mutation frequencies, the 138665410 FOXL2 gene variant, the 23862952 MYH6 gene variant, and the 71098693 HYDIN gene variant were found to be significantly associated with sporadic ASD (PASD (PASD, and supported the use of WES and bioinformatics analysis to identify disease-associated mutations.

  1. Authentication of Herbal Supplements Using Next-Generation Sequencing.

    Directory of Open Access Journals (Sweden)

    Natalia V Ivanova

    Full Text Available DNA-based testing has been gaining acceptance as a tool for authentication of a wide range of food products; however, its applicability for testing of herbal supplements remains contentious.We utilized Sanger and Next-Generation Sequencing (NGS for taxonomic authentication of fifteen herbal supplements representing three different producers from five medicinal plants: Echinacea purpurea, Valeriana officinalis, Ginkgo biloba, Hypericum perforatum and Trigonella foenum-graecum. Experimental design included three modifications of DNA extraction, two lysate dilutions, Internal Amplification Control, and multiple negative controls to exclude background contamination. Ginkgo supplements were also analyzed using HPLC-MS for the presence of active medicinal components.All supplements yielded DNA from multiple species, rendering Sanger sequencing results for rbcL and ITS2 regions either uninterpretable or non-reproducible between the experimental replicates. Overall, DNA from the manufacturer-listed medicinal plants was successfully detected in seven out of eight dry herb form supplements; however, low or poor DNA recovery due to degradation was observed in most plant extracts (none detected by Sanger; three out of seven-by NGS. NGS also revealed a diverse community of fungi, known to be associated with live plant material and/or the fermentation process used in the production of plant extracts. HPLC-MS testing demonstrated that Ginkgo supplements with degraded DNA contained ten key medicinal components.Quality control of herbal supplements should utilize a synergetic approach targeting both DNA and bioactive components, especially for standardized extracts with degraded DNA. The NGS workflow developed in this study enables reliable detection of plant and fungal DNA and can be utilized by manufacturers for quality assurance of raw plant materials, contamination control during the production process, and the final product. Interpretation of results should

  2. Authentication of Herbal Supplements Using Next-Generation Sequencing.

    Science.gov (United States)

    Ivanova, Natalia V; Kuzmina, Maria L; Braukmann, Thomas W A; Borisenko, Alex V; Zakharov, Evgeny V

    2016-01-01

    DNA-based testing has been gaining acceptance as a tool for authentication of a wide range of food products; however, its applicability for testing of herbal supplements remains contentious. We utilized Sanger and Next-Generation Sequencing (NGS) for taxonomic authentication of fifteen herbal supplements representing three different producers from five medicinal plants: Echinacea purpurea, Valeriana officinalis, Ginkgo biloba, Hypericum perforatum and Trigonella foenum-graecum. Experimental design included three modifications of DNA extraction, two lysate dilutions, Internal Amplification Control, and multiple negative controls to exclude background contamination. Ginkgo supplements were also analyzed using HPLC-MS for the presence of active medicinal components. All supplements yielded DNA from multiple species, rendering Sanger sequencing results for rbcL and ITS2 regions either uninterpretable or non-reproducible between the experimental replicates. Overall, DNA from the manufacturer-listed medicinal plants was successfully detected in seven out of eight dry herb form supplements; however, low or poor DNA recovery due to degradation was observed in most plant extracts (none detected by Sanger; three out of seven-by NGS). NGS also revealed a diverse community of fungi, known to be associated with live plant material and/or the fermentation process used in the production of plant extracts. HPLC-MS testing demonstrated that Ginkgo supplements with degraded DNA contained ten key medicinal components. Quality control of herbal supplements should utilize a synergetic approach targeting both DNA and bioactive components, especially for standardized extracts with degraded DNA. The NGS workflow developed in this study enables reliable detection of plant and fungal DNA and can be utilized by manufacturers for quality assurance of raw plant materials, contamination control during the production process, and the final product. Interpretation of results should involve an

  3. Evaluation and optimisation of indel detection workflows for ion torrent sequencing of the BRCA1 and BRCA2 genes.

    Science.gov (United States)

    Yeo, Zhen Xuan; Wong, Joshua Chee Leong; Rozen, Steven G; Lee, Ann Siew Gek

    2014-06-24

    The Ion Torrent PGM is a popular benchtop sequencer that shows promise in replacing conventional Sanger sequencing as the gold standard for mutation detection. Despite the PGM's reported high accuracy in calling single nucleotide variations, it tends to generate many false positive calls in detecting insertions and deletions (indels), which may hinder its utility for clinical genetic testing. Recently, the proprietary analytical workflow for the Ion Torrent sequencer, Torrent Suite (TS), underwent a series of upgrades. We evaluated three major upgrades of TS by calling indels in the BRCA1 and BRCA2 genes. Our analysis revealed that false negative indels could be generated by TS under both default calling parameters and parameters adjusted for maximum sensitivity. However, indel calling with the same data using the open source variant callers, GATK and SAMtools showed that false negatives could be minimised with the use of appropriate bioinformatics analysis. Furthermore, we identified two variant calling measures, Quality-by-Depth (QD) and VARiation of the Width of gaps and inserts (VARW), which substantially reduced false positive indels, including non-homopolymer associated errors without compromising sensitivity. In our best case scenario that involved the TMAP aligner and SAMtools, we achieved 100% sensitivity, 99.99% specificity and 29% False Discovery Rate (FDR) in indel calling from all 23 samples, which is a good performance for mutation screening using PGM. New versions of TS, BWA and GATK have shown improvements in indel calling sensitivity and specificity over their older counterpart. However, the variant caller of TS exhibits a lower sensitivity than GATK and SAMtools. Our findings demonstrate that although indel calling from PGM sequences may appear to be noisy at first glance, proper computational indel calling analysis is able to maximize both the sensitivity and specificity at the single base level, paving the way for the usage of this technology

  4. Discovery of Escherichia coli CRISPR sequences in an undergraduate laboratory.

    Science.gov (United States)

    Militello, Kevin T; Lazatin, Justine C

    2017-05-01

    Clustered regularly interspaced short palindromic repeats (CRISPRs) represent a novel type of adaptive immune system found in eubacteria and archaebacteria. CRISPRs have recently generated a lot of attention due to their unique ability to catalog foreign nucleic acids, their ability to destroy foreign nucleic acids in a mechanism that shares some similarity to RNA interference, and the ability to utilize reconstituted CRISPR systems for genome editing in numerous organisms. In order to introduce CRISPR biology into an undergraduate upper-level laboratory, a five-week set of exercises was designed to allow students to examine the CRISPR status of uncharacterized Escherichia coli strains and to allow the discovery of new repeats and spacers. Students started the project by isolating genomic DNA from E. coli and amplifying the iap CRISPR locus using the polymerase chain reaction (PCR). The PCR products were analyzed by Sanger DNA sequencing, and the sequences were examined for the presence of CRISPR repeat sequences. The regions between the repeats, the spacers, were extracted and analyzed with BLASTN searches. Overall, CRISPR loci were sequenced from several previously uncharacterized E. coli strains and one E. coli K-12 strain. Sanger DNA sequencing resulted in the discovery of 36 spacer sequences and their corresponding surrounding repeat sequences. Five of the spacers were homologous to foreign (non-E. coli) DNA. Assessment of the laboratory indicates that improvements were made in the ability of students to answer questions relating to the structure and function of CRISPRs. Future directions of the laboratory are presented and discussed. © 2016 by The International Union of Biochemistry and Molecular Biology, 45(3):262-269, 2017. © 2016 The International Union of Biochemistry and Molecular Biology.

  5. De novo transcriptome sequencing of the Octopus vulgaris hemocytes using Illumina RNA-Seq technology: response to the infection by the gastrointestinal parasite Aggregata octopiana.

    Science.gov (United States)

    Castellanos-Martínez, Sheila; Arteta, David; Catarino, Susana; Gestal, Camino

    2014-01-01

    Octopus vulgaris is a highly valuable species of great commercial interest and excellent candidate for aquaculture diversification; however, the octopus' well-being is impaired by pathogens, of which the gastrointestinal coccidian parasite Aggregata octopiana is one of the most important. The knowledge of the molecular mechanisms of the immune response in cephalopods, especially in octopus is scarce. The transcriptome of the hemocytes of O. vulgaris was de novo sequenced using the high-throughput paired-end Illumina technology to identify genes involved in immune defense and to understand the molecular basis of octopus tolerance/resistance to coccidiosis. A bi-directional mRNA library was constructed from hemocytes of two groups of octopus according to the infection by A. octopiana, sick octopus, suffering coccidiosis, and healthy octopus, and reads were de novo assembled together. The differential expression of transcripts was analysed using the general assembly as a reference for mapping the reads from each condition. After sequencing, a total of 75,571,280 high quality reads were obtained from the sick octopus group and 74,731,646 from the healthy group. The general transcriptome of the O. vulgaris hemocytes was assembled in 254,506 contigs. A total of 48,225 contigs were successfully identified, and 538 transcripts exhibited differential expression between groups of infection. The general transcriptome revealed genes involved in pathways like NF-kB, TLR and Complement. Differential expression of TLR-2, PGRP, C1q and PRDX genes due to infection was validated using RT-qPCR. In sick octopuses, only TLR-2 was up-regulated in hemocytes, but all of them were up-regulated in caecum and gills. The transcriptome reported here de novo establishes the first molecular clues to understand how the octopus immune system works and interacts with a highly pathogenic coccidian. The data provided here will contribute to identification of biomarkers for octopus resistance against

  6. Deep Sequencing of Myxilla (Ectyomyxilla) methanophila, an Epibiotic Sponge on Cold-Seep Tubeworms, Reveals Methylotrophic, Thiotrophic, and Putative Hydrocarbon-Degrading Microbial Associations

    KAUST Repository

    Arellano, Shawn M.

    2012-10-11

    The encrusting sponge Myxilla (Ectyomyxilla) methanophila (Poecilosclerida: Myxillidae) is an epibiont on vestimentiferan tubeworms at hydrocarbon seeps on the upper Louisiana slope of the Gulf of Mexico. It has long been suggested that this sponge harbors methylotrophic bacteria due to its low δ13C value and high methanol dehydrogenase activity, yet the full community of microbial associations in M. methanophila remained uncharacterized. In this study, we sequenced 16S rRNA genes representing the microbial community in M. methanophila collected from two hydrocarbon-seep sites (GC234 and Bush Hill) using both Sanger sequencing and next-generation 454 pyrosequencing technologies. Additionally, we compared the microbial community in M. methanophila to that of the biofilm collected from the associated tubeworm. Our results revealed that the microbial diversity in the sponges from both sites was low but the community structure was largely similar, showing a high proportion of methylotrophic bacteria of the genus Methylohalomonas and polycyclic aromatic hydrocarbon (PAH)-degrading bacteria of the genera Cycloclasticus and Neptunomonas. Furthermore, the sponge microbial clone library revealed the dominance of thioautotrophic gammaproteobacterial symbionts in M. methanophila. In contrast, the biofilm communities on the tubeworms were more diverse and dominated by the chemoorganotrophic Moritella at GC234 and methylotrophic Methylomonas and Methylohalomonas at Bush Hill. Overall, our study provides evidence to support previous suggestion that M. methanophila harbors methylotrophic symbionts and also reveals the association of PAH-degrading and thioautotrophic microbes in the sponge. © 2012 Springer Science+Business Media New York.

  7. Deep sequencing of Myxilla (Ectyomyxilla) methanophila, an epibiotic sponge on cold-seep tubeworms, reveals methylotrophic, thiotrophic, and putative hydrocarbon-degrading microbial associations.

    Science.gov (United States)

    Arellano, Shawn M; Lee, On On; Lafi, Feras F; Yang, Jiangke; Wang, Yong; Young, Craig M; Qian, Pei-Yuan

    2013-02-01

    The encrusting sponge Myxilla (Ectyomyxilla) methanophila (Poecilosclerida: Myxillidae) is an epibiont on vestimentiferan tubeworms at hydrocarbon seeps on the upper Louisiana slope of the Gulf of Mexico. It has long been suggested that this sponge harbors methylotrophic bacteria due to its low δ(13)C value and high methanol dehydrogenase activity, yet the full community of microbial associations in M. methanophila remained uncharacterized. In this study, we sequenced 16S rRNA genes representing the microbial community in M. methanophila collected from two hydrocarbon-seep sites (GC234 and Bush Hill) using both Sanger sequencing and next-generation 454 pyrosequencing technologies. Additionally, we compared the microbial community in M. methanophila to that of the biofilm collected from the associated tubeworm. Our results revealed that the microbial diversity in the sponges from both sites was low but the community structure was largely similar, showing a high proportion of methylotrophic bacteria of the genus Methylohalomonas and polycyclic aromatic hydrocarbon (PAH)-degrading bacteria of the genera Cycloclasticus and Neptunomonas. Furthermore, the sponge microbial clone library revealed the dominance of thioautotrophic gammaproteobacterial symbionts in M. methanophila. In contrast, the biofilm communities on the tubeworms were more diverse and dominated by the chemoorganotrophic Moritella at GC234 and methylotrophic Methylomonas and Methylohalomonas at Bush Hill. Overall, our study provides evidence to support previous suggestion that M. methanophila harbors methylotrophic symbionts and also reveals the association of PAH-degrading and thioautotrophic microbes in the sponge.

  8. Statistical method to compare massive parallel sequencing pipelines.

    Science.gov (United States)

    Elsensohn, M H; Leblay, N; Dimassi, S; Campan-Fournier, A; Labalme, A; Roucher-Boulez, F; Sanlaville, D; Lesca, G; Bardel, C; Roy, P

    2017-03-01

    Today, sequencing is frequently carried out by Massive Parallel Sequencing (MPS) that cuts drastically sequencing time and expenses. Nevertheless, Sanger sequencing remains the main validation method to confirm the presence of variants. The analysis of MPS data involves the development of several bioinformatic tools, academic or commercial. We present here a statistical method to compare MPS pipelines and test it in a comparison between an academic (BWA-GATK) and a commercial pipeline (TMAP-NextGENe®), with and without reference to a gold standard (here, Sanger sequencing), on a panel of 41 genes in 43 epileptic patients. This method used the number of variants to fit log-linear models for pairwise agreements between pipelines. To assess the heterogeneity of the margins and the odds ratios of agreement, four log-linear models were used: a full model, a homogeneous-margin model, a model with single odds ratio for all patients, and a model with single intercept. Then a log-linear mixed model was fitted considering the biological variability as a random effect. Among the 390,339 base-pairs sequenced, TMAP-NextGENe® and BWA-GATK found, on average, 2253.49 and 1857.14 variants (single nucleotide variants and indels), respectively. Against the gold standard, the pipelines had similar sensitivities (63.47% vs. 63.42%) and close but significantly different specificities (99.57% vs. 99.65%; p < 0.001). Same-trend results were obtained when only single nucleotide variants were considered (99.98% specificity and 76.81% sensitivity for both pipelines). The method allows thus pipeline comparison and selection. It is generalizable to all types of MPS data and all pipelines.

  9. Spectrum of benzo[a]pyrene-induced mutations in the Pig-a gene of L5178YTk+/- cells identified with next generation sequencing.

    Science.gov (United States)

    Revollo, Javier; Wang, Yiying; McKinzie, Page; Dad, Azra; Pearce, Mason; Heflich, Robert H; Dobrovolsky, Vasily N

    2017-12-01

    We used Sanger sequencing and next generation sequencing (NGS) for analysis of mutations in the endogenous X-linked Pig-a gene of clonally expanded L5178YTk +/- cells. The clones developed from single cells that were sorted on a flow cytometer based upon the expression pattern of the GPI-anchored marker, CD90, on their surface. CD90-deficient and CD90-proficient cells were sorted from untreated cultures and CD90-deficient cells were sorted from cultures treated with benzo[a]pyrene (B[a]P). Pig-a mutations were identified in all clones developed from CD90-deficient cells; no Pig-a mutations were found in clones of CD90-proficient cells. The spectrum of B[a]P-induced Pig-a mutations was dominated by basepair substitutions, small insertions and deletions at G:C, or at sequences rich in G:C content. We observed high concordance between Pig-a mutations determined by Sanger sequencing and by NGS, but NGS was able to identify mutations in samples that were difficult to analyze by Sanger sequencing (e.g., mixtures of two mutant clones). Overall, the NGS method is a cost and labor efficient high throughput approach for analysis of a large number of mutant clones. Published by Elsevier B.V.

  10. De Novo Assembly of Human Herpes Virus Type 1 (HHV-1) Genome, Mining of Non-Canonical Structures and Detection of Novel Drug-Resistance Mutations Using Short- and Long-Read Next Generation Sequencing Technologies.

    Science.gov (United States)

    Karamitros, Timokratis; Harrison, Ian; Piorkowska, Renata; Katzourakis, Aris; Magiorkinis, Gkikas; Mbisa, Jean Lutamyo

    2016-01-01

    Human herpesvirus type 1 (HHV-1) has a large double-stranded DNA genome of approximately 152 kbp that is structurally complex and GC-rich. This makes the assembly of HHV-1 whole genomes from short-read sequencing data technically challenging. To improve the assembly of HHV-1 genomes we have employed a hybrid genome assembly protocol using data from two sequencing technologies: the short-read Roche 454 and the long-read Oxford Nanopore MinION sequencers. We sequenced 18 HHV-1 cell culture-isolated clinical specimens collected from immunocompromised patients undergoing antiviral therapy. The susceptibility of the samples to several antivirals was determined by plaque reduction assay. Hybrid genome assembly resulted in a decrease in the number of contigs in 6 out of 7 samples and an increase in N(G)50 and N(G)75 of all 7 samples sequenced by both technologies. The approach also enhanced the detection of non-canonical contigs including a rearrangement between the unique (UL) and repeat (T/IRL) sequence regions of one sample that was not detectable by assembly of 454 reads alone. We detected several known and novel resistance-associated mutations in UL23 and UL30 genes. Genome-wide genetic variability ranged from genomes will be useful in determining genetic determinants of drug resistance, virulence, pathogenesis and viral evolution. The numerous, complex repeat regions of the HHV-1 genome currently remain a barrier towards this goal.

  11. Genetic mapping using the Diversity Arrays Technology (DArT) : application and validation using the whole-genome sequences of Arabidopsis thaliana and the fungal wheat pathogen Mycosphaerella graminicola

    NARCIS (Netherlands)

    Wittenberg, A.H.J.

    2007-01-01

    Diversity Arrays Technology (DArT) is a microarray-based DNA marker technique for genome-wide discovery and genotyping of genetic variation. DArT allows simultaneous scoring of hundreds- to thousands of restriction site based polymorphisms between genotypes and does not require DNA sequence

  12. Technology.

    Science.gov (United States)

    Online-Offline, 1998

    1998-01-01

    Focuses on technology, on advances in such areas as aeronautics, electronics, physics, the space sciences, as well as computers and the attendant progress in medicine, robotics, and artificial intelligence. Describes educational resources for elementary and middle school students, including Web sites, CD-ROMs and software, videotapes, books,…

  13. Targeted exome sequencing reveals novel USH2A mutations in Chinese patients with simplex Usher syndrome.

    Science.gov (United States)

    Shu, Hai-Rong; Bi, Huai; Pan, Yang-Chun; Xu, Hang-Yu; Song, Jian-Xin; Hu, Jie

    2015-09-16

    Usher syndrome (USH) is an autosomal recessive disorder characterized by hearing impairment and vision dysfunction due to retinitis pigmentosa. Phenotypic and genetic heterogeneities of this disease make it impractical to obtain a genetic diagnosis by conventional Sanger sequencing. In this study, we applied a next-generation sequencing approach to detect genetic abnormalities in patients with USH. Two unrelated Chinese families were recruited, consisting of two USH afflicted patients and four unaffected relatives. We selected 199 genes related to inherited retinal diseases as targets for deep exome sequencing. Through systematic data analysis using an established bioinformatics pipeline, all variants that passed filter criteria were validated by Sanger sequencing and co-segregation analysis. A homozygous frameshift mutation (c.4382delA, p.T1462Lfs*2) was revealed in exon20 of gene USH2A in the F1 family. Two compound heterozygous mutations, IVS47 + 1G > A and c.13156A > T (p.I4386F), located in intron 48 and exon 63 respectively, of USH2A, were identified as causative mutations for the F2 family. Of note, the missense mutation c.13156A > T has not been reported so far. In conclusion, targeted exome sequencing precisely and rapidly identified the genetic defects in two Chinese USH families and this technique can be applied as a routine examination for these disorders with significant clinical and genetic heterogeneity.

  14. The Quest for Rare Variants: Pooled Multiplexed Next Generation Sequencing in Plants

    Directory of Open Access Journals (Sweden)

    Fabio eMarroni

    2012-06-01

    Full Text Available Next generation sequencing (NGS instruments produce an unprecedented amount of sequence data at contained costs. This gives researchers the possibility of designing studies with adequate power to identify rare variants at a fraction of the economic and labor resources required by individual Sanger sequencing. As of today, only three research groups working in plant sciences have exploited this potentiality. They showed that pooled NGS can provide results in excellent agreement with those obtained by individual Sanger sequencing. Aim of this review is to convey to the reader the general ideas underlying the use of pooled NGS for the identification of rare variants. To facilitate a thorough understanding of the possibilities of the method we will explain in detail the variations in study design and discuss their advantages and disadvantages. We will show that information on allele frequency obtained by pooled next generation sequencing can be used to accurately compute basic population genetics indexes such as allele frequency, nucleotide diversity and Tajima’s D. Finally we will discuss applications and future perspectives of the multiplexed NGS approach.

  15. Clinical Use of Next-Generation Sequencing in the Diagnosis of Wilson’s Disease

    Directory of Open Access Journals (Sweden)

    Dániel Németh

    2016-01-01

    Full Text Available Objective. Wilson’s disease is a disorder of copper metabolism which is fatal without treatment. The great number of disease-causing ATP7B gene mutations and the variable clinical presentation of WD may cause a real diagnostic challenge. The emergence of next-generation sequencing provides a time-saving, cost-effective method for full sequencing of the whole ATP7B gene compared to the traditional Sanger sequencing. This is the first report on the clinical use of NGS to examine ATP7B gene. Materials and Methods. We used Ion Torrent Personal Genome Machine in four heterozygous patients for the identification of the other mutations and also in two patients with no known mutation. One patient with acute on chronic liver failure was a candidate for acute liver transplantation. The results were validated by Sanger sequencing. Results. In each case, the diagnosis of Wilson’s disease was confirmed by identifying the mutations in both alleles within 48 hours. One novel mutation (p.Ala1270Ile was found beyond the eight other known ones. The rapid detection of the mutations made possible the prompt diagnosis of WD in a patient with acute liver failure. Conclusions. According to our results we found next-generation sequencing a very useful, reliable, time-saving, and cost-effective method for diagnosing Wilson’s disease in selected cases.

  16. Technology

    Directory of Open Access Journals (Sweden)

    Xu Jing

    2016-01-01

    Full Text Available The traditional answer card reading method using OMR (Optical Mark Reader, most commonly, OMR special card special use, less versatile, high cost, aiming at the existing problems proposed a method based on pattern recognition of the answer card identification method. Using the method based on Line Segment Detector to detect the tilt of the image, the existence of tilt image rotation correction, and eventually achieve positioning and detection of answers to the answer sheet .Pattern recognition technology for automatic reading, high accuracy, detect faster

  17. Sequence analysis of the canine mitochondrial DNA control region from shed hair samples in criminal investigations.

    Science.gov (United States)

    Berger, C; Berger, B; Parson, W

    2012-01-01

    In recent years, evidence from domestic dogs has increasingly been analyzed by forensic DNA testing. Especially, canine hairs have proved most suitable and practical due to the high rate of hair transfer occurring between dogs and humans. Starting with the description of a contamination-free sample handling procedure, we give a detailed workflow for sequencing hypervariable segments (HVS) of the mtDNA control region from canine evidence. After the hair material is lysed and the DNA extracted by Phenol/Chloroform, the amplification and sequencing strategy comprises the HVS I and II of the canine control region and is optimized for DNA of medium-to-low quality and quantity. The sequencing procedure is based on the Sanger Big-dye deoxy-terminator method and the separation of the sequencing reaction products is performed on a conventional multicolor fluorescence detection capillary electrophoresis platform. Finally, software-aided base calling and sequence interpretation are addressed exemplarily.

  18. Deep sequencing analysis of HBV genotype shift and correlation with antiviral efficiency during adefovir dipivoxil therapy.

    Directory of Open Access Journals (Sweden)

    Yuwei Wang

    Full Text Available Viral genotype shift in chronic hepatitis B (CHB patients during antiviral therapy has been reported, but the underlying mechanism remains elusive.38 CHB patients treated with ADV for one year were selected for studying genotype shift by both deep sequencing and Sanger sequencing method.Sanger sequencing method found that 7.9% patients showed mixed genotype before ADV therapy. In contrast, all 38 patients showed mixed genotype before ADV treatment by deep sequencing. 95.5% mixed genotype rate was also obtained from additional 200 treatment-naïve CHB patients. Of the 13 patients with genotype shift, the fraction of the minor genotype in 5 patients (38% increased gradually during the course of ADV treatment. Furthermore, responses to ADV and HBeAg seroconversion were associated with the high rate of genotype shift, suggesting drug and immune pressure may be key factors to induce genotype shift. Interestingly, patients with genotype C had a significantly higher rate of genotype shift than genotype B. In genotype shift group, ADV treatment induced a marked enhancement of genotype B ratio accompanied by a reduction of genotype C ratio, suggesting genotype C may be more sensitive to ADV than genotype B. Moreover, patients with dominant genotype C may have a better therapeutic effect. Finally, genotype shifts was correlated with clinical improvement in terms of ALT.Our findings provided a rational explanation for genotype shift among ADV-treated CHB patients. The genotype and genotype shift might be associated with antiviral efficiency.

  19. The quest for rare variants: pooled multiplexed next generation sequencing in plants.

    Science.gov (United States)

    Marroni, Fabio; Pinosio, Sara; Morgante, Michele

    2012-01-01

    Next generation sequencing (NGS) instruments produce an unprecedented amount of sequence data at contained costs. This gives researchers the possibility of designing studies with adequate power to identify rare variants at a fraction of the economic and labor resources required by individual Sanger sequencing. As of today, few research groups working in plant sciences have exploited this potentiality, showing that pooled NGS provides results in excellent agreement with those obtained by individual Sanger sequencing. The aim of this review is to convey to the reader the general ideas underlying the use of pooled NGS for the identification of rare variants. To facilitate a thorough understanding of the possibilities of the method, we will explain in detail the possible experimental and analytical approaches and discuss their advantages and disadvantages. We will show that information on allele frequency obtained by pooled NGS can be used to accurately compute basic population genetics indexes such as allele frequency, nucleotide diversity, and Tajima's D. Finally, we will discuss applications and future perspectives of the multiplexed NGS approach.

  20. Refining the Results of a Classical SELEX Experiment by Expanding the Sequence Data Set of an Aptamer Pool Selected for Protein A

    Directory of Open Access Journals (Sweden)

    Regina Stoltenburg

    2018-02-01

    Full Text Available New, as yet undiscovered aptamers for Protein A were identified by applying next generation sequencing (NGS to a previously selected aptamer pool. This pool was obtained in a classical SELEX (Systematic Evolution of Ligands by EXponential enrichment experiment using the FluMag-SELEX procedure followed by cloning and Sanger sequencing. PA#2/8 was identified as the only Protein A-binding aptamer from the Sanger sequence pool, and was shown to be able to bind intact cells of Staphylococcus aureus. In this study, we show the extension of the SELEX results by re-sequencing of the same aptamer pool using a medium throughput NGS approach and data analysis. Both data pools were compared. They confirm the selection of a highly complex and heterogeneous oligonucleotide pool and show consistently a high content of orphans as well as a similar relative frequency of certain sequence groups. But in contrast to the Sanger data pool, the NGS pool was clearly dominated by one sequence group containing the known Protein A-binding aptamer PA#2/8 as the most frequent sequence in this group. In addition, we found two new sequence groups in the NGS pool represented by PA-C10 and PA-C8, respectively, which also have high specificity for Protein A. Comparative affinity studies reveal differences between the aptamers and confirm that PA#2/8 remains the most potent sequence within the selected aptamer pool reaching affinities in the low nanomolar range of KD = 20 ± 1 nM.

  1. Refining the Results of a Classical SELEX Experiment by Expanding the Sequence Data Set of an Aptamer Pool Selected for Protein A.

    Science.gov (United States)

    Stoltenburg, Regina; Strehlitz, Beate

    2018-02-24

    New, as yet undiscovered aptamers for Protein A were identified by applying next generation sequencing (NGS) to a previously selected aptamer pool. This pool was obtained in a classical SELEX (Systematic Evolution of Ligands by EXponential enrichment) experiment using the FluMag-SELEX procedure followed by cloning and Sanger sequencing. PA#2/8 was identified as the only Protein A-binding aptamer from the Sanger sequence pool, and was shown to be able to bind intact cells of Staphylococcus aureus . In this study, we show the extension of the SELEX results by re-sequencing of the same aptamer pool using a medium throughput NGS approach and data analysis. Both data pools were compared. They confirm the selection of a highly complex and heterogeneous oligonucleotide pool and show consistently a high content of orphans as well as a similar relative frequency of certain sequence groups. But in contrast to the Sanger data pool, the NGS pool was clearly dominated by one sequence group containing the known Protein A-binding aptamer PA#2/8 as the most frequent sequence in this group. In addition, we found two new sequence groups in the NGS pool represented by PA-C10 and PA-C8, respectively, which also have high specificity for Protein A. Comparative affinity studies reveal differences between the aptamers and confirm that PA#2/8 remains the most potent sequence within the selected aptamer pool reaching affinities in the low nanomolar range of K D = 20 ± 1 nM.

  2. Why barcode? High-throughput multiplex sequencing of mitochondrial genomes for molecular systematics.

    Science.gov (United States)

    Timmermans, M J T N; Dodsworth, S; Culverwell, C L; Bocak, L; Ahrens, D; Littlewood, D T J; Pons, J; Vogler, A P

    2010-11-01

    Mitochondrial genome sequences are important markers for phylogenetics but taxon sampling remains sporadic because of the great effort and cost required to acquire full-length sequences. Here, we demonstrate a simple, cost-effective way to sequence the full complement of protein coding mitochondrial genes from pooled samples using the 454/Roche platform. Multiplexing was achieved without the need for expensive indexing tags ('barcodes'). The method was trialled with a set of long-range polymerase chain reaction (PCR) fragments from 30 species of Coleoptera (beetles) sequenced in a 1/16th sector of a sequencing plate. Long contigs were produced from the pooled sequences with sequencing depths ranging from ∼10 to 100× per contig. Species identity of individual contigs was established via three 'bait' sequences matching disparate parts of the mitochondrial genome obtained by conventional PCR and Sanger sequencing. This proved that assembly of contigs from the sequencing pool was correct. Our study produced sequences for 21 nearly complete and seven partial sets of protein coding mitochondrial genes. Combined with existing sequences for 25 taxa, an improved estimate of basal relationships in Coleoptera was obtained. The procedure could be employed routinely for mitochondrial genome sequencing at the species level, to provide improved species 'barcodes' that currently use the cox1 gene only.

  3. Development of primers for sequencing the NSP1, NSP3, and VP6 genes of the group A porcine rotavirus

    Directory of Open Access Journals (Sweden)

    Fernanda Dornelas Florentino Silva

    2014-02-01

    Full Text Available Rotavirus is the causative pathogen of diarrhea in humans and in several animal species. Eight pairs of primers were developed and used for Sanger sequencing of the coding region of the NSP1, NSP3, and VP6 genes based on the conserved regions of the genome of the group A porcine rotavirus. Three samples previously screened as positive for group A rotaviruses were subjected to gene amplification and sequencing to characterize the pathogen. The information generated from this study is crucial for the understanding of the epidemiology of the disease.

  4. Complete chloroplast genome sequence of a major economic species, Ziziphus jujuba (Rhamnaceae).

    Science.gov (United States)

    Ma, Qiuyue; Li, Shuxian; Bi, Changwei; Hao, Zhaodong; Sun, Congrui; Ye, Ning

    2017-02-01

    Ziziphus jujuba is an important woody plant with high economic and medicinal value. Here, we analyzed and characterized the complete chloroplast (cp) genome of Z. jujuba, the first member of the Rhamnaceae family for which the chloroplast genome sequence has been reported. We also built a web browser for navigating the cp genome of Z. jujuba ( http://bio.njfu.edu.cn/gb2/gbrowse/Ziziphus_jujuba_cp/ ). Sequence analysis showed that this cp genome is 161,466 bp long and has a typical quadripartite structure of large (LSC, 89,120 bp) and small (SSC, 19,348 bp) single-copy regions separated by a pair of inverted repeats (IRs, 26,499 bp). The sequence contained 112 unique genes, including 78 protein-coding genes, 30 transfer RNAs, and four ribosomal RNAs. The genome structure, gene order, GC content, and codon usage are similar to other typical angiosperm cp genomes. A total of 38 tandem repeats, two forward repeats, and three palindromic repeats were detected in the Z. jujuba cp genome. Simple sequence repeat (SSR) analysis revealed that most SSRs were AT-rich. The homopolymer regions in the cp genome of Z. jujuba were verified and manually corrected by Sanger sequencing. One-third of mononucleotide repeats were found to be erroneously sequenced by the 454 pyrosequencing, which resulted in sequences of 1-4 bases shorter than that by the Sanger sequencing. Analyzing the cp genome of Z. jujuba revealed that the IR contraction and expansion events resulted in ycf1 and rps19 pseudogenes. A phylogenetic analysis based on 64 protein-coding genes showed that Z. jujuba was closely related to members of the Elaeagnaceae family, which will be helpful for phylogenetic studies of other Rosales species. The complete cp genome sequence of Z. jujuba will facilitate population, phylogenetic, and cp genetic engineering studies of this economic plant.

  5. Novel mutations and their genotype-phenotype correlations in patients with Noonan syndrome, using next-generation sequencing.

    Science.gov (United States)

    Tafazoli, Alireza; Eshraghi, Peyman; Pantaleoni, Francesca; Vakili, Rahim; Moghaddassian, Morteza; Ghahraman, Martha; Muto, Valentina; Paolacci, Stefano; Golyan, Fatemeh Fardi; Abbaszadegan, Mohammad Reza

    2018-03-01

    Noonan Syndrome (NS) is an autosomal dominant disorder with many variable and heterogeneous conditions. The genetic basis for 20-30% of cases is still unknown. This study evaluates Iranian Noonan patients both clinically and genetically for the first time. Mutational analysis of PTPN11 gene was performed in 15 Iranian patients, using PCR and Sanger sequencing at phase one. Then, as phase two, Next Generation Sequencing (NGS) in the form of targeted resequencing was utilized for analysis of exons from other related genes. Homology modelling for the novel founded mutations was performed as well. The genotype, phenotype correlation was done according to the molecular findings and clinical features. Previously reported mutation (p.N308D) in some patients and a novel mutation (p.D155N) in one of the patients were identified in phase one. After applying NGS methods, known and new variants were found in four patients in other genes, including: CBL (p. V904I), KRAS (p. L53W), SOS1 (p. I1302V), and SOS1 (p. R552G). Structural studies of two deduced novel mutations in related genes revealed deficiencies in the mutated proteins. Following genotype, phenotype correlation, a new pattern of the presence of intellectual disability in two patients was registered. NS shows strong variable expressivity along the high genetic heterogeneity especially in distinct populations and ethnic groups. Also possibly unknown other causative genes may be exist. Obviously, more comprehensive and new technologies like NGS methods are the best choice for detection of molecular defects in patients for genotype, phenotype correlation and disease management. Copyright © 2017 Medical University of Bialystok. Published by Elsevier B.V. All rights reserved.

  6. Molecular characterization of human T-cell lymphotropic virus type 1 full and partial genomes by Illumina massively parallel sequencing technology.

    Directory of Open Access Journals (Sweden)

    Rodrigo Pessôa

    Full Text Available BACKGROUND: Here, we report on the partial and full-length genomic (FLG variability of HTLV-1 sequences from 90 well-characterized subjects, including 48 HTLV-1 asymptomatic carriers (ACs, 35 HTLV-1-associated myelopathy/tropical spastic paraparesis (HAM/TSP and 7 adult T-cell leukemia/lymphoma (ATLL patients, using an Illumina paired-end protocol. METHODS: Blood samples were collected from 90 individuals, and DNA was extracted from the PBMCs to measure the proviral load and to amplify the HTLV-1 FLG from two overlapping fragments. The amplified PCR products were subjected to deep sequencing. The sequencing data were assembled, aligned, and mapped against the HTLV-1 genome with sufficient genetic resemblance and utilized for further phylogenetic analysis. RESULTS: A high-throughput sequencing-by-synthesis instrument was used to obtain an average of 3210- and 5200-fold coverage of the partial (n = 14 and FLG (n = 76 data from the HTLV-1 strains, respectively. The results based on the phylogenetic trees of consensus sequences from partial and FLGs revealed that 86 (95.5% individuals were infected with the transcontinental sub-subtypes of the cosmopolitan subtype (aA and that 4 individuals (4.5% were infected with the Japanese sub-subtypes (aB. A comparison of the nucleotide and amino acids of the FLG between the three clinical settings yielded no correlation between the sequenced genotype and clinical outcomes. The evolutionary relationships among the HTLV sequences were inferred from nucleotide sequence, and the results are consistent with the hypothesis that there were multiple introductions of the transcontinental subtype in Brazil. CONCLUSIONS: This study has increased the number of subtype aA full-length genomes from 8 to 81 and HTLV-1 aB from 2 to 5 sequences. The overall data confirmed that the cosmopolitan transcontinental sub-subtypes were the most prevalent in the Brazilian population. It is hoped that this valuable genomic data

  7. Molecular characterization of human T-cell lymphotropic virus type 1 full and partial genomes by Illumina massively parallel sequencing technology.

    Science.gov (United States)

    Pessôa, Rodrigo; Watanabe, Jaqueline Tomoko; Nukui, Youko; Pereira, Juliana; Casseb, Jorge; Kasseb, Jorge; de Oliveira, Augusto César Penalva; Segurado, Aluisio Cotrim; Sanabani, Sabri Saeed

    2014-01-01

    Here, we report on the partial and full-length genomic (FLG) variability of HTLV-1 sequences from 90 well-characterized subjects, including 48 HTLV-1 asymptomatic carriers (ACs), 35 HTLV-1-associated myelopathy/tropical spastic paraparesis (HAM/TSP) and 7 adult T-cell leukemia/lymphoma (ATLL) patients, using an Illumina paired-end protocol. Blood samples were collected from 90 individuals, and DNA was extracted from the PBMCs to measure the proviral load and to amplify the HTLV-1 FLG from two overlapping fragments. The amplified PCR products were subjected to deep sequencing. The sequencing data were assembled, aligned, and mapped against the HTLV-1 genome with sufficient genetic resemblance and utilized for further phylogenetic analysis. A high-throughput sequencing-by-synthesis instrument was used to obtain an average of 3210- and 5200-fold coverage of the partial (n = 14) and FLG (n = 76) data from the HTLV-1 strains, respectively. The results based on the phylogenetic trees of consensus sequences from partial and FLGs revealed that 86 (95.5%) individuals were infected with the transcontinental sub-subtypes of the cosmopolitan subtype (aA) and that 4 individuals (4.5%) were infected with the Japanese sub-subtypes (aB). A comparison of the nucleotide and amino acids of the FLG between the three clinical settings yielded no correlation between the sequenced genotype and clinical outcomes. The evolutionary relationships among the HTLV sequences were inferred from nucleotide sequence, and the results are consistent with the hypothesis that there were multiple introductions of the transcontinental subtype in Brazil. This study has increased the number of subtype aA full-length genomes from 8 to 81 and HTLV-1 aB from 2 to 5 sequences. The overall data confirmed that the cosmopolitan transcontinental sub-subtypes were the most prevalent in the Brazilian population. It is hoped that this valuable genomic data will add to our current understanding of the

  8. Characterization of promoter sequence of toll-like receptor genes in Vechur cattle

    Directory of Open Access Journals (Sweden)

    R. Lakshmi

    2016-06-01

    Full Text Available Aim: To analyze the promoter sequence of toll-like receptor (TLR genes in Vechur cattle, an indigenous breed of Kerala with the sequence of Bos taurus and access the differences that could be attributed to innate immune responses against bovine mastitis. Materials and Methods: Blood samples were collected from Jugular vein of Vechur cattle, maintained at Vechur cattle conservation center of Kerala Veterinary and Animal Sciences University, using an acid-citrate-dextrose anticoagulant. The genomic DNA was extracted, and polymerase chain reaction was carried out to amplify the promoter region of TLRs. The amplified product of TLR2, 4, and 9 promoter regions was sequenced by Sanger enzymatic DNA sequencing technique. Results: The sequence of promoter region of TLR2 of Vechur cattle with the B. taurus sequence present in GenBank showed 98% similarity and revealed variants for four sequence motifs. The sequence of the promoter region of TLR4 of Vechur cattle revealed 99% similarity with that of B. taurus sequence but not reveals significant variant in motifregions. However, two heterozygous loci were observed from the chromatogram. Promoter sequence of TLR9 gene also showed 99% similarity to B. taurus sequence and revealed variants for four sequence motifs. Conclusion: The results of this study indicate that significant variation in the promoter of TLR2 and 9 genes in Vechur cattle breed and may potentially link the influence the innate immunity response against mastitis diseases.

  9. Simultaneous discrimination of species and strains in Lactobacillus rhamnosus using species-specific PCR combined with multiplex mini-sequencing technology.

    Science.gov (United States)

    Huang, Chien-Hsun; Chang, Mu-Tzu; Huang, Lina; Chu, Wen-Shen

    2015-12-01

    This study described the use of species-specific PCR in combination with SNaPshot mini-sequencing to achieve species identification and strain differentiation in Lactobacillus rhamnosus. To develop species-specific PCR and strain subtyping primers, the dnaJ gene was used as a target, and its corresponding sequences were analyzed both in Lb. rhamnosus and in a subset of its phylogenetically closest species. The results indicated that the species-specific primer pair was indeed specific for Lb. rhamnosus, and the mini-sequencing assay was able to unambiguously distinguish Lb. rhamnosus strains into different haplotypes. In conclusion, we have successfully developed a rapid, accurate and cost-effective assay for inter- and intraspecies discrimination of Lb. rhamnosus, which can be applied to achieve efficient quality control of probiotic products. Copyright © 2015 Elsevier Ltd. All rights reserved.

  10. Efficient Generation of Myostatin Knock-Out Sheep Using CRISPR/Cas9 Technology and Microinjection into Zygotes.

    Directory of Open Access Journals (Sweden)

    M Crispo

    Full Text Available While CRISPR/Cas9 technology has proven to be a valuable system to generate gene-targeted modified animals in several species, this tool has been scarcely reported in farm animals. Myostatin is encoded by MSTN gene involved in the inhibition of muscle differentiation and growth. We determined the efficiency of the CRISPR/Cas9 system to edit MSTN in sheep and generate knock-out (KO animals with the aim to promote muscle development and body growth. We generated CRISPR/Cas9 mRNAs specific for ovine MSTN and microinjected them into the cytoplasm of ovine zygotes. When embryo development of CRISPR/Cas9 microinjected zygotes (n = 216 was compared with buffer injected embryos (n = 183 and non microinjected embryos (n = 173, cleavage rate was lower for both microinjected groups (P<0.05 and neither was affected by CRISPR/Cas9 content in the injected medium. Embryo development to blastocyst was not affected by microinjection and was similar among the experimental groups. From 20 embryos analyzed by Sanger sequencing, ten were mutant (heterozygous or mosaic; 50% efficiency. To obtain live MSTN KO lambs, 53 blastocysts produced after zygote CRISPR/Cas9 microinjection were transferred to 29 recipient females resulting in 65.5% (19/29 of pregnant ewes and 41.5% (22/53 of newborns. From 22 born lambs analyzed by T7EI and Sanger sequencing, ten showed indel mutations at MSTN gene. Eight showed mutations in both alleles and five of them were homozygous for indels generating out-of frame mutations that resulted in premature stop codons. Western blot analysis of homozygous KO founders confirmed the absence of myostatin, showing heavier body weight than wild type counterparts. In conclusion, our results demonstrate that CRISPR/Cas9 system was a very efficient tool to generate gene KO sheep. This technology is quick and easy to perform and less expensive than previous techniques, and can be applied to obtain genetically modified animal models of interest for

  11. Genomic sequencing in clinical trials

    OpenAIRE

    Mestan, Karen K; Ilkhanoff, Leonard; Mouli, Samdeep; Lin, Simon

    2011-01-01

    Abstract Human genome sequencing is the process by which the exact order of nucleic acid base pairs in the 24 human chromosomes is determined. Since the completion of the Human Genome Project in 2003, genomic sequencing is rapidly becoming a major part of our translational research efforts to understand and improve human health and disease. This article reviews the current and future directions of clinical research with respect to genomic sequencing, a technology that is just beginning to fin...

  12. Biosensors for DNA sequence detection

    Science.gov (United States)

    Vercoutere, Wenonah; Akeson, Mark

    2002-01-01

    DNA biosensors are being developed as alternatives to conventional DNA microarrays. These devices couple signal transduction directly to sequence recognition. Some of the most sensitive and functional technologies use fibre optics or electrochemical sensors in combination with DNA hybridization. In a shift from sequence recognition by hybridization, two emerging single-molecule techniques read sequence composition using zero-mode waveguides or electrical impedance in nanoscale pores.

  13. Identification of genomic insertion and flanking sequence of G2-EPSPS and GAT transgenes in soybean using whole genome sequencing method

    Directory of Open Access Journals (Sweden)

    Bingfu Guo

    2016-07-01

    Full Text Available Molecular characterization of sequences flanking exogenous fragment insertions is essential for safety assessment and labeling of genetically modified organisms (GMO. In this study, the T-DNA insertion sites and flanking sequences were identified in two newly developed transgenic glyphosate-tolerant soybeans GE-J16 and ZH10-6 based on whole genome sequencing (WGS method. About 21 Gb sequence data (~21× coverage for each line was generated on Illumina HiSeq 2500 platform. The junction reads mapped to boundary of T-DNA and flanking sequences in these two events were identified by comparing all sequencing reads with soybean reference genome and sequence of transgenic vector. The putative insertion loci and flanking sequences were further confirmed by PCR amplification, Sanger sequencing, and co-segregation analysis. All these analyses supported that exogenous T-DNA fragments were integrated in positions of Chr19: 50543767-50543792 and Chr17: 7980527-7980541 in these two transgenic lines. Identification of the genomic insertion site of the G2-EPSPS and GAT transgenes will facilitate the use of their glyphosate-tolerant traits in soybean breeding program. These results also demonstrated that WGS is a cost-effective and rapid method of identifying sites of T-DNA insertions and flanking sequences in soybean.

  14. Graphene nanodevices for DNA sequencing

    NARCIS (Netherlands)

    Heerema, S.J.; Dekker, C.

    2016-01-01

    Fast, cheap, and reliable DNA sequencing could be one of the most disruptive innovations of this decade, as it will pave the way for personalized medicine. In pursuit of such technology, a variety of nanotechnology-based approaches have been explored and established, including sequencing with

  15. Identification of a novel mutation in a Chinese family with Nance-Horan syndrome by whole exome sequencing*

    Science.gov (United States)

    Hong, Nan; Chen, Yan-hua; Xie, Chen; Xu, Bai-sheng; Huang, Hui; Li, Xin; Yang, Yue-qing; Huang, Ying-ping; Deng, Jian-lian; Qi, Ming; Gu, Yang-shun

    2014-01-01

    Objective: Nance-Horan syndrome (NHS) is a rare X-linked disorder characterized by congenital nuclear cataracts, dental anomalies, and craniofacial dysmorphisms. Mental retardation was present in about 30% of the reported cases. The purpose of this study was to investigate the genetic and clinical features of NHS in a Chinese family. Methods: Whole exome sequencing analysis was performed on DNA from an affected male to scan for candidate mutations on the X-chromosome. Sanger sequencing was used to verify these candidate mutations in the whole family. Clinical and ophthalmological examinations were performed on all members of the family. Results: A combination of exome sequencing and Sanger sequencing revealed a nonsense mutation c.322G>T (E108X) in exon 1 of NHS gene, co-segregating with the disease in the family. The nonsense mutation led to the conversion of glutamic acid to a stop codon (E108X), resulting in truncation of the NHS protein. Multiple sequence alignments showed that codon 108, where the mutation (c.322G>T) occurred, was located within a phylogenetically conserved region. The clinical features in all affected males and female carriers are described in detail. Conclusions: We report a nonsense mutation c.322G>T (E108X) in a Chinese family with NHS. Our findings broaden the spectrum of NHS mutations and provide molecular insight into future NHS clinical genetic diagnosis. PMID:25091991

  16. Identification of a novel mutation in a Chinese family with Nance-Horan syndrome by whole exome sequencing.

    Science.gov (United States)

    Hong, Nan; Chen, Yan-hua; Xie, Chen; Xu, Bai-sheng; Huang, Hui; Li, Xin; Yang, Yue-qing; Huang, Ying-ping; Deng, Jian-lian; Qi, Ming; Gu, Yang-shun

    2014-08-01

    Nance-Horan syndrome (NHS) is a rare X-linked disorder characterized by congenital nuclear cataracts, dental anomalies, and craniofacial dysmorphisms. Mental retardation was present in about 30% of the reported cases. The purpose of this study was to investigate the genetic and clinical features of NHS in a Chinese family. Whole exome sequencing analysis was performed on DNA from an affected male to scan for candidate mutations on the X-chromosome. Sanger sequencing was used to verify these candidate mutations in the whole family. Clinical and ophthalmological examinations were performed on all members of the family. A combination of exome sequencing and Sanger sequencing revealed a nonsense mutation c.322G>T (E108X) in exon 1 of NHS gene, co-segregating with the disease in the family. The nonsense mutation led to the conversion of glutamic acid to a stop codon (E108X), resulting in truncation of the NHS protein. Multiple sequence alignments showed that codon 108, where the mutation (c.322G>T) occurred, was located within a phylogenetically conserved region. The clinical features in all affected males and female carriers are described in detail. We report a nonsense mutation c.322G>T (E108X) in a Chinese family with NHS. Our findings broaden the spectrum of NHS mutations and provide molecular insight into future NHS clinical genetic diagnosis.

  17. Detection of Emerging Vaccine-Related Polioviruses by Deep Sequencing.

    Science.gov (United States)

    Sahoo, Malaya K; Holubar, Marisa; Huang, ChunHong; Mohamed-Hadley, Alisha; Liu, Yuanyuan; Waggoner, Jesse J; Troy, Stephanie B; Garcia-Garcia, Lourdes; Ferreyra-Reyes, Leticia; Maldonado, Yvonne; Pinsky, Benjamin A

    2017-07-01

    Oral poliovirus vaccine can mutate to regain neurovirulence. To date, evaluation of these mutations has been performed primarily on culture-enriched isolates by using conventional Sanger sequencing. We therefore developed a culture-independent, deep-sequencing method targeting the 5' untranslated region (UTR) and P1 genomic region to characterize vaccine-related poliovirus variants. Error analysis of the deep-sequencing method demonstrated reliable detection of poliovirus mutations at levels of vaccinated, asymptomatic children and their close contacts collected during a prospective cohort study in Veracruz, Mexico, revealed no vaccine-derived polioviruses. This was expected given that the longest duration between sequenced sample collection and the end of the most recent national immunization week was 66 days. However, we identified many low-level variants (Sabin serotypes, as well as vaccine-related viruses with multiple canonical mutations associated with phenotypic reversion present at high levels (>90%). These results suggest that monitoring emerging vaccine-related poliovirus variants by deep sequencing may aid in the poliovirus endgame and efforts to ensure global polio eradication. Copyright © 2017 Sahoo et al.

  18. Whole Genome Sequencing of Enterovirus species C Isolates by High-throughput Sequencing: Development of Generic Primers

    Directory of Open Access Journals (Sweden)

    Maël Bessaud

    2016-08-01

    Full Text Available Enteroviruses are among the most common viruses infecting humans and can cause diverse clinical syndromes ranging from minor febrile illness to severe and potentially fatal diseases. Enterovirus species C (EV-C consists of more than 20 types, among which the 3 serotypes of polioviruses, the etiological agents of poliomyelitis, are included. Biodiversity and evolution of EV-C genomes are shaped by frequent recombination events. Therefore, identification and characterization of circulating EV-C strains require the sequencing of different genomic regions.A simple method was developed to sequence quickly the entire genome of EV-C isolates. Four overlapping fragments were produced separately by RT-PCR performed with generic primers. The four amplicons were then pooled and purified prior to be sequenced by high-throughput technique.The method was assessed on a panel of EV-Cs belonging to a wide-range of types. It can be used to determine full-length genome sequences through de novo assembly of thousands of reads. It was also able to discriminate reads from closely related viruses in mixtures.By decreasing the workload compared to classical Sanger-based techniques, this method will serve as a precious tool for sequencing large panels of EV-Cs isolated in cell cultures during environmental surveillance or from patients, including vaccine-derived polioviruses.

  19. Quantum-Sequencing: Fast electronic single DNA molecule sequencing

    Science.gov (United States)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free, high-throughput and cost-effective, single-molecule sequencing method. Here, we present the first demonstration of unique ``electronic fingerprint'' of all nucleotides (A, G, T, C), with single-molecule DNA sequencing, using Quantum-tunneling Sequencing (Q-Seq) at room temperature. We show that the electronic state of the nucleobases shift depending on the pH, with most distinct states identified at acidic pH. We also demonstrate identification of single nucleotide modifications (methylation here). Using these unique electronic fingerprints (or tunneling data), we report a partial sequence of beta lactamase (bla) gene, which encodes resistance to beta-lactam antibiotics, with over 95% success rate. These results highlight the potential of Q-Seq as a robust technique for next-generation sequencing.

  20. Molecular genetics of the Usher syndrome in Lebanon: identification of 11 novel protein truncating mutations by whole exome sequencing.

    Science.gov (United States)

    Reddy, Ramesh; Fahiminiya, Somayyeh; El Zir, Elie; Mansour, Ahmad; Megarbane, Andre; Majewski, Jacek; Slim, Rima

    2014-01-01

    Usher syndrome (USH) is a genetically heterogeneous condition with ten disease-causing genes. The spectrum of genes and mutations causing USH in the Lebanese and Middle Eastern populations has not been described. Consequently, diagnostic approaches designed to screen for previously reported mutations were unlikely to identify the mutations in 11 unrelated families, eight of Lebanese and three of Middle Eastern origins. In addition, six of the ten USH genes consist of more than 20 exons, each, which made mutational analysis by Sanger sequencing of PCR-amplified exons from genomic DNA tedious and costly. The study was aimed at the identification of USH causing genes and mutations in 11 unrelated families with USH type I or II. Whole exome sequencing followed by expanded familial validation by Sanger sequencing. We identified disease-causing mutations in all the analyzed patients in four USH genes, MYO7A, USH2A, GPR98 and CDH23. Eleven of the mutations were novel and protein truncating, including a complex rearrangement in GPR98. Our data highlight the genetic diversity of Usher syndrome in the Lebanese population and the time and cost-effectiveness of whole exome sequencing approach for mutation analysis of genetically heterogeneous conditions caused by large genes.

  1. Molecular genetics of the Usher syndrome in Lebanon: identification of 11 novel protein truncating mutations by whole exome sequencing.

    Directory of Open Access Journals (Sweden)

    Ramesh Reddy

    Full Text Available Usher syndrome (USH is a genetically heterogeneous condition with ten disease-causing genes. The spectrum of genes and mutations causing USH in the Lebanese and Middle Eastern populations has not been described. Consequently, diagnostic approaches designed to screen for previously reported mutations were unlikely to identify the mutations in 11 unrelated families, eight of Lebanese and three of Middle Eastern origins. In addition, six of the ten USH genes consist of more than 20 exons, each, which made mutational analysis by Sanger sequencing of PCR-amplified exons from genomic DNA tedious and costly. The study was aimed at the identification of USH causing genes and mutations in 11 unrelated families with USH type I or II.Whole exome sequencing followed by expanded familial validation by Sanger sequencing.We identified disease-causing mutations in all the analyzed patients in four USH genes, MYO7A, USH2A, GPR98 and CDH23. Eleven of the mutations were novel and protein truncating, including a complex rearrangement in GPR98.Our data highlight the genetic diversity of Usher syndrome in the Lebanese population and the time and cost-effectiveness of whole exome sequencing approach for mutation analysis of genetically heterogeneous conditions caused by large genes.

  2. Molecular Genetics of the Usher Syndrome in Lebanon: Identification of 11 Novel Protein Truncating Mutations by Whole Exome Sequencing

    Science.gov (United States)

    Reddy, Ramesh; Fahiminiya, Somayyeh; El Zir, Elie; Mansour, Ahmad; Megarbane, Andre; Majewski, Jacek; Slim, Rima

    2014-01-01

    Background Usher syndrome (USH) is a genetically heterogeneous condition with ten disease-causing genes. The spectrum of genes and mutations causing USH in the Lebanese and Middle Eastern populations has not been described. Consequently, diagnostic approaches designed to screen for previously reported mutations were unlikely to identify the mutations in 11 unrelated families, eight of Lebanese and three of Middle Eastern origins. In addition, six of the ten USH genes consist of more than 20 exons, each, which made mutational analysis by Sanger sequencing of PCR-amplified exons from genomic DNA tedious and costly. The study was aimed at the identification of USH causing genes and mutations in 11 unrelated families with USH type I or II. Methods Whole exome sequencing followed by expanded familial validation by Sanger sequencing. Results We identified disease-causing mutations in all the analyzed patients in four USH genes, MYO7A, USH2A, GPR98 and CDH23. Eleven of the mutations were novel and protein truncating, including a complex rearrangement in GPR98. Conclusion Our data highlight the genetic diversity of Usher syndrome in the Lebanese population and the time and cost-effectiveness of whole exome sequencing approach for mutation analysis of genetically heterogeneous conditions caused by large genes. PMID:25211151

  3. FAST: FAST Analysis of Sequences Toolbox

    Directory of Open Access Journals (Sweden)

    Travis J. Lawrence

    2015-05-01

    Full Text Available FAST (FAST Analysis of Sequences Toolbox provides simple, powerful open source command-line tools to filter, transform, annotate and analyze biological sequence data. Modeled after the GNU (GNU’s Not Unix Textutils such as grep, cut, and tr, FAST tools such as fasgrep, fascut, and fastr make it easy to rapidly prototype expressive bioinformatic workflows in a compact and generic command vocabulary. Compact combinatorial encoding of data workflows with FAST commands can simplify the documentation and reproducibility of bioinformatic protocols, supporting better transparency in biological data science. Interface self-consistency and conformity with conventions of GNU, Matlab, Perl, BioPerl, R and GenBank help make FAST easy and rewarding to learn. FAST automates numerical, taxonomic, and text-based sorting, selection and transformation of sequence records and alignment sites based on content, index ranges, descriptive tags, annotated features, and in-line calculated analytics, including composition and codon usage. Automated content- and feature-based extraction of sites and support for molecular population genetic statistics makes FAST useful for molecular evolutionary analysis. FAST is portable, easy to install and secure thanks to the relative maturity of its Perl and BioPerl foundations, with stable releases posted to CPAN. Development as well as a publicly accessible Cookbook and Wiki are available on the FAST GitHub repository at https://github.com/tlawrence3/FAST. The default data exchange format in FAST is Multi-FastA (specifically, a restriction of BioPerl FastA format. Sanger and Illumina 1.8+ FastQ formatted files are also supported. FAST makes it easier for non-programmer biologists to interactively investigate and control biological data at the speed of thought.

  4. Application of massively parallel sequencing to genetic diagnosis in multiplex families with idiopathic sensorineural hearing impairment.

    Directory of Open Access Journals (Sweden)

    Chen-Chi Wu

    Full Text Available Despite the clinical utility of genetic diagnosis to address idiopathic sensorineural hearing impairment (SNHI, the current strategy for screening mutations via Sanger sequencing suffers from the limitation that only a limited number of DNA fragments associated with common deafness mutations can be genotyped. Consequently, a definitive genetic diagnosis cannot be achieved in many families with discernible family history. To investigate the diagnostic utility of massively parallel sequencing (MPS, we applied the MPS technique to 12 multiplex families with idiopathic SNHI in which common deafness mutations had previously been ruled out. NimbleGen sequence capture array was designed to target all protein coding sequences (CDSs and 100 bp of the flanking sequence of 80 common deafness genes. We performed MPS on the Illumina HiSeq2000, and applied BWA, SAMtools, Picard, GATK, Variant Tools, ANNOVAR, and IGV for bioinformatics analyses. Initial data filtering with allele frequencies (0.95 prioritized 5 indels (insertions/deletions and 36 missense variants in the 12 multiplex families. After further validation by Sanger sequencing, segregation pattern, and evolutionary conservation of amino acid residues, we identified 4 variants in 4 different genes, which might lead to SNHI in 4 families compatible with autosomal dominant inheritance. These included GJB2 p.R75Q, MYO7A p.T381M, KCNQ4 p.S680F, and MYH9 p.E1256K. Among them, KCNQ4 p.S680F and MYH9 p.E1256K were novel. In conclusion, MPS allows genetic diagnosis in multiplex families with idiopathic SNHI by detecting mutations in relatively uncommon deafness genes.

  5. Development of severe accident evaluation technology (level 2 PSA) for sodium-cooled fast reactors. (5) Identification of dominant factors in ex-vessel accident sequences

    International Nuclear Information System (INIS)

    Ohno, Shuji; Seino, Hiroshi; Miyahara, Shinya

    2009-01-01

    The evaluation of accident progression outside of a reactor vessel (ex-vessel) and subsequent transfer behavior of radioactive materials is of great importance from the viewpoint of Level 2 PSA. Hence typical ex-vessel accident sequences in the JAEA Sodium-cooled Fast Reactor are qualitatively discussed in this paper and dominant behaviors or factors in the sequences are investigated through parametric calculations using the CONTAIN/LMR code. Scenarios to be focused on are, 1) sodium vapor leakage from the reactor vessel and 2) sodium-concrete reaction, which are both to be considered in the accident category of LOHRS (loss of heat removal system) and might be followed by an early containment failure due to the thermal effect of sodium combustion and hydrogen burning respectively. The calculated results clarify that the sodium vapor leak rate and the scale of sodium-concrete reaction are the important factors to dominate the ex-vessel accident progression. In addition to the understandings of the dominant factors, the analyzed results also provide the specific information such as pressure loading value to the containment and the timing of pressurization, which is indispensable as technical base in Level 2 PSA for developing event trees and for quantifying the accident consequences. (author)

  6. Construction of an SNP-based high-density linkage map for flax (Linum usitatissimum L.) using specific length amplified fragment sequencing (SLAF-seq) technology.

    Science.gov (United States)

    Yi, Liuxi; Gao, Fengyun; Siqin, Bateer; Zhou, Yu; Li, Qiang; Zhao, Xiaoqing; Jia, Xiaoyun; Zhang, Hui

    2017-01-01

    Flax is an important crop for oil and fiber, however, no high-density genetic maps have been reported for this species. Specific length amplified fragment sequencing (SLAF-seq) is a high-resolution strategy for large scale de novo discovery and genotyping of single nucleotide polymorphisms. In this study, SLAF-seq was employed to develop SNP markers in an F2 population to construct a high-density genetic map for flax. In total, 196.29 million paired-end reads were obtained. The average sequencing depth was 25.08 in male parent, 32.17 in the female parent, and 9.64 in each F2 progeny. In total, 389,288 polymorphic SLAFs were detected, from which 260,380 polymorphic SNPs were developed. After filtering, 4,638 SNPs were found suitable for genetic map construction. The final genetic map included 4,145 SNP markers on 15 linkage groups and was 2,632.94 cM in length, with an average distance of 0.64 cM between adjacent markers. To our knowledge, this map is the densest SNP-based genetic map for flax. The SNP markers and genetic map reported in here will serve as a foundation for the fine mapping of quantitative trait loci (QTLs), map-based gene cloning and marker assisted selection (MAS) for flax.

  7. Detecting authorized and unauthorized genetically modified organisms containing vip3A by real-time PCR and next-generation sequencing.

    Science.gov (United States)

    Liang, Chanjuan; van Dijk, Jeroen P; Scholtens, Ingrid M J; Staats, Martijn; Prins, Theo W; Voorhuijzen, Marleen M; da Silva, Andrea M; Arisi, Ana Carolina Maisonnave; den Dunnen, Johan T; Kok, Esther J

    2014-04-01

    The growing number of biotech crops with novel genetic elements increasingly complicates the detection of genetically modified organisms (GMOs) in food and feed samples using conventional screening methods. Unauthorized GMOs (UGMOs) in food and feed are currently identified through combining GMO element screening with sequencing the DNA flanking these elements. In this study, a specific and sensitive qPCR assay was developed for vip3A element detection based on the vip3Aa20 coding sequences of the recently marketed MIR162 maize and COT102 cotton. Furthermore, SiteFinding-PCR in combination with Sanger, Illumina or Pacific BioSciences (PacBio) sequencing was performed targeting the flanking DNA of the vip3Aa20 element in MIR162. De novo assembly and Basic Local Alignment Search Tool searches were used to mimic UGMO identification. PacBio data resulted in relatively long contigs in the upstream (1,326 nucleotides (nt); 95 % identity) and downstream (1,135 nt; 92 % identity) regions, whereas Illumina data resulted in two smaller contigs of 858 and 1,038 nt with higher sequence identity (>99 % identity). Both approaches outperformed Sanger sequencing, underlining the potential for next-generation sequencing in UGMO identification.

  8. Identification of the first homozygous 1-bp deletion in GDF9 gene leading to primary ovarian insufficiency by using targeted massively parallel sequencing.

    Science.gov (United States)

    França, M M; Funari, M F A; Nishi, M Y; Narcizo, A M; Domenice, S; Costa, E M F; Lerario, A M; Mendonca, B B

    2018-02-01

    Targeted massively parallel sequencing (TMPS) has been used in genetic diagnosis for Mendelian disorders. In the past few years, the TMPS has identified new and already described genes associated with primary ovarian insufficiency (POI) phenotype. Here, we performed a targeted gene sequencing to find a genetic diagnosis in idiopathic cases of Brazilian POI cohort. A custom SureSelect XT DNA target enrichment panel was designed and the sequencing was performed on Illumina NextSeq sequencer. We identified 1 homozygous 1-bp deletion variant (c.783delC) in the GDF9 gene in 1 patient with POI. The variant was confirmed and segregated using Sanger sequencing. The c.783delC GDF9 variant changed an amino acid creating a premature termination codon (p.Ser262Hisfs*2). This variant was not present in all public databases (ExAC/gnomAD, NHLBI/EVS and 1000Genomes). Moreover, it was absent in 400 alleles from fertile Brazilian women screened by Sanger sequencing. The patient's mother and her unaffected sister carried the c.783delC variant in a heterozygous state, as expected for an autosomal recessive inheritance. Here, the TMPS identified the first homozygous 1-bp deletion variant in GDF9. This finding reveals a novel inheritance pattern of pathogenic variant in GDF9 associated with POI, thus improving the genetic diagnosis of this disorder. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  9. 基于二次参数化技术的风电机组序列化建模%Wind turbine sequence modeling based on secondary parametric technology

    Institute of Scientific and Technical Information of China (English)

    高青风; 孙振兴; 滕伟; 柳亦兵

    2012-01-01

    针对常规建模方法在庞大复杂风电机组应用上的不足,在建立单台风电机组全参数化模型的基础上,以风电机组序列整体为研究对象,根据同序列不同功率机组之间各零部件设计参数变化规律,研究应用二次参数化技术和参数序列化方法实现了高效的风电机组模型建立与管理;并结合三维模型参数驱动技术,研发了风电机组序列建模系统,可有效减少建模和造型工作量、降低设计失误率、提高设计效率,验证了该建模方法的正确性与合理性.%According to the disadvantages of conventional modeling methods for large and complex wind turbines, a highly efficient modeling method is proposed in this paper. Based on a fully parametric model of a single wind turbine, it took the whole sequence of wind turbines as the research object, and realized the fast modeling via secondary parametric technology according to the change law of design parameters of different wind turbines in the same sequence. A sequence modeling system for wind turbines was also developed by combining with parameter-driven modeling technology. The system can effectively reduce the workload and design error rate, improve design efficiency, and prove the correctness and rationality of the modeling method.

  10. DNA Sequencing by Capillary Electrophoresis

    Science.gov (United States)

    Karger, Barry L.; Guttman, Andras

    2009-01-01

    Sequencing of human and other genomes has been at the center of interest in the biomedical field over the past several decades and is now leading toward an era of personalized medicine. During this time, DNA sequencing methods have evolved from the labor intensive slab gel electrophoresis, through automated multicapillary electrophoresis systems using fluorophore labeling with multispectral imaging, to the “next generation” technologies of cyclic array, hybridization based, nanopore and single molecule sequencing. Deciphering the genetic blueprint and follow-up confirmatory sequencing of Homo sapiens and other genomes was only possible by the advent of modern sequencing technologies that was a result of step by step advances with a contribution of academics, medical personnel and instrument companies. While next generation sequencing is moving ahead at break-neck speed, the multicapillary electrophoretic systems played an essential role in the sequencing of the Human Genome, the foundation of the field of genomics. In this prospective, we wish to overview the role of capillary electrophoresis in DNA sequencing based in part of several of our articles in this journal. PMID:19517496

  11. Next-generation sequencing technology a new tool for killer cell immunoglobulin-like receptor allele typing in hematopoietic stem cell transplantation.

    Science.gov (United States)

    Maniangou, B; Retière, C; Gagne, K

    2018-02-01

    Killer cell Immunoglobulin-like Receptor (KIR) genes are a family of genes located together within the leukocyte receptor cluster on human chromosome 19q13.4. To date, 17 KIR genes have been identified including nine inhibitory genes (2DL1/L2/L3/L4/L5A/L5B, 3DL1/L2/L3), six activating genes (2DS1/S2/S3/S4/S5, 3DS1) and two pseudogenes (2DP1, 3DP1) classified into group A (KIR A) and group B (KIR B) haplotypes. The number and the nature of KIR genes vary between the individuals. In addition, these KIR genes are known to be polymorphic at allelic level (907 alleles described in July 2017). KIR genes encode for receptors which are predominantly expressed by Natural Killer (NK) cells. KIR receptors recognize HLA class I molecules and are able to kill residual recipient leukemia cells, and thus reduce the likelihood of relapse. KIR alleles of Hematopoietic Stem Cell (HSC) donor would require to be known (Alicata et al. Eur J Immunol 2016) because the KIR allele polymorphism may affect both the KIR + NK cell phenotype and function (Gagne et al. Eur J Immunol 2013; Bari R, et al. Sci Rep 2016) as well as HSCT outcome (Boudreau et al. JCO 2017). The introduction of the Next Generation Sequencing (NGS) has overcome current conventional DNA sequencing method limitations, known to be time consuming. Recently, a novel NGS KIR allele typing approach of all KIR genes was developed by our team in Nantes from 30 reference DNAs (Maniangou et al. Front in Immunol 2017). This NGS KIR allele typing approach is simple, fast, reliable, specific and showed a concordance rate of 95% for centromeric and telomeric KIR genes in comparison with high-resolution KIR typing obtained to those published data using exome capture (Norman PJ et al. Am J Hum Genet 2016). This NGS KIR allele typing approach may also be used in reproduction and to better study KIR + NK cell implication in the control of viral infections. Copyright © 2017 Elsevier Masson SAS. All rights reserved.

  12. Inhibition of expression in Escherichia coli of a virulence regulator MglB of Francisella tularensis using external guide sequence technology.

    Directory of Open Access Journals (Sweden)

    Gaoping Xiao

    Full Text Available External guide sequences (EGSs have successfully been used to inhibit expression of target genes at the post-transcriptional level in both prokaryotes and eukaryotes. We previously reported that EGS accessible and cleavable sites in the target RNAs can rapidly be identified by screening random EGS (rEGS libraries. Here the method of screening rEGS libraries and a partial RNase T1 digestion assay were used to identify sites accessible to EGSs in the mRNA of a global virulence regulator MglB from Francisella tularensis, a Gram-negative pathogenic bacterium. Specific EGSs were subsequently designed and their activities in terms of the cleavage of mglB mRNA by RNase P were tested in vitro and in vivo. EGS73, EGS148, and EGS155 in both stem and M1 EGS constructs induced mglB mRNA cleavage in vitro. Expression of stem EGS73 and EGS155 in Escherichia coli resulted in significant reduction of the mglB mRNA level coded for the F. tularensis mglB gene inserted in those cells.

  13. Sequencing and analysis of the Mediterranean amphioxus (Branchiostoma lanceolatum transcriptome.

    Directory of Open Access Journals (Sweden)

    Silvan Oulion

    Full Text Available BACKGROUND: The basally divergent phylogenetic position of amphioxus (Cephalochordata, as well as its conserved morphology, development and genetics, make it the best proxy for the chordate ancestor. Particularly, studies using the amphioxus model help our understanding of vertebrate evolution and development. Thus, interest for the amphioxus model led to the characterization of both the transcriptome and complete genome sequence of the American species, Branchiostoma floridae. However, recent technical improvements allowing induction of spawning in the laboratory during the breeding season on a daily basis with the Mediterranean species Branchiostoma lanceolatum have encouraged European Evo-Devo researchers to adopt this species as a model even though no genomic or transcriptomic data have been available. To fill this need we used the pyrosequencing method to characterize the B. lanceolatum transcriptome and then compared our results with the published transcriptome of B. floridae. RESULTS: Starting with total RNA from nine different developmental stages of B. lanceolatum, a normalized cDNA library was constructed and sequenced on Roche GS FLX (Titanium mode. Around 1.4 million of reads were produced and assembled into 70,530 contigs (average length of 490 bp. Overall 37% of the assembled sequences were annotated by BlastX and their Gene Ontology terms were determined. These results were then compared to genomic and transcriptomic data of B. floridae to assess similarities and specificities of each species. CONCLUSION: We obtained a high-quality amphioxus (B. lanceolatum reference transcriptome using a high throughput sequencing approach. We found that 83% of the predicted genes in the B. floridae complete genome sequence are also found in the B. lanceolatum transcriptome, while only 41% were found in the B. floridae transcriptome obtained with traditional Sanger based sequencing. Therefore, given the high degree of sequence conservation

  14. Viral metagenomics: Analysis of begomoviruses by illumina high-throughput sequencing

    KAUST Repository

    Idris, Ali

    2014-03-12

    Traditional DNA sequencing methods are inefficient, lack the ability to discern the least abundant viral sequences, and ineffective for determining the extent of variability in viral populations. Here, populations of single-stranded DNA plant begomoviral genomes and their associated beta- and alpha-satellite molecules (virus-satellite complexes) (genus, Begomovirus; family, Geminiviridae) were enriched from total nucleic acids isolated from symptomatic, field-infected plants, using rolling circle amplification (RCA). Enriched virus-satellite complexes were subjected to Illumina-Next Generation Sequencing (NGS). CASAVA and SeqMan NGen programs were implemented, respectively, for quality control and for de novo and reference-guided contig assembly of viral-satellite sequences. The authenticity of the begomoviral sequences, and the reproducibility of the Illumina-NGS approach for begomoviral deep sequencing projects, were validated by comparing NGS results with those obtained using traditional molecular cloning and Sanger sequencing of viral components and satellite DNAs, also enriched by RCA or amplified by polymerase chain reaction. As the use of NGS approaches, together with advances in software development, make possible deep sequence coverage at a lower cost; the approach described herein will streamline the exploration of begomovirus diversity and population structure from naturally infected plants, irrespective of viral abundance. This is the first report of the implementation of Illumina-NGS to explore the diversity and identify begomoviral-satellite SNPs directly from plants naturally-infected with begomoviruses under field conditions. 2014 by the authors; licensee MDPI, Basel, Switzerland.

  15. Viral Metagenomics: Analysis of Begomoviruses by Illumina High-Throughput Sequencing

    Directory of Open Access Journals (Sweden)

    Ali Idris

    2014-03-01

    Full Text Available Traditional DNA sequencing methods are inefficient, lack the ability to discern the least abundant viral sequences, and ineffective for determining the extent of variability in viral populations. Here, populations of single-stranded DNA plant begomoviral genomes and their associated beta- and alpha-satellite molecules (virus-satellite complexes (genus, Begomovirus; family, Geminiviridae were enriched from total nucleic acids isolated from symptomatic, field-infected plants, using rolling circle amplification (RCA. Enriched virus-satellite complexes were subjected to Illumina-Next Generation Sequencing (NGS. CASAVA and SeqMan NGen programs were implemented, respectively, for quality control and for de novo and reference-guided contig assembly of viral-satellite sequences. The authenticity of the begomoviral sequences, and the reproducibility of the Illumina-NGS approach for begomoviral deep sequencing projects, were validated by comparing NGS results with those obtained using traditional molecular cloning and Sanger sequencing of viral components and satellite DNAs, also enriched by RCA or amplified by polymerase chain reaction. As the use of NGS approaches, together with advances in software development, make possible deep sequence coverage at a lower cost; the approach described herein will streamline the exploration of begomovirus diversity and population structure from naturally infected plants, irrespective of viral abundance. This is the first report of the implementation of Illumina-NGS to explore the diversity and identify begomoviral-satellite SNPs directly from plants naturally-infected with begomoviruses under field conditions.

  16. Molecular diagnosis of glycogen storage disease and disorders with overlapping clinical symptoms by massive parallel sequencing.

    Science.gov (United States)

    Vega, Ana I; Medrano, Celia; Navarrete, Rosa; Desviat, Lourdes R; Merinero, Begoña; Rodríguez-Pombo, Pilar; Vitoria, Isidro; Ugarte, Magdalena; Pérez-Cerdá, Celia; Pérez, Belen

    2016-10-01

    Glycogen storage disease (GSD) is an umbrella term for a group of genetic disorders that involve the abnormal metabolism of glycogen; to date, 23 types of GSD have been identified. The nonspecific clinical presentation of GSD and the lack of specific biomarkers mean that Sanger sequencing is now widely relied on for making a diagnosis. However, this gene-by-gene sequencing technique is both laborious and costly, which is a consequence of the number of genes to be sequenced and the large size of some genes. This work reports the use of massive parallel sequencing to diagnose patients at our laboratory in Spain using either a customized gene panel (targeted exome sequencing) or the Illumina Clinical-Exome TruSight One Gene Panel (clinical exome sequencing (CES)). Sequence variants were matched against biochemical and clinical hallmarks. Pathogenic mutations were detected in 23 patients. Twenty-two mutations were recognized (mostly loss-of-function mutations), including 11 that were novel in GSD-associated genes. In addition, CES detected five patients with mutations in ALDOB, LIPA, NKX2-5, CPT2, or ANO5. Although these genes are not involved in GSD, they are associated with overlapping phenotypic characteristics such as hepatic, muscular, and cardiac dysfunction. These results show that next-generation sequencing, in combination with the detection of biochemical and clinical hallmarks, provides an accurate, high-throughput means of making genetic diagnoses of GSD and related diseases.Genet Med 18 10, 1037-1043.

  17. A Retrospective Examination of Feline Leukemia Subgroup Characterization: Viral Interference Assays to Deep Sequencing

    Directory of Open Access Journals (Sweden)

    Elliott S. Chiu

    2018-01-01

    Full Text Available Feline leukemia virus (FeLV was the first feline retrovirus discovered, and is associated with multiple fatal disease syndromes in cats, including lymphoma. The original research conducted on FeLV employed classical virological techniques. As methods have evolved to allow FeLV genetic characterization, investigators have continued to unravel the molecular pathology associated with this fascinating agent. In this review, we discuss how FeLV classification, transmission, and disease-inducing potential have been defined sequentially by viral interference assays, Sanger sequencing, PCR, and next-generation sequencing. In particular, we highlight the influences of endogenous FeLV and host genetics that represent FeLV research opportunities on the near horizon.

  18. A Retrospective Examination of Feline Leukemia Subgroup Characterization: Viral Interference Assays to Deep Sequencing.

    Science.gov (United States)

    Chiu, Elliott S; Hoover, Edward A; VandeWoude, Sue

    2018-01-10

    Feline leukemia virus (FeLV) was the first feline retrovirus discovered, and is associated with multiple fatal disease syndromes in cats, including lymphoma. The original research conducted on FeLV employed classical virological techniques. As methods have evolved to allow FeLV genetic characterization, investigators have continued to unravel the molecular pathology associated with this fascinating agent. In this review, we discuss how FeLV classification, transmission, and disease-inducing potential have been defined sequentially by viral interference assays, Sanger sequencing, PCR, and next-generation sequencing. In particular, we highlight the influences of endogenous FeLV and host genetics that represent FeLV research opportunities on the near horizon.

  19. Exploring the environmental diversity of kinetoplastid flagellates in the high-throughput DNA sequencing era

    Directory of Open Access Journals (Sweden)

    Claudia Masini d’Avila-Levy

    2015-01-01

    Full Text Available The class Kinetoplastea encompasses both free-living and parasitic species from a wide range of hosts. Several representatives of this group are responsible for severe human diseases and for economic losses in agriculture and livestock. While this group encompasses over 30 genera, most of the available information has been derived from the vertebrate pathogenic genera Leishmaniaand Trypanosoma.Recent studies of the previously neglected groups of Kinetoplastea indicated that the actual diversity is much higher than previously thought. This article discusses the known segment of kinetoplastid diversity and how gene-directed Sanger sequencing and next-generation sequencing methods can help to deepen our knowledge of these interesting protists.

  20. High Density Linkage Map Construction and Mapping of Yield Trait QTLs in Maize (Zea mays) Using the Genotyping-by-Sequencing (GBS) Technology

    Science.gov (United States)

    Su, Chengfu; Wang, Wei; Gong, Shunliang; Zuo, Jinghui; Li, Shujiang; Xu, Shizhong

    2017-01-01

    Increasing grain yield is the ultimate goal for maize breeding. High resolution quantitative trait loci (QTL) mapping can help us understand the molecular basis of phenotypic variation of yield and thus facilitate marker assisted breeding. The aim of this study is to use genotyping-by-sequencing (GBS) for large-scale SNP discovery and simultaneous genotyping of all F2 individuals from a cross between two varieties of maize that are in clear contrast in yield and related traits. A set of 199 F2 progeny derived from the cross of varieties SG-5 and SG-7 were generated and genotyped by GBS. A total of 1,046,524,604 reads with an average of 5,258,918 reads per F2 individual were generated. This number of reads represents an approximately 0.36-fold coverage of the maize reference genome Zea_mays.AGPv3.29 for each F2 individual. A total of 68,882 raw SNPs were discovered in the F2 population, which, after stringent filtering, led to a total of 29,927 high quality SNPs. Comparative analysis using these physically mapped marker loci revealed a higher degree of synteny with the reference genome. The SNP genotype data were utilized to construct an intra-specific genetic linkage map of maize consisting of 3,305 bins on 10 linkage groups spanning 2,236.66 cM at an average distance of 0.68 cM between consecutive markers. From this map, we identified 28 QTLs associated with yield traits (100-kernel weight, ear length, ear diameter, cob diameter, kernel row number, corn grains per row, ear weight, and grain weight per plant) using the composite interval mapping (CIM) method and 29 QTLs using the least absolute shrinkage selection operator (LASSO) method. QTLs identified by the CIM method account for 6.4% to 19.7% of the phenotypic variation. Small intervals of three QTLs (qCGR-1, qKW-2, and qGWP-4) contain several genes, including one gene (GRMZM2G139872) encoding the F-box protein, three genes (GRMZM2G180811, GRMZM5G828139, and GRMZM5G873194) encoding the WD40-repeat protein, and

  1. Maturity onset diabetes of youth (MODY) in Turkish children: sequence analysis of 11 causative genes by next generation sequencing.

    Science.gov (United States)

    Ağladıoğlu, Sebahat Yılmaz; Aycan, Zehra; Çetinkaya, Semra; Baş, Veysel Nijat; Önder, Aşan; Peltek Kendirci, Havva Nur; Doğan, Haldun; Ceylaner, Serdar

    2016-04-01

    Maturity-onset diabetes of the youth (MODY), is a genetically and clinically heterogeneous group of diseasesand is often misdiagnosed as type 1 or type 2 diabetes. The aim of this study is to investigate both novel and proven mutations of 11 MODY genes in Turkish children by using targeted next generation sequencing. A panel of 11 MODY genes were screened in 43 children with MODY diagnosed by clinical criterias. Studies of index cases was done with MISEQ-ILLUMINA, and family screenings and confirmation studies of mutations was done by Sanger sequencing. We identified 28 (65%) point mutations among 43 patients. Eighteen patients have GCK mutations, four have HNF1A, one has HNF4A, one has HNF1B, two have NEUROD1, one has PDX1 gene variations and one patient has both HNF1A and HNF4A heterozygote mutations. This is the first study including molecular studies of 11 MODY genes in Turkish children. GCK is the most frequent type of MODY in our study population. Very high frequency of novel mutations (42%) in our study population, supports that in heterogenous disorders like MODY sequence analysis provides rapid, cost effective and accurate genetic diagnosis.

  2. Discrepancy between Hepatitis C Virus Genotypes and NS4-Based Serotypes: Association with Their Subgenomic Sequences

    Directory of Open Access Journals (Sweden)

    Nan Nwe Win

    2017-01-01

    Full Text Available Determination of hepatitis C virus (HCV genotypes plays an important role in the direct-acting agent era. Discrepancies between HCV genotyping and serotyping assays are occasionally observed. Eighteen samples with discrepant results between genotyping and serotyping methods were analyzed. HCV serotyping and genotyping were based on the HCV nonstructural 4 (NS4 region and 5′-untranslated region (5′-UTR, respectively. HCV core and NS4 regions were chosen to be sequenced and were compared with the genotyping and serotyping results. Deep sequencing was also performed for the corresponding HCV NS4 regions. Seventeen out of 18 discrepant samples could be sequenced by the Sanger method. Both HCV core and NS4 sequences were concordant with that of genotyping in the 5′-UTR in all 17 samples. In cloning analysis of the HCV NS4 region, there were several amino acid variations, but each sequence was much closer to the peptide with the same genotype. Deep sequencing revealed that minor clones with different subgenotypes existed in two of the 17 samples. Genotyping by genome amplification showed high consistency, while several false reactions were detected by serotyping. The deep sequencing method also provides accurate genotyping results and may be useful for analyzing discrepant cases. HCV genotyping should be correctly determined before antiviral treatment.

  3. Quality Control of the Traditional Patent Medicine Yimu Wan Based on SMRT Sequencing and DNA Barcoding

    Science.gov (United States)

    Jia, Jing; Xu, Zhichao; Xin, Tianyi; Shi, Linchun; Song, Jingyuan

    2017-01-01

    Substandard traditional patent medicines may lead to global safety-related issues. Protecting consumers from the health risks associated with the integrity and authenticity of herbal preparations is of great concern. Of particular concern is quality control for traditional patent medicines. Here, we establish an effective approach for verifying the biological composition of traditional patent medicines based on single-molecule real-time (SMRT) sequencing and DNA barcoding. Yimu Wan (YMW), a classical herbal prescription recorded in the Chinese Pharmacopoeia, was chosen to test the method. Two reference YMW samples were used to establish a standard method for analysis, which was then applied to three different batches of commercial YMW samples. A total of 3703 and 4810 circular-consensus sequencing (CCS) reads from two reference and three commercial YMW samples were mapped to the ITS2 and psbA-trnH regions, respectively. Moreover, comparison of intraspecific genetic distances based on SMRT sequencing data with reference data from Sanger sequencing revealed an ITS2 and psbA-trnH intergenic spacer that exhibited high intraspecific divergence, with the sites of variation showing significant differences within species. Using the CCS strategy for SMRT sequencing analysis was adequate to guarantee the accuracy of identification. This study demonstrates the application of SMRT sequencing to detect the biological ingredients of herbal preparations. SMRT sequencing provides an affordable way to monitor the legality and safety of traditional patent medicines. PMID:28620408

  4. Implementing targeted region capture sequencing for the clinical detection of Alagille syndrome: An efficient and cost‑effective method.

    Science.gov (United States)

    Huang, Tianhong; Yang, Guilin; Dang, Xiao; Ao, Feijian; Li, Jiankang; He, Yizhou; Tang, Qiyuan; He, Qing

    2017-11-01

    Alagille syndrome (AGS) is a highly variable, autosomal dominant disease that affects multiple structures including the liver, heart, eyes, bones and face. Targeted region capture sequencing focuses on a panel of known pathogenic genes and provides a rapid, cost‑effective and accurate method for molecular diagnosis. In a Chinese family, this method was used on the proband and Sanger sequencing was applied to validate the candidate mutation. A de novo heterozygous mutation (c.3254_3255insT p.Leu1085PhefsX24) of the jagged 1 gene was identified as the potential disease‑causing gene mutation. In conclusion, the present study suggested that target region capture sequencing is an efficient, reliable and accurate approach for the clinical diagnosis of AGS. Furthermore, these results expand on the understanding of the pathogenesis of AGS.

  5. An analysis of the sequence of the BAD gene among patients with maturity-onset diabetes of the young (MODY).

    Science.gov (United States)

    Antosik, Karolina; Gnyś, Piotr; Jarosz-Chobot, Przemysława; Myśliwiec, Małgorzata; Szadkowska, Agnieszka; Małecki, Maciej; Młynarski, Wojciech; Borowiec, Maciej

    2017-01-01

    Monogenic diabetes is a rare disease caused by single gene mutations. Maturity onset diabetes of the young (MODY) is one of the major forms of monogenic diabetes recognised in the paediatric population. To date, 13 genes have been related to MODY development. The aim of the study was to analyse the sequence of the BCL2-associated agonist of cell death (BAD) gene in patients with clinical suspicion of GCK-MODY, but who were negative for glucokinase (GCK) gene mutations. A group of 122 diabetic patients were recruited from the "Polish Registry for Paediatric and Adolescent Diabetes - nationwide genetic screening for monogenic diabetes" project. The molecular testing was performed by Sanger sequencing. A total of 10 sequence variants of the BAD gene were identified in 122 analysed diabetic patients. Among the analysed patients suspected of MODY, one possible pathogenic variant was identified in one patient; however, further confirmation is required for a certain identification.

  6. Case Report Identification of a novel SLC45A2 mutation in albinism by targeted next-generation sequencing.

    Science.gov (United States)

    Xue, J J; Xue, J F; Xue, H Q; Guo, Y Y; Liu, Y; Ouyang, N

    2016-09-19

    Albinism is a diverse group of hypopigmentary disorders caused by multiple-genetic defects. The genetic diagnosis of patients affected with albinism by Sanger sequencing is often complex, expensive, and time-consuming. In this study, we performed targeted next-generation sequencing to screen for 16 genes in a patient with albinism, and identified 21 genetic variants, including 19 known single nucleotide polymorphisms, one novel missense mutation (c.1456 G>A), and one disease-causing mutation (c.478 G>C). The novel mutation was not observed in 100 controls, and was predicted to be a damaging mutation by SIFT and Polyphen. Thus, we identified a novel mutation in SLC45A2 in a Chinese family, expanding the mutational spectrum of albinism. Our results also demonstrate that targeted next-generation sequencing is an effective genetic test for albinism.

  7. Image sequence analysis

    CERN Document Server

    1981-01-01

    The processing of image sequences has a broad spectrum of important applica­ tions including target tracking, robot navigation, bandwidth compression of TV conferencing video signals, studying the motion of biological cells using microcinematography, cloud tracking, and highway traffic monitoring. Image sequence processing involves a large amount of data. However, because of the progress in computer, LSI, and VLSI technologies, we have now reached a stage when many useful processing tasks can be done in a reasonable amount of time. As a result, research and development activities in image sequence analysis have recently been growing at a rapid pace. An IEEE Computer Society Workshop on Computer Analysis of Time-Varying Imagery was held in Philadelphia, April 5-6, 1979. A related special issue of the IEEE Transactions on Pattern Anal­ ysis and Machine Intelligence was published in November 1980. The IEEE Com­ puter magazine has also published a special issue on the subject in 1981. The purpose of this book ...

  8. Generic Amplicon Deep Sequencing to Determine Ilarvirus Species Diversity in Australian Prunus.

    Science.gov (United States)

    Kinoti, Wycliff M; Constable, Fiona E; Nancarrow, Narelle; Plummer, Kim M; Rodoni, Brendan

    2017-01-01

    The distribution of Ilarvirus species populations amongst 61 Australian Prunus trees was determined by next generation sequencing (NGS) of amplicons generated using a genus-based generic RT-PCR targeting a conserved region of the Ilarvirus RNA2 component that encodes the RNA dependent RNA polymerase (RdRp) gene. Presence of Ilarvirus sequences in each positive sample was further validated by Sanger sequencing of cloned amplicons of regions of each of RNA1, RNA2 and/or RNA3 that were generated by species specific PCRs and by metagenomic NGS. Prunus necrotic ringspot virus (PNRSV) was the most frequently detected Ilarvirus , occurring in 48 of the 61 Ilarvirus -positive trees and Prune dwarf virus (PDV) and Apple mosaic virus (ApMV) were detected in three trees and one tree, respectively. American plum line pattern virus (APLPV) was detected in three trees and represents the first report of APLPV detection in Australia. Two novel and distinct groups of Ilarvirus -like RNA2 amplicon sequences were also identified in several trees by the generic amplicon NGS approach. The high read depth from the amplicon NGS of the generic PCR products allowed the detection of distinct RNA2 RdRp sequence variant populations of PNRSV, PDV, ApMV, APLPV and the two novel Ilarvirus -like sequences. Mixed infections of ilarviruses were also detected in seven Prunus trees. Sanger sequencing of specific RNA1, RNA2, and/or RNA3 genome segments of each virus and total nucleic acid metagenomics NGS confirmed the presence of PNRSV, PDV, ApMV and APLPV detected by RNA2 generic amplicon NGS. However, the two novel groups of Ilarvirus -like RNA2 amplicon sequences detected by the generic amplicon NGS could not be associated to the presence of sequence from RNA1 or RNA3 genome segments or full Ilarvirus genomes, and their origin is unclear. This work highlights the sensitivity of genus-specific amplicon NGS in detection of virus sequences and their distinct populations in multiple samples, and the

  9. Generic Amplicon Deep Sequencing to Determine Ilarvirus Species Diversity in Australian Prunus

    Directory of Open Access Journals (Sweden)

    Wycliff M. Kinoti

    2017-06-01

    Full Text Available The distribution of Ilarvirus species populations amongst 61 Australian Prunus trees was determined by next generation sequencing (NGS of amplicons generated using a genus-based generic RT-PCR targeting a conserved region of the Ilarvirus RNA2 component that encodes the RNA dependent RNA polymerase (RdRp gene. Presence of Ilarvirus sequences in each positive sample was further validated by Sanger sequencing of cloned amplicons of regions of each of RNA1, RNA2 and/or RNA3 that were generated by species specific PCRs and by metagenomic NGS. Prunus necrotic ringspot virus (PNRSV was the most frequently detected Ilarvirus, occurring in 48 of the 61 Ilarvirus-positive trees and Prune dwarf virus (PDV and Apple mosaic virus (ApMV were detected in three trees and one tree, respectively. American plum line pattern virus (APLPV was detected in three trees and represents the first report of APLPV detection in Australia. Two novel and distinct groups of Ilarvirus-like RNA2 amplicon sequences were also identified in several trees by the generic amplicon NGS approach. The high read depth from the amplicon NGS of the generic PCR products allowed the detection of distinct RNA2 RdRp sequence variant populations of PNRSV, PDV, ApMV, APLPV and the two novel Ilarvirus-like sequences. Mixed infections of ilarviruses were also detected in seven Prunus trees. Sanger sequencing of specific RNA1, RNA2, and/or RNA3 genome segments of each virus and total nucleic acid metagenomics NGS confirmed the presence of PNRSV, PDV, ApMV and APLPV detected by RNA2 generic amplicon NGS. However, the two novel groups of Ilarvirus-like RNA2 amplicon sequences detected by the generic amplicon NGS could not be associated to the presence of sequence from RNA1 or RNA3 genome segments or full Ilarvirus genomes, and their origin is unclear. This work highlights the sensitivity of genus-specific amplicon NGS in detection of virus sequences and their distinct populations in multiple samples

  10. A novel mutation in PRPF31, causative of autosomal dominant retinitis pigmentosa, using the BGISEQ-500 sequencer

    Directory of Open Access Journals (Sweden)

    Yu Zheng

    2018-01-01

    Full Text Available AIM: To study the genes responsible for retinitis pigmentosa. METHODS: A total of 15 Chinese families with retinitis pigmentosa, containing 94 sporadically afflicted cases, were recruited. The targeted sequences were captured using the Target_Eye_365_V3 chip and sequenced using the BGISEQ-500 sequencer, according to the manufacturer’s instructions. Data were aligned to UCSC Genome Browser build hg19, using the Burroughs Wheeler Aligner MEM algorithm. Local realignment was performed with the Genome Analysis Toolkit (GATK v.3.3.0 IndelRealigner, and variants were called with the Genome Analysis Toolkit Haplotypecaller, without any use of imputation. Variants were filtered against a panel derived from 1000 Genomes Project, 1000G_ASN, ESP6500, ExAC and dbSNP138. In all members of Family ONE and Family TWO with available DNA samples, the genetic variant was validated using Sanger sequencing. RESULTS: A novel, pathogenic variant of retinitis pigmentosa, c.357_358delAA (p.Ser119SerfsX5 was identified in PRPF31 in 2 of 15 autosomal-dominant retinitis pigmentosa (ADRP families, as well as in one, sporadic case. Sanger sequencing was performed upon probands, as well as upon other family members. This novel, pathogenic genotype co-segregated with retinitis pigmentosa phenotype in these two families. CONCLUSION: ADRP is a subtype of retinitis pigmentosa, defined by its genotype, which accounts for 20%-40% of the retinitis pigmentosa patients. Our study thus expands the spectrum of PRPF31 mutations known to occur in ADRP, and provides further demonstration of the applicability of the BGISEQ500 sequencer for genomics research.

  11. A novel mutation in PRPF31, causative of autosomal dominant retinitis pigmentosa, using the BGISEQ-500 sequencer

    Science.gov (United States)

    Zheng, Yu; Wang, Hai-Lin; Li, Jian-Kang; Xu, Li; Tellier, Laurent; Li, Xiao-Lin; Huang, Xiao-Yan; Li, Wei; Niu, Tong-Tong; Yang, Huan-Ming; Zhang, Jian-Guo; Liu, Dong-Ning

    2018-01-01

    AIM To study the genes responsible for retinitis pigmentosa. METHODS A total of 15 Chinese families with retinitis pigmentosa, containing 94 sporadically afflicted cases, were recruited. The targeted sequences were captured using the Target_Eye_365_V3 chip and sequenced using the BGISEQ-500 sequencer, according to the manufacturer's instructions. Data were aligned to UCSC Genome Browser build hg19, using the Burroughs Wheeler Aligner MEM algorithm. Local realignment was performed with the Genome Analysis Toolkit (GATK v.3.3.0) IndelRealigner, and variants were called with the Genome Analysis Toolkit Haplotypecaller, without any use of imputation. Variants were filtered against a panel derived from 1000 Genomes Project, 1000G_ASN, ESP6500, ExAC and dbSNP138. In all members of Family ONE and Family TWO with available DNA samples, the genetic variant was validated using Sanger sequencing. RESULTS A novel, pathogenic variant of retinitis pigmentosa, c.357_358delAA (p.Ser119SerfsX5) was identified in PRPF31 in 2 of 15 autosomal-dominant retinitis pigmentosa (ADRP) families, as well as in one, sporadic case. Sanger sequencing was performed upon probands, as well as upon other family members. This novel, pathogenic genotype co-segregated with retinitis pigmentosa phenotype in these two families. CONCLUSION ADRP is a subtype of retinitis pigmentosa, defined by its genotype, which accounts for 20%-40% of the retinitis pigmentosa patients. Our study thus expands the spectrum of PRPF31 mutations known to occur in ADRP, and provides further demonstration of the applicability of the BGISEQ500 sequencer for genomics research. PMID:29375987

  12. Whole-exome sequencing identifies novel compound heterozygous mutations in USH2A in Spanish patients with autosomal recessive retinitis pigmentosa.

    Science.gov (United States)

    Méndez-Vidal, Cristina; González-Del Pozo, María; Vela-Boza, Alicia; Santoyo-López, Javier; López-Domingo, Francisco J; Vázquez-Marouschek, Carmen; Dopazo, Joaquin; Borrego, Salud; Antiñolo, Guillermo

    2013-01-01

    Retinitis pigmentosa (RP) is an inherited retinal dystrophy characterized by extreme genetic and clinical heterogeneity. Thus, the diagnosis is not always easily performed due to phenotypic and genetic overlap. Current clinical practices have focused on the systematic evaluation of a set of known genes for each phenotype, but this approach may fail in patients with inaccurate diagnosis or infrequent genetic cause. In the present study, we investigated the genetic cause of autosomal recessive RP (arRP) in a Spanish family in which the causal mutation has not yet been identified with primer extension technology and resequencing. We designed a whole-exome sequencing (WES)-based approach using NimbleGen SeqCap EZ Exome V3 sample preparation kit and the SOLiD 5500×l next-generation sequencing platform. We sequenced the exomes of both unaffected parents and two affected siblings. Exome analysis resulted in the identification of 43,204 variants in the index patient. All variants passing filter criteria were validated with Sanger sequencing to confirm familial segregation and absence in the control population. In silico prediction tools were used to determine mutational impact on protein function and the structure of the identified variants. Novel Usher syndrome type 2A (USH2A) compound heterozygous mutations, c.4325T>C (p.F1442S) and c.15188T>G (p.L5063R), located in exons 20 and 70, respectively, were identified as probable causative mutations for RP in this family. Family segregation of the variants showed the presence of both mutations in all affected members and in two siblings who were apparently asymptomatic at the time of family ascertainment. Clinical reassessment confirmed the diagnosis of RP in these patients. Using WES, we identified two heterozygous novel mutations in USH2A as the most likely disease-causing variants in a Spanish family diagnosed with arRP in which the cause of the disease had not yet been identified with commonly used techniques. Our data

  13. EZH2 and CD79B mutational status over time in B-cell non-Hodgkin lymphomas detected by high-throughput sequencing using minimal samples

    Science.gov (United States)

    Saieg, Mauro Ajaj; Geddie, William R; Boerner, Scott L; Bailey, Denis; Crump, Michael; da Cunha Santos, Gilda

    2013-01-01

    BACKGROUND: Numerous genomic abnormalities in B-cell non-Hodgkin lymphomas (NHLs) have been revealed by novel high-throughput technologies, including recurrent mutations in EZH2 (enhancer of zeste homolog 2) and CD79B (B cell antigen receptor complex-associated protein beta chain) genes. This study sought to determine the evolution of the mutational status of EZH2 and CD79B over time in different samples from the same patient in a cohort of B-cell NHLs, through use of a customized multiplex mutation assay. METHODS: DNA that was extracted from cytological material stored on FTA cards as well as from additional specimens, including archived frozen and formalin-fixed histological specimens, archived stained smears, and cytospin preparations, were submitted to a multiplex mutation assay specifically designed for the detection of point mutations involving EZH2 and CD79B, using MassARRAY spectrometry followed by Sanger sequencing. RESULTS: All 121 samples from 80 B-cell NHL cases were successfully analyzed. Mutations in EZH2 (Y646) and CD79B (Y196) were detected in 13.2% and 8% of the samples, respectively, almost exclusively in follicular lymphomas and diffuse large B-cell lymphomas. In one-third of the positive cases, a wild type was detected in a different sample from the same patient during follow-up. CONCLUSIONS: Testing multiple minimal tissue samples using a high-throughput multiplex platform exponentially increases tissue availability for molecular analysis and might facilitate future studies of tumor progression and the related molecular events. Mutational status of EZH2 and CD79B may vary in B-cell NHL samples over time and support the concept that individualized therapy should be based on molecular findings at the time of treatment, rather than on results obtained from previous specimens. Cancer (Cancer Cytopathol) 2013;121:377–386. © 2013 American Cancer Society. PMID:23361872

  14. Novel ZEB2-BCL11B Fusion Gene Identified by RNA-Sequencing in Acute Myeloid Leukemia with t(2;14(q22;q32.

    Directory of Open Access Journals (Sweden)

    Synne Torkildsen

    Full Text Available RNA-sequencing of a case of acute myeloid leukemia with the bone marrow karyotype 46,XY,t(2;14(q22;q32[5]/47,XY,idem,+?4,del(6(q13q21[cp6]/46,XY[4] showed that the t(2;14 generated a ZEB2-BCL11B chimera in which exon 2 of ZEB2 (nucleotide 595 in the sequence with accession number NM_014795.3 was fused to exon 2 of BCL11B (nucleotide 554 in the sequence with accession number NM_022898.2. RT-PCR together with Sanger sequencing verified the presence of the above-mentioned fusion transcript. All functional domains of BCL11B are retained in the chimeric protein. Abnormal expression of BCL11B coding regions subjected to control by the ZEB2 promoter seems to be the leukemogenic mechanism behind the translocation.

  15. Detection of genomic variation by selection of a 9 mb DNA region and high throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Sergey I Nikolaev

    Full Text Available Detection of the rare polymorphisms and causative mutations of genetic diseases in a targeted genomic area has become a major goal in order to understand genomic and phenotypic variability. We have interrogated repeat-masked regions of 8.9 Mb on human chromosomes 21 (7.8 Mb and 7 (1.1 Mb from an individual from the International HapMap Project (NA12872. We have optimized a method of genomic selection for high throughput sequencing. Microarray-based selection and sequencing resulted in 260-fold enrichment, with 41% of reads mapping to the target region. 83% of SNPs in the targeted region had at least 4-fold sequence coverage and 54% at least 15-fold. When assaying HapMap SNPs in NA12872, our sequence genotypes are 91.3% concordant in regions with coverage > or = 4-fold, and 97.9% concordant in regions with coverage > or = 15-fold. About 81% of the SNPs recovered with both thresholds are listed in dbSNP. We observed that regions with low sequence coverage occur in close proximity to low-complexity DNA. Validation experiments using Sanger sequencing were performed for 46 SNPs with 15-20 fold coverage, with a confirmation rate of 96%, suggesting that DNA selection provides an accurate and cost-effective method for identifying rare genomic variants.

  16. Machine Learned Replacement of N-Labels for Basecalled Sequences in DNA Barcoding.

    Science.gov (United States)

    Ma, Eddie Y T; Ratnasingham, Sujeevan; Kremer, Stefan C

    2018-01-01

    This study presents a machine learning method that increases the number of identified bases in Sanger Sequencing. The system post-processes a KB basecalled chromatogram. It selects a recoverable subset of N-labels in the KB-called chromatogram to replace with basecalls (A,C,G,T). An N-label correction is defined given an additional read of the same sequence, and a human finished sequence. Corrections are added to the dataset when an alignment determines the additional read and human agree on the identity of the N-label. KB must also rate the replacement with quality value of in the additional read. Corrections are only available during system training. Developing the system, nearly 850,000 N-labels are obtained from Barcode of Life Datasystems, the premier database of genetic markers called DNA Barcodes. Increasing the number of correct bases improves reference sequence reliability, increases sequence identification accuracy, and assures analysis correctness. Keeping with barcoding standards, our system maintains an error rate of percent. Our system only applies corrections when it estimates low rate of error. Tested on this data, our automation selects and recovers: 79 percent of N-labels from COI (animal barcode); 80 percent from matK and rbcL (plant barcodes); and 58 percent from non-protein-coding sequences (across eukaryotes).

  17. Non-codingRNA sequence variations in human chronic lymphocytic leukemia and colorectal cancer.

    Science.gov (United States)

    Wojcik, Sylwia E; Rossi, Simona; Shimizu, Masayoshi; Nicoloso, Milena S; Cimmino, Amelia; Alder, Hansjuerg; Herlea, Vlad; Rassenti, Laura Z; Rai, Kanti R; Kipps, Thomas J; Keating, Michael J; Croce, Carlo M; Calin, George A

    2010-02-01

    Cancer is a genetic disease in which the interplay between alterations in protein-coding genes and non-coding RNAs (ncRNAs) plays a fundamental role. In recent years, the full coding component of the human genome was sequenced in various cancers, whereas such attempts related to ncRNAs are still fragmentary. We screened genomic DNAs for sequence variations in 148 microRNAs (miRNAs) and ultraconserved regions (UCRs) loci in patients with chronic lymphocytic leukemia (CLL) or colorectal cancer (CRC) by Sanger technique and further tried to elucidate the functional consequences of some of these variations. We found sequence variations in miRNAs in both sporadic and familial CLL cases, mutations of UCRs in CLLs and CRCs and, in certain instances, detected functional effects of these variations. Furthermore, by integrating our data with previously published data on miRNA sequence variations, we have created a catalog of DNA sequence variations in miRNAs/ultraconserved genes in human cancers. These findings argue that ncRNAs are targeted by both germ line and somatic mutations as well as by single-nucleotide polymorphisms with functional significance for human tumorigenesis. Sequence variations in ncRNA loci are frequent and some have functional and biological significance. Such information can be exploited to further investigate on a genome-wide scale the frequency of genetic variations in ncRNAs and their functional meaning, as well as for the development of new diagnostic and prognostic markers for leukemias and carcinomas.

  18. Using Next Generation RAD Sequencing to Isolate Multispecies Microsatellites for Pilosocereus (Cactaceae.

    Directory of Open Access Journals (Sweden)

    Isabel A S Bonatelli

    Full Text Available Microsatellite markers (also known as SSRs, Simple Sequence Repeats are widely used in plant science and are among the most informative molecular markers for population genetic investigations, but the development of such markers presents substantial challenges. In this report, we discuss how next generation sequencing can replace the cloning, Sanger sequencing, identification of polymorphic loci, and testing cross-amplification that were previously required to develop microsatellites. We report the development of a large set of microsatellite markers for five species of the Neotropical cactus genus Pilosocereus using a restriction-site-associated DNA sequencing (RAD-seq on a Roche 454 platform. We identified an average of 165 microsatellites per individual, with the absolute numbers across individuals proportional to the sequence reads obtained per individual. Frequency distribution of the repeat units was similar in the five species, with shorter motifs such as di- and trinucleotide being the most abundant repeats. In addition, we provide 72 microsatellites that could be potentially amplified in the sampled species and 22 polymorphic microsatellites validated in two populations of the species Pilosocereus machrisii. Although low coverage sequencing among individuals was observed for most of the loci, which we suggest to be more related to the nature of the microsatellite markers and the possible bias inserted by the restriction enzymes than to the genome size, our work demonstrates that an NGS approach is an efficient method to isolate multispecies microsatellites even in non-model organisms.

  19. Using Next Generation RAD Sequencing to Isolate Multispecies Microsatellites for Pilosocereus (Cactaceae).

    Science.gov (United States)

    Bonatelli, Isabel A S; Carstens, Bryan C; Moraes, Evandro M

    2015-01-01

    Microsatellite markers (also known as SSRs, Simple Sequence Repeats) are widely used in plant science and are among the most informative molecular markers for population genetic investigations, but the development of such markers presents substantial challenges. In this report, we discuss how next generation sequencing can replace the cloning, Sanger sequencing, identification of polymorphic loci, and testing cross-amplification that were previously required to develop microsatellites. We report the development of a large set of microsatellite markers for five species of the Neotropical cactus genus Pilosocereus using a restriction-site-associated DNA sequencing (RAD-seq) on a Roche 454 platform. We identified an average of 165 microsatellites per individual, with the absolute numbers across individuals proportional to the sequence reads obtained per individual. Frequency distribution of the repeat units was similar in the five species, with shorter motifs such as di- and trinucleotide being the most abundant repeats. In addition, we provide 72 microsatellites that could be potentially amplified in the sampled species and 22 polymorphic microsatellites validated in two populations of the species Pilosocereus machrisii. Although low coverage sequencing among individuals was observed for most of the loci, which we suggest to be more related to the nature of the microsatellite markers and the possible bias inserted by the restriction enzymes than to the genome size, our work demonstrates that an NGS approach is an efficient method to isolate multispecies microsatellites even in non-model organisms.

  20. Exome sequencing identifies SUCO mutations in mesial temporal lobe epilepsy.

    Science.gov (United States)

    Sha, Zhiqiang; Sha, Longze; Li, Wenting; Dou, Wanchen; Shen, Yan; Wu, Liwen; Xu, Qi

    2015-03-30

    Mesial temporal lobe epilepsy (mTLE) is the main type and most common medically intractable form of epilepsy. Severity of disease-based stratified samples may help identify new disease-associated mutant genes. We analyzed mRNA expression profiles from patient hippocampal tissue. Three of the seven patients had severe mTLE with generalized-onset convulsions and consciousness loss that occurred over many years. We found that compared with other groups, patients with severe mTLE were classified into a distinct group. Whole-exome sequencing and Sanger sequencing validation in all seven patients identified three novel SUN domain-containing ossification factor (SUCO) mutations in severely affected patients. Furthermore, SUCO knock down significantly reduced dendritic length in vitro. Our results indicate that mTLE defects may affect neuronal development, and suggest that neurons have abnormal development due to lack of SUCO, which may be a generalized-onset epilepsy-related gene. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  1. Genetic sequences derived from suppression subtractive ...

    African Journals Online (AJOL)

    STORAGESEVER

    2008-06-17

    Jun 17, 2008 ... their possible roles in Xanthomonas albilineans ... Technology, P. O. Box 1334, Durban 4000, Republic of South Africa. Accepted 4 ... Clones selected were sequenced (using a Perkin Elmer ABI PRISM Dye terminator cycle.

  2. Shotgun protein sequencing.

    Energy Technology Data Exchange (ETDEWEB)

    Faulon, Jean-Loup Michel; Heffelfinger, Grant S.

    2009-06-01

    A novel experimental and computational technique based on multiple enzymatic digestion of a protein or protein mixture that reconstructs protein sequences from sequences of overlapping peptides is described in this SAND report. This approach, analogous to shotgun sequencing of DNA, is to be used to sequence alternative spliced proteins, to identify post-translational modifications, and to sequence genetically engineered proteins.

  3. Clinical applications of sequencing take center stage

    OpenAIRE

    Glusman, Gustavo

    2013-01-01

    A report on the Advances in Genome Biology and Technology (AGBT) meeting, Marco Island, Florida, USA, February 20-23, 2013. This year's Advances in Genome Biology and Technology (AGBT) meeting reflected the current state of 'next generation' sequencing (NGS) technologies: significantly reduced competition and innovation, and a strong focus on standardization and application. Announcements of technological breakthroughs - a hallmark of previous AGBT meetings - were markedly absent, but existin...

  4. Massively parallel sequencing of forensic STRs

    DEFF Research Database (Denmark)

    Parson, Walther; Ballard, David; Budowle, Bruce

    2016-01-01

    The DNA Commission of the International Society for Forensic Genetics (ISFG) is reviewing factors that need to be considered ahead of the adoption by the forensic community of short tandem repeat (STR) genotyping by massively parallel sequencing (MPS) technologies. MPS produces sequence data that...

  5. Direct chloroplast sequencing: comparison of sequencing platforms and analysis tools for whole chloroplast barcoding.

    Directory of Open Access Journals (Sweden)

    Marta Brozynska

    Full Text Available Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina and Ion Torrent (Life Technology sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genome sequence of a wild rice plant relative to cultivated rice (cv. Nipponbare. Consensus chloroplast sequences were produced by mapping sequence reads to the reference rice chloroplast genome or by de novo assembly and mapping of the resulting contigs to the reference sequence. A total of 122 polymorphisms (SNPs and indels between the wild and cultivated rice chloroplasts were predicted by these different sequencing and analysis methods. Of these, a total of 102 polymorphisms including 90 SNPs were predicted by both platforms. Indels were more variable with different sequencing methods, with almost all discrepancies found in homopolymers. The Ion Torrent platform gave no apparent false SNP but was less reliable for indels. The methods should be suitable for routine barcoding using appropriate combinations of sequencing platform and data analysis.

  6. Prevalence of Hepatitis C Virus Subgenotypes 1a and 1b in Japanese Patients: Ultra-Deep Sequencing Analysis of HCV NS5B Genotype-Specific Region

    Science.gov (United States)

    Wu, Shuang; Kanda, Tatsuo; Nakamoto, Shingo; Jiang, Xia; Miyamura, Tatsuo; Nakatani, Sueli M.; Ono, Suzane Kioko; Takahashi-Nakaguchi, Azusa; Gonoi, Tohru; Yokosuka, Osamu

    2013-01-01

    Background Hepatitis C virus (HCV) subgenotypes 1a and 1b have different impacts on the treatment response to peginterferon plus ribavirin with direct-acting antivirals (DAAs) against patients infected with HCV genotype 1, as the emergence rates of resistance mutations are different between these two subgenotypes. In Japan, almost all of HCV genotype 1 belongs to subgenotype 1b. Methods and Findings To determine HCV subgenotype 1a or 1b in Japanese patients infected with HCV genotype 1, real-time PCR-based method and Sanger method were used for the HCV NS5B region. HCV subgenotypes were determined in 90% by real-time PCR-based method. We also analyzed the specific probe regions for HCV subgenotypes 1a and 1b using ultra-deep sequencing, and uncovered mutations that could not be revealed using direct-sequencing by Sanger method. We estimated the prevalence of HCV subgenotype 1a as 1.2-2.5% of HCV genotype 1 patients in Japan. Conclusions Although real-time PCR-based HCV subgenotyping method seems fair for differentiating HCV subgenotypes 1a and 1b, it may not be sufficient for clinical practice. Ultra-deep sequencing is useful for revealing the resistant strain(s) of HCV before DAA treatment as well as mixed infection with different genotypes or subgenotypes of HCV. PMID:24069214

  7. Adaptive Basis Selection for Exponential Family Smoothing Splines with Application in Joint Modeling of Multiple Sequencing Samples

    OpenAIRE

    Ma, Ping; Zhang, Nan; Huang, Jianhua Z.; Zhong, Wenxuan

    2017-01-01

    Second-generation sequencing technologies have replaced array-based technologies and become the default method for genomics and epigenomics analysis. Second-generation sequencing technologies sequence tens of millions of DNA/cDNA fragments in parallel. After the resulting sequences (short reads) are mapped to the genome, one gets a sequence of short read counts along the genome. Effective extraction of signals in these short read counts is the key to the success of sequencing technologies. No...

  8. Multimodal sequence learning.

    Science.gov (United States)

    Kemény, Ferenc; Meier, Beat

    2016-02-01

    While sequence learning research models complex phenomena, previous studies have mostly focused on unimodal sequences. The goal of the current experiment is to put implicit sequence learning into a multimodal context: to test whether it can operate across different modalities. We used the Task Sequence Learning paradigm to test whether sequence learning varies across modalities, and whether participants are able to learn multimodal sequences. Our results show that implicit sequence learning is very similar regardless of the source modality. However, the presence of correlated task and response sequences was required for learning to take place. The experiment provides new evidence for implicit sequence learning of abstract conceptual representations. In general, the results suggest that correlated sequences are necessary for implicit sequence learning to occur. Moreover, they show that elements from different modalities can be automatically integrated into one unitary multimodal sequence. Copyright © 2015 Elsevier B.V. All rights reserved.

  9. Sequence Read Archive (SRA)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Sequence Read Archive (SRA) stores raw sequencing data from the next generation of sequencing platforms including Roche 454 GS System®, Illumina Genome...

  10. Targeted next generation sequencing for molecular diagnosis of Usher syndrome.

    Science.gov (United States)

    Aparisi, María J; Aller, Elena; Fuster-García, Carla; García-García, Gema; Rodrigo, Regina; Vázquez-Manrique, Rafael P; Blanco-Kelly, Fiona; Ayuso, Carmen; Roux, Anne-Françoise; Jaijo, Teresa; Millán, José M

    2014-11-18

    Usher syndrome is an autosomal recessive disease that associates sensorineural hearing loss, retinitis pigmentosa and, in some cases, vestibular dysfunction. It is clinically and genetically heterogeneous. To date, 10 genes have been associated with the disease, making its molecular diagnosis based on Sanger sequencing, expensive and time-consuming. Consequently, the aim of the present study was to develop a molecular diagnostics method for Usher syndrome, based on targeted next generation sequencing. A custom HaloPlex panel for Illumina platforms was designed to capture all exons of the 10 known causative Usher syndrome genes (MYO7A, USH1C, CDH23, PCDH15, USH1G, CIB2, USH2A, GPR98, DFNB31 and CLRN1), the two Usher syndrome-related genes (HARS and PDZD7) and the two candidate genes VEZT and MYO15A. A cohort of 44 patients suffering from Usher syndrome was selected for this study. This cohort was divided into two groups: a test group of 11 patients with known mutations and another group of 33 patients with unknown mutations. Forty USH patients were successfully sequenced, 8 USH patients from the test group and 32 patients from the group composed of USH patients without genetic diagnosis. We were able to detect biallelic mutations in one USH gene in 22 out of 32 USH patients (68.75%) and to identify 79.7% of the expected mutated alleles. Fifty-three different mutations were detected. These mutations included 21 missense, 8 nonsense, 9 frameshifts, 9 intronic mutations and 6 large rearrangements. Targeted next generation sequencing allowed us to detect both point mutations and large rearrangements in a single experiment, minimizing the economic cost of the study, increasing the detection ratio of the genetic cause of the disease and improving the genetic diagnosis of Usher syndrome patients.

  11. A second generation framework for the analysis of microsatellites in expressed sequence tags and the development of EST-SSR markers for a conifer, Cryptomeria japonica

    Directory of Open Access Journals (Sweden)

    Ueno Saneyoshi

    2012-04-01

    Full Text Available Abstract Background Microsatellites or simple sequence repeats (SSRs in expressed sequence tags (ESTs are useful resources for genome analysis because of their abundance, functionality and polymorphism. The advent of commercial second generation sequencing machines has lead to new strategies for developing EST-SSR markers, necessitating the development of bioinformatic framework that can keep pace with the increasing quality and quantity of sequence data produced. We describe an open scheme for analyzing ESTs and developing EST-SSR markers from reads collected by Sanger sequencing and pyrosequencing of sugi (Cryptomeria japonica. Results We collected 141,097 sequence reads by Sanger sequencing and 1,333,444 by pyrosequencing. After trimming contaminant and low quality sequences, 118,319 Sanger and 1,201,150 pyrosequencing reads were passed to the MIRA assembler, generating 81,284 contigs that were analysed for SSRs. 4,059 SSRs were found in 3,694 (4.54% contigs, giving an SSR frequency lower than that in seven other plant species with gene indices (5.4–21.9%. The average GC content of the SSR-containing contigs was 41.55%, compared to 40.23% for all contigs. Tri-SSRs were the most common SSRs; the most common motif was AT, which was found in 655 (46.3% di-SSRs, followed by the AAG motif, found in 342 (25.9% tri-SSRs. Most (72.8% tri-SSRs were in coding regions, but 55.6% of the di-SSRs were in non-coding regions; the AT motif was most abundant in 3′ untranslated regions. Gene ontology (GO annotations showed that six GO terms were significantly overrepresented within SSR-containing contigs. Forty–four EST-SSR markers were developed from 192 primer pairs using two pipelines: read2Marker and the newly-developed CMiB, which combines several open tools. Markers resulting from both pipelines showed no differences in PCR success rate and polymorphisms, but PCR success and polymorphism were significantly affected by the expected PCR product size

  12. A second generation framework for the analysis of microsatellites in expressed sequence tags and the development of EST-SSR markers for a conifer, Cryptomeria japonica

    Science.gov (United States)

    2012-01-01

    Background Microsatellites or simple sequence repeats (SSRs) in expressed sequence tags (ESTs) are useful resources for genome analysis because of their abundance, functionality and polymorphism. The advent of commercial second generation sequencing machines has lead to new strategies for developing EST-SSR markers, necessitating the development of bioinformatic framework that can keep pace with the increasing quality and quantity of sequence data produced. We describe an open scheme for analyzing ESTs and developing EST-SSR markers from reads collected by Sanger sequencing and pyrosequencing of sugi (Cryptomeria japonica). Results We collected 141,097 sequence reads by Sanger sequencing and 1,333,444 by pyrosequencing. After trimming contaminant and low quality sequences, 118,319 Sanger and 1,201,150 pyrosequencing reads were passed to the MIRA assembler, generating 81,284 contigs that were analysed for SSRs. 4,059 SSRs were found in 3,694 (4.54%) contigs, giving an SSR frequency lower than that in seven other plant species with gene indices (5.4–21.9%). The average GC content of the SSR-containing contigs was 41.55%, compared to 40.23% for all contigs. Tri-SSRs were the most common SSRs; the most common motif was AT, which was found in 655 (46.3%) di-SSRs, followed by the AAG motif, found in 342 (25.9%) tri-SSRs. Most (72.8%) tri-SSRs were in coding regions, but 55.6% of the di-SSRs were in non-coding regions; the AT motif was most abundant in 3′ untranslated regions. Gene ontology (GO) annotations showed that six GO terms were significantly overrepresented within SSR-containing contigs. Forty–four EST-SSR markers were developed from 192 primer pairs using two pipelines: read2Marker and the newly-developed CMiB, which combines several open tools. Markers resulting from both pipelines showed no differences in PCR success rate and polymorphisms, but PCR success and polymorphism were significantly affected by the expected PCR product size and number of SSR

  13. Navigating the tip of the genomic iceberg: Next-generation sequencing for plant systematics.

    Science.gov (United States)

    Straub, Shannon C K; Parks, Matthew; Weitemier, Kevin; Fishbein, Mark; Cronn, Richard C; Liston, Aaron

    2012-02-01

    Just as Sanger sequencing did more than 20 years ago, next-generation sequencing (NGS) is poised to revolutionize plant systematics. By combining multiplexing approaches with NGS throughput, systematists may no longer need to choose between more taxa or more characters. Here we describe a genome skimming (shallow sequencing) approach for plant systematics. Through simulations, we evaluated optimal sequencing depth and performance of single-end and paired-end short read sequences for assembly of nuclear ribosomal DNA (rDNA) and plastomes and addressed the effect of divergence on reference-guided plastome assembly. We also used simulations to identify potential phylogenetic markers from low-copy nuclear loci at different sequencing depths. We demonstrated the utility of genome skimming through phylogenetic analysis of the Sonoran Desert clade (SDC) of Asclepias (Apocynaceae). Paired-end reads performed better than single-end reads. Minimum sequencing depths for high quality rDNA and plastome assemblies were 40× and 30×, respectively. Divergence from the reference significantly affected plastome assembly, but relatively similar references are available for most seed plants. Deeper rDNA sequencing is necessary to characterize intragenomic polymorphism. The low-copy fraction of the nuclear genome was readily surveyed, even at low sequencing depths. Nearly 160000 bp of sequence from three organelles provided evidence of phylogenetic incongruence in the SDC. Adoption of NGS will facilitate progress in plant systematics, as whole plastome and rDNA cistrons, partial mitochondrial genomes, and low-copy nuclear markers can now be efficiently obtained for molecular phylogenetics studies.

  14. Next Generation DNA Sequencing and the Future of Genomic Medicine

    OpenAIRE

    Anderson, Matthew W.; Schrijver, Iris

    2010-01-01

    In the years since the first complete human genome sequence was reported, there has been a rapid development of technologies to facilitate high-throughput sequence analysis of DNA (termed “next-generation” sequencing). These novel approaches to DNA sequencing offer the promise of complete genomic analysis at a cost feasible for routine clinical diagnostics. However, the ability to more thoroughly interrogate genomic sequence raises a number of important issues with regard to result interpreta...

  15. Identification of two novel pathogenic compound heterozygous MYO7A mutations in Usher syndrome by whole exome sequencing.

    Science.gov (United States)

    Jia, Ying; Li, Xiaoge; Yang, Dong; Xu, Yi; Guo, Ying; Li, Xin

    2018-01-01

    The current study aims to identify the pathogenic sites in a core pedigree of Usher syndrome (USH). A core pedigree of USH was analyzed by whole exome sequencing (WES). Mutations were verified by polymerase chain reaction (PCR) amplification and Sanger sequencing. Two pathogenic variations (c.849+2T>C and c.5994G>A) in MYO7A were successfully identified and individually separated from parents. One variant (c.849+2T>C) was nonsense mutation, causing the protein terminated in advance, and the other one (c.5994G>A) located near the boundary of exon could cause aberrant splicing. This study provides a meaningful exploration for identification of clinical core genetic pedigrees. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. A rapid screening with direct sequencing from blood samples for the diagnosis of Leigh syndrome

    Directory of Open Access Journals (Sweden)

    Hiroko Shimbo

    2014-01-01

    Full Text Available Large numbers of genes are responsible for Leigh syndrome (LS, making genetic confirmation of LS difficult. We screened our patients with LS using a limited set of 21 primers encompassing the frequently reported gene for the respiratory chain complexes I (ND1–ND6, and ND4L, IV(SURF1, and V(ATP6 and the pyruvate dehydrogenase E1α-subunit. Of 18 LS patients, we identified mutations in 11 patients, including 7 in mDNA (two with ATP6, 4 in nuclear (three with SURF1. Overall, we identified mutations in 61% of LS patients (11/18 individuals in this cohort. Sanger sequencing with our limited set of primers allowed us a rapid genetic confirmation of more than half of the LS patients and it appears to be efficient as a primary genetic screening in this cohort.

  17. A massive parallel sequencing workflow for diagnostic genetic testing of mismatch repair genes

    Science.gov (United States)

    Hansen, Maren F; Neckmann, Ulrike; Lavik, Liss A S; Vold, Trine; Gilde, Bodil; Toft, Ragnhild K; Sjursen, Wenche

    2014-01-01

    The purpose of this study was to develop a massive parallel sequencing (MPS) workflow for diagnostic analysis of mismatch repair (MMR) genes using the GS Junior system (Roche). A pathogenic variant in one of four MMR genes, (MLH1, PMS2, MSH6, and MSH2), is the cause of Lynch Syndrome (LS), which mainly predispose to colorectal cancer. We used an amplicon-based sequencing method allowing specific and preferential amplification of the MMR genes including PMS2, of which several pseudogenes exist. The amplicons were pooled at different ratios to obtain coverage uniformity and maximize the throughput of a single-GS Junior run. In total, 60 previously identified and distinct variants (substitutions and indels), were sequenced by MPS and successfully detected. The heterozygote detection range was from 19% to 63% and dependent on sequence context and coverage. We were able to distinguish between false-positive and true-positive calls in homopolymeric regions by cross-sample comparison and evaluation of flow signal distributions. In addition, we filtered variants according to a predefined status, which facilitated variant annotation. Our study shows that implementation of MPS in routine diagnostics of LS can accelerate sample throughput and reduce costs without compromising sensitivity, compared to Sanger sequencing. PMID:24689082

  18. Hardware Accelerated Sequence Alignment with Traceback

    Directory of Open Access Journals (Sweden)

    Scott Lloyd

    2009-01-01

    in a timely manner. Known methods to accelerate alignment on reconfigurable hardware only address sequence comparison, limit the sequence length, or exhibit memory and I/O bottlenecks. A space-efficient, global sequence alignment algorithm and architecture is presented that accelerates the forward scan and traceback in hardware without memory and I/O limitations. With 256 processing elements in FPGA technology, a performance gain over 300 times that of a desktop computer is demonstrated on sequence lengths of 16000. For greater performance, the architecture is scalable to more processing elements.

  19. Comparison of two Next Generation sequencing platforms for full genome sequencing of Classical Swine Fever Virus

    DEFF Research Database (Denmark)

    Fahnøe, Ulrik; Pedersen, Anders Gorm; Höper, Dirk

    2013-01-01

    to the consensus sequence. Additionally, we got an average sequence depth for the genome of 4000 for the Iontorrent PGM and 400 for the FLX platform making the mapping suitable for single nucleotide variant (SNV) detection. The analysis revealed a single non-silent SNV A10665G leading to the amino acid change D......Next Generation Sequencing (NGS) is becoming more adopted into viral research and will be the preferred technology in the years to come. We have recently sequenced several strains of Classical Swine Fever Virus (CSFV) by NGS on both Genome Sequencer FLX (GS FLX) and Iontorrent PGM platforms...

  20. Identification of a novel LMF1 nonsense mutation responsible for severe hypertriglyceridemia by targeted next-generation sequencing.

    Science.gov (United States)

    Cefalù, Angelo B; Spina, Rossella; Noto, Davide; Ingrassia, Valeria; Valenti, Vincenza; Giammanco, Antonina; Fayer, Francesca; Misiano, Gabriella; Cocorullo, Gianfranco; Scrimali, Chiara; Palesano, Ornella; Altieri, Grazia I; Ganci, Antonina; Barbagallo, Carlo M; Averna, Maurizio R

    Severe hypertriglyceridemia (HTG) may result from mutations in genes affecting the intravascular lipolysis of triglyceride (TG)-rich lipoproteins. The aim of this study was to develop a targeted next-generation sequencing panel for the molecular diagnosis of disorders characterized by severe HTG. We developed a targeted customized panel for next-generation sequencing Ion Torrent Personal Genome Machine to capture the coding exons and intron/exon boundaries of 18 genes affecting the main pathways of TG synthesis and metabolism. We sequenced 11 samples of patients with severe HTG (TG>885 mg/dL-10 mmol/L): 4 positive controls in whom pathogenic mutations had previously been identified by Sanger sequencing and 7 patients in whom the molecular defect was still unknown. The customized panel was accurate, and it allowed to confirm genetic variants previously identified in all positive controls with primary severe HTG. Only 1 patient of 7 with HTG was found to be carrier of a homozygous pathogenic mutation of the third novel mutation of LMF1 gene (c.1380C>G-p.Y460X). The clinical and molecular familial cascade screening allowed the identification of 2 additional affected siblings and 7 heterozygous carriers of the mutation. We showed that our targeted resequencing approach for genetic diagnosis of severe HTG appears to be accurate, less time consuming, and more economical compared with traditional Sanger resequencing. The identification of pathogenic mutations in candidate genes remains challenging and clinical resequencing should mainly intended for patients with strong clinical criteria for monogenic severe HTG. Copyright © 2017 National Lipid Association. Published by Elsevier Inc. All rights reserved.

  1. Identification of Novel Variants in LTBP2 and PXDN Using Whole-Exome Sequencing in Developmental and Congenital Glaucoma.

    Directory of Open Access Journals (Sweden)

    Shazia Micheal

    Full Text Available Primary congenital glaucoma (PCG is the most common form of glaucoma in children. PCG occurs due to the developmental defects in the trabecular meshwork and anterior chamber of the eye. The purpose of this study is to identify the causative genetic variants in three families with developmental and primary congenital glaucoma (PCG with a recessive inheritance pattern.DNA samples were obtained from consanguineous families of Pakistani ancestry. The CYP1B1 gene was sequenced in the affected probands by conventional Sanger DNA sequencing. Whole exome sequencing (WES was performed in DNA samples of four individuals belonging to three different CYP1B1-negative families. Variants identified by WES were validated by Sanger sequencing.WES identified potentially causative novel mutations in the latent transforming growth factor beta binding protein 2 (LTBP2 gene in two PCG families. In the first family a novel missense mutation (c.4934G>A; p.Arg1645Glu co-segregates with the disease phenotype, and in the second family a novel frameshift mutation (c.4031_4032insA; p.Asp1345Glyfs*6 was identified. In a third family with developmental glaucoma a novel mutation (c.3496G>A; p.Gly1166Arg was identified in the PXDN gene, which segregates with the disease.We identified three novel mutations in glaucoma families using WES; two in the LTBP2 gene and one in the PXDN gene. The results will not only enhance our current understanding of the genetic basis of glaucoma, but may also contribute to a better understanding of the diverse phenotypic consequences caused by mutations in these genes.

  2. Genome-wide linkage, exome sequencing and functional analyses identify ABCB6 as the pathogenic gene of dyschromatosis universalis hereditaria.

    Directory of Open Access Journals (Sweden)

    Hong Liu

    Full Text Available As a genetic disorder of abnormal pigmentation, the molecular basis of dyschromatosis universalis hereditaria (DUH had remained unclear until recently when ABCB6 was reported as a causative gene of DUH.We performed genome-wide linkage scan using Illumina Human 660W-Quad BeadChip and exome sequencing analyses using Agilent SureSelect Human All Exon Kits in a multiplex Chinese DUH family to identify the pathogenic mutations and verified the candidate mutations using Sanger sequencing. Quantitative RT-PCR and Immunohistochemistry was performed to verify the expression of the pathogenic gene, Zebrafish was also used to confirm the functional role of ABCB6 in melanocytes and pigmentation.Genome-wide linkage (assuming autosomal dominant inheritance mode and exome sequencing analyses identified ABCB6 as the disease candidate gene by discovering a coding mutation (c.1358C>T; p.Ala453Val that co-segregates with the disease phenotype. Further mutation analysis of ABCB6 in four other DUH families and two sporadic cases by Sanger sequencing confirmed the mutation (c.1358C>T; p.Ala453Val and discovered a second, co-segregating coding mutation (c.964A>C; p.Ser322Lys in one of the four families. Both mutations were heterozygous in DUH patients and not present in the 1000 Genome Project and dbSNP database as well as 1,516 unrelated Chinese healthy controls. Expression analysis in human skin and mutagenesis interrogation in zebrafish confirmed the functional role of ABCB6 in melanocytes and pigmentation. Given the involvement of ABCB6 mutations in coloboma, we performed ophthalmological examination of the DUH carriers of ABCB6 mutations and found ocular abnormalities in them.Our study has advanced our understanding of DUH pathogenesis and revealed the shared pathological mechanism between pigmentary DUH and ocular coloboma.

  3. MetaGaAP: A Novel Pipeline to Estimate Community Composition and Abundance from Non-Model Sequence Data

    Directory of Open Access Journals (Sweden)

    Christopher Noune

    2017-02-01

    Full Text Available Next generation sequencing and bioinformatic approaches are increasingly used to quantify microorganisms within populations by analysis of ‘meta-barcode’ data. This approach relies on comparison of amplicon sequences of ‘barcode’ regions from a population with public-domain databases of reference sequences. However, for many organisms relevant ‘barcode’ regions may not have been identified and large databases of reference sequences may not be available. A workflow and software pipeline, ‘MetaGaAP,’ was developed to identify and quantify genotypes through four steps: shotgun sequencing and identification of polymorphisms in a metapopulation to identify custom ‘barcode’ regions of less than 30 polymorphisms within the span of a single ‘read’, amplification and sequencing of the ‘barcode’, generation of a custom database of polymorphisms, and quantitation of the relative abundance of genotypes. The pipeline and workflow were validated in a ‘wild type’ Alphabaculovirus isolate, Helicoverpa armigera single nucleopolyhedrovirus (HaSNPV-AC53 and a tissue-culture derived strain (HaSNPV-AC53-T2. The approach was validated by comparison of polymorphisms in amplicons and shotgun data, and by comparison of predicted dominant and co-dominant genotypes with Sanger sequences. The computational power required to generate and search the database effectively limits the number of polymorphisms that can be included in a barcode to 30 or less. The approach can be used in quantitative analysis of the ecology and pathology of non-model organisms.

  4. The Pinus taeda genome is characterized by diverse and highly diverged repetitive sequences

    Directory of Open Access Journals (Sweden)

    Yandell Mark

    2010-07-01

    Full Text Available Abstract Background In today's age of genomic discovery, no attempt has been made to comprehensively sequence a gymnosperm genome. The largest genus in the coniferous family Pinaceae is Pinus, whose 110-120 species have extremely large genomes (c. 20-40 Gb, 2N = 24. The size and complexity of these genomes have prompted much speculation as to the feasibility of completing a conifer genome sequence. Conifer genomes are reputed to be highly repetitive, but there is little information available on the nature and identity of repetitive units in gymnosperms. The pines have extensive genetic resources, with approximately 329000 ESTs from eleven species and genetic maps in eight species, including a dense genetic map of the twelve linkage groups in Pinus taeda. Results We present here the Sanger sequence and annotation of ten P. taeda BAC clones and Genome Analyzer II whole genome shotgun (WGS sequences representing 7.5% of the genome. Computational annotation of ten BACs predicts three putative protein-coding genes and at least fifteen likely pseudogenes in nearly one megabase of sequence. We found three conifer-specific LTR retroelements in the BACs, and tentatively identified at least 15 others based on evidence from the distantly related angiosperms. Alignment of WGS sequences to the BACs indicates that 80% of BAC sequences have similar copies (≥ 75% nucleotide identity elsewhere in the genome, but only 23% have identical copies (99% identity. The three most common repetitive elements in the genome were identified and, when combined, represent less than 5% of the genome. Conclusions This study indicates that the majority of repeats in the P. taeda genome are 'novel' and will therefore require additional BAC or genomic sequencing for accurate characterization. The pine genome contains a very large number of diverged and probably defunct repetitive elements. This study also provides new evidence that sequencing a pine genome using a WGS approach is

  5. Genome Sequences of Oryza Species

    KAUST Repository

    Kumagai, Masahiko; Tanaka, Tsuyoshi; Ohyanagi, Hajime; Hsing, Yue-Ie C.; Itoh, Takeshi

    2018-01-01

    This chapter summarizes recent data obtained from genome sequencing, annotation projects, and studies on the genome diversity of Oryza sativa and related Oryza species. O. sativa, commonly known as Asian rice, is the first monocot species whose complete genome sequence was deciphered based on physical mapping by an international collaborative effort. This genome, along with its accurate and comprehensive annotation, has become an indispensable foundation for crop genomics and breeding. With the development of innovative sequencing technologies, genomic studies of O. sativa have dramatically increased; in particular, a large number of cultivars and wild accessions have been sequenced and compared with the reference rice genome. Since de novo genome sequencing has become cost-effective, the genome of African cultivated rice, O. glaberrima, has also been determined. Comparative genomic studies have highlighted the independent domestication processes of different rice species, but it also turned out that Asian and African rice share a common gene set that has experienced similar artificial selection. An international project aimed at constructing reference genomes and examining the genome diversity of wild Oryza species is currently underway, and the genomes of some species are publicly available. This project provides a platform for investigations such as the evolution, development, polyploidization, and improvement of crops. Studies on the genomic diversity of Oryza species, including wild species, should provide new insights to solve the problem of growing food demands in the face of rapid climatic changes.

  6. Genome Sequences of Oryza Species

    KAUST Repository

    Kumagai, Masahiko

    2018-02-14

    This chapter summarizes recent data obtained from genome sequencing, annotation projects, and studies on the genome diversity of Oryza sativa and related Oryza species. O. sativa, commonly known as Asian rice, is the first monocot species whose complete genome sequence was deciphered based on physical mapping by an international collaborative effort. This genome, along with its accurate and comprehensive annotation, has become an indispensable foundation for crop genomics and breeding. With the development of innovative sequencing technologies, genomic studies of O. sativa have dramatically increased; in particular, a large number of cultivars and wild accessions have been sequenced and compared with the reference rice genome. Since de novo genome sequencing has become cost-effective, the genome of African cultivated rice, O. glaberrima, has also been determined. Comparative genomic studies have highlighted the independent domestication processes of different rice species, but it also turned out that Asian and African rice share a common gene set that has experienced similar artificial selection. An international project aimed at constructing reference genomes and examining the genome diversity of wild Oryza species is currently underway, and the genomes of some species are publicly available. This project provides a platform for investigations such as the evolution, development, polyploidization, and improvement of crops. Studies on the genomic diversity of Oryza species, including wild species, should provide new insights to solve the problem of growing food demands in the face of rapid climatic changes.

  7. Sequence Factorization with Multiple References.

    Directory of Open Access Journals (Sweden)

    Sebastian Wandelt

    Full Text Available The success of high-throughput sequencing has lead to an increasing number of projects which sequence large populations of a species. Storage and analysis of sequence data is a key challenge in these projects, because of the sheer size of the datasets. Compression is one simple technology to deal with this challenge. Referential factorization and compression schemes, which store only the differences between input sequence and a reference sequence, gained lots of interest in this field. Highly-similar sequences, e.g., Human genomes, can be compressed with a compression ratio of 1,000:1 and more, up to two orders of magnitude better than with standard compression techniques. Recently, it was shown that the compression against multiple references from the same species can boost the compression ratio up to 4,000:1. However, a detailed analysis of using multiple references is lacking, e.g., for main memory consumption and optimality. In this paper, we describe one key technique for the referential compression against multiple references: The factorization of sequences. Based on the notion of an optimal factorization, we propose optimization heuristics and identify parameter settings which greatly influence 1 the size of the factorization, 2 the time for factorization, and 3 the required amount of main memory. We evaluate a total of 30 setups with a varying number of references on data from three different species. Our results show a wide range of factorization sizes (optimal to an overhead of up to 300%, factorization speed (0.01 MB/s to more than 600 MB/s, and main memory usage (few dozen MB to dozens of GB. Based on our evaluation, we identify the best configurations for common use cases. Our evaluation shows that multi-reference factorization is much better than single-reference factorization.

  8. Sequencing intractable DNA to close microbial genomes.

    Directory of Open Access Journals (Sweden)

    Richard A Hurt

    Full Text Available Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps and the Desulfovibrio africanus genome (1 intractable gap. The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  9. Sequencing Intractable DNA to Close Microbial Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Hurt, Jr., Richard Ashley [ORNL; Brown, Steven D [ORNL; Podar, Mircea [ORNL; Palumbo, Anthony Vito [ORNL; Elias, Dwayne A [ORNL

    2012-01-01

    Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled intractable resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such difficult regions in the non-contiguous finished Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. These developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  10. Nonparametric combinatorial sequence models.

    Science.gov (United States)

    Wauthier, Fabian L; Jordan, Michael I; Jojic, Nebojsa

    2011-11-01

    This work considers biological sequences that exhibit combinatorial structures in their composition: groups of positions of the aligned sequences are "linked" and covary as one unit across sequences. If multiple such groups exist, complex interactions can emerge between them. Sequences of this kind arise frequently in biology but methodologies for analyzing them are still being developed. This article presents a nonparametric prior on sequences which allows combinatorial structures to emerge and which induces a posterior distribution over factorized sequence representations. We carry out experiments on three biological sequence families which indicate that combinatorial structures are indeed present and that combinatorial sequence models can more succinctly describe them than simpler mixture models. We conclude with an application to MHC binding prediction which highlights the utility of the posterior distribution over sequence representations induced by the prior. By integrating out the posterior, our method compares favorably to leading binding predictors.

  11. FRESCO: Referential compression of highly similar sequences.

    Science.gov (United States)

    Wandelt, Sebastian; Leser, Ulf

    2013-01-01

    In many applications, sets of similar texts or sequences are of high importance. Prominent examples are revision histories of documents or genomic sequences. Modern high-throughput sequencing technologies are able to generate DNA sequences at an ever-increasing rate. In parallel to the decreasing experimental time and cost necessary to produce DNA sequences, computational requirements for analysis and storage of the sequences are steeply increasing. Compression is a key technology to deal with this challenge. Recently, referential compression schemes, storing only the differences between a to-be-compressed input and a known reference sequence, gained a lot of interest in this field. In this paper, we propose a general open-source framework to compress large amounts of biological sequence data called Framework for REferential Sequence COmpression (FRESCO). Our basic compression algorithm is shown to be one to two orders of magnitudes faster than comparable related work, while achieving similar compression ratios. We also propose several techniques to further increase compression ratios, while still retaining the advantage in speed: 1) selecting a good reference sequence; and 2) rewriting a reference sequence to allow for better compression. In addition,we propose a new way of further boosting the compression ratios by applying referential compression to already referentially compressed files (second-order compression). This technique allows for compression ratios way beyond state of the art, for instance,4,000:1 and higher for human genomes. We evaluate our algorithms on a large data set from three different species (more than 1,000 genomes, more than 3 TB) and on a collection of versions of Wikipedia pages. Our results show that real-time compression of highly similar sequences at high compression ratios is possible on modern hardware.

  12. UGT1A1 (TA)n genotyping in sickle-cell disease: high resolution melting (HRM) curve analysis or direct sequencing, what is the best way?

    Science.gov (United States)

    Thomas, Vincent; Mazard, Blandine; Garcia, Caroline; Lacan, Philippe; Gagnieu, Marie-Claude; Joly, Philippe

    2013-09-23

    Minucci et al. have proposed in 2010 a rapid, simple and cost-effective HRM method on the LightCycler 480® apparatus (Roche) for the determination of the 6/6, 6/7 and 7/7 genotypes of the (TA)n UGT1A1 promoter polymorphism. However, they have not studied the n=5 and n=8 alleles which can be quite frequent in sickle-cell disease patients. The aim of our study was to test this HRM protocol to all the 10 possible (TA)n UGT1A1 genotypes (i.e. 5/5, 5/6, 5/7, 5/8, 6/6, 6/7, 6/8, 7/7, 7/8 and 8/8) by using our SCD cohort of patients. All genotypes could be unambiguously identified except 6/7 and 6/8 which give a similar HRM profile. For those two genotypes, the differentiation necessitates either a direct Sanger sequencing or a second PCR protocol followed by a 3% agarose gel migration. For the (TA)n UGT1A1 promoter genotyping of African patients, each lab has to wonder what is the best way between (i) direct Sanger sequencing of all patients and (ii) HRM protocol for all patients followed by a complementary analysis to differentiate the 6/7 and 6/8 genotypes. © 2013. Published by Elsevier B.V. All rights reserved.

  13. Whole-Exome Sequencing Reveals Clinically Relevant Variants in Family Affected with Autism Spectrum Disorder

    Directory of Open Access Journals (Sweden)

    Jiaxiu Zhou

    2016-10-01

    Full Text Available Chromosomal microarray (CMA has been suggested as a first tier clinical diagnostic test for ASD. High-throughput sequencing (HTS has associated hundreds of genes associated with ASD. Whole Exome Sequencing (WES was used in combination with CMA to identify clinically-relevant ASD variants. In prior work, a trio-based (father, mother, and proband WGS (Whole Genome Sequencing was used to reveal clinically-relevant de novo, or inherited, rare variants in half (16 / 32 of the ASD families in which all probands had normal, or VOUS (Variant of Uncertain Clinical Significance, CMA results. In this study, after CMA screening chromosome structural abnormalities of a proband affected with ASD, a WES was performed on the patient and parents. Some rare de novo, and inherited, variants were detected using trio-based bioinformatics analysis. ASD variants were ranked by SFARI Gene score, HPO (human phenotype ontology, protein function damage, and manual searching PubMed. Sanger sequencing was used to validated some candidate variants in family members. A de novo homozygous mutation in SPG11 (p.C209F, two inherited, compound-heterozygote mutations in SCN9A (p.Q10R and p.R1893H and BEST1 (p.A135V and p.A297V were confirmed. Heterozygous mutations in TSC1 (p.S487C and SHANK2 (p.Arg569His inherited from mother were also confirmed.

  14. Implementation of Cloud based next generation sequencing data analysis in a clinical laboratory.

    Science.gov (United States)

    Onsongo, Getiria; Erdmann, Jesse; Spears, Michael D; Chilton, John; Beckman, Kenneth B; Hauge, Adam; Yohe, Sophia; Schomaker, Matthew; Bower, Matthew; Silverstein, Kevin A T; Thyagarajan, Bharat

    2014-05-23

    The introduction of next generation sequencing (NGS) has revolutionized molecular diagnostics, though several challenges remain limiting the widespread adoption of NGS testing into clinical practice. One such difficulty includes the development of a robust bioinformatics pipeline that can handle the volume of data generated by high-throughput sequencing in a cost-effective manner. Analysis of sequencing data typically requires a substantial level of computing power that is often cost-prohibitive to most clinical diagnostics laboratories. To address this challenge, our institution has developed a Galaxy-based data analysis pipeline which relies on a web-based, cloud-computing infrastructure to process NGS data and identify genetic variants. It provides additional flexibility, needed to control storage costs, resulting in a pipeline that is cost-effective on a per-sample basis. It does not require the usage of EBS disk to run a sample. We demonstrate the validation and feasibility of implementing this bioinformatics pipeline in a molecular diagnostics laboratory. Four samples were analyzed in duplicate pairs and showed 100% concordance in mutations identified. This pipeline is currently being used in the clinic and all identified pathogenic variants confirmed using Sanger sequencing further validating the software.

  15. Barcoding lichen-forming fungi using 454 pyrosequencing is challenged by artifactual and biological sequence variation.

    Science.gov (United States)

    Mark, Kristiina; Cornejo, Carolina; Keller, Christine; Flück, Daniela; Scheidegger, Christoph

    2016-09-01

    Although lichens (lichen-forming fungi) play an important role in the ecological integrity of many vulnerable landscapes, only a minority of lichen-forming fungi have been barcoded out of the currently accepted ∼18 000 species. Regular Sanger sequencing can be problematic when analyzing lichens since saprophytic, endophytic, and parasitic fungi live intimately admixed, resulting in low-quality sequencing reads. Here, high-throughput, long-read 454 pyrosequencing in a GS FLX+ System was tested to barcode the fungal partner of 100 epiphytic lichen species from Switzerland using fungal-specific primers when amplifying the full internal transcribed spacer region (ITS). The present study shows the potential of DNA barcoding using pyrosequencing, in that the expected lichen fungus was successfully sequenced for all samples except one. Alignment solutions such as BLAST were found to be largely adequate for the generated long reads. In addition, the NCBI nucleotide database-currently the most complete database for lichen-forming fungi-can be used as a reference database when identifying common species, since the majority of analyzed lichens were identified correctly to the species or at least to the genus level. However, several issues were encountered, including a high sequencing error rate, multiple ITS versions in a genome (incomplete concerted evolution), and in some samples the presence of mixed lichen-forming fungi (possible lichen chimeras).

  16. Application of Massively Parallel Sequencing in the Clinical Diagnostic Testing of Inherited Cardiac Conditions

    Directory of Open Access Journals (Sweden)

    Ivone U. S. Leong

    2014-06-01

    Full Text Available Sudden cardiac death in people between the ages of 1–40 years is a devastating event and is frequently caused by several heritable cardiac disorders. These disorders include cardiac ion channelopathies, such as long QT syndrome, catecholaminergic polymorphic ventricular tachycardia and Brugada syndrome and cardiomyopathies, such as hypertrophic cardiomyopathy and arrhythmogenic right ventricular cardiomyopathy. Through careful molecular genetic evaluation of DNA from sudden death victims, the causative gene mutation can be uncovered, and the rest of the family can be screened and preventative measures implemented in at-risk individuals. The current screening approach in most diagnostic laboratories uses Sanger-based sequencing; however, this method is time consuming and labour intensive. The development of massively parallel sequencing has made it possible to produce millions of sequence reads simultaneously and is potentially an ideal approach to screen for mutations in genes that are associated with sudden cardiac death. This approach offers mutation screening at reduced cost and turnaround time. Here, we will review the current commercially available enrichment kits, massively parallel sequencing (MPS platforms, downstream data analysis and its application to sudden cardiac death in a diagnostic environment.

  17. Deep sequencing of uveal melanoma identifies a recurrent mutation in PLCB4

    DEFF Research Database (Denmark)

    Johansson, Peter; Aoude, Lauren G; Wadt, Karin

    2016-01-01

    Next generation sequencing of uveal melanoma (UM) samples has identified a number of recurrent oncogenic or loss-of-function mutations in key driver genes including: GNAQ, GNA11, EIF1AX, SF3B1 and BAP1. To search for additional driver mutations in this tumor type we carried out whole......, instead, a BRCA mutation signature predominated. In addition to mutations in the known UM driver genes, we found a recurrent mutation in PLCB4 (c.G1888T, p.D630Y, NM_000933), which was validated using Sanger sequencing. The identical mutation was also found in published UM sequence data (1 of 56 tumors......-genome or whole-exome sequencing of 28 tumors or primary cell lines. These samples have a low mutation burden, with a mean of 10.6 protein changing mutations per sample (range 0 to 53). As expected for these sun-shielded melanomas the mutation spectrum was not consistent with an ultraviolet radiation signature...

  18. The contribution of next generation sequencing to epilepsy genetics

    DEFF Research Database (Denmark)

    Møller, Rikke S.; Dahl, Hans A.; Helbig, Ingo

    2015-01-01

    During the last decade, next generation sequencing technologies such as targeted gene panels, whole exome sequencing and whole genome sequencing have led to an explosion of gene identifications in monogenic epilepsies including both familial epilepsies and severe epilepsies, often referred to as ...

  19. From Genome Sequence to Taxonomy - A Skeptic’s View

    DEFF Research Database (Denmark)

    Özen, Asli Ismihan; Vesth, Tammi Camilla; Ussery, David

    2012-01-01

    The relative ease of sequencing bacterial genomes has resulted in thousands of sequenced bacterial genomes available in the public databases. This same technology now allows for using the entire genome sequence as an identifier for an organism. There are many methods available which attempt to us...

  20. Deep-sequencing protocols influence the results obtained in small-RNA sequencing.

    Directory of Open Access Journals (Sweden)

    Joern Toedling

    Full Text Available Second-generation sequencing is a powerful method for identifying and quantifying small-RNA components of cells. However, little attention has been paid to the effects of the choice of sequencing platform and library preparation protocol on the results obtained. We present a thorough comparison of small-RNA sequencing libraries generated from the same embryonic stem cell lines, using different sequencing platforms, which represent the three major second-generation sequencing technologies, and protocols. We have analysed and compared the expression of microRNAs, as well as populations of small RNAs derived from repetitive elements. Despite the fact that different libraries display a good correlation between sequencing platforms, qualitative and quantitative variations in the results were found, depending on the protocol used. Thus, when comparing libraries from different biological samples, it is strongly recommended to use the same sequencing platform and protocol in order to ensure the biological relevance of the comparisons.

  1. Whole-exome sequencing identifies novel MPL and JAK2 mutations in triple-negative myeloproliferative neoplasms.

    Science.gov (United States)

    Milosevic Feenstra, Jelena D; Nivarthi, Harini; Gisslinger, Heinz; Leroy, Emilie; Rumi, Elisa; Chachoua, Ilyas; Bagienski, Klaudia; Kubesova, Blanka; Pietra, Daniela; Gisslinger, Bettina; Milanesi, Chiara; Jäger, Roland; Chen, Doris; Berg, Tiina; Schalling, Martin; Schuster, Michael; Bock, Christoph; Constantinescu, Stefan N; Cazzola, Mario; Kralovics, Robert

    2016-01-21

    Essential thrombocythemia (ET) and primary myelofibrosis (PMF) are chronic diseases characterized by clonal hematopoiesis and hyperproliferation of terminally differentiated myeloid cells. The disease is driven by somatic mutations in exon 9 of CALR or exon 10 of MPL or JAK2-V617F in >90% of the cases, whereas the remaining cases are termed "triple negative." We aimed to identify the disease-causing mutations in the triple-negative cases of ET and PMF by applying whole-exome sequencing (WES) on paired tumor and control samples from 8 patients. We found evidence of clonal hematopoiesis in 5 of 8 studied cases based on clonality analysis and presence of somatic genetic aberrations. WES identified somatic mutations in 3 of 8 cases. We did not detect any novel recurrent somatic mutations. In 3 patients with clonal hematopoiesis analyzed by WES, we identified a somatic MPL-S204P, a germline MPL-V285E mutation, and a germline JAK2-G571S variant. We performed Sanger sequencing of the entire coding region of MPL in 62, and of JAK2 in 49 additional triple-negative cases of ET or PMF. New somatic (T119I, S204F, E230G, Y591D) and 1 germline (R321W) MPL mutation were detected. All of the identified MPL mutations were gain-of-function when analyzed in functional assays. JAK2 variants were identified in 5 of 57 triple-negative cases analyzed by WES and Sanger sequencing combined. We could demonstrate that JAK2-V625F and JAK2-F556V are gain-of-function mutations. Our results suggest that triple-negative cases of ET and PMF do not represent a homogenous disease entity. Cases with polyclonal hematopoiesis might represent hereditary disorders. © 2016 by The American Society of Hematology.

  2. Long sequence correlation coprocessor

    Science.gov (United States)

    Gage, Douglas W.

    1994-09-01

    A long sequence correlation coprocessor (LSCC) accelerates the bitwise correlation of arbitrarily long digital sequences by calculating in parallel the correlation score for 16, for example, adjacent bit alignments between two binary sequences. The LSCC integrated circuit is incorporated into a computer system with memory storage buffers and a separate general purpose computer processor which serves as its controller. Each of the LSCC's set of sequential counters simultaneously tallies a separate correlation coefficient. During each LSCC clock cycle, computer enable logic associated with each counter compares one bit of a first sequence with one bit of a second sequence to increment the counter if the bits are the same. A shift register assures that the same bit of the first sequence is simultaneously compared to different bits of the second sequence to simultaneously calculate the correlation coefficient by the different counters to represent different alignments of the two sequences.

  3. Roles of repetitive sequences

    Energy Technology Data Exchange (ETDEWEB)

    Bell, G.I.

    1991-12-31

    The DNA of higher eukaryotes contains many repetitive sequences. The study of repetitive sequences is important, not only because many have important biological function, but also because they provide information on genome organization, evolution and dynamics. In this paper, I will first discuss some generic effects that repetitive sequences will have upon genome dynamics and evolution. In particular, it will be shown that repetitive sequences foster recombination among, and turnover of, the elements of a genome. I will then consider some examples of repetitive sequences, notably minisatellite sequences and telomere sequences as examples of tandem repeats, without and with respectively known function, and Alu sequences as an example of interspersed repeats. Some other examples will also be considered in less detail.

  4. Anomaly Detection in Sequences

    Data.gov (United States)

    National Aeronautics and Space Administration — We present a set of novel algorithms which we call sequenceMiner, that detect and characterize anomalies in large sets of high-dimensional symbol sequences that...

  5. DNA sequencing conference, 2

    Energy Technology Data Exchange (ETDEWEB)

    Cook-Deegan, R.M. [Georgetown Univ., Kennedy Inst. of Ethics, Washington, DC (United States); Venter, J.C. [National Inst. of Neurological Disorders and Strokes, Bethesda, MD (United States); Gilbert, W. [Harvard Univ., Cambridge, MA (United States); Mulligan, J. [Stanford Univ., CA (United States); Mansfield, B.K. [Oak Ridge National Lab., TN (United States)

    1991-06-19

    This conference focused on DNA sequencing, genetic linkage mapping, physical mapping, informatics and bioethics. Several were used to study this sequencing and mapping. This article also discusses computer hardware and software aiding in the mapping of genes.

  6. sequenceMiner algorithm

    Data.gov (United States)

    National Aeronautics and Space Administration — Detecting and describing anomalies in large repositories of discrete symbol sequences. sequenceMiner has been open-sourced! Download the file below to try it out....

  7. BAC end sequencing of Pacific white shrimp Litopenaeus vannamei: a glimpse into the genome of Penaeid shrimp

    Science.gov (United States)

    Zhao, Cui; Zhang, Xiaojun; Liu, Chengzhang; Huan, Pin; Li, Fuhua; Xiang, Jianhai; Huang, Chao

    2012-05-01

    Little is known about the genome of Pacific white shrimp ( Litopenaeus vannamei). To address this, we conducted BAC (bacterial artificial chromosome) end sequencing of L. vannamei. We selected and sequenced 7 812 BAC clones from the BAC library LvHE from the two ends of the inserts by Sanger sequencing. After trimming and quality filtering, 11 279 BAC end sequences (BESs) including 4 609 pairedends BESs were obtained. The total length of the BESs was 4 340 753 bp, representing 0.18% of the L. vannamei haploid genome. The lengths of the BESs ranged from 100 bp to 660 bp with an average length of 385 bp. Analysis of the BESs indicated that the L. vannamei genome is AT-rich and that the primary repeats patterns were simple sequence repeats (SSRs) and low complexity sequences. Dinucleotide and hexanucleotide repeats were the most common SSR types in the BESs. The most abundant transposable element was gypsy, which may contribute to the generation of the large genome size of L. vannamei. We successfully annotated 4 519 BESs by BLAST searching, including genes involved in immunity and sex determination. Our results provide an important resource for functional gene studies, map construction and integration, and complete genome assembly for this species.

  8. Applying Next Generation Sequencing to Skeletal Development and Disease

    OpenAIRE

    Bowen, Margot Elizabeth

    2013-01-01

    Next Generation Sequencing (NGS) technologies have dramatically increased the throughput and lowered the cost of DNA sequencing. In this thesis, I apply these technologies to unresolved questions in skeletal development and disease. Firstly, I use targeted re-sequencing of genomic DNA to identify the genetic cause of the cartilage tumor syndrome, metachondromatosis (MC). I show that the majority of MC patients carry heterozygous loss-of-function mutations in the PTPN11 gene, which encodes a p...

  9. Enhanced Dynamic Algorithm of Genome Sequence Alignments

    OpenAIRE

    Arabi E. keshk

    2014-01-01

    The merging of biology and computer science has created a new field called computational biology that explore the capacities of computers to gain knowledge from biological data, bioinformatics. Computational biology is rooted in life sciences as well as computers, information sciences, and technologies. The main problem in computational biology is sequence alignment that is a way of arranging the sequences of DNA, RNA or protein to identify the region of similarity and relationship between se...

  10. Complete Genome Sequence of Ikoma Lyssavirus

    OpenAIRE

    Marston, Denise A.; Ellis, Richard J.; Horton, Daniel L.; Kuzmin, Ivan V.; Wise, Emma L.; McElhinney, Lorraine M.; Banyard, Ashley C.; Ngeleja, Chanasa; Keyyu, Julius; Cleaveland, Sarah; Lembo, Tiziana; Rupprecht, Charles E.; Fooks, Anthony R.

    2012-01-01

    Lyssaviruses (family Rhabdoviridae) constitute one of the most important groups of viral zoonoses globally. All lyssaviruses cause the disease rabies, an acute progressive encephalitis for which, once symptoms occur, there is no effective cure. Currently available vaccines are highly protective against the predominantly circulating lyssavirus species. Using next-generation sequencing technologies, we have obtained the whole-genome sequence for a novel lyssavirus, Ikoma lyssavirus (IKOV), isol...

  11. Enrichment of target sequences for next-generation sequencing applications in research and diagnostics.

    Science.gov (United States)

    Altmüller, Janine; Budde, Birgit S; Nürnberg, Peter

    2014-02-01

    Abstract Targeted re-sequencing such as gene panel sequencing (GPS) has become very popular in medical genetics, both for research projects and in diagnostic settings. The technical principles of the different enrichment methods have been reviewed several times before; however, new enrichment products are constantly entering the market, and researchers are often puzzled about the requirement to take decisions about long-term commitments, both for the enrichment product and the sequencing technology. This review summarizes important considerations for the experimental design and provides helpful recommendations in choosing the best sequencing strategy for various research projects and diagnostic applications.

  12. Quantitative phenotyping via deep barcode sequencing.

    Science.gov (United States)

    Smith, Andrew M; Heisler, Lawrence E; Mellor, Joseph; Kaper, Fiona; Thompson, Michael J; Chee, Mark; Roth, Frederick P; Giaever, Guri; Nislow, Corey

    2009-10-01

    Next-generation DNA sequencing technologies have revolutionized diverse genomics applications, including de novo genome sequencing, SNP detection, chromatin immunoprecipitation, and transcriptome analysis. Here we apply deep sequencing to genome-scale fitness profiling to evaluate yeast strain collections in parallel. This method, Barcode analysis by Sequencing, or "Bar-seq," outperforms the current benchmark barcode microarray assay in terms of both dynamic range and throughput. When applied to a complex chemogenomic assay, Bar-seq quantitatively identifies drug targets, with performance superior to the benchmark microarray assay. We also show that Bar-seq is well-suited for a multiplex format. We completely re-sequenced and re-annotated the yeast deletion collection using deep sequencing, found that approximately 20% of the barcodes and common priming sequences varied from expectation, and used this revised list of barcode sequences to improve data quality. Together, this new assay and analysis routine provide a deep-sequencing-based toolkit for identifying gene-environment interactions on a genome-wide scale.

  13. Snake Genome Sequencing: Results and Future Prospects.

    Science.gov (United States)

    Kerkkamp, Harald M I; Kini, R Manjunatha; Pospelov, Alexey S; Vonk, Freek J; Henkel, Christiaan V; Richardson, Michael K

    2016-12-01

    Snake genome sequencing is in its infancy-very much behind the progress made in sequencing the genomes of humans, model organisms and pathogens relevant to biomedical research, and agricultural species. We provide here an overview of some of the snake genome projects in progress, and discuss the biological findings, with special emphasis on toxinology, from the small number of draft snake genomes already published. We discuss the future of snake genomics, pointing out that new sequencing technologies will help overcome the problem of repetitive sequences in assembling snake genomes. Genome sequences are also likely to be valuable in examining the clustering of toxin genes on the chromosomes, in designing recombinant antivenoms and in studying the epigenetic regulation of toxin gene expression.

  14. Snake Genome Sequencing: Results and Future Prospects

    Directory of Open Access Journals (Sweden)

    Harald M. I. Kerkkamp

    2016-12-01

    Full Text Available Snake genome sequencing is in its infancy—very much behind the progress made in sequencing the genomes of humans, model organisms and pathogens relevant to biomedical research, and agricultural species. We provide here an overview of some of the snake genome projects in progress, and discuss the biological findings, with special emphasis on toxinology, from the small number of draft snake genomes already published. We discuss the future of snake genomics, pointing out that new sequencing technologies will help overcome the problem of repetitive sequences in assembling snake genomes. Genome sequences are also likely to be valuable in examining the clustering of toxin genes on the chromosomes, in designing recombinant antivenoms and in studying the epigenetic regulation of toxin gene expression.

  15. Identification, variation and transcription of pneumococcal repeat sequences

    Science.gov (United States)

    2011-01-01

    Background Small interspersed repeats are commonly found in many bacterial chromosomes. Two families of repeats (BOX and RUP) have previously been identified in the genome of Streptococcus pneumoniae, a nasopharyngeal commensal and respiratory pathogen of humans. However, little is known about the role they play in pneumococcal genetics. Results Analysis of the genome of S. pneumoniae ATCC 700669 revealed the presence of a third repeat family, which we have named SPRITE. All three repeats are present at a reduced density in the genome of the closely related species S. mitis. However, they are almost entirely absent from all other streptococci, although a set of elements related to the pneumococcal BOX repeat was identified in the zoonotic pathogen S. suis. In conjunction with information regarding their distribution within the pneumococcal chromosome, this suggests that it is unlikely that these repeats are specialised sequences performing a particular role for the host, but rather that they constitute parasitic elements. However, comparing insertion sites between pneumococcal sequences indicates that they appear to transpose at a much lower rate than IS elements. Some large BOX elements in S. pneumoniae were found to encode open reading frames on both strands of the genome, whilst another was found to form a composite RNA structure with two T box riboswitches. In multiple cases, such BOX elements were demonstrated as being expressed using directional RNA-seq and RT-PCR. Conclusions BOX, RUP and SPRITE repeats appear to have proliferated extensively throughout the pneumococcal chromosome during the species' past, but novel insertions are currently occurring at a relatively slow rate. Through their extensive secondary structures, they seem likely to affect the expression of genes with which they are co-transcribed. Software for annotation of these repeats is freely available from ftp://ftp.sanger.ac.uk/pub/pathogens/strep_repeats/. PMID:21333003

  16. Targeted exon sequencing in Usher syndrome type I.

    Science.gov (United States)

    Bujakowska, Kinga M; Consugar, Mark; Place, Emily; Harper, Shyana; Lena, Jaclyn; Taub, Daniel G; White, Joseph; Navarro-Gomez, Daniel; Weigel DiFranco, Carol; Farkas, Michael H; Gai, Xiaowu; Berson, Eliot L; Pierce, Eric A

    2014-12-02

    Patients with Usher syndrome type I (USH1) have retinitis pigmentosa, profound congenital hearing loss, and vestibular ataxia. This syndrome is currently thought to be associated with at least six genes, which are encoded by over 180 exons. Here, we present the use of state-of-the-art techniques in the molecular diagnosis of a cohort of 47 USH1 probands. The cohort was studied with selective exon capture and next-generation sequencing of currently known inherited retinal degeneration genes, comparative genomic hybridization, and Sanger sequencing of new USH1 exons identified by human retinal transcriptome analysis. With this approach, we were able to genetically solve 14 of the 47 probands by confirming the biallelic inheritance of mutations. We detected two likely pathogenic variants in an additional 19 patients, for whom family members were not available for cosegregation analysis to confirm biallelic inheritance. Ten patients, in addition to primary disease-causing mutations, carried rare likely pathogenic USH1 alleles or variants in other genes associated with deaf-blindness, which may influence disease phenotype. Twenty-one of the identified mutations were novel among the 33 definite or likely solved patients. Here, we also present a clinical description of the studied cohort at their initial visits. We found a remarkable genetic heterogeneity in the studied USH1 cohort with multiplicity of mutations, of which many were novel. No obvious influence of genotype on phenotype was found, possibly due to small sample sizes of the genotypes under study. Copyright 2014 The Association for Research in Vision and Ophthalmology, Inc.

  17. Complete genome sequence of a novel Plum pox virus strain W isolate determined by 454 pyrosequencing.

    Science.gov (United States)

    Sheveleva, Anna; Kudryavtseva, Anna; Speranskaya, Anna; Belenikin, Maxim; Melnikova, Natalia; Chirkov, Sergei

    2013-10-01

    The near-complete (99.7 %) genome sequence of a novel Russian Plum pox virus (PPV) isolate Pk, belonging to the strain Winona (W), has been determined by 454 pyrosequencing with the exception of the thirty-one 5'-terminal nucleotides. This region was amplified using 5'RACE kit and sequenced by the Sanger method. Genomic RNA released from immunocaptured PPV particles was employed for generation of cDNA library using TransPlex Whole transcriptome amplification kit (WTA2, Sigma-Aldrich). The entire Pk genome has identity level of 92.8-94.5 % when compared to the complete nucleotide sequences of other PPV-W isolates (W3174, LV-141pl, LV-145bt, and UKR 44189), confirming a high degree of variability within the PPV-W strain. The isolates Pk and LV-141pl are most closely related. The Pk has been found in a wild plum (Prunus domestica) in a new region of Russia indicating widespread dissemination of the PPV-W strain in the European part of the former USSR.

  18. Whole genome sequencing reveals a de novo SHANK3 mutation in familial autism spectrum disorder.

    Directory of Open Access Journals (Sweden)

    Sergio I Nemirovsky

    Full Text Available Clinical genomics promise to be especially suitable for the study of etiologically heterogeneous conditions such as Autism Spectrum Disorder (ASD. Here we present three siblings with ASD where we evaluated the usefulness of Whole Genome Sequencing (WGS for the diagnostic approach to ASD.We identified a family segregating ASD in three siblings with an unidentified cause. We performed WGS in the three probands and used a state-of-the-art comprehensive bioinformatic analysis pipeline and prioritized the identified variants located in genes likely to be related to ASD. We validated the finding by Sanger sequencing in the probands and their parents.Three male siblings presented a syndrome characterized by severe intellectual disability, absence of language, autism spectrum symptoms and epilepsy with negative family history for mental retardation, language disorders, ASD or other psychiatric disorders. We found germline mosaicism for a heterozygous deletion of a cytosine in the exon 21 of the SHANK3 gene, resulting in a missense sequence of 5 codons followed by a premature stop codon (NM_033517:c.3259_3259delC, p.Ser1088Profs*6.We reported an infrequent form of familial ASD where WGS proved useful in the clinic. We identified a mutation in SHANK3 that underscores its relevance in Autism Spectrum Disorder.

  19. Whole exome sequencing identifies novel mutation in eight Chinese children with isolated tetralogy of Fallot.

    Science.gov (United States)

    Liu, Lin; Wang, Hong-Dan; Cui, Cun-Ying; Qin, Yun-Yun; Fan, Tai-Bing; Peng, Bang-Tian; Zhang, Lian-Zhong; Wang, Cheng-Zeng

    2017-12-05

    Tetralogy of Fallot is the most common cyanotic congenital heart disease. However, its pathogenesis remains to be clarified. The purpose of this study was to identify the genetic variants in Tetralogy of Fallot by whole exome sequencing. Whole exome sequencing was performed among eight small families with Tetralogy of Fallot. Differential single nucleotide polymorphisms and small InDels were found by alignment within families and between families and then were verified by Sanger sequencing. Tetralogy of Fallot-related genes were determined by analysis using Gene Ontology /pathway, Online Mendelian Inheritance in Man, PubMed and other databases. A total of sixteen differential single nucleotide polymorphisms loci and eight differential small InDels were discovered. The sixteen differential single nucleotide polymorphisms loci were located on Chr 1, 2, 4, 5, 11, 12, 15, 22 and X. Among the sixteen single nucleotide polymorphisms loci, six has not been reported. The eight differential small InDels were located on Chr 2, 4, 9, 12, 17, 19 and X, whereas of the eight differential small InDels, two has not been reported. Analysis using Gene Ontology /pathway, Online Mendelian Inheritance in Man, PubMed and other databases revealed that PEX5 , NACA , ATXN2 , CELA1 , PCDHB4 and CTBP1 were associated with Tetralogy of Fallot. Our findings identify PEX5 , NACA , ATXN2 , CELA1 , PCDHB4 and CTBP1 mutations as underlying genetic causes of isolated tetralogy of Fallot.

  20. Identification of Five Novel Variants in Chinese Oculocutaneous Albinism by Targeted Next-Generation Sequencing.

    Science.gov (United States)

    Qiu, Biyuan; Ma, Tao; Peng, Chunyan; Zheng, Xiaoqin; Yang, Jiyun

    2018-04-01

    The diagnosis of oculocutaneous albinism (OCA) is established using clinical signs and symptoms. OCA is, however, a highly genetically heterogeneous disease with mutations identified in at least nineteen unique genes, many of which produce overlapping phenotypic traits. Thus, differentiating genetic OCA subtypes for diagnoses and genetic counseling is challenging, based on clinical presentation alone, and would benefit from a comprehensive molecular diagnostic. To develop and validate a more comprehensive, targeted, next-generation-sequencing-based diagnostic for the identification of OCA-causing variants. The genomic DNA samples from 28 OCA probands were analyzed by targeted next-generation sequencing (NGS), and the candidate variants were confirmed through Sanger sequencing. We observed mutations in the TYR, OCA2, and SLC45A2 genes in 25/28 (89%) patients with OCA. We identified 38 pathogenic variants among these three genes, including 5 novel variants: c.1970G>T (p.Gly657Val), c.1669A>C (p.Thr557Pro), c.2339-2A>C, and c.1349C>G (p.Thr450Arg) in OCA2; c.459_470delTTTTGCTGCCGA (p.Ala155_Phe158del) in SLC45A2. Our findings expand the mutational spectrum of OCA in the Chinese population, and the assay we developed should be broadly useful as a molecular diagnostic, and as an aid for genetic counseling for OCA patients.

  1. mtDNA sequence diversity of Hazara ethnic group from Pakistan.

    Science.gov (United States)

    Rakha, Allah; Fatima; Peng, Min-Sheng; Adan, Atif; Bi, Rui; Yasmin, Memona; Yao, Yong-Gang

    2017-09-01

    The present study was undertaken to investigate mitochondrial DNA (mtDNA) control region sequences of Hazaras from Pakistan, so as to generate mtDNA reference database for forensic casework in Pakistan and to analyze phylogenetic relationship of this particular ethnic group with geographically proximal populations. Complete mtDNA control region (nt 16024-576) sequences were generated through Sanger Sequencing for 319 Hazara individuals from Quetta, Baluchistan. The population sample set showed a total of 189 distinct haplotypes, belonging mainly to West Eurasian (51.72%), East & Southeast Asian (29.78%) and South Asian (18.50%) haplogroups. Compared with other populations from Pakistan, the Hazara population had a relatively high haplotype diversity (0.9945) and a lower random match probability (0.0085). The dataset has been incorporated into EMPOP database under accession number EMP00680. The data herein comprises the largest, and likely most thoroughly examined, control region mtDNA dataset from Hazaras of Pakistan. Copyright © 2017 Elsevier B.V. All rights reserved.

  2. Phylogenetic and functional analysis of metagenome sequence from high-temperature archaeal habitats demonstrate linkages between metabolic potential and geochemistry

    Directory of Open Access Journals (Sweden)

    William P. Inskeep

    2013-05-01

    Full Text Available Geothermal habitats in Yellowstone National Park (YNP provide an unparalled opportunity to understand the environmental factors that control the distribution of archaea in thermal habitats. Here we describe, analyze and synthesize metagenomic and geochemical data collected from seven high-temperature sites that contain microbial communities dominated by archaea relative to bacteria. The specific objectives of the study were to use metagenome sequencing to determine the structure and functional capacity of thermophilic archaeal-dominated microbial communities across a pH range from 2.5 to 6.4 and to discuss specific examples where the metabolic potential correlated with measured environmental parameters and geochemical processes occurring in situ. Random shotgun metagenome sequence (~40-45 Mbase Sanger sequencing per site was obtained from environmental DNA extracted from high-temperature sediments and/or microbial mats and subjected to numerous phylogenetic and functional analyses. Analysis of individual sequences (e.g., MEGAN and G+C content and assemblies from each habitat type revealed the presence of dominant archaeal populations in all environments, 10 of whose genomes were largely reconstructed from the sequence data. Analysis of protein family occurrence, particularly of those involved in energy conservation, electron transport and autotrophic metabolism, revealed significant differences in metabolic strategies across sites consistent with differences in major geochemical attributes (e.g., sulfide, oxygen, pH. These observations provide an ecological basis for understanding the distribution of indigenous archaeal lineages across high temperature systems of YNP.

  3. Identification of two novel SH3PXD2B gene mutations in Frank-Ter Haar syndrome by exome sequencing: Case report and review of the literature.

    Science.gov (United States)

    Zrhidri, Abdelali; Jaouad, Imane Cherkaoui; Lyahyai, Jaber; Raymond, Laure; Egéa, Grégory; Taoudi, Mohamed; El Mouatassim, Said; Sefiani, Abdelaziz

    2017-09-10

    Frank-Ter Haar syndrome (FTHS) is an autosomal-recessive disorder characterized by skeletal, cardio-vascular, and eye abnormalities, such as increased intraocular pressure, prominent eyes, and hypertelorism. The most common underlying genetic defect in Frank-Ter Haar syndrome appears to be due to mutations in the SH3PXD2B gene on chromosome 5q35.1. Until now, only six mutations in SH3PXD2B gene have been identified. A genetic heterogeneity of FTHS was suggested in previous studies. FTHS was suspected clinically in a girl of 2years old, born from non-consanguineous Moroccan healthy parents. The patient had been referred to a medical genetics outpatient clinic for dysmorphic facial features. Whole Exome Sequencing (WES) was performed in the patient and her parents, in addition to Sanger sequencing that was carried out to confirm the results. We report the first description of a Moroccan FTHS patient with two novel compound heterozygous mutations c.806G>A; p.Trp269* (maternal allele) and c.892delC; p.Asp299Thrfs*44 (paternal allele) in the SH3PXD2B gene. Sanger sequencing confirmed this mutation in the affected girl and demonstrated that her parents carry this mutation in heterozygous state. Our results confirm the clinical diagnosis of FTHS in this reported family and contribute to expand the mutational spectrum of this rare disease. Our study shows also, that exome sequencing is a powerful and a cost-effective tool for the diagnosis of a supposed genetically heterogeneous disorder such FTHS. Copyright © 2017 Elsevier B.V. All rights reserved.

  4. Enhanced throughput for infrared automated DNA sequencing

    Science.gov (United States)

    Middendorf, Lyle R.; Gartside, Bill O.; Humphrey, Pat G.; Roemer, Stephen C.; Sorensen, David R.; Steffens, David L.; Sutter, Scott L.

    1995-04-01

    Several enhancements have been developed and applied to infrared automated DNA sequencing resulting in significantly higher throughput. A 41 cm sequencing gel (31 cm well- to-read distance) combines high resolution of DNA sequencing fragments with optimized run times yielding two runs per day of 500 bases per sample. A 66 cm sequencing gel (56 cm well-to-read distance) produces sequence read lengths of up to 1000 bases for ds and ss templates using either T7 polymerase or cycle-sequencing protocols. Using a multichannel syringe to load 64 lanes allows 16 samples (compatible with 96-well format) to be visualized for each run. The 41 cm gel configuration allows 16,000 bases per day (16 samples X 500 bases/sample X 2 ten hour runs/day) to be sequenced with the advantages of infrared technology. Enhancements to internal labeling techniques using an infrared-labeled dATP molecule (Boehringer Mannheim GmbH, Penzberg, Germany; Sequenase (U.S. Biochemical) have also been made. The inclusion of glycerol in the sequencing reactions yields greatly improved results for some primer and template combinations. The inclusion of (alpha) -Thio-dNTP's in the labeling reaction increases signal intensity two- to three-fold.

  5. Protecting genomic sequence anonymity with generalization lattices.

    Science.gov (United States)

    Malin, B A

    2005-01-01

    Current genomic privacy technologies assume the identity of genomic sequence data is protected if personal information, such as demographics, are obscured, removed, or encrypted. While demographic features can directly compromise an individual's identity, recent research demonstrates such protections are insufficient because sequence data itself is susceptible to re-identification. To counteract this problem, we introduce an algorithm for anonymizing a collection of person-specific DNA sequences. The technique is termed DNA lattice anonymization (DNALA), and is based upon the formal privacy protection schema of k -anonymity. Under this model, it is impossible to observe or learn features that distinguish one genetic sequence from k-1 other entries in a collection. To maximize information retained in protected sequences, we incorporate a concept generalization lattice to learn the distance between two residues in a single nucleotide region. The lattice provides the most similar generalized concept for two residues (e.g. adenine and guanine are both purines). The method is tested and evaluated with several publicly available human population datasets ranging in size from 30 to 400 sequences. Our findings imply the anonymization schema is feasible for the protection of sequences privacy. The DNALA method is the first computational disclosure control technique for general DNA sequences. Given the computational nature of the method, guarantees of anonymity can be formally proven. There is room for improvement and validation, though this research provides the groundwork from which future researchers can construct genomics anonymization schemas tailored to specific datasharing scenarios.

  6. Light whole genome sequence for SNP discovery across domestic cat breeds

    Directory of Open Access Journals (Sweden)

    Driscoll Carlos

    2010-06-01

    Full Text Available Abstract Background The domestic cat has offered enormous genomic potential in the veterinary description of over 250 hereditary disease models as well as the occurrence of several deadly feline viruses (feline leukemia virus -- FeLV, feline coronavirus -- FECV, feline immunodeficiency virus - FIV that are homologues to human scourges (cancer, SARS, and AIDS respectively. However, to realize this bio-medical potential, a high density single nucleotide polymorphism (SNP map is required in order to accomplish disease and phenotype association discovery. Description To remedy this, we generated 3,178,297 paired fosmid-end Sanger sequence reads from seven cats, and combined these data with the publicly available 2X cat whole genome sequence. All sequence reads were assembled together to form a 3X whole genome assembly allowing the discovery of over three million SNPs. To reduce potential false positive SNPs due to the low coverage assembly, a low upper-limit was placed on sequence coverage and a high lower-limit on the quality of the discrepant bases at a potential variant site. In all domestic cats of different breeds: female Abyssinian, female American shorthair, male Cornish Rex, female European Burmese, female Persian, female Siamese, a male Ragdoll and a female African wildcat were sequenced lightly. We report a total of 964 k common SNPs suitable for a domestic cat SNP genotyping array and an additional 900 k SNPs detected between African wildcat and domestic cats breeds. An empirical sampling of 94 discovered SNPs were tested in the sequenced cats resulting in a SNP validation rate of 99%. Conclusions These data provide a large collection of mapped feline SNPs across the cat genome that will allow for the development of SNP genotyping platforms for mapping feline diseases.

  7. Risk of Breast Cancer with CXCR4-using HIV Defined by V3-Loop Sequencing

    Science.gov (United States)

    Goedert, James J.; Swenson, Luke C.; Napolitano, Laura A.; Haddad, Mojgan; Anastos, Kathryn; Minkoff, Howard; Young, Mary; Levine, Alexandra; Adeyemi, Oluwatoyin; Seaberg, Eric C.; Aouizerat, Bradley; Rabkin, Charles S.; Harrigan, P. Richard; Hessol, Nancy A.

    2014-01-01

    Objective Evaluate the risk of female breast cancer associated with HIV-CXCR4 (X4) tropism as determined by various genotypic measures. Methods A breast cancer case-control study, with pairwise comparisons of tropism determination methods, was conducted. From the Women's Interagency HIV Study repository, one stored plasma specimen was selected from 25 HIV-infected cases near the breast cancer diagnosis date and 75 HIV-infected control women matched for age and calendar date. HIVgp120-V3 sequences were derived by Sanger population sequencing (PS) and 454-pyro deep sequencing (DS). Sequencing-based HIV-X4 tropism was defined using the geno2pheno algorithm, with both high-stringency DS [False-Positive-Rate (FPR 3.5) and 2% X4 cutoff], and lower stringency DS (FPR 5.75, 15% X4 cut-off). Concordance of tropism results by PS, DS, and previously performed phenotyping was assessed with kappa (κ) statistics. Case-control comparisons used exact P-values and conditional logistic regression. Results In 74 women (19 cases, 55 controls) with complete results, prevalence of HIV-X4 by PS was 5% in cases vs 29% in controls (P=0.06, odds ratio 0.14, confidence interval 0.003-1.03). Smaller case-control prevalence differences were found with high-stringency DS (21% vs 36%, P=0.32), lower-stringency DS (16% vs 35%, P=0.18), and phenotyping (11% vs 31%, P=0.10). HIV-X4-tropism concordance was best between PS and lower-stringency DS (93%, κ=0.83). Other pairwise concordances were 82%-92% (κ=0.56-0.81). Concordance was similar among cases and controls. Conclusions HIV-X4 defined by population sequencing (PS) had good agreement with lower stringency deep sequencing and was significantly associated with lower odds of breast cancer. PMID:25321183

  8. Shirky and Sanger, or the costs of crowdsourcing

    Directory of Open Access Journals (Sweden)

    Mathieu O'Neil

    2010-03-01

    Full Text Available Online knowledge production sites do not rely on isolated experts but on collaborative processes, on the wisdom of the group or “crowd”. Some authors have argued that it is possible to combine traditional or credentialled expertise with collective production; others believe that traditional expertise's focus on correctness has been superseded by the affordances of digital networking, such as re-use and verifiability. This paper examines the costs of two kinds of “crowdsourced” encyclopedic projects: Citizendium, based on the work of credentialled and identified experts, faces a recruitment deficit; in contrast Wikipedia has proved wildly popular, but anti-credentialism and anonymity result in uncertainty, irresponsibility, the development of cliques and the growing importance of pseudo-legal competencies for conflict resolution. Finally the paper reflects on the wider social implications of focusing on what experts are rather than on what they are for.

  9. Table 1 Oligonucleotide primers used for SNP verification by Sanger ...

    Indian Academy of Sciences (India)

    charissa

    1 Ao W, Aldous S, Woodruf E, Hicke B, Rea L, Kreiswirth B, Jenison R. Rapid detection of rpoB gene mutations conferring rifampin resistance in Mycobacterium tuberculosis. J Clin Microbiol. 2012; 50: 2433-2440. 2 Bakuła Z, Napiórkowska A, Bielecki J et al. Mutations in the embB gene and their association with ethambutol ...

  10. selecting suitable drainage pattern to minimize flooding in sangere

    African Journals Online (AJOL)

    Mr Takana

    heights obtained from the ground survey using Total Station. ILWIS 3.3 ... drainage pattern to minimize the effect of flood hazard using the ... as linkages between upland and downstream areas;. Bhaskar .... A case study of urban city. Pp151.

  11. De novo assembly of a 40 Mb eukaryotic genome from short sequence reads: Sordaria macrospora, a model organism for fungal morphogenesis.

    Science.gov (United States)

    Nowrousian, Minou; Stajich, Jason E; Chu, Meiling; Engh, Ines; Espagne, Eric; Halliday, Karen; Kamerewerd, Jens; Kempken, Frank; Knab, Birgit; Kuo, Hsiao-Che; Osiewacz, Heinz D; Pöggeler, Stefanie; Read, Nick D; Seiler, Stephan; Smith, Kristina M; Zickler, Denise; Kück, Ulrich; Freitag, Michael

    2010-04-08

    Filamentous fungi are of great importance in ecology, agriculture, medicine, and biotechnology. Thus, it is not surprising that genomes for more than 100 filamentous fungi have been sequenced, most of them by Sanger sequencing. While next-generation sequencing techniques have revolutionized genome resequencing, e.g. for strain comparisons, genetic mapping, or transcriptome and ChIP analyses, de novo assembly of eukaryotic genomes still presents significant hurdles, because of their large size and stretches of repetitive sequences. Filamentous fungi contain few repetitive regions in their 30-90 Mb genomes and thus are suitable candidates to test de novo genome assembly from short sequence reads. Here, we present a high-quality draft sequence of the Sordaria macrospora genome that was obtained by a combination of Illumina/Solexa and Roche/454 sequencing. Paired-end Solexa sequencing of genomic DNA to 85-fold coverage and an additional 10-fold coverage by single-end 454 sequencing resulted in approximately 4 Gb of DNA sequence. Reads were assembled to a 40 Mb draft version (N50 of 117 kb) with the Velvet assembler. Comparative analysis with Neurospora genomes increased the N50 to 498 kb. The S. macrospora genome contains even fewer repeat regions than its closest sequenced relative, Neurospora crassa. Comparison with genomes of other fungi showed that S. macrospora, a model organism for morphogenesis and meiosis, harbors duplications of several genes involved in self/nonself-recognition. Furthermore, S. macrospora contains more polyketide biosynthesis genes than N. crassa. Phylogenetic analyses suggest that some of these genes may have been acquired by horizontal gene transfer from a distantly related ascomycete group. Our study shows that, for typical filamentous fungi, de novo assembly of genomes from short sequence reads alone is feasible, that a mixture of Solexa and 454 sequencing substantially improves the assembly, and that the resulting data can be used for

  12. De novo assembly of a 40 Mb eukaryotic genome from short sequence reads: Sordaria macrospora, a model organism for fungal morphogenesis.

    Directory of Open Access Journals (Sweden)

    Minou Nowrousian

    2010-04-01

    Full Text Available Filamentous fungi are of great importance in ecology, agriculture, medicine, and biotechnology. Thus, it is not surprising that genomes for more than 100 filamentous fungi have been sequenced, most of them by Sanger sequencing. While next-generation sequencing techniques have revolutionized genome resequencing, e.g. for strain comparisons, genetic mapping, or transcriptome and ChIP analyses, de novo assembly of eukaryotic genomes still presents significant hurdles, because of their large size and stretches of repetitive sequences. Filamentous fungi contain few repetitive regions in their 30-90 Mb genomes and thus are suitable candidates to test de novo genome assembly from short sequence reads. Here, we present a high-quality draft sequence of the Sordaria macrospora genome that was obtained by a combination of Illumina/Solexa and Roche/454 sequencing. Paired-end Solexa sequencing of genomic DNA to 85-fold coverage and an additional 10-fold coverage by single-end 454 sequencing resulted in approximately 4 Gb of DNA sequence. Reads were assembled to a 40 Mb draft version (N50 of 117 kb with the Velvet assembler. Comparative analysis with Neurospora genomes increased the N50 to 498 kb. The S. macrospora genome contains even fewer repeat regions than its closest sequenced relative, Neurospora crassa. Comparison with genomes of other fungi showed that S. macrospora, a model organism for morphogenesis and meiosis, harbors duplications of several genes involved in self/nonself-recognition. Furthermore, S. macrospora contains more polyketide biosynthesis genes than N. crassa. Phylogenetic analyses suggest that some of these genes may have been acquired by horizontal gene transfer from a distantly related ascomycete group. Our study shows that, for typical filamentous fungi, de novo assembly of genomes from short sequence reads alone is feasible, that a mixture of Solexa and 454 sequencing substantially improves the assembly, and that the resulting data

  13. Sequences for Student Investigation

    Science.gov (United States)

    Barton, Jeffrey; Feil, David; Lartigue, David; Mullins, Bernadette

    2004-01-01

    We describe two classes of sequences that give rise to accessible problems for undergraduate research. These problems may be understood with virtually no prerequisites and are well suited for computer-aided investigation. The first sequence is a variation of one introduced by Stephen Wolfram in connection with his study of cellular automata. The…

  14. Sequence History Update Tool

    Science.gov (United States)

    Khanampompan, Teerapat; Gladden, Roy; Fisher, Forest; DelGuercio, Chris

    2008-01-01

    The Sequence History Update Tool performs Web-based sequence statistics archiving for Mars Reconnaissance Orbiter (MRO). Using a single UNIX command, the software takes advantage of sequencing conventions to automatically extract the needed statistics from multiple files. This information is then used to populate a PHP database, which is then seamlessly formatted into a dynamic Web page. This tool replaces a previous tedious and error-prone process of manually editing HTML code to construct a Web-based table. Because the tool manages all of the statistics gathering and file delivery to and from multiple data sources spread across multiple servers, there is also a considerable time and effort savings. With the use of The Sequence History Update Tool what previously took minutes is now done in less than 30 seconds, and now provides a more accurate archival record of the sequence commanding for MRO.

  15. Genome sequencing for obstetricians & gynaecologists | Kent ...

    African Journals Online (AJOL)

    The medical profession has been waiting for a decade to be invigorated by the sequencing of the human genome, arguably the greatest scientific project ever. The technology has been spectacular but the results of the project have yielded more unexpected results than definitive answers – many about the very nature of our ...

  16. Oxford Nanopore MinION Sequencing and Genome Assembly

    Directory of Open Access Journals (Sweden)

    Hengyun Lu

    2016-10-01

    Full Text Available The revolution of genome sequencing is continuing after the successful second-generation sequencing (SGS technology. The third-generation sequencing (TGS technology, led by Pacific Biosciences (PacBio, is progressing rapidly, moving from a technology once only capable of providing data for small genome analysis, or for performing targeted screening, to one that promises high quality de novo assembly and structural variation detection for human-sized genomes. In 2014, the MinION, the first commercial sequencer using nanopore technology, was released by Oxford Nanopore Technologies (ONT. MinION identifies DNA bases by measuring the changes in electrical conductivity generated as DNA strands pass through a biological pore. Its portability, affordability, and speed in data production makes it suitable for real-time applications, the release of the long read sequencer MinION has thus generated much excitement and interest in the genomics community. While de novo genome assemblies can be cheaply produced from SGS data, assembly continuity is often relatively poor, due to the limited ability of short reads to handle long repeats. Assembly quality can be greatly improved by using TGS long reads, since repetitive regions can be easily expanded into using longer sequencing lengths, despite having higher error rates at the base level. The potential of nanopore sequencing has been demonstrated by various studies in genome surveillance at locations where rapid and reliable sequencing is needed, but where resources are limited.

  17. JVM: Java Visual Mapping tool for next generation sequencing read.

    Science.gov (United States)

    Yang, Ye; Liu, Juan

    2015-01-01

    We developed a program JVM (Java Visual Mapping) for mapping next generation sequencing read to reference sequence. The program is implemented in Java and is designed to deal with millions of short read generated by sequence alignment using the Illumina sequencing technology. It employs seed index strategy and octal encoding operations for sequence alignments. JVM is useful for DNA-Seq, RNA-Seq when dealing with single-end resequencing. JVM is a desktop application, which supports reads capacity from 1 MB to 10 GB.

  18. The International Nucleotide Sequence Database Collaboration.

    Science.gov (United States)

    Cochrane, Guy; Karsch-Mizrachi, Ilene; Nakamura, Yasukazu

    2011-01-01

    Under the International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org), globally comprehensive public domain nucleotide sequence is captured, preserved and presented. The partners of this long-standing collaboration work closely together to provide data formats and conventions that enable consistent data submission to their databases and support regular data exchange around the globe. Clearly defined policy and governance in relation to free access to data and relationships with journal publishers have positioned INSDC databases as a key provider of the scientific record and a core foundation for the global bioinformatics data infrastructure. While growth in sequence data volumes comes no longer as a surprise to INSDC partners, the uptake of next-generation sequencing technology by mainstream science that we have witnessed in recent years brings a step-change to growth, necessarily making a clear mark on INSDC strategy. In this article, we introduce the INSDC, outline data growth patterns and comment on the challenges of increased growth.

  19. Aplikace logistických technologií Just-in-Time a Just-in-Sequence ve společnosti Robert Bosch spol. s r.o.

    OpenAIRE

    ŠTEFKOVÁ, Iveta

    2009-01-01

    This work is mostly oriented to define theory and application of principle Just in Time ( JIT ), Just in Sequence (JIS ) and related methods in Robert Bosch Ltd.Part goals are define theoretical basis of JIT and JIS from knowledge of Czech and foreign literature and detailed analysis of particular methods. Each method is well evaluated with all its pros and cons. Further observations engaged method of Kanban, Heijunka, MRPI, MRPII, which are close related to JIT and JIS. This work precisely d...

  20. HIV Sequence Compendium 2015

    Energy Technology Data Exchange (ETDEWEB)

    Foley, Brian Thomas [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Leitner, Thomas Kenneth [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Apetrei, Cristian [Univ. of Pittsburgh, PA (United States); Hahn, Beatrice [Univ. of Pennsylvania, Philadelphia, PA (United States); Mizrachi, Ilene [National Center for Biotechnology Information, Bethesda, MD (United States); Mullins, James [Univ. of Washington, Seattle, WA (United States); Rambaut, Andrew [Univ. of Edinburgh, Scotland (United Kingdom); Wolinsky, Steven [Northwestern Univ., Evanston, IL (United States); Korber, Bette Tina Marie [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2015-10-05

    This compendium is an annual printed summary of the data contained in the HIV sequence database. We try to present a judicious selection of the data in such a way that it is of maximum utility to HIV researchers. Each of the alignments attempts to display the genetic variability within the different species, groups and subtypes of the virus. This compendium contains sequences published before January 1, 2015. Hence, though it is published in 2015 and called the 2015 Compendium, its contents correspond to the 2014 curated alignments on our website. The number of sequences in the HIV database is still increasing. In total, at the end of 2014, there were 624,121 sequences in the HIV Sequence Database, an increase of 7% since the previous year. This is the first year that the number of new sequences added to the database has decreased compared to the previous year. The number of near complete genomes (>7000 nucleotides) increased to 5834 by end of 2014. However, as in previous years, the compendium alignments contain only a fraction of these. A more complete version of all alignments is available on our website, http://www.hiv.lanl.gov/ content/sequence/NEWALIGN/align.html As always, we are open to complaints and suggestions for improvement. Inquiries and comments regarding the compendium should be addressed to seq-info@lanl.gov.

  1. Mapping sequences by parts

    Directory of Open Access Journals (Sweden)

    Guziolowski Carito

    2007-09-01

    Full Text Available Abstract Background: We present the N-map method, a pairwise and asymmetrical approach which allows us to compare sequences by taking into account evolutionary events that produce shuffled, reversed or repeated elements. Basically, the optimal N-map of a sequence s over a sequence t is the best way of partitioning the first sequence into N parts and placing them, possibly complementary reversed, over the second sequence in order to maximize the sum of their gapless alignment scores. Results: We introduce an algorithm computing an optimal N-map with time complexity O (|s| × |t| × N using O (|s| × |t| × N memory space. Among all the numbers of parts taken in a reasonable range, we select the value N for which the optimal N-map has the most significant score. To evaluate this significance, we study the empirical distributions of the scores of optimal N-maps and show that they can be approximated by normal distributions with a reasonable accuracy. We test the functionality of the approach over random sequences on which we apply artificial evolutionary events. Practical Application: The method is illustrated with four case studies of pairs of sequences involving non-standard evolutionary events.

  2. Targeted next-generation sequencing makes new molecular diagnoses and expands genotype-phenotype relationship in Ehlers-Danlos syndrome.

    Science.gov (United States)

    Weerakkody, Ruwan A; Vandrovcova, Jana; Kanonidou, Christina; Mueller, Michael; Gampawar, Piyush; Ibrahim, Yousef; Norsworthy, Penny; Biggs, Jennifer; Abdullah, Abdulshakur; Ross, David; Black, Holly A; Ferguson, David; Cheshire, Nicholas J; Kazkaz, Hanadi; Grahame, Rodney; Ghali, Neeti; Vandersteen, Anthony; Pope, F Michael; Aitman, Timothy J

    2016-11-01

    Ehlers-Danlos syndrome (EDS) comprises a group of overlapping hereditary disorders of connective tissue with significant morbidity and mortality, including major vascular complications. We sought to identify the diagnostic utility of a next-generation sequencing (NGS) panel in a mixed EDS cohort. We developed and applied PCR-based NGS assays for targeted, unbiased sequencing of 12 collagen and aortopathy genes to a cohort of 177 unrelated EDS patients. Variants were scored blind to previous genetic testing and then compared with results of previous Sanger sequencing. Twenty-eight pathogenic variants in COL5A1/2, COL3A1, FBN1, and COL1A1 and four likely pathogenic variants in COL1A1, TGFBR1/2, and SMAD3 were identified by the NGS assays. These included all previously detected single-nucleotide and other short pathogenic variants in these genes, and seven newly detected pathogenic or likely pathogenic variants leading to clinically significant diagnostic revisions. Twenty-two variants of uncertain significance were identified, seven of which were in aortopathy genes and required clinical follow-up. Unbiased NGS-based sequencing made new molecular diagnoses outside the expected EDS genotype-phenotype relationship and identified previously undetected clinically actionable variants in aortopathy susceptibility genes. These data may be of value in guiding future clinical pathways for genetic diagnosis in EDS.Genet Med 18 11, 1119-1127.

  3. Whole-exome sequencing identifies USH2A mutations in a pseudo-dominant Usher syndrome family.

    Science.gov (United States)

    Zheng, Sui-Lian; Zhang, Hong-Liang; Lin, Zhen-Lang; Kang, Qian-Yan

    2015-10-01

    Usher syndrome (USH) is an autosomal recessive (AR) multi-sensory degenerative disorder leading to deaf-blindness. USH is clinically subdivided into three subclasses, and 10 genes have been identified thus far. Clinical and genetic heterogeneities in USH make a precise diagnosis difficult. A dominant‑like USH family in successive generations was identified, and the present study aimed to determine the genetic predisposition of this family. Whole‑exome sequencing was performed in two affected patients and an unaffected relative. Systematic data were analyzed by bioinformatic analysis to remove the candidate mutations via step‑wise filtering. Direct Sanger sequencing and co‑segregation analysis were performed in the pedigree. One novel and two known mutations in the USH2A gene were identified, and were further confirmed by direct sequencing and co‑segregation analysis. The affected mother carried compound mutations in the USH2A gene, while the unaffected father carried a heterozygous mutation. The present study demonstrates that whole‑exome sequencing is a robust approach for the molecular diagnosis of disorders with high levels of genetic heterogeneity.

  4. Leveraging long read sequencing from a single individual to provide a comprehensive resource for benchmarking variant calling methods.

    Science.gov (United States)

    Mu, John C; Tootoonchi Afshar, Pegah; Mohiyuddin, Marghoob; Chen, Xi; Li, Jian; Bani Asadi, Narges; Gerstein, Mark B; Wong, Wing H; Lam, Hugo Y K

    2015-09-28

    A high-confidence, comprehensive human variant set is critical in assessing accuracy of sequencing algorithms, which are crucial in precision medicine based on high-throughput sequencing. Although recent works have attempted to provide such a resource, they still do not encompass all major types of variants including structural variants (SVs). Thus, we leveraged the massive high-quality Sanger sequences from the HuRef genome to construct by far the most comprehensive gold set of a single individual, which was cross validated with deep Illumina sequencing, population datasets, and well-established algorithms. It was a necessary effort to completely reanalyze the HuRef genome as its previously published variants were mostly reported five years ago, suffering from compatibility, organization, and accuracy issues that prevent their direct use in benchmarking. Our extensive analysis and validation resulted in a gold set with high specificity and sensitivity. In contrast to the current gold sets of the NA12878 or HS1011 genomes, our gold set is the first that includes small variants, deletion SVs and insertion SVs up to a hundred thousand base-pairs. We demonstrate the utility of our HuRef gold set to benchmark several published SV detection tools.

  5. On site DNA barcoding by nanopore sequencing.

    Directory of Open Access Journals (Sweden)

    Michele Menegon

    Full Text Available Biodiversity research is becoming increasingly dependent on genomics, which allows the unprecedented digitization and understanding of the planet's biological heritage. The use of genetic markers i.e. DNA barcoding, has proved to be a powerful tool in species identification. However, full exploitation of this approach is hampered by the high sequencing costs and the absence of equipped facilities in biodiversity-rich countries. In the present work, we developed a portable sequencing laboratory based on the portable DNA sequencer from Oxford Nanopore Technologies, the MinION. Complementary laboratory equipment and reagents were selected to be used in remote and tough environmental conditions. The performance of the MinION sequencer and the portable laboratory was tested for DNA barcoding in a mimicking tropical environment, as well as in a remote rainforest of Tanzania lacking electricity. Despite the relatively high sequencing error-rate of the MinION, the development of a suitable pipeline for data analysis allowed the accurate identification of different species of vertebrates including amphibians, reptiles and mammals. In situ sequencing of a wild frog allowed us to rapidly identify the species captured, thus confirming that effective DNA barcoding in the field is possible. These results open new perspectives for real-time-on-site DNA sequencing thus potentially increasing opportunities for the understanding of biodiversity in areas lacking conventional laboratory facilities.

  6. Sequencing of BAC pools by different next generation sequencing platforms and strategies

    Directory of Open Access Journals (Sweden)

    Scholz Uwe

    2011-10-01

    Full Text Available Abstract Background Next generation sequencing of BACs is a viable option for deciphering the sequence of even large and highly repetitive genomes. In order to optimize this strategy, we examined the influence of read length on the quality of Roche/454 sequence assemblies, to what extent Illumina/Solexa mate pairs (MPs improve the assemblies by scaffolding and whether barcoding of BACs is dispensable. Results Sequencing four BACs with both FLX and Titanium technologies revealed similar sequencing accuracy, but showed that the longer Titanium reads produce considerably less misassemblies and gaps. The 454 assemblies of 96 barcoded BACs were improved by scaffolding 79% of the total contig length with MPs from a non-barcoded library. Assembly of the unmasked 454 sequences without separation by barcodes revealed chimeric contig formation to be a major problem, encompassing 47% of the total contig length. Masking the sequences reduced this fraction to 24%. Conclusion Optimal BAC pool sequencing should be based on the longest available reads, with barcoding essential for a comprehensive assessment of both repetitive and non-repetitive sequence information. When interest is restricted to non-repetitive regions and repeats are masked prior to assembly, barcoding is non-essential. In any case, the assemblies can be improved considerably by scaffolding with non-barcoded BAC pool MPs.

  7. The Colliding Beams Sequencer

    International Nuclear Information System (INIS)

    Johnson, D.E.; Johnson, R.P.

    1989-01-01

    The Colliding Beam Sequencer (CBS) is a computer program used to operate the pbar-p Collider by synchronizing the applications programs and simulating the activities of the accelerator operators during filling and storage. The Sequencer acts as a meta-program, running otherwise stand alone applications programs, to do the set-up, beam transfers, acceleration, low beta turn on, and diagnostics for the transfers and storage. The Sequencer and its operational performance will be described along with its special features which include a periodic scheduler and command logger. 14 refs., 3 figs

  8. Phylogenetic Trees From Sequences

    Science.gov (United States)

    Ryvkin, Paul; Wang, Li-San

    In this chapter, we review important concepts and approaches for phylogeny reconstruction from sequence data.We first cover some basic definitions and properties of phylogenetics, and briefly explain how scientists model sequence evolution and measure sequence divergence. We then discuss three major approaches for phylogenetic reconstruction: distance-based phylogenetic reconstruction, maximum parsimony, and maximum likelihood. In the third part of the chapter, we review how multiple phylogenies are compared by consensus methods and how to assess confidence using bootstrapping. At the end of the chapter are two sections that list popular software packages and additional reading.

  9. High-resolution analysis of the 5'-end transcriptome using a next generation DNA sequencer.

    Directory of Open Access Journals (Sweden)

    Shin-ichi Hashimoto

    Full Text Available Massively parallel, tag-based sequencing systems, such as the SOLiD system, hold the promise of revolutionizing the study of whole genome gene expression due to the number of data points that can be generated in a simple and cost-effective manner. We describe the development of a 5'-end transcriptome workflow for the SOLiD system and demonstrate the advantages in sensitivity and dynamic range offered by this tag-based application over traditional approaches for the study of whole genome gene expression. 5'-end transcriptome analysis was used to study whole genome gene expression within a colon cancer cell line, HT-29, treated with the DNA methyltransferase inhibitor, 5-aza-2'-deoxycytidine (5Aza. More than 20 million 25-base 5'-end tags were obtained from untreated and 5Aza-treated cells and matched to sequences within the human genome. Seventy three percent of the mapped unique tags were associated with RefSeq cDNA sequences, corresponding to approximately 14,000 different protein-coding genes in this single cell type. The level of expression of these genes ranged from 0.02 to 4,704 transcripts per cell. The sensitivity of a single sequence run of the SOLiD platform was 100-1,000 fold greater than that observed from 5'end SAGE data generated from the analysis of 70,000 tags obtained by Sanger sequencing. The high-resolution 5'end gene expression profiling presented in this study will not only provide novel insight into the transcriptional machinery but should also serve as a basis for a better understanding of cell biology.

  10. Hybrid sequencing approach applied to human fecal metagenomic clone libraries revealed clones with potential biotechnological applications.

    Science.gov (United States)

    Džunková, Mária; D'Auria, Giuseppe; Pérez-Villarroya, David; Moya, Andrés

    2012-01-01

    Natural environments represent an incredible source of microbial genetic diversity. Discovery of novel biomolecules involves biotechnological methods that often require the design and implementation of biochemical assays to screen clone libraries. However, when an assay is applied to thousands of clones, one may eventually end up with very few positive clones which, in most of the cases, have to be "domesticated" for downstream characterization and application, and this makes screening both laborious and expensive. The negative clones, which are not considered by the selected assay, may also have biotechnological potential; however, unfortunately they would remain unexplored. Knowledge of the clone sequences provides important clues about potential biotechnological application of the clones in the library; however, the sequencing of clones one-by-one would be very time-consuming and expensive. In this study, we characterized the first metagenomic clone library from the feces of a healthy human volunteer, using a method based on 454 pyrosequencing coupled with a clone-by-clone Sanger end-sequencing. Instead of whole individual clone sequencing, we sequenced 358 clones in a pool. The medium-large insert (7-15 kb) cloning strategy allowed us to assemble these clones correctly, and to assign the clone ends to maintain the link between the position of a living clone in the library and the annotated contig from the 454 assembly. Finally, we found several open reading frames (ORFs) with previously described potential medical application. The proposed approach allows planning ad-hoc biochemical assays for the clones of interest, and the appropriate sub-cloning strategy for gene expression in suitable vectors/hosts.

  11. Hybrid sequencing approach applied to human fecal metagenomic clone libraries revealed clones with potential biotechnological applications.

    Directory of Open Access Journals (Sweden)

    Mária Džunková

    Full Text Available Natural environments represent an incredible source of microbial genetic diversity. Discovery of novel biomolecules involves biotechnological methods that often require the design and implementation of biochemical assays to screen clone libraries. However, when an assay is applied to thousands of clones, one may eventually end up with very few positive clones which, in most of the cases, have to be "domesticated" for downstream characterization and application, and this makes screening both laborious and expensive. The negative clones, which are not considered by the selected assay, may also have biotechnological potential; however, unfortunately they would remain unexplored. Knowledge of the clone sequences provides important clues about potential biotechnological application of the clones in the library; however, the sequencing of clones one-by-one would be very time-consuming and expensive. In this study, we characterized the first metagenomic clone library from the feces of a healthy human volunteer, using a method based on 454 pyrosequencing coupled with a clone-by-clone Sanger end-sequencing. Instead of whole individual clone sequencing, we sequenced 358 clones in a pool. The medium-large insert (7-15 kb cloning strategy allowed us to assemble these clones correctly, and to assign the clone ends to maintain the link between the position of a living clone in the library and the annotated contig from the 454 assembly. Finally, we found several open reading frames (ORFs with previously described potential medical application. The proposed approach allows planning ad-hoc biochemical assays for the clones of interest, and the appropriate sub-cloning strategy for gene expression in suitable vectors/hosts.

  12. Analysis of high-throughput sequencing and annotation strategies for phage genomes.

    Directory of Open Access Journals (Sweden)

    Matthew R Henn

    Full Text Available BACKGROUND: Bacterial viruses (phages play a critical role in shaping microbial populations as they influence both host mortality and horizontal gene transfer. As such, they have a significant impact on local and global ecosystem function and human health. Despite their importance, little is known about the genomic diversity harbored in phages, as methods to capture complete phage genomes have been hampered by the lack of knowledge about the target genomes, and difficulties in generating sufficient quantities of genomic DNA for sequencing. Of the approximately 550 phage genomes currently available in the public domain, fewer than 5% are marine phage. METHODOLOGY/PRINCIPAL FINDINGS: To advance the study of phage biology through comparative genomic approaches we used marine cyanophage as a model system. We compared DNA preparation methodologies (DNA extraction directly from either phage lysates or CsCl purified phage particles, and sequencing strategies that utilize either Sanger sequencing of a linker amplification shotgun library (LASL or of a whole genome shotgun library (WGSL, or 454 pyrosequencing methods. We demonstrate that genomic DNA sample preparation directly from a phage lysate, combined with 454 pyrosequencing, is best suited for phage genome sequencing at scale, as this method is capable of capturing complete continuous genomes with high accuracy. In addition, we describe an automated annotation informatics pipeline that delivers high-quality annotation and yields few false positives and negatives in ORF calling. CONCLUSIONS/SIGNIFICANCE: These DNA preparation, sequencing and annotation strategies enable a high-throughput approach to the burgeoning field of phage genomics.

  13. Next-generation sequencing using a pre-designed gene panel for the molecular diagnosis of congenital disorders in pediatric patients.

    Science.gov (United States)

    Lim, Eileen C P; Brett, Maggie; Lai, Angeline H M; Lee, Siew-Peng; Tan, Ee-Shien; Jamuar, Saumya S; Ng, Ivy S L; Tan, Ene-Choo

    2015-12-14

    Next-generation sequencing (NGS) has revolutionized genetic research and offers enormous potential for clinical application. Sequencing the exome has the advantage of casting the net wide for all known coding regions while targeted gene panel sequencing provides enhanced sequencing depths and can be designed to avoid incidental findings in adult-onset conditions. A HaloPlex panel consisting of 180 genes within commonly altered chromosomal regions is available for use on both the Ion Personal Genome Machine (PGM) and MiSeq platforms to screen for causative mutations in these genes. We used this Haloplex ICCG panel for targeted sequencing of 15 patients with clinical presentations indicative of an abnormality in one of the 180 genes. Sequencing runs were done using the Ion 318 Chips on the Ion Torrent PGM. Variants were filtered for known polymorphisms and analysis was done to identify possible disease-causing variants before validation by Sanger sequencing. When possible, segregation of variants with phenotype in family members was performed to ascertain the pathogenicity of the variant. More than 97% of the target bases were covered at >20×. There was an average of 9.6 novel variants per patient. Pathogenic mutations were identified in five genes for six patients, with two novel variants. There were another five likely pathogenic variants, some of which were unreported novel variants. In a cohort of 15 patients, we were able to identify a likely genetic etiology in six patients (40%). Another five patients had candidate variants for which further evaluation and segregation analysis are ongoing. Our results indicate that the HaloPlex ICCG panel is useful as a rapid, high-throughput and cost-effective screening tool for 170 of the 180 genes. There is low coverage for some regions in several genes which might have to be supplemented by Sanger sequencing. However, comparing the cost, ease of analysis, and shorter turnaround time, it is a good alternative to exome

  14. Whole-Exome Sequencing Identifies One De Novo Variant in the FGD6 Gene in a Thai Family with Autism Spectrum Disorder

    Directory of Open Access Journals (Sweden)

    Chuphong Thongnak

    2018-01-01

    Full Text Available Autism spectrum disorder (ASD has a strong genetic basis, although the genetics of autism is complex and it is unclear. Genetic testing such as microarray or sequencing was widely used to identify autism markers, but they are unsuccessful in several cases. The objective of this study is to identify causative variants of autism in two Thai families by using whole-exome sequencing technique. Whole-exome sequencing was performed with autism-affected children from two unrelated families. Each sample was sequenced on SOLiD 5500xl Genetic Analyzer system followed by combined bioinformatics pipeline including annotation and filtering process to identify candidate variants. Candidate variants were validated, and the segregation study with other family members was performed using Sanger sequencing. This study identified a possible causative variant for ASD, c.2951G>A, in the FGD6 gene. We demonstrated the potential for ASD genetic variants associated with ASD using whole-exome sequencing and a bioinformatics filtering procedure. These techniques could be useful in identifying possible causative ASD variants, especially in cases in which variants cannot be identified by other techniques.

  15. Screening for single nucleotide variants, small indels and exon deletions with a next-generation sequencing based gene panel approach for Usher syndrome.

    Science.gov (United States)

    Krawitz, Peter M; Schiska, Daniela; Krüger, Ulrike; Appelt, Sandra; Heinrich, Verena; Parkhomchuk, Dmitri; Timmermann, Bernd; Millan, Jose M; Robinson, Peter N; Mundlos, Stefan; Hecht, Jochen; Gross, Manfred

    2014-09-01

    Usher syndrome is an autosomal recessive disorder characterized both by deafness and blindness. For the three clinical subtypes of Usher syndrome causal mutations in altogether 12 genes and a modifier gene have been identified. Due to the genetic heterogeneity of Usher syndrome, the molecular analysis is predestined for a comprehensive and parallelized analysis of all known genes by next-generation sequencing (NGS) approaches. We describe here the targeted enrichment and deep sequencing for exons of Usher genes and compare the costs and workload of this approach compared to Sanger sequencing. We also present a bioinformatics analysis pipeline that allows us to detect single-nucleotide variants, short insertions and deletions, as well as copy number variations of one or more exons on the same sequence data. Additionally, we present a flexible in silico gene panel for the analysis of sequence variants, in which newly identified genes can easily be included. We applied this approach to a cohort of 44 Usher patients and detected biallelic pathogenic mutations in 35 individuals and monoallelic mutations in eight individuals of our cohort. Thirty-nine of the sequence variants, including two heterozygous deletions comprising several exons of USH2A, have not been reported so far. Our NGS-based approach allowed us to assess single-nucleotide variants, small indels, and whole exon deletions in a single test. The described diagnostic approach is fast and cost-effective with a high molecular diagnostic yield.

  16. Gomphid DNA sequence data

    Data.gov (United States)

    U.S. Environmental Protection Agency — DNA sequence data for several genetic loci. This dataset is not publicly accessible because: It's already publicly available on GenBank. It can be accessed through...

  17. Yeast genome sequencing:

    DEFF Research Database (Denmark)

    Piskur, Jure; Langkjær, Rikke Breinhold

    2004-01-01

    For decades, unicellular yeasts have been general models to help understand the eukaryotic cell and also our own biology. Recently, over a dozen yeast genomes have been sequenced, providing the basis to resolve several complex biological questions. Analysis of the novel sequence data has shown...... of closely related species helps in gene annotation and to answer how many genes there really are within the genomes. Analysis of non-coding regions among closely related species has provided an example of how to determine novel gene regulatory sequences, which were previously difficult to analyse because...... they are short and degenerate and occupy different positions. Comparative genomics helps to understand the origin of yeasts and points out crucial molecular events in yeast evolutionary history, such as whole-genome duplication and horizontal gene transfer(s). In addition, the accumulating sequence data provide...

  18. Exome sequencing identifies CTSK mutations in patients originally diagnosed as intermediate osteopetrosis☆

    Science.gov (United States)

    Pangrazio, Alessandra; Puddu, Alessandro; Oppo, Manuela; Valentini, Maria; Zammataro, Luca; Vellodi, Ashok; Gener, Blanca; Llano-Rivas, Isabel; Raza, Jamal; Atta, Irum; Vezzoni, Paolo; Superti-Furga, Andrea; Villa, Anna; Sobacchi, Cristina

    2014-01-01

    Autosomal Recessive Osteopetrosis is a genetic disorder characterized by increased bone density due to lack of resorption by the osteoclasts. Genetic studies have widely unraveled the molecular basis of the most severe forms, while cases of intermediate severity are more difficult to characterize, probably because of a large heterogeneity. Here, we describe the use of exome sequencing in the molecular diagnosis of 2 siblings initially thought to be affected by “intermediate osteopetrosis”, which identified a homozygous mutation in the CTSK gene. Prompted by this finding, we tested by Sanger sequencing 25 additional patients addressed to us for recessive osteopetrosis and found CTSK mutations in 4 of them. In retrospect, their clinical and radiographic features were found to be compatible with, but not typical for, Pycnodysostosis. We sought to identify modifier genes that might have played a role in the clinical manifestation of the disease in these patients, but our results were not informative. In conclusion, we underline the difficulties of differential diagnosis in some patients whose clinical appearance does not fit the classical malignant or benign picture and recommend that CTSK gene be included in the molecular diagnosis of high bone density conditions. PMID:24269275

  19. Genomic Characterization for Parasitic Weeds of the Genus Striga by Sample Sequence Analysis

    Directory of Open Access Journals (Sweden)

    Matt C. Estep

    2012-03-01

    Full Text Available Generation of ∼2200 Sanger sequence reads or ∼10,000 454 reads for seven Lour. DNA samples (five species allowed identification of the highly repetitive DNA content in these genomes. The 14 most abundant repeats in these species were identified and partially assembled. Annotation indicated that they represent nine long terminal repeat (LTR retrotransposon families, three tandem satellite repeats, one long interspersed element (LINE retroelement, and one DNA transposon. All of these repeats are most closely related to repetitive elements in other closely related plants and are not products of horizontal transfer from their host species. These repeats were differentially abundant in each species, with the LTR retrotransposons and satellite repeats most responsible for variation in genome size. Each species had some repetitive elements that were more abundant and some less abundant than the other species examined, indicating that no single element or any unilateral growth or decrease trend in genome behavior was responsible for variation in genome size and composition. Genome sizes were determined by flow sorting, and the values of 615 Mb [ (L. Kuntze], 1330 Mb [ (Willd. Vatke], 1425 Mb [ (Delile Benth.] and 2460 Mb ( Benth. suggest a ploidy series, a prediction supported by repetitive DNA sequence analysis. Phylogenetic analysis using six chloroplast loci indicated the ancestral relationships of the five most agriculturally important species, with the unexpected result that the one parasite of dicotyledonous plants ( was found to be more closely related to some of the grass parasites than many of the grass parasites are to each other.

  20. An atypical case of Noonan syndrome with mutation diagnosed by targeted exome sequencing

    Directory of Open Access Journals (Sweden)

    Jinsup Kim

    2017-09-01

    Full Text Available Noonan syndrome (NS is a genetic disorder caused by autosomal dominant inheritance and is characterized by a distinctive facial appearance, short stature, chest deformity, and congenital heart disease. In individuals with NS, germline mutations have been identified in several genes involved in the RAS/mitogen-activated protein kinase signal transduction pathway. Because of its clinical and genetic heterogeneity, the conventional diagnostic protocol with Sanger sequencing requires a multistep approach. Therefore, molecular genetic diagnosis using targeted exome sequencing (TES is considered a less expensive and faster method, particularly for patients who do not fulfill the clinical diagnostic criteria of NS. In this case, the patient showed short stature, dysmorphic facial features suggestive of NS, feeding intolerance, cryptorchidism, and intellectual disability in early childhood. At the age of 16, the patient still showed extreme short stature with delayed puberty and characteristic facial features suggestive of NS. Although the patient had no cardiac problems or chest wall deformities, which are commonly present in NS and are major concerns for patients and clinicians, the patient showed several other characteristic clinical features of NS. Considering the possibility of a genetic disorder, including NS, a molecular genetic study with TES was performed. With TES analysis, we detected a pathogenic variant of c.458A > T in KRAS in this patient with atypical NS phenotype and provided appropriate clinical management and genetic counseling. The application of TES enables accurate molecular diagnosis of patients with nonspecific or atypical features in genetic diseases with several responsible genes, such as NS.

  1. Exome sequencing identifies CTSK mutations in patients originally diagnosed as intermediate osteopetrosis.

    Science.gov (United States)

    Pangrazio, Alessandra; Puddu, Alessandro; Oppo, Manuela; Valentini, Maria; Zammataro, Luca; Vellodi, Ashok; Gener, Blanca; Llano-Rivas, Isabel; Raza, Jamal; Atta, Irum; Vezzoni, Paolo; Superti-Furga, Andrea; Villa, Anna; Sobacchi, Cristina

    2014-02-01

    Autosomal Recessive Osteopetrosis is a genetic disorder characterized by increased bone density due to lack of resorption by the osteoclasts. Genetic studies have widely unraveled the molecular basis of the most severe forms, while cases of intermediate severity are more difficult to characterize, probably because of a large heterogeneity. Here, we describe the use of exome sequencing in the molecular diagnosis of 2 siblings initially thought to be affected by "intermediate osteopetrosis", which identified a homozygous mutation in the CTSK gene. Prompted by this finding, we tested by Sanger sequencing 25 additional patients addressed to us for recessive osteopetrosis and found CTSK mutations in 4 of them. In retrospect, their clinical and radiographic features were found to be compatible with, but not typical for, Pycnodysostosis. We sought to identify modifier genes that might have played a role in the clinical manifestation of the disease in these patients, but our results were not informative. In conclusion, we underline the difficulties of differential diagnosis in some patients whose clinical appearance does not fit the classical malignant or benign picture and recommend that CTSK gene be included in the molecular diagnosis of high bone density conditions. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.

  2. Whole-exome sequencing reveals a rare interferon gamma receptor 1 mutation associated with myasthenia gravis.

    Science.gov (United States)

    Qi, Guoyan; Liu, Peng; Gu, Shanshan; Yang, Hongxia; Dong, Huimin; Xue, Yinping

    2018-04-01

    Our study is aimed to explore the underlying genetic basis of myasthenia gravis. We collected a Chinese pedigree with myasthenia gravis, and whole-exome sequencing was performed on the two affected siblings and their parents. The candidate pathogenic gene was identified by bioinformatics filtering, which was further verified by Sanger sequencing. The homozygous mutation c.G40A (p.V14M) in interferon gamma receptor 1was identified. Moreover, the mutation was also detected in 3 cases of 44 sporadic myasthenia gravis patients. The p.V14M substitution in interferon gamma receptor 1 may affect the signal peptide function and the translocation on cell membrane, which could disrupt the binding of the ligand of interferon gamma and antibody production, contributing to myasthenia gravis susceptibility. We discovered that a rare variant c.G40A in interferon gamma receptor 1 potentially contributes to the myasthenia gravis pathogenesis. Further functional studies are needed to confirm the effect of the interferon gamma receptor 1 on the myasthenia gravis phenotype.

  3. Deep Sequencing Insights in Therapeutic shRNA Processing and siRNA Target Cleavage Precision.

    Science.gov (United States)

    Denise, Hubert; Moschos, Sterghios A; Sidders, Benjamin; Burden, Frances; Perkins, Hannah; Carter, Nikki; Stroud, Tim; Kennedy, Michael; Fancy, Sally-Ann; Lapthorn, Cris; Lavender, Helen; Kinloch, Ross; Suhy, David; Corbau, Romu

    2014-02-04

    TT-034 (PF-05095808) is a recombinant adeno-associated virus serotype 8 (AAV8) agent expressing three short hairpin RNA (shRNA) pro-drugs that target the hepatitis C virus (HCV) RNA genome. The cytosolic enzyme Dicer cleaves each shRNA into multiple, potentially active small interfering RNA (siRNA) drugs. Using next-generation sequencing (NGS) to identify and characterize active shRNAs maturation products, we observed that each TT-034-encoded shRNA could be processed into as many as 95 separate siRNA strands. Few of these appeared active as determined by Sanger 5' RNA Ligase-Mediated Rapid Amplification of cDNA Ends (5-RACE) and through synthetic shRNA and siRNA analogue studies. Moreover, NGS scrutiny applied on 5-RACE products (RACE-seq) suggested that synthetic siRNAs could direct cleavage in not one, but up to five separate positions on targeted RNA, in a sequence-dependent manner. These data support an on-target mechanism of action for TT-034 without cytotoxicity and question the accepted precision of substrate processing by the key RNA interference (RNAi) enzymes Dicer and siRNA-induced silencing complex (siRISC).Molecular Therapy-Nucleic Acids (2014) 3, e145; doi:10.1038/mtna.2013.73; published online 4 February 2014.

  4. Whole-exome sequencing revealed two novel mutations in Usher syndrome.

    Science.gov (United States)

    Koparir, Asuman; Karatas, Omer Faruk; Atayoglu, Ali Timucin; Yuksel, Bayram; Sagiroglu, Mahmut Samil; Seven, Mehmet; Ulucan, Hakan; Yuksel, Adnan; Ozen, Mustafa

    2015-06-01

    Usher syndrome is a clinically and genetically heterogeneous autosomal recessive inherited disorder accompanied by hearing loss and retinitis pigmentosa (RP). Since the associated genes are various and quite large, we utilized whole-exome sequencing (WES) as a diagnostic tool to identify the molecular basis of Usher syndrome. DNA from a 12-year-old male diagnosed with Usher syndrome was analyzed by WES. Mutations detected were confirmed by Sanger sequencing. The pathogenicity of these mutations was determined by in silico analysis. A maternally inherited deleterious frameshift mutation, c.14439_14454del in exon 66 and a paternally inherited non-sense c.10830G>A stop-gain SNV in exon 55 of USH2A were found as two novel compound heterozygous mutations. Both of these mutations disrupt the C terminal of USH2A protein. As a result, WES revealed two novel compound heterozygous mutations in a Turkish USH2A patient. This approach gave us an opportunity to have an appropriate diagnosis and provide genetic counseling to the family within a reasonable time. Copyright © 2015 Elsevier B.V. All rights reserved.

  5. Whole-genome sequencing identifies recurrent somatic NOTCH2 mutations in splenic marginal zone lymphoma.

    Science.gov (United States)

    Kiel, Mark J; Velusamy, Thirunavukkarasu; Betz, Bryan L; Zhao, Lili; Weigelin, Helmut G; Chiang, Mark Y; Huebner-Chan, David R; Bailey, Nathanael G; Yang, David T; Bhagat, Govind; Miranda, Roberto N; Bahler, David W; Medeiros, L Jeffrey; Lim, Megan S; Elenitoba-Johnson, Kojo S J

    2012-08-27

    Splenic marginal zone lymphoma (SMZL), the most common primary lymphoma of spleen, is poorly understood at the genetic level. In this study, using whole-genome DNA sequencing (WGS) and confirmation by Sanger sequencing, we observed mutations identified in several genes not previously known to be recurrently altered in SMZL. In particular, we identified recurrent somatic gain-of-function mutations in NOTCH2, a gene encoding a protein required for marginal zone B cell development, in 25 of 99 (∼25%) cases of SMZL and in 1 of 19 (∼5%) cases of nonsplenic MZLs. These mutations clustered near the C-terminal proline/glutamate/serine/threonine (PEST)-rich domain, resulting in protein truncation or, rarely, were nonsynonymous substitutions affecting the extracellular heterodimerization domain (HD). NOTCH2 mutations were not present in other B cell lymphomas and leukemias, such as chronic lymphocytic leukemia/small lymphocytic lymphoma (CLL/SLL; n = 15), mantle cell lymphoma (MCL; n = 15), low-grade follicular lymphoma (FL; n = 44), hairy cell leukemia (HCL; n = 15), and reactive lymphoid hyperplasia (n = 14). NOTCH2 mutations were associated with adverse clinical outcomes (relapse, histological transformation, and/or death) among SMZL patients (P = 0.002). These results suggest that NOTCH2 mutations play a role in the pathogenesis and progression of SMZL and are associated with a poor prognosis.

  6. Dynamic Sequence Assignment.

    Science.gov (United States)

    1983-12-01

    D-136 548 DYNAMIIC SEQUENCE ASSIGNMENT(U) ADVANCED INFORMATION AND 1/2 DECISION SYSTEMS MOUNTAIN YIELW CA C A 0 REILLY ET AL. UNCLSSIIED DEC 83 AI/DS...I ADVANCED INFORMATION & DECISION SYSTEMS Mountain View. CA 94040 84 u ,53 V,..’. Unclassified _____ SCURITY CLASSIFICATION OF THIS PAGE REPORT...reviews some important heuristic algorithms developed for fas- ter solution of the sequence assignment problem. 3.1. DINAMIC MOGRAMUNIG FORMULATION FOR

  7. HIV Sequence Compendium 2010

    Energy Technology Data Exchange (ETDEWEB)

    Kuiken, Carla [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Foley, Brian [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Leitner, Thomas [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Apetrei, Christian [Univ. of Pittsburgh, PA (United States); Hahn, Beatrice [Univ. of Alabama, Tuscaloosa, AL (United States); Mizrachi, Ilene [National Center for Biotechnology Information, Bethesda, MD (United States); Mullins, James [Univ. of Washington, Seattle, WA (United States); Rambaut, Andrew [Univ. of Edinburgh, Scotland (United Kingdom); Wolinsky, Steven [Northwestern Univ., Evanston, IL (United States); Korber, Bette [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2010-12-31

    This compendium is an annual printed summary of the data contained in the HIV sequence database. In these compendia we try to present a judicious selection of the data in such a way that it is of maximum utility to HIV researchers. Each of the alignments attempts to display the genetic variability within the different species, groups and subtypes of the virus. This compendium contains sequences published before January 1, 2010. Hence, though it is called the 2010 Compendium, its contents correspond to the 2009 curated alignments on our website. The number of sequences in the HIV database is still increasing exponentially. In total, at the time of printing, there were 339,306 sequences in the HIV Sequence Database, an increase of 45% since last year. The number of near complete genomes (>7000 nucleotides) increased to 2576 by end of 2009, reflecting a smaller increase than in previous years. However, as in previous years, the compendium alignments contain only a small fraction of these. Included in the alignments are a small number of sequences representing each of the subtypes and the more prevalent circulating recombinant forms (CRFs) such as 01 and 02, as well as a few outgroup sequences (group O and N and SIV-CPZ). Of the rarer CRFs we included one representative each. A more complete version of all alignments is available on our website, http://www.hiv.lanl.gov/content/sequence/NEWALIGN/align.html. Reprints are available from our website in the form of both HTML and PDF files. As always, we are open to complaints and suggestions for improvement. Inquiries and comments regarding the compendium should be addressed to seq-info@lanl.gov.

  8. General LTE Sequence

    OpenAIRE

    Billal, Masum

    2015-01-01

    In this paper,we have characterized sequences which maintain the same property described in Lifting the Exponent Lemma. Lifting the Exponent Lemma is a very powerful tool in olympiad number theory and recently it has become very popular. We generalize it to all sequences that maintain a property like it i.e. if p^{\\alpha}||a_k and p^\\b{eta}||n, then p^{{\\alpha}+\\b{eta}}||a_{nk}.

  9. Pairwise Sequence Alignment Library

    Energy Technology Data Exchange (ETDEWEB)

    2015-05-20

    Vector extensions, such as SSE, have been part of the x86 CPU since the 1990s, with applications in graphics, signal processing, and scientific applications. Although many algorithms and applications can naturally benefit from automatic vectorization techniques, there are still many that are difficult to vectorize due to their dependence on irregular data structures, dense branch operations, or data dependencies. Sequence alignment, one of the most widely used operations in bioinformatics workflows, has a computational footprint that features complex data dependencies. The trend of widening vector registers adversely affects the state-of-the-art sequence alignment algorithm based on striped data layouts. Therefore, a novel SIMD implementation of a parallel scan-based sequence alignment algorithm that can better exploit wider SIMD units was implemented as part of the Parallel Sequence Alignment Library (parasail). Parasail features: Reference implementations of all known vectorized sequence alignment approaches. Implementations of Smith Waterman (SW), semi-global (SG), and Needleman Wunsch (NW) sequence alignment algorithms. Implementations across all modern CPU instruction sets including AVX2 and KNC. Language interfaces for C/C++ and Python.

  10. Deciphering the distance to antibiotic resistance for the pneumococcus using genome sequencing data

    NARCIS (Netherlands)

    Mobegi, Fredrick M; Cremers, Amelieke J H; de Jonge, Marien I; Bentley, Stephen D; van Hijum, Sacha A F T; Zomer, Aldert|info:eu-repo/dai/nl/304642754

    2017-01-01

    Advances in genome sequencing technologies and genome-wide association studies (GWAS) have provided unprecedented insights into the molecular basis of microbial phenotypes and enabled the identification of the underlying genetic variants in real populations. However, utilization of genome sequencing

  11. Levenshtein error-correcting barcodes for multiplexed DNA sequencing

    NARCIS (Netherlands)

    Buschmann, Tilo; Bystrykh, Leonid V.

    2013-01-01

    Background: High-throughput sequencing technologies are improving in quality, capacity and costs, providing versatile applications in DNA and RNA research. For small genomes or fraction of larger genomes, DNA samples can be mixed and loaded together on the same sequencing track. This so-called

  12. Identification of Meconopsis species by a DNA barcode sequence ...

    African Journals Online (AJOL)

    Deoxyribonucleic acid (DNA) barcoding is a novel technology that uses a standard DNA sequence to facilitate species identification. Species identification is necessary for the authentication of traditional plant based medicines. Although a consensus has not been agreed regarding which DNA sequences can be used as ...

  13. Scalable Kernel Methods and Algorithms for General Sequence Analysis

    Science.gov (United States)

    Kuksa, Pavel

    2011-01-01

    Analysis of large-scale sequential data has become an important task in machine learning and pattern recognition, inspired in part by numerous scientific and technological applications such as the document and text classification or the analysis of biological sequences. However, current computational methods for sequence comparison still lack…

  14. Genome sequence of Stachybotrys chartarum Strain 51-11

    Science.gov (United States)

    Stachybotrys chartarum strain 51-11 genome was sequenced by shotgun sequencing utilizing Illumina Hiseq 2000 and PacBio long read technology. Since Stachybotrys chartarum has been implicated in health impacts within water-damaged buildings, any information extracted from the geno...

  15. What can next generation sequencing do for you? Next generation sequencing as a valuable tool in plant research

    OpenAIRE

    Bräutigam, Andrea; Gowik, Udo

    2010-01-01

    Next generation sequencing (NGS) technologies have opened fascinating opportunities for the analysis of plants with and without a sequenced genome on a genomic scale. During the last few years, NGS methods have become widely available and cost effective. They can be applied to a wide variety of biological questions, from the sequencing of complete eukaryotic genomes and transcriptomes, to the genome-scale analysis of DNA-protein interactions. In this review, we focus on the use of NGS for pla...

  16. Complete genome sequence of Ikoma lyssavirus.

    Science.gov (United States)

    Marston, Denise A; Ellis, Richard J; Horton, Daniel L; Kuzmin, Ivan V; Wise, Emma L; McElhinney, Lorraine M; Banyard, Ashley C; Ngeleja, Chanasa; Keyyu, Julius; Cleaveland, Sarah; Lembo, Tiziana; Rupprecht, Charles E; Fooks, Anthony R

    2012-09-01

    Lyssaviruses (family Rhabdoviridae) constitute one of the most important groups of viral zoonoses globally. All lyssaviruses cause the disease rabies, an acute progressive encephalitis for which, once symptoms occur, there is no effective cure. Currently available vaccines are highly protective against the predominantly circulating lyssavirus species. Using next-generation sequencing technologies, we have obtained the whole-genome sequence for a novel lyssavirus, Ikoma lyssavirus (IKOV), isolated from an African civet in Tanzania displaying clinical signs of rabies. Genetically, this virus is the most divergent within the genus Lyssavirus. Characterization of the genome will help to improve our understanding of lyssavirus diversity and enable investigation into vaccine-induced immunity and protection.

  17. Adaptive Processing for Sequence Alignment

    KAUST Repository

    Zidan, Mohammed A.; Bonny, Talal; Salama, Khaled N.

    2012-01-01

    Disclosed are various embodiments for adaptive processing for sequence alignment. In one embodiment, among others, a method includes obtaining a query sequence and a plurality of database sequences. A first portion of the plurality of database sequences is distributed to a central processing unit (CPU) and a second portion of the plurality of database sequences is distributed to a graphical processing unit (GPU) based upon a predetermined splitting ratio associated with the plurality of database sequences, where the database sequences of the first portion are shorter than the database sequences of the second portion. A first alignment score for the query sequence is determined with the CPU based upon the first portion of the plurality of database sequences and a second alignment score for the query sequence is determined with the GPU based upon the second portion of the plurality of database sequences.

  18. Adaptive Processing for Sequence Alignment

    KAUST Repository

    Zidan, Mohammed A.

    2012-01-26

    Disclosed are various embodiments for adaptive processing for sequence alignment. In one embodiment, among others, a method includes obtaining a query sequence and a plurality of database sequences. A first portion of the plurality of database sequences is distributed to a central processing unit (CPU) and a second portion of the plurality of database sequences is distributed to a graphical processing unit (GPU) based upon a predetermined splitting ratio associated with the plurality of database sequences, where the database sequences of the first portion are shorter than the database sequences of the second portion. A first alignment score for the query sequence is determined with the CPU based upon the first portion of the plurality of database sequences and a second alignment score for the query sequence is determined with the GPU based upon the second portion of the plurality of database sequences.

  19. SeqCompress: an algorithm for biological sequence compression.

    Science.gov (United States)

    Sardaraz, Muhammad; Tahir, Muhammad; Ikram, Ataul Aziz; Bajwa, Hassan

    2014-10-01

    The growth of Next Generation Sequencing technologies presents significant research challenges, specifically to design bioinformatics tools that handle massive amount of data efficiently. Biological sequence data storage cost has become a noticeable proportion of total cost in the generation and analysis. Particularly increase in DNA sequencing rate is significantly outstripping the rate of increase in disk storage capacity, which may go beyond the limit of storage capacity. It is essential to develop algorithms that handle large data sets via better memory management. This article presents a DNA sequence compression algorithm SeqCompress that copes with the space complexity of biological sequences. The algorithm is based on lossless data compression and uses statistical model as well as arithmetic coding to compress DNA sequences. The proposed algorithm is compared with recent specialized compression tools for biological sequences. Experimental results show that proposed algorithm has better compression gain as compared to other existing algorithms. Copyright © 2014 Elsevier Inc. All rights reserved.

  20. Next-Generation Mitogenomics: A Comparison of Approaches Applied to Caecilian Amphibian Phylogeny

    OpenAIRE

    Maddock, Simon T.; Briscoe, Andrew G.; Wilkinson, Mark; Waeschenbach, Andrea; San Mauro, Diego; Day, Julia J.; Littlewood, D. Tim J.; Foster, Peter G.; Nussbaum, Ronald A.; Gower, David J.

    2016-01-01

    Mitochondrial genome (mitogenome) sequences are being generated with increasing speed due to the advances of next-generation sequencing (NGS) technology and associated analytical tools. However, detailed comparisons to explore the utility of alternative NGS approaches applied to the same taxa have not been undertaken. We compared a ‘traditional’ Sanger sequencing method with two NGS approaches (shotgun sequencing and non-indexed, multiplex amplicon sequencing) on four different sequencing pla...

  1. Accurate molecular diagnosis of phenylketonuria and tetrahydrobiopterin-deficient hyperphenylalaninemias using high-throughput targeted sequencing

    Science.gov (United States)

    Trujillano, Daniel; Perez, Belén; González, Justo; Tornador, Cristian; Navarrete, Rosa; Escaramis, Georgia; Ossowski, Stephan; Armengol, Lluís; Cornejo, Verónica; Desviat, Lourdes R; Ugarte, Magdalena; Estivill, Xavier

    2014-01-01

    Genetic diagnostics of phenylketonuria (PKU) and tetrahydrobiopterin (BH4) deficient hyperphenylalaninemia (BH4DH) rely on methods that scan for known mutations or on laborious molecular tools that use Sanger sequencing. We have implemented a novel and much more efficient strategy based on high-throughput multiplex-targeted resequencing of four genes (PAH, GCH1, PTS, and QDPR) that, when affected by loss-of-function mutations, cause PKU and BH4DH. We have validated this approach in a cohort of 95 samples with the previously known PAH, GCH1, PTS, and QDPR mutations and one control sample. Pooled barcoded DNA libraries were enriched using a custom NimbleGen SeqCap EZ Choice array and sequenced using a HiSeq2000 sequencer. The combination of several robust bioinformatics tools allowed us to detect all known pathogenic mutations (point mutations, short insertions/deletions, and large genomic rearrangements) in the 95 samples, without detecting spurious calls in these genes in the control sample. We then used the same capture assay in a discovery cohort of 11 uncharacterized HPA patients using a MiSeq sequencer. In addition, we report the precise characterization of the breakpoints of four genomic rearrangements in PAH, including a novel deletion of 899 bp in intron 3. Our study is a proof-of-principle that high-throughput-targeted resequencing is ready to substitute classical molecular methods to perform differential genetic diagnosis of hyperphenylalaninemias, allowing the establishment of specifically tailored treatments a few days after birth. PMID:23942198

  2. Exome sequencing of index patients with retinal dystrophies as a tool for molecular diagnosis.

    Directory of Open Access Journals (Sweden)

    Marta Corton

    Full Text Available Retinal dystrophies (RD are a group of hereditary diseases that lead to debilitating visual impairment and are usually transmitted as a Mendelian trait. Pathogenic mutations can occur in any of the 100 or more disease genes identified so far, making molecular diagnosis a rather laborious process. In this work we explored the use of whole exome sequencing (WES as a tool for identification of RD mutations, with the aim of assessing its applicability in a diagnostic context.We ascertained 12 Spanish families with seemingly recessive RD. All of the index patients underwent mutational pre-screening by chip-based sequence hybridization and resulted to be negative for known RD mutations. With the exception of one pedigree, to simulate a standard diagnostic scenario we processed by WES only the DNA from the index patient of each family, followed by in silico data analysis. We successfully identified causative mutations in patients from 10 different families, which were later verified by Sanger sequencing and co-segregation analyses. Specifically, we detected pathogenic DNA variants (∼50% novel mutations in the genes RP1, USH2A, CNGB3, NMNAT1, CHM, and ABCA4, responsible for retinitis pigmentosa, Usher syndrome, achromatopsia, Leber congenital amaurosis, choroideremia, or recessive Stargardt/cone-rod dystrophy cases.Despite the absence of genetic information from other family members that could help excluding nonpathogenic DNA variants, we could detect causative mutations in a variety of genes known to represent a wide spectrum of clinical phenotypes in 83% of the patients analyzed. Considering the constant drop in costs for human exome sequencing and the relative simplicity of the analyses made, this technique could represent a valuable tool for molecular diagnostics or genetic research, even in cases for which no genotypes from family members are available.

  3. Analysis of Litopenaeus vannamei transcriptome using the next-generation DNA sequencing technique.

    Directory of Open Access Journals (Sweden)

    Chaozheng Li

    Full Text Available BACKGROUND: Pacific white shrimp (Litopenaeus vannamei, the major species of farmed shrimps in the world, has been attracting extensive studies, which require more and more genome background knowledge. The now available transcriptome data of L. vannamei are insufficient for research requirements, and have not been adequately assembled and annotated. METHODOLOGY/PRINCIPAL FINDINGS: This is the first study that used a next-generation high-throughput DNA sequencing technique, the Solexa/Illumina GA II method, to analyze the transcriptome from whole bodies of L. vannamei larvae. More than 2.4 Gb of raw data were generated, and 109,169 unigenes with a mean length of 396 bp were assembled using the SOAP denovo software. 73,505 unigenes (>200 bp with good quality sequences were selected and subjected to annotation analysis, among which 37.80% can be matched in NCBI Nr database, 37.3% matched in Swissprot, and 44.1% matched in TrEMBL. Using BLAST and BLAST2Go softwares, 11,153 unigenes were classified into 25 Clusters of Orthologous Groups of proteins (COG categories, 8171 unigenes were assigned into 51 Gene ontology (GO functional groups, and 18,154 unigenes were divided into 220 Kyoto Encyclopedia of Genes and Genomes (KEGG pathways. To primarily verify part of the results of assembly and annotations, 12 assembled unigenes that are homologous to many embryo development-related genes were chosen and subjected to RT-PCR for electrophoresis and Sanger sequencing analyses, and to real-time PCR for expression profile analyses during embryo development. CONCLUSIONS/SIGNIFICANCE: The L. vannamei transcriptome analyzed using the next-generation sequencing technique enriches the information of L. vannamei genes, which will facilitate our understanding of the genome background of crustaceans, and promote the studies on L. vannamei.

  4. Combined Targeted DNA Sequencing in Non-Small Cell Lung Cancer (NSCLC Using UNCseq and NGScopy, and RNA Sequencing Using UNCqeR for the Detection of Genetic Aberrations in NSCLC.

    Directory of Open Access Journals (Sweden)

    Xiaobei Zhao

    Full Text Available The recent FDA approval of the MiSeqDx platform provides a unique opportunity to develop targeted next generation sequencing (NGS panels for human disease, including cancer. We have developed a scalable, targeted panel-based assay termed UNCseq, which involves a NGS panel of over 200 cancer-associated genes and a standardized downstream bioinformatics pipeline for detection of single nucleotide variations (SNV as well as small insertions and deletions (indel. In addition, we developed a novel algorithm, NGScopy, designed for samples with sparse sequencing coverage to detect large-scale copy number variations (CNV, similar to human SNP Array 6.0 as well as small-scale intragenic CNV. Overall, we applied this assay to 100 snap-frozen lung cancer specimens lacking same-patient germline DNA (07-0120 tissue cohort and validated our results against Sanger sequencing, SNP Array, and our recently published integrated DNA-seq/RNA-seq assay, UNCqeR, where RNA-seq of same-patient tumor specimens confirmed SNV detected by DNA-seq, if RNA-seq coverage depth was adequate. In addition, we applied the UNCseq assay on an independent lung cancer tumor tissue collection with available same-patient germline DNA (11-1115 tissue cohort and confirmed mutations using assays performed in a CLIA-certified laboratory. We conclude that UNCseq can identify SNV, indel, and CNV in tumor specimens lacking germline DNA in a cost-efficient fashion.

  5. Main sequence mass loss

    International Nuclear Information System (INIS)

    Brunish, W.M.; Guzik, J.A.; Willson, L.A.; Bowen, G.

    1987-01-01

    It has been hypothesized that variable stars may experience mass loss, driven, at least in part, by oscillations. The class of stars we are discussing here are the δ Scuti variables. These are variable stars with masses between about 1.2 and 2.25 M/sub θ/, lying on or very near the main sequence. According to this theory, high rotation rates enhance the rate of mass loss, so main sequence stars born in this mass range would have a range of mass loss rates, depending on their initial rotation velocity and the amplitude of the oscillations. The stars would evolve rapidly down the main sequence until (at about 1.25 M/sub θ/) a surface convection zone began to form. The presence of this convective region would slow the rotation, perhaps allowing magnetic braking to occur, and thus sharply reduce the mass loss rate. 7 refs

  6. Inaugural Genomics Automation Congress and the coming deluge of sequencing data.

    Science.gov (United States)

    Creighton, Chad J

    2010-10-01

    Presentations at Select Biosciences's first 'Genomics Automation Congress' (Boston, MA, USA) in 2010 focused on next-generation sequencing and the platforms and methodology around them. The meeting provided an overview of sequencing technologies, both new and emerging. Speakers shared their recent work on applying sequencing to profile cells for various levels of biomolecular complexity, including DNA sequences, DNA copy, DNA methylation, mRNA and microRNA. With sequencing time and costs continuing to drop dramatically, a virtual explosion of very large sequencing datasets is at hand, which will probably present challenges and opportunities for high-level data analysis and interpretation, as well as for information technology infrastructure.

  7. Electricity sequence control

    International Nuclear Information System (INIS)

    Shin, Heung Ryeol

    2010-03-01

    The contents of the book are introduction of control system, like classification and control signal, introduction of electricity power switch, such as push-button and detection switch sensor for induction type and capacitance type machinery for control, solenoid valve, expression of sequence and type of electricity circuit about using diagram, time chart, marking and term, logic circuit like Yes, No, and, or and equivalence logic, basic electricity circuit, electricity sequence control, added condition, special program control about choice and jump of program, motor control, extra circuit on repeat circuit, pause circuit in a conveyer, safety regulations and rule about classification of electricity disaster and protective device for insulation.

  8. Next-generation sequencing

    DEFF Research Database (Denmark)

    Rieneck, Klaus; Bak, Mads; Jønson, Lars

    2013-01-01

    , Illumina); several millions of PCR sequences were analyzed. RESULTS: The results demonstrated the feasibility of diagnosing the fetal KEL1 or KEL2 blood group from cell-free DNA purified from maternal plasma. CONCLUSION: This method requires only one primer pair, and the large amount of sequence...... information obtained allows well for statistical analysis of the data. This general approach can be integrated into current laboratory practice and has numerous applications. Besides DNA-based predictions of blood group phenotypes, platelet phenotypes, or sickle cell anemia, and the determination of zygosity...

  9. Translational database selection and multiplexed sequence capture for up front filtering of reliable breast cancer biomarker candidates.

    Directory of Open Access Journals (Sweden)

    Patrik L Ståhl

    Full Text Available Biomarker identification is of utmost importance for the development of novel diagnostics and therapeutics. Here we make use of a translational database selection strategy, utilizing data from the Human Protein Atlas (HPA on differentially expressed protein patterns in healthy and breast cancer tissues as a means to filter out potential biomarkers for underlying genetic causatives of the disease. DNA was isolated from ten breast cancer biopsies, and the protein coding and flanking non-coding genomic regions corresponding to the selected proteins were extracted in a multiplexed format from the samples using a single DNA sequence capture array. Deep sequencing revealed an even enrichment of the multiplexed samples and a great variation of genetic alterations in the tumors of the sampled individuals. Benefiting from the upstream filtering method, the final set of biomarker candidates could be completely verified through bidirectional Sanger sequencing, revealing a 40 percent false positive rate despite high read coverage. Of the variants encountered in translated regions, nine novel non-synonymous variations were identified and verified, two of which were present in more than one of the ten tumor samples.

  10. ATRX mutation in two adult brothers with non-specific moderate intellectual disability identified by exome sequencing.

    Science.gov (United States)

    Moncini, S; Bedeschi, M F; Castronovo, P; Crippa, M; Calvello, M; Garghentino, R R; Scuvera, G; Finelli, P; Venturin, M

    2013-12-01

    In this report, we describe two adult brothers affected by moderate non-specific intellectual disability (ID). They showed minor facial anomalies, not clearly ascribable to any specific syndromic patterns, microcephaly, brachydactyly and broad toes. Both brothers presented seizures. Karyotype, subtelomeric and FMR1 analysis were normal in both cases. We performed array-CGH analysis that revealed no copy-number variations potentially associated with ID. Subsequent exome sequence analysis allowed the identification of the ATRX c.109C>T (p.R37X) mutation in both the affected brothers. Sanger sequencing confirmed the presence of the mutation in the brothers and showed that the mother is a healthy carrier. Mutations in the ATRX gene cause the X-linked alpha thalassemia/mental retardation (ATR-X) syndrome (MIM #301040), a severe clinical condition usually associated with profound ID, facial dysmorphism and alpha thalassemia. However, the syndrome is clinically heterogeneous and some mutations, including the c.109C>T, are associated with a broad phenotypic spectrum, with patients displaying a less severe phenotype with only mild-moderate ID. In the case presented here, exome sequencing provided an effective strategy to achieve the molecular diagnosis of ATR-X syndrome, which otherwise would have been difficult to consider due to the mild non-specific phenotype and the absence of a family history with typical severe cases.

  11. Next-generation sequencing reveals a novel NDP gene mutation in a Chinese family with Norrie disease.

    Science.gov (United States)

    Huang, Xiaoyan; Tian, Mao; Li, Jiankang; Cui, Ling; Li, Min; Zhang, Jianguo

    2017-11-01

    Norrie disease (ND) is a rare X-linked genetic disorder, the main symptoms of which are congenital blindness and white pupils. It has been reported that ND is caused by mutations in the NDP gene. Although many mutations in NDP have been reported, the genetic cause for many patients remains unknown. In this study, the aim is to investigate the genetic defect in a five-generation family with typical symptoms of ND. To identify the causative gene, next-generation sequencing based target capture sequencing was performed. Segregation analysis of the candidate variant was performed in additional family members using Sanger sequencing. We identified a novel missense variant (c.314C>A) located within the NDP gene. The mutation cosegregated within all affected individuals in the family and was not found in unaffected members. By happenstance, in this family, we also detected a known pathogenic variant of retinitis pigmentosa in a healthy individual. c.314C>A mutation of NDP gene is a novel mutation and broadens the genetic spectrum of ND.

  12. Next-generation sequencing reveals a novel NDP gene mutation in a Chinese family with Norrie disease

    Directory of Open Access Journals (Sweden)

    Xiaoyan Huang

    2017-01-01

    Full Text Available Purpose: Norrie disease (ND is a rare X-linked genetic disorder, the main symptoms of which are congenital blindness and white pupils. It has been reported that ND is caused by mutations in the NDP gene. Although many mutations in NDP have been reported, the genetic cause for many patients remains unknown. In this study, the aim is to investigate the genetic defect in a five-generation family with typical symptoms of ND. Methods: To identify the causative gene, next-generation sequencing based target capture sequencing was performed. Segregation analysis of the candidate variant was performed in additional family members using Sanger sequencing. Results: We identified a novel missense variant (c.314C>A located within the NDP gene. The mutation cosegregated within all affected individuals in the family and was not found in unaffected members. By happenstance, in this family, we also detected a known pathogenic variant of retinitis pigmentosa in a healthy individual. Conclusion: c.314C>A mutation of NDP gene is a novel mutation and broadens the genetic spectrum of ND.

  13. Intraspecific variations in Cyt b and D-loop sequences of Testudine species, Lissemys punctata from south Karnataka

    Directory of Open Access Journals (Sweden)

    R. Lalitha

    2018-01-01

    Full Text Available The freshwater Testudine species have gained importance in recent years, as most of their population is threatened due to exploitation for delicacy and pet trade. In this regard, Lissemys punctata, a freshwater terrapin, predominantly distributed in Asian countries has gained its significance for the study. A pilot study report on mitochondrial markers (Cyt b and D-loop conducted on L. punctata species from southern Karnataka, India was presented in this investigation. A complete region spanning 1.14 kb and ∼1 kb was amplified by HotStart PCR and sequenced by Sanger sequencing. The Cyt b sequence revealed 85 substitution sites, no indels and 17 parsimony informative sites, whereas D-loop showed 189 variable sites, 51 parsimony informative sites with 5′ functional domains TAS, CSB-F, CSBs (1, 2, 3 preceding tandem repeat at 3′ end. Current data highlights the intraspecific variations in these target regions and variations validated using suitable evolutionary models points out that the overall point mutations observed in the region are transitions leading to no structural and functional alterations. The mitochondrial data generated uncover the genetic diversity within species and conservationist can utilize the data to estimate the effective population size or for forensic identification of animal or its seizures during unlawful trade activities.

  14. Aspects of coverage in medical DNA sequencing

    Directory of Open Access Journals (Sweden)

    Wilson Richard K

    2008-05-01

    Full Text Available Abstract Background DNA sequencing is now emerging as an important component in biomedical studies of diseases like cancer. Short-read, highly parallel sequencing instruments are expected to be used heavily for such projects, but many design specifications have yet to be conclusively established. Perhaps the most fundamental of these is the redundancy required to detect sequence variations, which bears directly upon genomic coverage and the consequent resolving power for discerning somatic mutations. Results We address the medical sequencing coverage problem via an extension of the standard mathematical theory of haploid coverage. The expected diploid multi-fold coverage, as well as its generalization for aneuploidy are derived and these expressions can be readily evaluated for any project. The resulting theory is used as a scaling law to calibrate performance to that of standard BAC sequencing at 8× to 10× redundancy, i.e. for expected coverages that exceed 99% of the unique sequence. A differential strategy is formalized for tumor/normal studies wherein tumor samples are sequenced more deeply than normal ones. In particular, both tumor alleles should be detected at least twice, while both normal alleles are detected at least once. Our theory predicts these requirements can be met for tumor and normal redundancies of approximately 26× and 21×, respectively. We explain why these values do not differ by a factor of 2, as might intuitively be expected. Future technology developments should prompt even deeper sequencing of tumors, but the 21× value for normal samples is essentially a constant. Conclusion Given the assumptions of standard coverage theory, our model gives pragmatic estimates for required redundancy. The differential strategy should be an efficient means of identifying potential somatic mutations for further study.

  15. Microfluidic PCR Amplification and MiSeq Amplicon Sequencing Techniques for High-Throughput Detection and Genotyping of Human Pathogenic RNA Viruses in Human Feces, Sewage, and Oysters

    Directory of Open Access Journals (Sweden)

    Mamoru Oshiki

    2018-04-01

    Full Text Available Detection and genotyping of pathogenic RNA viruses in human and environmental samples are useful for monitoring the circulation and prevalence of these pathogens, whereas a conventional PCR assay followed by Sanger sequencing is time-consuming and laborious. The present study aimed to develop a high-throughput detection-and-genotyping tool for 11 human RNA viruses [Aichi virus; astrovirus; enterovirus; norovirus genogroup I (GI, GII, and GIV; hepatitis A virus; hepatitis E virus; rotavirus; sapovirus; and human parechovirus] using a microfluidic device and next-generation sequencer. Microfluidic nested PCR was carried out on a 48.48 Access Array chip, and the amplicons were recovered and used for MiSeq sequencing (Illumina, Tokyo, Japan; genotyping was conducted by homology searching and phylogenetic analysis of the obtained sequence reads. The detection limit of the 11 tested viruses ranged from 100 to 103 copies/μL in cDNA sample, corresponding to 101–104 copies/mL-sewage, 105–108 copies/g-human feces, and 102–105 copies/g-digestive tissues of oyster. The developed assay was successfully applied for simultaneous detection and genotyping of RNA viruses to samples of human feces, sewage, and artificially contaminated oysters. Microfluidic nested PCR followed by MiSeq sequencing enables efficient tracking of the fate of multiple RNA viruses in various environments, which is essential for a better understanding of the circulation of human pathogenic RNA viruses in the human population.

  16. Application of high-throughput DNA sequencing in phytopathology.

    Science.gov (United States)

    Studholme, David J; Glover, Rachel H; Boonham, Neil

    2011-01-01

    The new sequencing technologies are already making a big impact in academic research on medically important microbes and may soon revolutionize diagnostics, epidemiology, and infection control. Plant pathology also stands to gain from exploiting these opportunities. This manuscript reviews some applications of these high-throughput sequencing methods that are relevant to phytopathology, with emphasis on the associated computational and bioinformatics challenges and their solutions. Second-generation sequencing technologies have recently been exploited in genomics of both prokaryotic and eukaryotic plant pathogens. They are also proving to be useful in diagnostics, especially with respect to viruses. Copyright © 2011 by Annual Reviews. All rights reserved.

  17. Epigenetics and assisted reproductive technologies

    DEFF Research Database (Denmark)

    Pinborg, Anja; Loft, Anne; Romundstad, Liv Bente

    2016-01-01

    Epigenetic modification controls gene activity without changes in the DNA sequence. The genome undergoes several phases of epigenetic programming during gametogenesis and early embryo development coinciding with assisted reproductive technologies (ART) treatments. Imprinting disorders have been...

  18. 10KP: A phylodiverse genome sequencing plan

    Science.gov (United States)

    Cheng, Shifeng; Melkonian, Michael; Brockington, Samuel; Archibald, John M; Delaux, Pierre-Marc; Melkonian, Barbara; Mavrodiev, Evgeny V; Sun, Wenjing; Fu, Yuan; Yang, Huanming; Soltis, Douglas E; Graham, Sean W; Soltis, Pamela S; Liu, Xin; Xu, Xun

    2018-01-01

    Abstract Understanding plant evolution and diversity in a phylogenomic context is an enormous challenge due, in part, to limited availability of genome-scale data across phylodiverse species. The 10KP (10,000 Plants) Genome Sequencing Project will sequence and characterize representative genomes from every major clade of embryophytes, green algae, and protists (excluding fungi) within the next 5 years. By implementing and continuously improving leading-edge sequencing technologies and bioinformatics tools, 10KP will catalogue the genome content of plant and protist diversity and make these data freely available as an enduring foundation for future scientific discoveries and applications. 10KP is structured as an international consortium, open to the global community, including botanical gardens, plant research institutes, universities, and private industry. Our immediate goal is to establish a policy framework for this endeavor, the principles of which are outlined here. PMID:29618049

  19. 10KP: A phylodiverse genome sequencing plan.

    Science.gov (United States)

    Cheng, Shifeng; Melkonian, Michael; Smith, Stephen A; Brockington, Samuel; Archibald, John M; Delaux, Pierre-Marc; Li, Fay-Wei; Melkonian, Barbara; Mavrodiev, Evgeny V; Sun, Wenjing; Fu, Yuan; Yang, Huanming; Soltis, Douglas E; Graham, Sean W; Soltis, Pamela S; Liu, Xin; Xu, Xun; Wong, Gane Ka-Shu

    2018-03-01

    Understanding plant evolution and diversity in a phylogenomic context is an enormous challenge due, in part, to limited availability of genome-scale data across phylodiverse species. The 10KP (10,000 Plants) Genome Sequencing Project will sequence and characterize representative genomes from every major clade of embryophytes, green algae, and protists (excluding fungi) within the next 5 years. By implementing and continuously improving leading-edge sequencing technologies and bioinformatics tools, 10KP will catalogue the genome content of plant and protist diversity and make these data freely available as an enduring foundation for future scientific discoveries and applications. 10KP is structured as an international consortium, open to the global community, including botanical gardens, plant research institutes, universities, and private industry. Our immediate goal is to establish a policy framework for this endeavor, the principles of which are outlined here.

  20. Automated constraint checking of spacecraft command sequences

    Science.gov (United States)

    Horvath, Joan C.; Alkalaj, Leon J.; Schneider, Karl M.; Spitale, Joseph M.; Le, Dang

    1995-01-01

    Robotic spacecraft are controlled by onboard sets of commands called "sequences." Determining that sequences will have the desired effect on the spacecraft can be expensive in terms of both labor and computer coding time, with different particular costs for different types of spacecraft. Specification languages and appropriate user interface to the languages can be used to make the most effective use of engineering validation time. This paper describes one specification and verification environment ("SAVE") designed for validating that command sequences have not violated any flight rules. This SAVE system was subsequently adapted for flight use on the TOPEX/Poseidon spacecraft. The relationship of this work to rule-based artificial intelligence and to other specification techniques is discussed, as well as the issues that arise in the transfer of technology from a research prototype to a full flight system.

  1. Biological sequence analysis

    DEFF Research Database (Denmark)

    Durbin, Richard; Eddy, Sean; Krogh, Anders Stærmose

    This book provides an up-to-date and tutorial-level overview of sequence analysis methods, with particular emphasis on probabilistic modelling. Discussed methods include pairwise alignment, hidden Markov models, multiple alignment, profile searches, RNA secondary structure analysis, and phylogene...

  2. THE RHIC SEQUENCER

    International Nuclear Information System (INIS)

    VAN ZEIJTS, J.; DOTTAVIO, T.; FRAK, B.; MICHNOFF, R.

    2001-01-01

    The Relativistic Heavy Ion Collider (RHIC) has a high level asynchronous time-line driven by a controlling program called the ''Sequencer''. Most high-level magnet and beam related issues are orchestrated by this system. The system also plays an important task in coordinated data acquisition and saving. We present the program, operator interface, operational impact and experience

  3. Twin anemia polycythemia sequence

    NARCIS (Netherlands)

    Slaghekke, Femke

    2014-01-01

    In this thesis we describe that Twin Anemia Polycythemia Sequence (TAPS) is a form of chronic feto-fetal transfusion in monochorionic (identical) twins based on a small amount of blood transfusion through very small anastomoses. For the antenatal diagnosis of TAPS, Middle Cerebral Artery – Peak

  4. simple sequence repeat (SSR)

    African Journals Online (AJOL)

    In the present study, 78 mapped simple sequence repeat (SSR) markers representing 11 linkage groups of adzuki bean were evaluated for transferability to mungbean and related Vigna spp. 41 markers amplified characteristic bands in at least one Vigna species. The transferability percentage across the genotypes ranged ...

  5. Sequence Matching Analysis for Curriculum Development

    Directory of Open Access Journals (Sweden)

    Liem Yenny Bendatu

    2015-06-01

    Full Text Available Many organizations apply information technologies to support their business processes. Using the information technologies, the actual events are recorded and utilized to conform with predefined model. Conformance checking is an approach to measure the fitness and appropriateness between process model and actual events. However, when there are multiple events with the same timestamp, the traditional approach unfit to result such measures. This study attempts to develop a sequence matching analysis. Considering conformance checking as the basis of this approach, this proposed approach utilizes the current control flow technique in process mining domain. A case study in the field of educational process has been conducted. This study also proposes a curriculum analysis framework to test the proposed approach. By considering the learning sequence of students, it results some measurements for curriculum development. Finally, the result of the proposed approach has been verified by relevant instructors for further development.

  6. Exome Sequencing Identifies a Novel LMNA Splice-Site Mutation and Multigenic Heterozygosity of Potential Modifiers in a Family with Sick Sinus Syndrome, Dilated Cardiomyopathy, and Sudden Cardiac Death.

    Directory of Open Access Journals (Sweden)

    Michael V Zaragoza

    Full Text Available The goals are to understand the primary genetic mechanisms that cause Sick Sinus Syndrome and to identify potential modifiers that may result in intrafamilial variability within a multigenerational family. The proband is a 63-year-old male with a family history of individuals (>10 with sinus node dysfunction, ventricular arrhythmia, cardiomyopathy, heart failure, and sudden death. We used exome sequencing of a single individual to identify a novel LMNA mutation and demonstrated the importance of Sanger validation and family studies when evaluating candidates. After initial single-gene studies were negative, we conducted exome sequencing for the proband which produced 9 gigabases of sequencing data. Bioinformatics analysis showed 94% of the reads mapped to the reference and identified 128,563 unique variants with 108,795 (85% located in 16,319 genes of 19,056 target genes. We discovered multiple variants in known arrhythmia, cardiomyopathy, or ion channel associated genes that may serve as potential modifiers in disease expression. To identify candidate mutations, we focused on ~2,000 variants located in 237 genes of 283 known arrhythmia, cardiomyopathy, or ion channel associated genes. We filtered the candidates to 41 variants in 33 genes using zygosity, protein impact, database searches, and clinical association. Only 21 of 41 (51% variants were validated by Sanger sequencing. We selected nine confirmed variants with minor allele frequencies G, a novel heterozygous splice-site mutation as the primary mutation with rare or novel variants in HCN4, MYBPC3, PKP4, TMPO, TTN, DMPK and KCNJ10 as potential modifiers and a mechanism consistent with haploinsufficiency.

  7. Quantifying population genetic differentiation from next-generation sequencing data

    DEFF Research Database (Denmark)

    Fumagalli, Matteo; Garrett Vieira, Filipe Jorge; Korneliussen, Thorfinn Sand

    2013-01-01

    method for quantifying population genetic differentiation from next-generation sequencing data. In addition, we present a strategy to investigate population structure via Principal Components Analysis. Through extensive simulations, we compare the new method herein proposed to approaches based...... on genotype calling and demonstrate a marked improvement in estimation accuracy for a wide range of conditions. We apply the method to a large-scale genomic data set of domesticated and wild silkworms sequenced at low coverage. We find that we can infer the fine-scale genetic structure of the sampled......Over the last few years, new high-throughput DNA sequencing technologies have dramatically increased speed and reduced sequencing costs. However, the use of these sequencing technologies is often challenged by errors and biases associated with the bioinformatical methods used for analyzing the data...

  8. Targeted sequencing of plant genomes

    Science.gov (United States)

    Mark D. Huynh

    2014-01-01

    Next-generation sequencing (NGS) has revolutionized the field of genetics by providing a means for fast and relatively affordable sequencing. With the advancement of NGS, wholegenome sequencing (WGS) has become more commonplace. However, sequencing an entire genome is still not cost effective or even beneficial in all cases. In studies that do not require a whole-...

  9. Almost convergence of triple sequences

    OpenAIRE

    Ayhan Esi; M.Necdet Catalbas

    2013-01-01

    In this paper we introduce and study the concepts of almost convergence and almost Cauchy for triple sequences. Weshow that the set of almost convergent triple sequences of 0's and 1's is of the first category and also almost everytriple sequence of 0's and 1's is not almost convergent.Keywords: almost convergence, P-convergent, triple sequence.

  10. A few Smarandache Integer Sequences

    OpenAIRE

    Ibstedt, Henry

    2010-01-01

    This paper deals with the analysis of a few Smarandache Integer Sequences which first appeared in Properties or the Numbers, F. Smarandache, University or Craiova Archives, 1975. The first four sequences are recurrence generated sequences while the last three are concatenation sequences.

  11. Discovery of novel transcripts of the human tissue kallikrein (KLK1) and kallikrein-related peptidase 2 (KLK2) in human cancer cells, exploiting Next-Generation Sequencing technology.

    Science.gov (United States)

    Adamopoulos, Panagiotis G; Kontos, Christos K; Scorilas, Andreas

    2018-03-31

    Tissue kallikrein, kallikrein-related peptidases (KLKs), and plasma kallikrein form the largest group of serine proteases in the human genome, sharing many structural and functional properties. Several KLK transcripts have been found aberrantly expressed in numerous human malignancies, confirming their prognostic or/and diagnostic values. However, the process of alternative splicing can now be studied in-depth due to the development of Next-Generation Sequencing (NGS). In the present study, we used NGS to discover novel transcripts of the KLK1 and KLK2 genes, after nested touchdown PCR. Bioinformatics analysis and PCR experiments revealed a total of eleven novel KLK transcripts (two KLK1 and nine KLK2 transcripts). In addition, the expression profiles of each novel transcript were investigated with nested PCR experiments using variant-specific primers. Since KLKs are implicated in human malignancies, qualifying as potential biomarkers, the quantification of the presented novel transcripts in human samples may have clinical applications in different types of cancer. Copyright © 2018. Published by Elsevier Inc.

  12. 下一代测序技术在胚胎植入前遗传学检测中的应用%Application of the next generation sequencing technology in preimplantation genetic detection

    Institute of Scientific and Technical Information of China (English)

    谢美娟; 杨学习; 李明

    2017-01-01

    以下一代测序技术(next-generation sequencing,NGS)为代表的基因组学技术的迅猛发展给全面深度的染色体筛查和基因诊断提供了机会.NGS也迅速应用于胚胎植入前遗传学诊断(preimplantation genetic diagnosis,PGD)和胚胎植入前遗传学筛查(preimplantation genetic screening,PGS)临床检测中,成为常规检测技术,经济与可靠使其具有更广阔的应用前景.单细胞全基因组扩增(whole genome amplification,WGA)技术的进步使得NGS在PGD和PGS的临床应用中能够更加全面了解植入前胚胎的遗传学信息,可以检测到更加细微的差异;基于NGS技术的PGS和PGD将给移植成功率和试管婴儿(in-vitro fertilization,IVF)出生率带来明显提升.本文主要介绍PGD/PGS的定义、传统的PGD/PGS检测技术,单细胞全基因组扩增技术以及NGS在PGD/PGS中的应用.

  13. Transcriptome sequencing of the Microarray Quality Control (MAQC RNA reference samples using next generation sequencing

    Directory of Open Access Journals (Sweden)

    Thierry-Mieg Danielle

    2009-06-01

    Full Text Available Abstract Background Transcriptome sequencing using next-generation sequencing platforms will soon be competing with DNA microarray technologies for global gene expression analysis. As a preliminary evaluation of these promising technologies, we performed deep sequencing of cDNA synthesized from the Microarray Quality Control (MAQC reference RNA samples using Roche's 454 Genome Sequencer FLX. Results We generated more that 3.6 million sequence reads of average length 250 bp for the MAQC A and B samples and introduced a data analysis pipeline for translating cDNA read counts into gene expression levels. Using BLAST, 90% of the reads mapped to the human genome and 64% of the reads mapped to the RefSeq database of well annotated genes with e-values ≤ 10-20. We measured gene expression levels in the A and B samples by counting the numbers of reads that mapped to individual RefSeq genes in multiple sequencing runs to evaluate the MAQC quality metrics for reproducibility, sensitivity, specificity, and accuracy and compared the results with DNA microarrays and Quantitative RT-PCR (QRTPCR from the MAQC studies. In addition, 88% of the reads were successfully aligned directly to the human genome using the AceView alignment programs with an average 90% sequence similarity to identify 137,899 unique exon junctions, including 22,193 new exon junctions not yet contained in the RefSeq database. Conclusion Using the MAQC metrics for evaluating the performance of gene expression platforms, the ExpressSeq results for gene expression levels showed excellent reproducibility, sensitivity, and specificity that improved systematically with increasing shotgun sequencing depth, and quantitative accuracy that was comparable to DNA microarrays and QRTPCR. In addition, a careful mapping of the reads to the genome using the AceView alignment programs shed new light on the complexity of the human transcriptome including the discovery of thousands of new splice variants.

  14. Zseq: An Approach for Preprocessing Next-Generation Sequencing Data.

    Science.gov (United States)

    Alkhateeb, Abedalrhman; Rueda, Luis

    2017-08-01

    Next-generation sequencing technology generates a huge number of reads (short sequences), which contain a vast amount of genomic data. The sequencing process, however, comes with artifacts. Preprocessing of sequences is mandatory for further downstream analysis. We present Zseq, a linear method that identifies the most informative genomic sequences and reduces the number of biased sequences, sequence duplications, and ambiguous nucleotides. Zseq finds the complexity of the sequences by counting the number of unique k-mers in each sequence as its corresponding score and also takes into the account other factors such as ambiguous nucleotides or high GC-content percentage in k-mers. Based on a z-score threshold, Zseq sweeps through the sequences again and filters those with a z-score less than the user-defined threshold. Zseq algorithm is able to provide a better mapping rate; it reduces the number of ambiguous bases significantly in comparison with other methods. Evaluation of the filtered reads has been conducted by aligning the reads and assembling the transcripts using the reference genome as well as de novo assembly. The assembled transcripts show a better discriminative ability to separate cancer and normal samples in comparison with another state-of-the-art method. Moreover, de novo assembled transcripts from the reads filtered by Zseq have longer genomic sequences than other tested methods. Estimating the threshold of the cutoff point is introduced using labeling rules with optimistic results.

  15. Cost-effective sequencing of full-length cDNA clones powered by a de novo-reference hybrid assembly.

    Science.gov (United States)

    Kuroshu, Reginaldo M; Watanabe, Junichi; Sugano, Sumio; Morishita, Shinichi; Suzuki, Yutaka; Kasahara, Masahiro

    2010-05-07

    Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence approximately 800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only approximately US$3 per clone, demonstrating a significant advantage over previous approaches.

  16. Sequencing of a QTL-rich region of the Theobroma cacao genome using pooled BACs and the identification of trait specific candidate genes

    Directory of Open Access Journals (Sweden)

    Blackmon Barbara P

    2011-07-01

    Full Text Available Abstract Background BAC-based physical maps provide for sequencing across an entire genome or a selected sub-genomic region of biological interest. Such a region can be approached with next-generation whole-genome sequencing and assembly as if it were an independent small genome. Using the minimum tiling path as a guide, specific BAC clones representing the prioritized genomic interval are selected, pooled, and used to prepare a sequencing library. Results This pooled BAC approach was taken to sequence and assemble a QTL-rich region, of ~3 Mbp and represented by twenty-seven BACs, on linkage group 5 of the Theobroma cacao cv. Matina 1-6 genome. Using various mixtures of read coverages from paired-end and linear 454 libraries, multiple assemblies of varied quality were generated. Quality was assessed by comparing the assembly of 454 reads with a subset of ten BACs individually sequenced and assembled using Sanger reads. A mixture of reads optimal for assembly was identified. We found, furthermore, that a quality assembly suitable for serving as a reference genome template could be obtained even with a reduced depth of sequencing coverage. Annotation of the resulting assembly revealed several genes potentially responsible for three T. cacao traits: black pod disease resistance, bean shape index, and pod weight. Conclusions Our results, as with other pooled BAC sequencing reports, suggest that pooling portions of a minimum tiling path derived from a BAC-based physical map is an effective method to target sub-genomic regions for sequencing. While we focused on a single QTL region, other QTL regions of importance could be similarly sequenced allowing for biological discovery to take place before a high quality whole-genome assembly is completed.

  17. Harnessing Whole Genome Sequencing in Medical Mycology.

    Science.gov (United States)

    Cuomo, Christina A

    2017-01-01

    Comparative genome sequencing studies of human fungal pathogens enable identification of genes and variants associated with virulence and drug resistance. This review describes current approaches, resources, and advances in applying whole genome sequencing to study clinically important fungal pathogens. Genomes for some important fungal pathogens were only recently assembled, revealing gene family expansions in many species and extreme gene loss in one obligate species. The scale and scope of species sequenced is rapidly expanding, leveraging technological advances to assemble and annotate genomes with higher precision. By using iteratively improved reference assemblies or those generated de novo for new species, recent studies have compared the sequence of isolates representing populations or clinical cohorts. Whole genome approaches provide the resolution necessary for comparison of closely related isolates, for example, in the analysis of outbreaks or sampled across time within a single host. Genomic analysis of fungal pathogens has enabled both basic research and diagnostic studies. The increased scale of sequencing can be applied across populations, and new metagenomic methods allow direct analysis of complex samples.

  18. OTU analysis using metagenomic shotgun sequencing data.

    Directory of Open Access Journals (Sweden)

    Xiaolin Hao

    Full Text Available Because of technological limitations, the primer and amplification biases in targeted sequencing of 16S rRNA genes have veiled the true microbial diversity underlying environmental samples. However, the protocol of metagenomic shotgun sequencing provides 16S rRNA gene fragment data with natural immunity against the biases raised during priming and thus the potential of uncovering the true structure of microbial community by giving more accurate predictions of operational taxonomic units (OTUs. Nonetheless, the lack of statistically rigorous comparison between 16S rRNA gene fragments and other data types makes it difficult to interpret previously reported results using 16S rRNA gene fragments. Therefore, in the present work, we established a standard analysis pipeline that would help confirm if the differences in the data are true or are just due to potential technical bias. This pipeline is built by using simulated data to find optimal mapping and OTU prediction methods. The comparison between simulated datasets revealed a relationship between 16S rRNA gene fragments and full-length 16S rRNA sequences that a 16S rRNA gene fragment having a length >150 bp provides the same accuracy as a full-length 16S rRNA sequence using our proposed pipeline, which could serve as a good starting point for experimental design and making the comparison between 16S rRNA gene fragment-based and targeted 16S rRNA sequencing-based surveys possible.

  19. BLEACHING EUCALYPTUS PULPS WITH SHORT SEQUENCES

    Directory of Open Access Journals (Sweden)

    Flaviana Reis Milagres

    2011-03-01

    Full Text Available Eucalyptus spp kraft pulp, due to its high content of hexenuronic acids, is quite easy to bleach. Therefore, investigations have been made attempting to decrease the number of stages in the bleaching process in order to minimize capital costs. This study focused on the evaluation of short ECF (Elemental Chlorine Free and TCF (Totally Chlorine Free sequences for bleaching oxygen delignified Eucalyptus spp kraft pulp to 90% ISO brightness: PMoDP (Molybdenum catalyzed acid peroxide, chlorine dioxide and hydrogen peroxide, PMoD/P (Molybdenum catalyzed acid peroxide, chlorine dioxide and hydrogen peroxide, without washing PMoD(PO (Molybdenum catalyzed acid peroxide, chlorine dioxide and pressurized peroxide, D(EPODP (chlorine dioxide, extraction oxidative with oxygen and peroxide, chlorine dioxide and hydrogen peroxide, PMoQ(PO (Molybdenum catalyzed acid peroxide, DTPA and pressurized peroxide, and XPMoQ(PO (Enzyme, molybdenum catalyzed acid peroxide, DTPA and pressurized peroxide. Uncommon pulp treatments, such as molybdenum catalyzed acid peroxide (PMo and xylanase (X bleaching stages, were used. Among the ECF alternatives, the two-stage PMoD/P sequence proved highly cost-effective without affecting pulp quality in relation to the traditional D(EPODP sequence and produced better quality effluent in relation to the reference. However, a four stage sequence, XPMoQ(PO, was required to achieve full brightness using the TCF technology. This sequence was highly cost-effective although it only produced pulp of acceptable quality.

  20. Human Genome Sequencing in Health and Disease

    Science.gov (United States)

    Gonzaga-Jauregui, Claudia; Lupski, James R.; Gibbs, Richard A.

    2013-01-01

    Following the “finished,” euchromatic, haploid human reference genome sequence, the rapid development of novel, faster, and cheaper sequencing technologies is making possible the era of personalized human genomics. Personal diploid human genome sequences have been generated, and each has contributed to our better understanding of variation in the human genome. We have consequently begun to appreciate the vastness of individual genetic variation from single nucleotide to structural variants. Translation of genome-scale variation into medically useful information is, however, in its infancy. This review summarizes the initial steps undertaken in clinical implementation of personal genome information, and describes the application of whole-genome and exome sequencing to identify the cause of genetic diseases and to suggest adjuvant therapies. Better analysis tools and a deeper understanding of the biology of our genome are necessary in order to decipher, interpret, and optimize clinical utility of what the variation in the human genome can teach us. Personal genome sequencing may eventually become an instrument of common medical practice, providing information that assists in the formulation of a differential diagnosis. We outline herein some of the remaining challenges. PMID:22248320

  1. Targeting the Treponemal Microbiome of Digital Dermatitis Infections by High-Resolution Phylogenetic Analyses and Comparison with Fluorescent In Situ Hybridization

    DEFF Research Database (Denmark)

    Schou, Kirstine Klitgaard; Foix Bretó, Antoni; Boye, Mette

    2013-01-01

    Modern pyrosequencing technology allows for a more comprehensive approach than traditional Sanger sequencing for elucidating the etiology of bovine digital dermatitis. We sought to describe the composition and diversity of treponemes in digital dermatitis lesions by using deep sequencing of the V...

  2. Bioinformatics assisted breeding, from QTL to candidate genes

    NARCIS (Netherlands)

    Chibon, P.Y.

    2013-01-01

    Over the last decade, the amount of data generated by a single run of a NGS sequencer outperforms days of work done with Sanger sequencing. Metabolomics, proteomics and transcriptomics technologies have also involved producing more and more information at an ever faster rate. In addition, the

  3. Single-cell sequencing in stem cell biology.

    Science.gov (United States)

    Wen, Lu; Tang, Fuchou

    2016-04-15

    Cell-to-cell variation and heterogeneity are fundamental and intrinsic characteristics of stem cell populations, but these differences are masked when bulk cells are used for omic analysis. Single-cell sequencing technologies serve as powerful tools to dissect cellular heterogeneity comprehensively and to identify distinct phenotypic cell types, even within a 'homogeneous' stem cell population. These technologies, including single-cell genome, epigenome, and transcriptome sequencing technologies, have been developing rapidly in recent years. The application of these methods to different types of stem cells, including pluripotent stem cells and tissue-specific stem cells, has led to exciting new findings in the stem cell field. In this review, we discuss the recent progress as well as future perspectives in the methodologies and applications of single-cell omic sequencing technologies.

  4. Multilocus Sequence Typing

    OpenAIRE

    Belén, Ana; Pavón, Ibarz; Maiden, Martin C.J.

    2009-01-01

    Multilocus sequence typing (MLST) was first proposed in 1998 as a typing approach that enables the unambiguous characterization of bacterial isolates in a standardized, reproducible, and portable manner using the human pathogen Neisseria meningitidis as the exemplar organism. Since then, the approach has been applied to a large and growing number of organisms by public health laboratories and research institutions. MLST data, shared by investigators over the world via the Internet, have been ...

  5. Achalasia Carcinoma Sequence

    OpenAIRE

    Makmun, Dadang

    2001-01-01

    We report a case of carcinoma of the esophagus in a 58 years old woman with achalasia, who has been diagnosed since 30 years ago, which initiated by surgical treatment (myotomy) and the symptoms recurred since 3 years ago. According to the progress of the disease, Malignancy was strongly suspected due to prolonged stasis and mucosal irritation caused by achalasia (achalasia carcinoma sequence). Because of these contributing factors for the development of serious complications such as Malignan...

  6. Sequencing BPS spectra

    Energy Technology Data Exchange (ETDEWEB)

    Gukov, Sergei [Walter Burke Institute for Theoretical Physics, California Institute of Technology,1200 E California Blvd, Pasadena, CA 91125 (United States); Max-Planck-Institut für Mathematik,Vivatsgasse 7, D-53111 Bonn (Germany); Nawata, Satoshi [Walter Burke Institute for Theoretical Physics, California Institute of Technology,1200 E California Blvd, Pasadena, CA 91125 (United States); Centre for Quantum Geometry of Moduli Spaces, University of Aarhus,Nordre Ringgade 1, DK-8000 (Denmark); Saberi, Ingmar [Walter Burke Institute for Theoretical Physics, California Institute of Technology,1200 E California Blvd, Pasadena, CA 91125 (United States); Stošić, Marko [CAMGSD, Departamento de Matemática, Instituto Superior Técnico,Av. Rovisco Pais, 1049-001 Lisbon (Portugal); Mathematical Institute SANU,Knez Mihajlova 36, 11000 Belgrade (Serbia); Sułkowski, Piotr [Walter Burke Institute for Theoretical Physics, California Institute of Technology,1200 E California Blvd, Pasadena, CA 91125 (United States); Faculty of Physics, University of Warsaw,ul. Pasteura 5, 02-093 Warsaw (Poland)

    2016-03-02

    This paper provides both a detailed study of color-dependence of link homologies, as realized in physics as certain spaces of BPS states, and a broad study of the behavior of BPS states in general. We consider how the spectrum of BPS states varies as continuous parameters of a theory are perturbed. This question can be posed in a wide variety of physical contexts, and we answer it by proposing that the relationship between unperturbed and perturbed BPS spectra is described by a spectral sequence. These general considerations unify previous applications of spectral sequence techniques to physics, and explain from a physical standpoint the appearance of many spectral sequences relating various link homology theories to one another. We also study structural properties of colored HOMFLY homology for links and evaluate Poincaré polynomials in numerous examples. Among these structural properties is a novel “sliding” property, which can be explained by using (refined) modular S-matrix. This leads to the identification of modular transformations in Chern-Simons theory and 3d N=2 theory via the 3d/3d correspondence. Lastly, we introduce the notion of associated varieties as classical limits of recursion relations of colored superpolynomials of links, and study their properties.

  7. Sequencing BPS spectra

    International Nuclear Information System (INIS)

    Gukov, Sergei; Nawata, Satoshi; Saberi, Ingmar; Stošić, Marko; Sułkowski, Piotr

    2016-01-01

    This paper provides both a detailed study of color-dependence of link homologies, as realized in physics as certain spaces of BPS states, and a broad study of the behavior of BPS states in general. We consider how the spectrum of BPS states varies as continuous parameters of a theory are perturbed. This question can be posed in a wide variety of physical contexts, and we answer it by proposing that the relationship between unperturbed and perturbed BPS spectra is described by a spectral sequence. These general considerations unify previous applications of spectral sequence techniques to physics, and explain from a physical standpoint the appearance of many spectral sequences relating various link homology theories to one another. We also study structural properties of colored HOMFLY homology for links and evaluate Poincaré polynomials in numerous examples. Among these structural properties is a novel “sliding” property, which can be explained by using (refined) modular S-matrix. This leads to the identification of modular transformations in Chern-Simons theory and 3d N=2 theory via the 3d/3d correspondence. Lastly, we introduce the notion of associated varieties as classical limits of recursion relations of colored superpolynomials of links, and study their properties.

  8. The diploid genome sequence of an Asian individual

    DEFF Research Database (Denmark)

    Wang, Jun; Wang, Wei; Li, Ruiqiang

    2008-01-01

    Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we...... used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the Asian individual's genome. We identified approximately 3 million single-nucleotide polymorphisms (SNPs) inside this region, of which 13.6% were not in the dbSNP database. Genotyping analysis showed that SNP...... identification had high accuracy and consistency, indicating the high sequence quality of this assembly. We also carried out heterozygote phasing and haplotype prediction against HapMap CHB and JPT haplotypes (Chinese and Japanese, respectively), sequence comparison with the two available individual genomes (J...

  9. X-exome sequencing of 405 unresolved families identifies seven novel intellectual disability genes.

    Science.gov (United States)

    Hu, H; Haas, S A; Chelly, J; Van Esch, H; Raynaud, M; de Brouwer, A P M; Weinert, S; Froyen, G; Frints, S G M; Laumonnier, F; Zemojtel, T; Love, M I; Richard, H; Emde, A-K; Bienek, M; Jensen, C; Hambrock, M; Fischer, U; Langnick, C; Feldkamp, M; Wissink-Lindhout, W; Lebrun, N; Castelnau, L; Rucci, J; Montjean, R; Dorseuil, O; Billuart, P; Stuhlmann, T; Shaw, M; Corbett, M A; Gardner, A; Willis-Owen, S; Tan, C; Friend, K L; Belet, S; van Roozendaal, K E P; Jimenez-Pocquet, M; Moizard, M-P; Ronce, N; Sun, R; O'Keeffe, S; Chenna, R; van Bömmel, A; Göke, J; Hackett, A; Field, M; Christie, L; Boyle, J; Haan, E; Nelson, J; Turner, G; Baynam, G; Gillessen-Kaesbach, G; Müller, U; Steinberger, D; Budny, B; Badura-Stronka, M; Latos-Bieleńska, A; Ousager, L B; Wieacker, P; Rodríguez Criado, G; Bondeson, M-L; Annerén, G; Dufke, A; Cohen, M; Van Maldergem, L; Vincent-Delorme, C; Echenne, B; Simon-Bouy, B; Kleefstra, T; Willemsen, M; Fryns, J-P; Devriendt, K; Ullmann, R; Vingron, M; Wrogemann, K; Wienker, T F; Tzschach, A; van Bokhoven, H; Gecz, J; Jentsch, T J; Chen, W; Ropers, H-H; Kalscheuer, V M

    2016-01-01

    X-linked intellectual disability (XLID) is a clinically and genetically heterogeneous disorder. During the past two decades in excess of 100 X-chromosome ID genes have been identified. Yet, a large number of families mapping to the X-chromosome remained unresolved suggesting that more XLID genes or loci are yet to be identified. Here, we have investigated 405 unresolved families with XLID. We employed massively parallel sequencing of all X-chromosome exons in the index males. The majority of these males were previously tested negative for copy number variations and for mutations in a subset of known XLID genes by Sanger sequencing. In total, 745 X-chromosomal genes were screened. After stringent filtering, a total of 1297 non-recurrent exonic variants remained for prioritization. Co-segregation analysis of potential clinically relevant changes revealed that 80 families (20%) carried pathogenic variants in established XLID genes. In 19 families, we detected likely causative protein truncating and missense variants in 7 novel and validated XLID genes (CLCN4, CNKSR2, FRMPD4, KLHL15, LAS1L, RLIM and USP27X) and potentially deleterious variants in 2 novel candidate XLID genes (CDK16 and TAF1). We show that the CLCN4 and CNKSR2 variants impair protein functions as indicated by electrophysiological studies and altered differentiation of cultured primary neurons from Clcn4(-/-) mice or after mRNA knock-down. The newly identified and candidate XLID proteins belong to pathways and networks with established roles in cognitive function and intellectual disability in particular. We suggest that systematic sequencing of all X-chromosomal genes in a cohort of patients with genetic evidence for X-chromosome locus involvement may resolve up to 58% of Fragile X-negative cases.

  10. Sequence and Analysis of the Genome of the Pathogenic Yeast Candida orthopsilosis

    Science.gov (United States)

    Riccombeni, Alessandro; Vidanes, Genevieve; Proux-Wéra, Estelle; Wolfe, Kenneth H.; Butler, Geraldine

    2012-01-01

    Candida orthopsilosis is closely related to the fungal pathogen Candida parapsilosis. However, whereas C. parapsilosis is a major cause of disease in immunosuppressed individuals and in premature neonates, C. orthopsilosis is more rarely associated with infection. We sequenced the C. orthopsilosis genome to facilitate the identification of genes associated with virulence. Here, we report the de novo assembly and annotation of the genome of a Type 2 isolate of C. orthopsilosis. The sequence was obtained by combining data from next generation sequencing (454 Life Sciences and Illumina) with paired-end Sanger reads from a fosmid library. The final assembly contains 12.6 Mb on 8 chromosomes. The genome was annotated using an automated pipeline based on comparative analysis of genomes of Candida species, together with manual identification of introns. We identified 5700 protein-coding genes in C. orthopsilosis, of which 5570 have an ortholog in C. parapsilosis. The time of divergence between C. orthopsilosis and C. parapsilosis is estimated to be twice as great as that between Candida albicans and Candida dubliniensis. There has been an expansion of the Hyr/Iff family of cell wall genes and the JEN family of monocarboxylic transporters in C. parapsilosis relative to C. orthopsilosis. We identified one gene from a Maltose/Galactoside O-acetyltransferase family that originated by horizontal gene transfer from a bacterium to the common ancestor of C. orthopsilosis and C. parapsilosis. We report that TFB3, a component of the general transcription factor TFIIH, undergoes alternative splicing by intron retention in multiple Candida species. We also show that an intein in the vacuolar ATPase gene VMA1 is present in C. orthopsilosis but not C. parapsilosis, and has a patchy distribution in Candida species. Our results suggest that the difference in virulence between C. parapsilosis and C. orthopsilosis may be associated with expansion of gene families. PMID:22563396

  11. Targeted next-generation sequencing analysis identifies novel mutations in families with severe familial exudative vitreoretinopathy

    Science.gov (United States)

    Huang, Xiao-Yan; Zhuang, Hong; Wu, Ji-Hong; Li, Jian-Kang; Hu, Fang-Yuan; Zheng, Yu; Tellier, Laurent Christian Asker M.; Zhang, Sheng-Hai; Gao, Feng-Juan; Zhang, Jian-Guo

    2017-01-01

    Purpose Familial exudative vitreoretinopathy (FEVR) is a genetically and clinically heterogeneous disease, characterized by failure of vascular development of the peripheral retina. The symptoms of FEVR vary widely among patients in the same family, and even between the two eyes of a given patient. This study was designed to identify the genetic defect in a patient cohort of ten Chinese families with a definitive diagnosis of FEVR. Methods To identify the causative gene, next-generation sequencing (NGS)-based target capture sequencing was performed. Segregation analysis of the candidate variant was performed in additional family members by using Sanger sequencing and quantitative real-time PCR (QPCR). Results Of the cohort of ten FEVR families, six pathogenic variants were identified, including four novel and two known heterozygous mutations. Of the variants identified, four were missense variants, and two were novel heterozygous deletion mutations [LRP5, c.4053 DelC (p.Ile1351IlefsX88); TSPAN12, EX8Del]. The two novel heterozygous deletion mutations were not observed in the control subjects and could give rise to a relatively severe FEVR phenotype, which could be explained by the protein function prediction. Conclusions We identified two novel heterozygous deletion mutations [LRP5, c.4053 DelC (p.Ile1351IlefsX88); TSPAN12, EX8Del] using targeted NGS as a causative mutation for FEVR. These genetic deletion variations exhibit a severe form of FEVR, with tractional retinal detachments compared with other known point mutations. The data further enrich the mutation spectrum of FEVR and enhance our understanding of genotype–phenotype correlations to provide useful information for disease diagnosis, prognosis, and effective genetic counseling. PMID:28867931

  12. Exome sequencing of bilateral testicular germ cell tumors suggests independent development lineages.

    Science.gov (United States)

    Brabrand, Sigmund; Johannessen, Bjarne; Axcrona, Ulrika; Kraggerud, Sigrid M; Berg, Kaja G