WorldWideScience

Sample records for high-throughput amplicon sequencing

  1. High throughput 16S rRNA gene amplicon sequencing

    DEFF Research Database (Denmark)

    Nierychlo, Marta; Larsen, Poul; Jørgensen, Mads Koustrup

    S rRNA gene amplicon sequencing has been developed over the past few years and is now ready to use for more comprehensive studies related to plant operation and optimization thanks to short analysis time, low cost, high throughput, and high taxonomic resolution. In this study we show how 16S r......RNA gene amplicon sequencing can be used to reveal factors of importance for the operation of full-scale nutrient removal plants related to settling problems and floc properties. Using optimized DNA extraction protocols, indexed primers and our in-house Illumina platform, we prepared multiple samples...... be correlated to the presence of the species that are regarded as “strong” and “weak” floc formers. In conclusion, 16S rRNA gene amplicon sequencing provides a high throughput approach for a rapid and cheap community profiling of activated sludge that in combination with multivariate statistics can be used...

  2. A priori Considerations When Conducting High-Throughput Amplicon-Based Sequence Analysis

    Directory of Open Access Journals (Sweden)

    Aditi Sengupta

    2016-03-01

    Full Text Available Amplicon-based sequencing strategies that include 16S rRNA and functional genes, alongside “meta-omics” analyses of communities of microorganisms, have allowed researchers to pose questions and find answers to “who” is present in the environment and “what” they are doing. Next-generation sequencing approaches that aid microbial ecology studies of agricultural systems are fast gaining popularity among agronomy, crop, soil, and environmental science researchers. Given the rapid development of these high-throughput sequencing techniques, researchers with no prior experience will desire information about the best practices that can be used before actually starting high-throughput amplicon-based sequence analyses. We have outlined items that need to be carefully considered in experimental design, sampling, basic bioinformatics, sequencing of mock communities and negative controls, acquisition of metadata, and in standardization of reaction conditions as per experimental requirements. Not all considerations mentioned here may pertain to a particular study. The overall goal is to inform researchers about considerations that must be taken into account when conducting high-throughput microbial DNA sequencing and sequences analysis.

  3. SEED 2: a user-friendly platform for amplicon high-throughput sequencing data analyses.

    Science.gov (United States)

    Vetrovský, Tomáš; Baldrian, Petr; Morais, Daniel; Berger, Bonnie

    2018-02-14

    Modern molecular methods have increased our ability to describe microbial communities. Along with the advances brought by new sequencing technologies, we now require intensive computational resources to make sense of the large numbers of sequences continuously produced. The software developed by the scientific community to address this demand, although very useful, require experience of the command-line environment, extensive training and have steep learning curves, limiting their use. We created SEED 2, a graphical user interface for handling high-throughput amplicon-sequencing data under Windows operating systems. SEED 2 is the only sequence visualizer that empowers users with tools to handle amplicon-sequencing data of microbial community markers. It is suitable for any marker genes sequences obtained through Illumina, IonTorrent or Sanger sequencing. SEED 2 allows the user to process raw sequencing data, identify specific taxa, produce of OTU-tables, create sequence alignments and construct phylogenetic trees. Standard dual core laptops with 8 GB of RAM can handle ca. 8 million of Illumina PE 300 bp sequences, ca. 4GB of data. SEED 2 was implemented in Object Pascal and uses internal functions and external software for amplicon data processing. SEED 2 is a freeware software, available at http://www.biomed.cas.cz/mbu/lbwrf/seed/ as a self-contained file, including all the dependencies, and does not require installation. Supplementary data contain a comprehensive list of supported functions. daniel.morais@biomed.cas.cz. Supplementary data are available at Bioinformatics online. © The Author(s) 2018. Published by Oxford University Press.

  4. Mining environmental high-throughput sequence data sets to identify divergent amplicon clusters for phylogenetic reconstruction and morphotype visualization.

    Science.gov (United States)

    Gimmler, Anna; Stoeck, Thorsten

    2015-08-01

    Environmental high-throughput sequencing (envHTS) is a very powerful tool, which in protistan ecology is predominantly used for the exploration of diversity and its geographic and local patterns. We here used a pyrosequenced V4-SSU rDNA data set from a solar saltern pond as test case to exploit such massive protistan amplicon data sets beyond this descriptive purpose. Therefore, we combined a Swarm-based blastn network including 11 579 ciliate V4 amplicons to identify divergent amplicon clusters with targeted polymerase chain reaction (PCR) primer design for full-length small subunit of the ribosomal DNA retrieval and probe design for fluorescence in situ hybridization (FISH). This powerful strategy allows to benefit from envHTS data sets to (i) reveal the phylogenetic position of the taxon behind divergent amplicons; (ii) improve phylogenetic resolution and evolutionary history of specific taxon groups; (iii) solidly assess an amplicons (species') degree of similarity to its closest described relative; (iv) visualize the morphotype behind a divergent amplicons cluster; (v) rapidly FISH screen many environmental samples for geographic/habitat distribution and abundances of the respective organism and (vi) to monitor the success of enrichment strategies in live samples for cultivation and isolation of the respective organisms. © 2015 Society for Applied Microbiology and John Wiley & Sons Ltd.

  5. Unraveling Core Functional Microbiota in Traditional Solid-State Fermentation by High-Throughput Amplicons and Metatranscriptomics Sequencing

    Directory of Open Access Journals (Sweden)

    Zhewei Song

    2017-07-01

    Full Text Available Fermentation microbiota is specific microorganisms that generate different types of metabolites in many productions. In traditional solid-state fermentation, the structural composition and functional capacity of the core microbiota determine the quality and quantity of products. As a typical example of food fermentation, Chinese Maotai-flavor liquor production involves a complex of various microorganisms and a wide variety of metabolites. However, the microbial succession and functional shift of the core microbiota in this traditional food fermentation remain unclear. Here, high-throughput amplicons (16S rRNA gene amplicon sequencing and internal transcribed space amplicon sequencing and metatranscriptomics sequencing technologies were combined to reveal the structure and function of the core microbiota in Chinese soy sauce aroma type liquor production. In addition, ultra-performance liquid chromatography and headspace-solid phase microextraction-gas chromatography-mass spectrometry were employed to provide qualitative and quantitative analysis of the major flavor metabolites. A total of 10 fungal and 11 bacterial genera were identified as the core microbiota. In addition, metatranscriptomic analysis revealed pyruvate metabolism in yeasts (genera Pichia, Schizosaccharomyces, Saccharomyces, and Zygosaccharomyces and lactic acid bacteria (genus Lactobacillus classified into two stages in the production of flavor components. Stage I involved high-level alcohol (ethanol production, with the genus Schizosaccharomyces serving as the core functional microorganism. Stage II involved high-level acid (lactic acid and acetic acid production, with the genus Lactobacillus serving as the core functional microorganism. The functional shift from the genus Schizosaccharomyces to the genus Lactobacillus drives flavor component conversion from alcohol (ethanol to acid (lactic acid and acetic acid in Chinese Maotai-flavor liquor production. Our findings provide

  6. Unraveling Core Functional Microbiota in Traditional Solid-State Fermentation by High-Throughput Amplicons and Metatranscriptomics Sequencing.

    Science.gov (United States)

    Song, Zhewei; Du, Hai; Zhang, Yan; Xu, Yan

    2017-01-01

    Fermentation microbiota is specific microorganisms that generate different types of metabolites in many productions. In traditional solid-state fermentation, the structural composition and functional capacity of the core microbiota determine the quality and quantity of products. As a typical example of food fermentation, Chinese Maotai-flavor liquor production involves a complex of various microorganisms and a wide variety of metabolites. However, the microbial succession and functional shift of the core microbiota in this traditional food fermentation remain unclear. Here, high-throughput amplicons (16S rRNA gene amplicon sequencing and internal transcribed space amplicon sequencing) and metatranscriptomics sequencing technologies were combined to reveal the structure and function of the core microbiota in Chinese soy sauce aroma type liquor production. In addition, ultra-performance liquid chromatography and headspace-solid phase microextraction-gas chromatography-mass spectrometry were employed to provide qualitative and quantitative analysis of the major flavor metabolites. A total of 10 fungal and 11 bacterial genera were identified as the core microbiota. In addition, metatranscriptomic analysis revealed pyruvate metabolism in yeasts (genera Pichia, Schizosaccharomyces, Saccharomyces , and Zygosaccharomyces ) and lactic acid bacteria (genus Lactobacillus ) classified into two stages in the production of flavor components. Stage I involved high-level alcohol (ethanol) production, with the genus Schizosaccharomyces serving as the core functional microorganism. Stage II involved high-level acid (lactic acid and acetic acid) production, with the genus Lactobacillus serving as the core functional microorganism. The functional shift from the genus Schizosaccharomyces to the genus Lactobacillus drives flavor component conversion from alcohol (ethanol) to acid (lactic acid and acetic acid) in Chinese Maotai-flavor liquor production. Our findings provide insight into

  7. PipeCraft: Flexible open-source toolkit for bioinformatics analysis of custom high-throughput amplicon sequencing data.

    Science.gov (United States)

    Anslan, Sten; Bahram, Mohammad; Hiiesalu, Indrek; Tedersoo, Leho

    2017-11-01

    High-throughput sequencing methods have become a routine analysis tool in environmental sciences as well as in public and private sector. These methods provide vast amount of data, which need to be analysed in several steps. Although the bioinformatics may be applied using several public tools, many analytical pipelines allow too few options for the optimal analysis for more complicated or customized designs. Here, we introduce PipeCraft, a flexible and handy bioinformatics pipeline with a user-friendly graphical interface that links several public tools for analysing amplicon sequencing data. Users are able to customize the pipeline by selecting the most suitable tools and options to process raw sequences from Illumina, Pacific Biosciences, Ion Torrent and Roche 454 sequencing platforms. We described the design and options of PipeCraft and evaluated its performance by analysing the data sets from three different sequencing platforms. We demonstrated that PipeCraft is able to process large data sets within 24 hr. The graphical user interface and the automated links between various bioinformatics tools enable easy customization of the workflow. All analytical steps and options are recorded in log files and are easily traceable. © 2017 John Wiley & Sons Ltd.

  8. Biphasic Study to Characterize Agricultural Biogas Plants by High-Throughput 16S rRNA Gene Amplicon Sequencing and Microscopic Analysis.

    Science.gov (United States)

    Maus, Irena; Kim, Yong Sung; Wibberg, Daniel; Stolze, Yvonne; Off, Sandra; Antonczyk, Sebastian; Pühler, Alfred; Scherer, Paul; Schlüter, Andreas

    2017-02-28

    Process surveillance within agricultural biogas plants (BGPs) was concurrently studied by high-throughput 16S rRNA gene amplicon sequencing and an optimized quantitative microscopic fingerprinting (QMF) technique. In contrast to 16S rRNA gene amplicons, digitalized microscopy is a rapid and cost-effective method that facilitates enumeration and morphological differentiation of the most significant groups of methanogens regarding their shape and characteristic autofluorescent factor 420. Moreover, the fluorescence signal mirrors cell vitality. In this study, four different BGPs were investigated. The results indicated stable process performance in the mesophilic BGPs and in the thermophilic reactor. Bacterial subcommunity characterization revealed significant differences between the four BGPs. Most remarkably, the genera Defluviitoga and Halocella dominated the thermophilic bacterial subcommunity, whereas members of another taxon, Syntrophaceticus , were found to be abundant in the mesophilic BGP. The domain Archaea was dominated by the genus Methanoculleus in all four BGPs, followed by Methanosaeta in BGP1 and BGP3. In contrast, Methanothermobacter members were highly abundant in the thermophilic BGP4. Furthermore, a high consistency between the sequencing approach and the QMF method was shown, especially for the thermophilic BGP. The differences elucidated that using this biphasic approach for mesophilic BGPs provided novel insights regarding disaggregated single cells of Methanosarcina and Methanosaeta species. Both dominated the archaeal subcommunity and replaced coccoid Methanoculleus members belonging to the same group of Methanomicrobiales that have been frequently observed in similar BGPs. This work demonstrates that combining QMF and 16S rRNA gene amplicon sequencing is a complementary strategy to describe archaeal community structures within biogas processes.

  9. A novel ultra high-throughput 16S rRNA gene amplicon sequencing library preparation method for the Illumina HiSeq platform.

    Science.gov (United States)

    de Muinck, Eric J; Trosvik, Pål; Gilfillan, Gregor D; Hov, Johannes R; Sundaram, Arvind Y M

    2017-07-06

    Advances in sequencing technologies and bioinformatics have made the analysis of microbial communities almost routine. Nonetheless, the need remains to improve on the techniques used for gathering such data, including increasing throughput while lowering cost and benchmarking the techniques so that potential sources of bias can be better characterized. We present a triple-index amplicon sequencing strategy to sequence large numbers of samples at significantly lower c ost and in a shorter timeframe compared to existing methods. The design employs a two-stage PCR protocol, incorpo rating three barcodes to each sample, with the possibility to add a fourth-index. It also includes heterogeneity spacers to overcome low complexity issues faced when sequencing amplicons on Illumina platforms. The library preparation method was extensively benchmarked through analysis of a mock community in order to assess biases introduced by sample indexing, number of PCR cycles, and template concentration. We further evaluated the method through re-sequencing of a standardized environmental sample. Finally, we evaluated our protocol on a set of fecal samples from a small cohort of healthy adults, demonstrating good performance in a realistic experimental setting. Between-sample variation was mainly related to batch effects, such as DNA extraction, while sample indexing was also a significant source of bias. PCR cycle number strongly influenced chimera formation and affected relative abundance estimates of species with high GC content. Libraries were sequenced using the Illumina HiSeq and MiSeq platforms to demonstrate that this protocol is highly scalable to sequence thousands of samples at a very low cost. Here, we provide the most comprehensive study of performance and bias inherent to a 16S rRNA gene amplicon sequencing method to date. Triple-indexing greatly reduces the number of long custom DNA oligos required for library preparation, while the inclusion of variable length

  10. Two-stage clustering (TSC: a pipeline for selecting operational taxonomic units for the high-throughput sequencing of PCR amplicons.

    Directory of Open Access Journals (Sweden)

    Xiao-Tao Jiang

    Full Text Available Clustering 16S/18S rRNA amplicon sequences into operational taxonomic units (OTUs is a critical step for the bioinformatic analysis of microbial diversity. Here, we report a pipeline for selecting OTUs with a relatively low computational demand and a high degree of accuracy. This pipeline is referred to as two-stage clustering (TSC because it divides tags into two groups according to their abundance and clusters them sequentially. The more abundant group is clustered using a hierarchical algorithm similar to that in ESPRIT, which has a high degree of accuracy but is computationally costly for large datasets. The rarer group, which includes the majority of tags, is then heuristically clustered to improve efficiency. To further improve the computational efficiency and accuracy, two preclustering steps are implemented. To maintain clustering accuracy, all tags are grouped into an OTU depending on their pairwise Needleman-Wunsch distance. This method not only improved the computational efficiency but also mitigated the spurious OTU estimation from 'noise' sequences. In addition, OTUs clustered using TSC showed comparable or improved performance in beta-diversity comparisons compared to existing OTU selection methods. This study suggests that the distribution of sequencing datasets is a useful property for improving the computational efficiency and increasing the clustering accuracy of the high-throughput sequencing of PCR amplicons. The software and user guide are freely available at http://hwzhoulab.smu.edu.cn/paperdata/.

  11. High-throughput sequencing of 16S rRNA gene amplicons: effects of extraction procedure, primer length and annealing temperature.

    Science.gov (United States)

    Sergeant, Martin J; Constantinidou, Chrystala; Cogan, Tristan; Penn, Charles W; Pallen, Mark J

    2012-01-01

    The analysis of 16S-rDNA sequences to assess the bacterial community composition of a sample is a widely used technique that has increased with the advent of high throughput sequencing. Although considerable effort has been devoted to identifying the most informative region of the 16S gene and the optimal informatics procedures to process the data, little attention has been paid to the PCR step, in particular annealing temperature and primer length. To address this, amplicons derived from 16S-rDNA were generated from chicken caecal content DNA using different annealing temperatures, primers and different DNA extraction procedures. The amplicons were pyrosequenced to determine the optimal protocols for capture of maximum bacterial diversity from a chicken caecal sample. Even at very low annealing temperatures there was little effect on the community structure, although the abundance of some OTUs such as Bifidobacterium increased. Using shorter primers did not reveal any novel OTUs but did change the community profile obtained. Mechanical disruption of the sample by bead beating had a significant effect on the results obtained, as did repeated freezing and thawing. In conclusion, existing primers and standard annealing temperatures captured as much diversity as lower annealing temperatures and shorter primers.

  12. Sources of PCR-induced distortions in high-throughput sequencing data sets

    Science.gov (United States)

    Kebschull, Justus M.; Zador, Anthony M.

    2015-01-01

    PCR permits the exponential and sequence-specific amplification of DNA, even from minute starting quantities. PCR is a fundamental step in preparing DNA samples for high-throughput sequencing. However, there are errors associated with PCR-mediated amplification. Here we examine the effects of four important sources of error—bias, stochasticity, template switches and polymerase errors—on sequence representation in low-input next-generation sequencing libraries. We designed a pool of diverse PCR amplicons with a defined structure, and then used Illumina sequencing to search for signatures of each process. We further developed quantitative models for each process, and compared predictions of these models to our experimental data. We find that PCR stochasticity is the major force skewing sequence representation after amplification of a pool of unique DNA amplicons. Polymerase errors become very common in later cycles of PCR but have little impact on the overall sequence distribution as they are confined to small copy numbers. PCR template switches are rare and confined to low copy numbers. Our results provide a theoretical basis for removing distortions from high-throughput sequencing data. In addition, our findings on PCR stochasticity will have particular relevance to quantification of results from single cell sequencing, in which sequences are represented by only one or a few molecules. PMID:26187991

  13. Very high resolution single pass HLA genotyping using amplicon sequencing on the 454 next generation DNA sequencers: Comparison with Sanger sequencing.

    Science.gov (United States)

    Yamamoto, F; Höglund, B; Fernandez-Vina, M; Tyan, D; Rastrou, M; Williams, T; Moonsamy, P; Goodridge, D; Anderson, M; Erlich, H A; Holcomb, C L

    2015-12-01

    Compared to Sanger sequencing, next-generation sequencing offers advantages for high resolution HLA genotyping including increased throughput, lower cost, and reduced genotype ambiguity. Here we describe an enhancement of the Roche 454 GS GType HLA genotyping assay to provide very high resolution (VHR) typing, by the addition of 8 primer pairs to the original 14, to genotype 11 HLA loci. These additional amplicons help resolve common and well-documented alleles and exclude commonly found null alleles in genotype ambiguity strings. Simplification of workflow to reduce the initial preparation effort using early pooling of amplicons or the Fluidigm Access Array™ is also described. Performance of the VHR assay was evaluated on 28 well characterized cell lines using Conexio Assign MPS software which uses genomic, rather than cDNA, reference sequence. Concordance was 98.4%; 1.6% had no genotype assignment. Of concordant calls, 53% were unambiguous. To further assess the assay, 59 clinical samples were genotyped and results compared to unambiguous allele assignments obtained by prior sequence-based typing supplemented with SSO and/or SSP. Concordance was 98.7% with 58.2% as unambiguous calls; 1.3% could not be assigned. Our results show that the amplicon-based VHR assay is robust and can replace current Sanger methodology. Together with software enhancements, it has the potential to provide even higher resolution HLA typing. Copyright © 2015. Published by Elsevier Inc.

  14. Fungi Sailing the Arctic Ocean: Speciose Communities in North Atlantic Driftwood as Revealed by High-Throughput Amplicon Sequencing.

    Science.gov (United States)

    Rämä, Teppo; Davey, Marie L; Nordén, Jenni; Halvorsen, Rune; Blaalid, Rakel; Mathiassen, Geir H; Alsos, Inger G; Kauserud, Håvard

    2016-08-01

    High amounts of driftwood sail across the oceans and provide habitat for organisms tolerating the rough and saline environment. Fungi have adapted to the extremely cold and saline conditions which driftwood faces in the high north. For the first time, we applied high-throughput sequencing to fungi residing in driftwood to reveal their taxonomic richness, community composition, and ecology in the North Atlantic. Using pyrosequencing of ITS2 amplicons obtained from 49 marine logs, we found 807 fungal operational taxonomic units (OTUs) based on clustering at 97 % sequence similarity cut-off level. The phylum Ascomycota comprised 74 % of the OTUs and 20 % belonged to Basidiomycota. The richness of basidiomycetes decreased with prolonged submersion in the sea, supporting the general view of ascomycetes being more extremotolerant. However, more than one fourth of the fungal OTUs remained unassigned to any fungal class, emphasising the need for better DNA reference data from the marine habitat. Different fungal communities were detected in coniferous and deciduous logs. Our results highlight that driftwood hosts a considerably higher fungal diversity than currently known. The driftwood fungal community is not a terrestrial relic but a speciose assemblage of fungi adapted to the stressful marine environment and different kinds of wooden substrates found in it.

  15. Ultra-high resolution HLA genotyping and allele discovery by highly multiplexed cDNA amplicon pyrosequencing

    Directory of Open Access Journals (Sweden)

    Lank Simon M

    2012-08-01

    Full Text Available Abstract Background High-resolution HLA genotyping is a critical diagnostic and research assay. Current methods rarely achieve unambiguous high-resolution typing without making population-specific frequency inferences due to a lack of locus coverage and difficulty in exon-phase matching. Achieving high-resolution typing is also becoming more challenging with traditional methods as the database of known HLA alleles increases. Results We designed a cDNA amplicon-based pyrosequencing method to capture 94% of the HLA class I open-reading-frame with only two amplicons per sample, and an analogous method for class II HLA genes, with a primary focus on sequencing the DRB loci. We present a novel Galaxy server-based analysis workflow for determining genotype. During assay validation, we performed two GS Junior sequencing runs to determine the accuracy of the HLA class I amplicons and DRB amplicon at different levels of multiplexing. When 116 amplicons were multiplexed, we unambiguously resolved 99%of class I alleles to four- or six-digit resolution, as well as 100% unambiguous DRB calls. The second experiment, with 271 multiplexed amplicons, missed some alleles, but generated high-resolution, concordant typing for 93% of class I alleles, and 96% for DRB1 alleles. In a third, preliminary experiment we attempted to sequence novel amplicons for other class II loci with mixed success. Conclusions The presented assay is higher-throughput and higher-resolution than existing HLA genotyping methods, and suitable for allele discovery or large cohort sampling. The validated class I and DRB primers successfully generated unambiguously high-resolution genotypes, while further work is needed to validate additional class II genotyping amplicons.

  16. Combining Amplification Typing of L1 Active Subfamilies (ATLAS) with High-Throughput Sequencing.

    Science.gov (United States)

    Rahbari, Raheleh; Badge, Richard M

    2016-01-01

    With the advent of new generations of high-throughput sequencing technologies, the catalog of human genome variants created by retrotransposon activity is expanding rapidly. However, despite these advances in describing L1 diversity and the fact that L1 must retrotranspose in the germline or prior to germline partitioning to be evolutionarily successful, direct assessment of de novo L1 retrotransposition in the germline or early embryogenesis has not been achieved for endogenous L1 elements. A direct study of de novo L1 retrotransposition into susceptible loci within sperm DNA (Freeman et al., Hum Mutat 32(8):978-988, 2011) suggested that the rate of L1 retrotransposition in the germline is much lower than previously estimated (ATLAS L1 display technique (Badge et al., Am J Hum Genet 72(4):823-838, 2003) to investigate de novo L1 retrotransposition in human genomes. In this chapter, we describe how we combined a high-coverage ATLAS variant with high-throughput sequencing, achieving 11-25× sequence depth per single amplicon, to study L1 retrotransposition in whole genome amplified (WGA) DNAs.

  17. Improved Efficiency and Reliability of NGS Amplicon Sequencing Data Analysis for Genetic Diagnostic Procedures Using AGSA Software

    Directory of Open Access Journals (Sweden)

    Axel Poulet

    2016-01-01

    Full Text Available Screening for BRCA mutations in women with familial risk of breast or ovarian cancer is an ideal situation for high-throughput sequencing, providing large amounts of low cost data. However, 454, Roche, and Ion Torrent, Thermo Fisher, technologies produce homopolymer-associated indel errors, complicating their use in routine diagnostics. We developed software, named AGSA, which helps to detect false positive mutations in homopolymeric sequences. Seventy-two familial breast cancer cases were analysed in parallel by amplicon 454 pyrosequencing and Sanger dideoxy sequencing for genetic variations of the BRCA genes. All 565 variants detected by dideoxy sequencing were also detected by pyrosequencing. Furthermore, pyrosequencing detected 42 variants that were missed with Sanger technique. Six amplicons contained homopolymer tracts in the coding sequence that were systematically misread by the software supplied by Roche. Read data plotted as histograms by AGSA software aided the analysis considerably and allowed validation of the majority of homopolymers. As an optimisation, additional 250 patients were analysed using microfluidic amplification of regions of interest (Access Array Fluidigm of the BRCA genes, followed by 454 sequencing and AGSA analysis. AGSA complements a complete line of high-throughput diagnostic sequence analysis, reducing time and costs while increasing reliability, notably for homopolymer tracts.

  18. A quantitative and qualitative comparison of illumina MiSeq and 454 amplicon sequencing for genotyping the highly polymorphic major histocompatibility complex (MHC) in a non-model species.

    Science.gov (United States)

    Razali, Haslina; O'Connor, Emily; Drews, Anna; Burke, Terry; Westerdahl, Helena

    2017-07-28

    High-throughput sequencing enables high-resolution genotyping of extremely duplicated genes. 454 amplicon sequencing (454) has become the standard technique for genotyping the major histocompatibility complex (MHC) genes in non-model organisms. However, illumina MiSeq amplicon sequencing (MiSeq), which offers a much higher read depth, is now superseding 454. The aim of this study was to quantitatively and qualitatively evaluate the performance of MiSeq in relation to 454 for genotyping MHC class I alleles using a house sparrow (Passer domesticus) dataset with pedigree information. House sparrows provide a good study system for this comparison as their MHC class I genes have been studied previously and, consequently, we had prior expectations concerning the number of alleles per individual. We found that 454 and MiSeq performed equally well in genotyping amplicons with low diversity, i.e. amplicons from individuals that had fewer than 6 alleles. Although there was a higher rate of failure in the 454 dataset in resolving amplicons with higher diversity (6-9 alleles), the same genotypes were identified by both 454 and MiSeq in 98% of cases. We conclude that low diversity amplicons are equally well genotyped using either 454 or MiSeq, but the higher coverage afforded by MiSeq can lead to this approach outperforming 454 in amplicons with higher diversity.

  19. Insight into biases and sequencing errors for amplicon sequencing with the Illumina MiSeq platform.

    Science.gov (United States)

    Schirmer, Melanie; Ijaz, Umer Z; D'Amore, Rosalinda; Hall, Neil; Sloan, William T; Quince, Christopher

    2015-03-31

    With read lengths of currently up to 2 × 300 bp, high throughput and low sequencing costs Illumina's MiSeq is becoming one of the most utilized sequencing platforms worldwide. The platform is manageable and affordable even for smaller labs. This enables quick turnaround on a broad range of applications such as targeted gene sequencing, metagenomics, small genome sequencing and clinical molecular diagnostics. However, Illumina error profiles are still poorly understood and programs are therefore not designed for the idiosyncrasies of Illumina data. A better knowledge of the error patterns is essential for sequence analysis and vital if we are to draw valid conclusions. Studying true genetic variation in a population sample is fundamental for understanding diseases, evolution and origin. We conducted a large study on the error patterns for the MiSeq based on 16S rRNA amplicon sequencing data. We tested state-of-the-art library preparation methods for amplicon sequencing and showed that the library preparation method and the choice of primers are the most significant sources of bias and cause distinct error patterns. Furthermore we tested the efficiency of various error correction strategies and identified quality trimming (Sickle) combined with error correction (BayesHammer) followed by read overlapping (PANDAseq) as the most successful approach, reducing substitution error rates on average by 93%. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. Efficient error correction for next-generation sequencing of viral amplicons.

    Science.gov (United States)

    Skums, Pavel; Dimitrova, Zoya; Campo, David S; Vaughan, Gilberto; Rossi, Livia; Forbi, Joseph C; Yokosawa, Jonny; Zelikovsky, Alex; Khudyakov, Yury

    2012-06-25

    Next-generation sequencing allows the analysis of an unprecedented number of viral sequence variants from infected patients, presenting a novel opportunity for understanding virus evolution, drug resistance and immune escape. However, sequencing in bulk is error prone. Thus, the generated data require error identification and correction. Most error-correction methods to date are not optimized for amplicon analysis and assume that the error rate is randomly distributed. Recent quality assessment of amplicon sequences obtained using 454-sequencing showed that the error rate is strongly linked to the presence and size of homopolymers, position in the sequence and length of the amplicon. All these parameters are strongly sequence specific and should be incorporated into the calibration of error-correction algorithms designed for amplicon sequencing. In this paper, we present two new efficient error correction algorithms optimized for viral amplicons: (i) k-mer-based error correction (KEC) and (ii) empirical frequency threshold (ET). Both were compared to a previously published clustering algorithm (SHORAH), in order to evaluate their relative performance on 24 experimental datasets obtained by 454-sequencing of amplicons with known sequences. All three algorithms show similar accuracy in finding true haplotypes. However, KEC and ET were significantly more efficient than SHORAH in removing false haplotypes and estimating the frequency of true ones. Both algorithms, KEC and ET, are highly suitable for rapid recovery of error-free haplotypes obtained by 454-sequencing of amplicons from heterogeneous viruses.The implementations of the algorithms and data sets used for their testing are available at: http://alan.cs.gsu.edu/NGS/?q=content/pyrosequencing-error-correction-algorithm.

  1. Automated degenerate PCR primer design for high-throughput sequencing improves efficiency of viral sequencing

    Directory of Open Access Journals (Sweden)

    Li Kelvin

    2012-11-01

    Full Text Available Abstract Background In a high-throughput environment, to PCR amplify and sequence a large set of viral isolates from populations that are potentially heterogeneous and continuously evolving, the use of degenerate PCR primers is an important strategy. Degenerate primers allow for the PCR amplification of a wider range of viral isolates with only one set of pre-mixed primers, thus increasing amplification success rates and minimizing the necessity for genome finishing activities. To successfully select a large set of degenerate PCR primers necessary to tile across an entire viral genome and maximize their success, this process is best performed computationally. Results We have developed a fully automated degenerate PCR primer design system that plays a key role in the J. Craig Venter Institute’s (JCVI high-throughput viral sequencing pipeline. A consensus viral genome, or a set of consensus segment sequences in the case of a segmented virus, is specified using IUPAC ambiguity codes in the consensus template sequence to represent the allelic diversity of the target population. PCR primer pairs are then selected computationally to produce a minimal amplicon set capable of tiling across the full length of the specified target region. As part of the tiling process, primer pairs are computationally screened to meet the criteria for successful PCR with one of two described amplification protocols. The actual sequencing success rates for designed primers for measles virus, mumps virus, human parainfluenza virus 1 and 3, human respiratory syncytial virus A and B and human metapneumovirus are described, where >90% of designed primer pairs were able to consistently successfully amplify >75% of the isolates. Conclusions Augmenting our previously developed and published JCVI Primer Design Pipeline, we achieved similarly high sequencing success rates with only minor software modifications. The recommended methodology for the construction of the consensus

  2. Microfluidic PCR Amplification and MiSeq Amplicon Sequencing Techniques for High-Throughput Detection and Genotyping of Human Pathogenic RNA Viruses in Human Feces, Sewage, and Oysters

    Directory of Open Access Journals (Sweden)

    Mamoru Oshiki

    2018-04-01

    Full Text Available Detection and genotyping of pathogenic RNA viruses in human and environmental samples are useful for monitoring the circulation and prevalence of these pathogens, whereas a conventional PCR assay followed by Sanger sequencing is time-consuming and laborious. The present study aimed to develop a high-throughput detection-and-genotyping tool for 11 human RNA viruses [Aichi virus; astrovirus; enterovirus; norovirus genogroup I (GI, GII, and GIV; hepatitis A virus; hepatitis E virus; rotavirus; sapovirus; and human parechovirus] using a microfluidic device and next-generation sequencer. Microfluidic nested PCR was carried out on a 48.48 Access Array chip, and the amplicons were recovered and used for MiSeq sequencing (Illumina, Tokyo, Japan; genotyping was conducted by homology searching and phylogenetic analysis of the obtained sequence reads. The detection limit of the 11 tested viruses ranged from 100 to 103 copies/μL in cDNA sample, corresponding to 101–104 copies/mL-sewage, 105–108 copies/g-human feces, and 102–105 copies/g-digestive tissues of oyster. The developed assay was successfully applied for simultaneous detection and genotyping of RNA viruses to samples of human feces, sewage, and artificially contaminated oysters. Microfluidic nested PCR followed by MiSeq sequencing enables efficient tracking of the fate of multiple RNA viruses in various environments, which is essential for a better understanding of the circulation of human pathogenic RNA viruses in the human population.

  3. Comprehensive evaluation and optimization of amplicon library preparation methods for high-throughput antibody sequencing.

    Science.gov (United States)

    Menzel, Ulrike; Greiff, Victor; Khan, Tarik A; Haessler, Ulrike; Hellmann, Ina; Friedensohn, Simon; Cook, Skylar C; Pogson, Mark; Reddy, Sai T

    2014-01-01

    High-throughput sequencing (HTS) of antibody repertoire libraries has become a powerful tool in the field of systems immunology. However, numerous sources of bias in HTS workflows may affect the obtained antibody repertoire data. A crucial step in antibody library preparation is the addition of short platform-specific nucleotide adapter sequences. As of yet, the impact of the method of adapter addition on experimental library preparation and the resulting antibody repertoire HTS datasets has not been thoroughly investigated. Therefore, we compared three standard library preparation methods by performing Illumina HTS on antibody variable heavy genes from murine antibody-secreting cells. Clonal overlap and rank statistics demonstrated that the investigated methods produced equivalent HTS datasets. PCR-based methods were experimentally superior to ligation with respect to speed, efficiency, and practicality. Finally, using a two-step PCR based method we established a protocol for antibody repertoire library generation, beginning from inputs as low as 1 ng of total RNA. In summary, this study represents a major advance towards a standardized experimental framework for antibody HTS, thus opening up the potential for systems-based, cross-experiment meta-analyses of antibody repertoires.

  4. High-throughput sequence alignment using Graphics Processing Units

    Directory of Open Access Journals (Sweden)

    Trapnell Cole

    2007-12-01

    Full Text Available Abstract Background The recent availability of new, less expensive high-throughput DNA sequencing technologies has yielded a dramatic increase in the volume of sequence data that must be analyzed. These data are being generated for several purposes, including genotyping, genome resequencing, metagenomics, and de novo genome assembly projects. Sequence alignment programs such as MUMmer have proven essential for analysis of these data, but researchers will need ever faster, high-throughput alignment tools running on inexpensive hardware to keep up with new sequence technologies. Results This paper describes MUMmerGPU, an open-source high-throughput parallel pairwise local sequence alignment program that runs on commodity Graphics Processing Units (GPUs in common workstations. MUMmerGPU uses the new Compute Unified Device Architecture (CUDA from nVidia to align multiple query sequences against a single reference sequence stored as a suffix tree. By processing the queries in parallel on the highly parallel graphics card, MUMmerGPU achieves more than a 10-fold speedup over a serial CPU version of the sequence alignment kernel, and outperforms the exact alignment component of MUMmer on a high end CPU by 3.5-fold in total application time when aligning reads from recent sequencing projects using Solexa/Illumina, 454, and Sanger sequencing technologies. Conclusion MUMmerGPU is a low cost, ultra-fast sequence alignment program designed to handle the increasing volume of data produced by new, high-throughput sequencing technologies. MUMmerGPU demonstrates that even memory-intensive applications can run significantly faster on the relatively low-cost GPU than on the CPU.

  5. Determining the diet of larvae of western rock lobster (Panulirus cygnus using high-throughput DNA sequencing techniques.

    Directory of Open Access Journals (Sweden)

    Richard O'Rorke

    Full Text Available The Western Australian rock lobster fishery has been both a highly productive and sustainable fishery. However, a recent dramatic and unexplained decline in post-larval recruitment threatens this sustainability. Our lack of knowledge of key processes in lobster larval ecology, such as their position in the food web, limits our ability to determine what underpins this decline. The present study uses a high-throughput amplicon sequencing approach on DNA obtained from the hepatopancreas of larvae to discover significant prey items. Two short regions of the 18S rRNA gene were amplified under the presence of lobster specific PNA to prevent lobster amplification and to improve prey amplification. In the resulting sequences either little prey was recovered, indicating that the larval gut was empty, or there was a high number of reads originating from multiple zooplankton taxa. The most abundant reads included colonial Radiolaria, Thaliacea, Actinopterygii, Hydrozoa and Sagittoidea, which supports the hypothesis that the larvae feed on multiple groups of mostly transparent gelatinous zooplankton. This hypothesis has prevailed as it has been tentatively inferred from the physiology of larvae, captive feeding trials and co-occurrence in situ. However, these prey have not been observed in the larval gut as traditional microscopic techniques cannot discern between transparent and gelatinous prey items in the gut. High-throughput amplicon sequencing of gut DNA has enabled us to classify these otherwise undetectable prey. The dominance of the colonial radiolarians among the gut contents is intriguing in that this group has been historically difficult to quantify in the water column, which may explain why they have not been connected to larval diet previously. Our results indicate that a PCR based technique is a very successful approach to identify the most abundant taxa in the natural diet of lobster larvae.

  6. Network analysis of the microorganism in 25 Danish wastewater treatment plants over 7 years using high-throughput amplicon sequencing

    DEFF Research Database (Denmark)

    Albertsen, Mads; Larsen, Poul; Saunders, Aaron Marc

    to link sludge and floc properties to the microbial communities. All data was subjected to extensive network analysis and multivariate statistics through R. The 16S amplicon results confirmed the findings of relatively few core groups of organism shared by all the wastewater treatment plants......Wastewater treatment is the world’s largest biotechnological processes and a perfect model system for microbial ecology as the habitat is well defined and replicated all over the world. Extensive investigations on Danish wastewater treatment plants using fluorescent in situ hybridization have...... a year, totaling over 400 samples. All samples were subjected to 16S rDNA amplicon sequencing using V13 primers on the Illumina MiSeq platform (2x300bp) to a depth of at least 20.000 quality filtered reads per sample. The OTUs were assigned taxonomy based on a manually curated version of the greengenes...

  7. Using high-throughput sequencing to leverage surveillance of genetic diversity and oseltamivir resistance: a pilot study during the 2009 influenza A(H1N1 pandemic.

    Directory of Open Access Journals (Sweden)

    Juan Téllez-Sosa

    Full Text Available BACKGROUND: Influenza viruses display a high mutation rate and complex evolutionary patterns. Next-generation sequencing (NGS has been widely used for qualitative and semi-quantitative assessment of genetic diversity in complex biological samples. The "deep sequencing" approach, enabled by the enormous throughput of current NGS platforms, allows the identification of rare genetic viral variants in targeted genetic regions, but is usually limited to a small number of samples. METHODOLOGY AND PRINCIPAL FINDINGS: We designed a proof-of-principle study to test whether redistributing sequencing throughput from a high depth-small sample number towards a low depth-large sample number approach is feasible and contributes to influenza epidemiological surveillance. Using 454-Roche sequencing, we sequenced at a rather low depth, a 307 bp amplicon of the neuraminidase gene of the Influenza A(H1N1 pandemic (A(H1N1pdm virus from cDNA amplicons pooled in 48 barcoded libraries obtained from nasal swab samples of infected patients (n  =  299 taken from May to November, 2009 pandemic period in Mexico. This approach revealed that during the transition from the first (May-July to second wave (September-November of the pandemic, the initial genetic variants were replaced by the N248D mutation in the NA gene, and enabled the establishment of temporal and geographic associations with genetic diversity and the identification of mutations associated with oseltamivir resistance. CONCLUSIONS: NGS sequencing of a short amplicon from the NA gene at low sequencing depth allowed genetic screening of a large number of samples, providing insights to viral genetic diversity dynamics and the identification of genetic variants associated with oseltamivir resistance. Further research is needed to explain the observed replacement of the genetic variants seen during the second wave. As sequencing throughput rises and library multiplexing and automation improves, we foresee that

  8. High-Throughput Block Optical DNA Sequence Identification.

    Science.gov (United States)

    Sagar, Dodderi Manjunatha; Korshoj, Lee Erik; Hanson, Katrina Bethany; Chowdhury, Partha Pratim; Otoupal, Peter Britton; Chatterjee, Anushree; Nagpal, Prashant

    2018-01-01

    Optical techniques for molecular diagnostics or DNA sequencing generally rely on small molecule fluorescent labels, which utilize light with a wavelength of several hundred nanometers for detection. Developing a label-free optical DNA sequencing technique will require nanoscale focusing of light, a high-throughput and multiplexed identification method, and a data compression technique to rapidly identify sequences and analyze genomic heterogeneity for big datasets. Such a method should identify characteristic molecular vibrations using optical spectroscopy, especially in the "fingerprinting region" from ≈400-1400 cm -1 . Here, surface-enhanced Raman spectroscopy is used to demonstrate label-free identification of DNA nucleobases with multiplexed 3D plasmonic nanofocusing. While nanometer-scale mode volumes prevent identification of single nucleobases within a DNA sequence, the block optical technique can identify A, T, G, and C content in DNA k-mers. The content of each nucleotide in a DNA block can be a unique and high-throughput method for identifying sequences, genes, and other biomarkers as an alternative to single-letter sequencing. Additionally, coupling two complementary vibrational spectroscopy techniques (infrared and Raman) can improve block characterization. These results pave the way for developing a novel, high-throughput block optical sequencing method with lossy genomic data compression using k-mer identification from multiplexed optical data acquisition. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  9. Polyadenylated Sequencing Primers Enable Complete Readability of PCR Amplicons Analyzed by Dideoxynucleotide Sequencing

    Directory of Open Access Journals (Sweden)

    Martin Beránek

    2012-01-01

    Full Text Available Dideoxynucleotide DNA sequencing is one of the principal procedures in molecular biology. Loss of an initial part of nucleotides behind the 3' end of the sequencing primer limits the readability of sequenced amplicons. We present a method which extends the readability by using sequencing primers modified by polyadenylated tails attached to their 5' ends. Performing a polymerase chain reaction, we amplified eight amplicons of six human genes (AMELX, APOE, HFE, MBL2, SERPINA1 and TGFB1 ranging from 106 bp to 680 bp. Polyadenylation of the sequencing primers minimized the loss of bases in all amplicons. Complete sequences of shorter products (AMELX 106 bp, SERPINA1 121 bp, HFE 208 bp, APOE 244 bp, MBL2 317 bp were obtained. In addition, in the case of TGFB1 products (366 bp, 432 bp, and 680 bp, respectively, the lengths of sequencing readings were significantly longer if adenylated primers were used. Thus, single strand dideoxynucleotide sequencing with adenylated primers enables complete or near complete readability of short PCR amplicons.

  10. Size Matters: Assessing Optimum Soil Sample Size for Fungal and Bacterial Community Structure Analyses Using High Throughput Sequencing of rRNA Gene Amplicons

    Directory of Open Access Journals (Sweden)

    Christopher Ryan Penton

    2016-06-01

    Full Text Available We examined the effect of different soil sample sizes obtained from an agricultural field, under a single cropping system uniform in soil properties and aboveground crop responses, on bacterial and fungal community structure and microbial diversity indices. DNA extracted from soil sample sizes of 0.25, 1, 5 and 10 g using MoBIO kits and from 10 and 100 g sizes using a bead-beating method (SARDI were used as templates for high-throughput sequencing of 16S and 28S rRNA gene amplicons for bacteria and fungi, respectively, on the Illumina MiSeq and Roche 454 platforms. Sample size significantly affected overall bacterial and fungal community structure, replicate dispersion and the number of operational taxonomic units (OTUs retrieved. Richness, evenness and diversity were also significantly affected. The largest diversity estimates were always associated with the 10 g MoBIO extractions with a corresponding reduction in replicate dispersion. For the fungal data, smaller MoBIO extractions identified more unclassified Eukaryota incertae sedis and unclassified glomeromycota while the SARDI method retrieved more abundant OTUs containing unclassified Pleosporales and the fungal genera Alternaria and Cercophora. Overall, these findings indicate that a 10 g soil DNA extraction is most suitable for both soil bacterial and fungal communities for retrieving optimal diversity while still capturing rarer taxa in concert with decreasing replicate variation.

  11. Model SNP development for complex genomes based on hexaploid oat using high-throughput 454 sequencing technology

    Directory of Open Access Journals (Sweden)

    Chao Shiaoman

    2011-01-01

    Full Text Available Abstract Background Genetic markers are pivotal to modern genomics research; however, discovery and genotyping of molecular markers in oat has been hindered by the size and complexity of the genome, and by a scarcity of sequence data. The purpose of this study was to generate oat expressed sequence tag (EST information, develop a bioinformatics pipeline for SNP discovery, and establish a method for rapid, cost-effective, and straightforward genotyping of SNP markers in complex polyploid genomes such as oat. Results Based on cDNA libraries of four cultivated oat genotypes, approximately 127,000 contigs were assembled from approximately one million Roche 454 sequence reads. Contigs were filtered through a novel bioinformatics pipeline to eliminate ambiguous polymorphism caused by subgenome homology, and 96 in silico SNPs were selected from 9,448 candidate loci for validation using high-resolution melting (HRM analysis. Of these, 52 (54% were polymorphic between parents of the Ogle1040 × TAM O-301 (OT mapping population, with 48 segregating as single Mendelian loci, and 44 being placed on the existing OT linkage map. Ogle and TAM amplicons from 12 primers were sequenced for SNP validation, revealing complex polymorphism in seven amplicons but general sequence conservation within SNP loci. Whole-amplicon interrogation with HRM revealed insertions, deletions, and heterozygotes in secondary oat germplasm pools, generating multiple alleles at some primer targets. To validate marker utility, 36 SNP assays were used to evaluate the genetic diversity of 34 diverse oat genotypes. Dendrogram clusters corresponded generally to known genome composition and genetic ancestry. Conclusions The high-throughput SNP discovery pipeline presented here is a rapid and effective method for identification of polymorphic SNP alleles in the oat genome. The current-generation HRM system is a simple and highly-informative platform for SNP genotyping. These techniques provide

  12. Whole Genome Sequencing of Enterovirus species C Isolates by High-throughput Sequencing: Development of Generic Primers

    Directory of Open Access Journals (Sweden)

    Maël Bessaud

    2016-08-01

    Full Text Available Enteroviruses are among the most common viruses infecting humans and can cause diverse clinical syndromes ranging from minor febrile illness to severe and potentially fatal diseases. Enterovirus species C (EV-C consists of more than 20 types, among which the 3 serotypes of polioviruses, the etiological agents of poliomyelitis, are included. Biodiversity and evolution of EV-C genomes are shaped by frequent recombination events. Therefore, identification and characterization of circulating EV-C strains require the sequencing of different genomic regions.A simple method was developed to sequence quickly the entire genome of EV-C isolates. Four overlapping fragments were produced separately by RT-PCR performed with generic primers. The four amplicons were then pooled and purified prior to be sequenced by high-throughput technique.The method was assessed on a panel of EV-Cs belonging to a wide-range of types. It can be used to determine full-length genome sequences through de novo assembly of thousands of reads. It was also able to discriminate reads from closely related viruses in mixtures.By decreasing the workload compared to classical Sanger-based techniques, this method will serve as a precious tool for sequencing large panels of EV-Cs isolated in cell cultures during environmental surveillance or from patients, including vaccine-derived polioviruses.

  13. Generic Amplicon Deep Sequencing to Determine Ilarvirus Species Diversity in Australian Prunus.

    Science.gov (United States)

    Kinoti, Wycliff M; Constable, Fiona E; Nancarrow, Narelle; Plummer, Kim M; Rodoni, Brendan

    2017-01-01

    The distribution of Ilarvirus species populations amongst 61 Australian Prunus trees was determined by next generation sequencing (NGS) of amplicons generated using a genus-based generic RT-PCR targeting a conserved region of the Ilarvirus RNA2 component that encodes the RNA dependent RNA polymerase (RdRp) gene. Presence of Ilarvirus sequences in each positive sample was further validated by Sanger sequencing of cloned amplicons of regions of each of RNA1, RNA2 and/or RNA3 that were generated by species specific PCRs and by metagenomic NGS. Prunus necrotic ringspot virus (PNRSV) was the most frequently detected Ilarvirus , occurring in 48 of the 61 Ilarvirus -positive trees and Prune dwarf virus (PDV) and Apple mosaic virus (ApMV) were detected in three trees and one tree, respectively. American plum line pattern virus (APLPV) was detected in three trees and represents the first report of APLPV detection in Australia. Two novel and distinct groups of Ilarvirus -like RNA2 amplicon sequences were also identified in several trees by the generic amplicon NGS approach. The high read depth from the amplicon NGS of the generic PCR products allowed the detection of distinct RNA2 RdRp sequence variant populations of PNRSV, PDV, ApMV, APLPV and the two novel Ilarvirus -like sequences. Mixed infections of ilarviruses were also detected in seven Prunus trees. Sanger sequencing of specific RNA1, RNA2, and/or RNA3 genome segments of each virus and total nucleic acid metagenomics NGS confirmed the presence of PNRSV, PDV, ApMV and APLPV detected by RNA2 generic amplicon NGS. However, the two novel groups of Ilarvirus -like RNA2 amplicon sequences detected by the generic amplicon NGS could not be associated to the presence of sequence from RNA1 or RNA3 genome segments or full Ilarvirus genomes, and their origin is unclear. This work highlights the sensitivity of genus-specific amplicon NGS in detection of virus sequences and their distinct populations in multiple samples, and the

  14. Generic Amplicon Deep Sequencing to Determine Ilarvirus Species Diversity in Australian Prunus

    Directory of Open Access Journals (Sweden)

    Wycliff M. Kinoti

    2017-06-01

    Full Text Available The distribution of Ilarvirus species populations amongst 61 Australian Prunus trees was determined by next generation sequencing (NGS of amplicons generated using a genus-based generic RT-PCR targeting a conserved region of the Ilarvirus RNA2 component that encodes the RNA dependent RNA polymerase (RdRp gene. Presence of Ilarvirus sequences in each positive sample was further validated by Sanger sequencing of cloned amplicons of regions of each of RNA1, RNA2 and/or RNA3 that were generated by species specific PCRs and by metagenomic NGS. Prunus necrotic ringspot virus (PNRSV was the most frequently detected Ilarvirus, occurring in 48 of the 61 Ilarvirus-positive trees and Prune dwarf virus (PDV and Apple mosaic virus (ApMV were detected in three trees and one tree, respectively. American plum line pattern virus (APLPV was detected in three trees and represents the first report of APLPV detection in Australia. Two novel and distinct groups of Ilarvirus-like RNA2 amplicon sequences were also identified in several trees by the generic amplicon NGS approach. The high read depth from the amplicon NGS of the generic PCR products allowed the detection of distinct RNA2 RdRp sequence variant populations of PNRSV, PDV, ApMV, APLPV and the two novel Ilarvirus-like sequences. Mixed infections of ilarviruses were also detected in seven Prunus trees. Sanger sequencing of specific RNA1, RNA2, and/or RNA3 genome segments of each virus and total nucleic acid metagenomics NGS confirmed the presence of PNRSV, PDV, ApMV and APLPV detected by RNA2 generic amplicon NGS. However, the two novel groups of Ilarvirus-like RNA2 amplicon sequences detected by the generic amplicon NGS could not be associated to the presence of sequence from RNA1 or RNA3 genome segments or full Ilarvirus genomes, and their origin is unclear. This work highlights the sensitivity of genus-specific amplicon NGS in detection of virus sequences and their distinct populations in multiple samples

  15. Extracellular DNA amplicon sequencing reveals high levels of benthic eukaryotic diversity in the central Red Sea

    KAUST Repository

    Pearman, John K.

    2015-11-01

    The present study aims to characterize the benthic eukaryotic biodiversity patterns at a coarse taxonomic level in three areas of the central Red Sea (a lagoon, an offshore area in Thuwal and a shallow coastal area near Jeddah) based on extracellular DNA. High-throughput amplicon sequencing targeting the V9 region of the 18S rRNA gene was undertaken for 32 sediment samples. High levels of alpha-diversity were detected with 16,089 operational taxonomic units (OTUs) being identified. The majority of the OTUs were assigned to Metazoa (29.2%), Alveolata (22.4%) and Stramenopiles (17.8%). Stramenopiles (Diatomea) and Alveolata (Ciliophora) were frequent in a lagoon and in shallower coastal stations, whereas metazoans (Arthropoda: Maxillopoda) were dominant in deeper offshore stations. Only 24.6% of total OTUs were shared among all areas. Beta-diversity was generally lower between the lagoon and Jeddah (nearshore) than between either of those and the offshore area, suggesting a nearshore–offshore biodiversity gradient. The current approach allowed for a broad-range of benthic eukaryotic biodiversity to be analysed with significantly less labour than would be required by other traditional taxonomic approaches. Our findings suggest that next generation sequencing techniques have the potential to provide a fast and standardised screening of benthic biodiversity at large spatial and temporal scales.

  16. Human papillomavirus detection using the Abbott RealTime high-risk HPV tests compared with conventional nested PCR coupled to high-throughput sequencing of amplification products in cervical smear specimens from a Gabonese female population.

    Science.gov (United States)

    Moussavou-Boundzanga, Pamela; Koumakpayi, Ismaël Hervé; Labouba, Ingrid; Leroy, Eric M; Belembaogo, Ernest; Berthet, Nicolas

    2017-12-21

    Cervical cancer is the fourth most common malignancy in women worldwide. However, screening with human papillomavirus (HPV) molecular tests holds promise for reducing cervical cancer incidence and mortality in low- and middle-income countries. The performance of the Abbott RealTime High-Risk HPV test (AbRT) was evaluated in 83 cervical smear specimens and compared with a conventional nested PCR coupled to high-throughput sequencing (HTS) to identify the amplicons. The AbRT assay detected at least one HPV genotype in 44.57% of women regardless of the grade of cervical abnormalities. Except for one case, good concordance was observed for the genotypes detected with the AbRT assay in the high-risk HPV category determined with HTS of the amplicon generated by conventional nested PCR. The AbRT test is an easy and reliable molecular tool and was as sensitive as conventional nested PCR in cervical smear specimens for detection HPVs associated with high-grade lesions. Moreover, sequencing amplicons using an HTS approach effectively identified the genotype of the hrHPV identified with the AbRT test.

  17. Multiplexed homogeneous proximity ligation assays for high throughput protein biomarker research in serological material

    DEFF Research Database (Denmark)

    Lundberg, Martin; Thorsen, Stine Buch; Assarsson, Erika

    2011-01-01

    A high throughput protein biomarker discovery tool has been developed based on multiplexed proximity ligation assays (PLA) in a homogeneous format in the sense of no washing steps. The platform consists of four 24-plex panels profiling 74 putative biomarkers with sub pM sensitivity each consuming...... sequences are united by DNA ligation upon simultaneous target binding forming a PCR amplicon. Multiplex PLA thereby converts multiple target analytes into real-time PCR amplicons that are individually quantificatied using microfluidic high capacity qPCR in nano liter volumes. The assay shows excellent...

  18. High Diversity of Myocyanophage in Various Aquatic Environments Revealed by High-Throughput Sequencing of Major Capsid Protein Gene With a New Set of Primers

    Directory of Open Access Journals (Sweden)

    Weiguo Hou

    2018-05-01

    Full Text Available Myocyanophages, a group of viruses infecting cyanobacteria, are abundant and play important roles in elemental cycling. Here we investigated the particle-associated viral communities retained on 0.2 μm filters and in sediment samples (representing ancient cyanophage communities from four ocean and three lake locations, using high-throughput sequencing and a newly designed primer pair targeting a gene fragment (∼145-bp in length encoding the cyanophage gp23 major capsid protein (MCP. Diverse viral communities were detected in all samples. The fragments of 142-, 145-, and 148-bp in length were most abundant in the amplicons, and most sequences (>92% belonged to cyanophages. Additionally, different sequencing depths resulted in different diversity estimates of the viral community. Operational taxonomic units obtained from deep sequencing of the MCP gene covered the majority of those obtained from shallow sequencing, suggesting that deep sequencing exhibited a more complete picture of cyanophage community than shallow sequencing. Our results also revealed a wide geographic distribution of marine myocyanophages, i.e., higher dissimilarities of the myocyanophage communities corresponded with the larger distances between the sampling sites. Collectively, this study suggests that the newly designed primer pair can be effectively used to study the community and diversity of myocyanophage from different environments, and the high-throughput sequencing represents a good method to understand viral diversity.

  19. Taxonomy of anaerobic digestion microbiome reveals biases associated with the applied high throughput sequencing strategies

    DEFF Research Database (Denmark)

    Campanaro, Stefano; Treu, Laura; Kougias, Panagiotis

    2018-01-01

    In the past few years, many studies investigated the anaerobic digestion microbiome by means of 16S rRNA amplicon sequencing. Results obtained from these studies were compared to each other without taking into consideration the followed procedure for amplicons preparation and data analysis...... specifically, the microbial compositions of three laboratory scale biogas reactors were analyzed before and after addition of sodium oleate by sequencing the microbiome with three different approaches: 16S rRNA amplicon sequencing, shotgun DNA and shotgun RNA. This comparative analysis revealed that......, in amplicon sequencing, abundance of some taxa (Euryarchaeota and Spirochaetes) was biased by the inefficiency of universal primers to hybridize all the templates. Reliability of the results obtained was also influenced by the number of hypervariable regions under investigation. Finally, amplicon sequencing...

  20. Implementation of Amplicon Parallel Sequencing Leads to Improvement of Diagnosis and Therapy of Lung Cancer Patients.

    Science.gov (United States)

    König, Katharina; Peifer, Martin; Fassunke, Jana; Ihle, Michaela A; Künstlinger, Helen; Heydt, Carina; Stamm, Katrin; Ueckeroth, Frank; Vollbrecht, Claudia; Bos, Marc; Gardizi, Masyar; Scheffler, Matthias; Nogova, Lucia; Leenders, Frauke; Albus, Kerstin; Meder, Lydia; Becker, Kerstin; Florin, Alexandra; Rommerscheidt-Fuss, Ursula; Altmüller, Janine; Kloth, Michael; Nürnberg, Peter; Henkel, Thomas; Bikár, Sven-Ernö; Sos, Martin L; Geese, William J; Strauss, Lewis; Ko, Yon-Dschun; Gerigk, Ulrich; Odenthal, Margarete; Zander, Thomas; Wolf, Jürgen; Merkelbach-Bruse, Sabine; Buettner, Reinhard; Heukamp, Lukas C

    2015-07-01

    The Network Genomic Medicine Lung Cancer was set up to rapidly translate scientific advances into early clinical trials of targeted therapies in lung cancer performing molecular analyses of more than 3500 patients annually. Because sequential analysis of the relevant driver mutations on fixated samples is challenging in terms of workload, tissue availability, and cost, we established multiplex parallel sequencing in routine diagnostics. The aim was to analyze all therapeutically relevant mutations in lung cancer samples in a high-throughput fashion while significantly reducing turnaround time and amount of input DNA compared with conventional dideoxy sequencing of single polymerase chain reaction amplicons. In this study, we demonstrate the feasibility of a 102 amplicon multiplex polymerase chain reaction followed by sequencing on an Illumina sequencer on formalin-fixed paraffin-embedded tissue in routine diagnostics. Analysis of a validation cohort of 180 samples showed this approach to require significantly less input material and to be more reliable, robust, and cost-effective than conventional dideoxy sequencing. Subsequently, 2657 lung cancer patients were analyzed. We observed that comprehensive biomarker testing provided novel information in addition to histological diagnosis and clinical staging. In 2657 consecutively analyzed lung cancer samples, we identified driver mutations at the expected prevalence. Furthermore we found potentially targetable DDR2 mutations at a frequency of 3% in both adenocarcinomas and squamous cell carcinomas. Overall, our data demonstrate the utility of systematic sequencing analysis in a clinical routine setting and highlight the dramatic impact of such an approach on the availability of therapeutic strategies for the targeted treatment of individual cancer patients.

  1. Management of High-Throughput DNA Sequencing Projects: Alpheus.

    Science.gov (United States)

    Miller, Neil A; Kingsmore, Stephen F; Farmer, Andrew; Langley, Raymond J; Mudge, Joann; Crow, John A; Gonzalez, Alvaro J; Schilkey, Faye D; Kim, Ryan J; van Velkinburgh, Jennifer; May, Gregory D; Black, C Forrest; Myers, M Kathy; Utsey, John P; Frost, Nicholas S; Sugarbaker, David J; Bueno, Raphael; Gullans, Stephen R; Baxter, Susan M; Day, Steve W; Retzel, Ernest F

    2008-12-26

    High-throughput DNA sequencing has enabled systems biology to begin to address areas in health, agricultural and basic biological research. Concomitant with the opportunities is an absolute necessity to manage significant volumes of high-dimensional and inter-related data and analysis. Alpheus is an analysis pipeline, database and visualization software for use with massively parallel DNA sequencing technologies that feature multi-gigabase throughput characterized by relatively short reads, such as Illumina-Solexa (sequencing-by-synthesis), Roche-454 (pyrosequencing) and Applied Biosystem's SOLiD (sequencing-by-ligation). Alpheus enables alignment to reference sequence(s), detection of variants and enumeration of sequence abundance, including expression levels in transcriptome sequence. Alpheus is able to detect several types of variants, including non-synonymous and synonymous single nucleotide polymorphisms (SNPs), insertions/deletions (indels), premature stop codons, and splice isoforms. Variant detection is aided by the ability to filter variant calls based on consistency, expected allele frequency, sequence quality, coverage, and variant type in order to minimize false positives while maximizing the identification of true positives. Alpheus also enables comparisons of genes with variants between cases and controls or bulk segregant pools. Sequence-based differential expression comparisons can be developed, with data export to SAS JMP Genomics for statistical analysis.

  2. High-Throughput Next-Generation Sequencing of Polioviruses

    Science.gov (United States)

    Montmayeur, Anna M.; Schmidt, Alexander; Zhao, Kun; Magaña, Laura; Iber, Jane; Castro, Christina J.; Chen, Qi; Henderson, Elizabeth; Ramos, Edward; Shaw, Jing; Tatusov, Roman L.; Dybdahl-Sissoko, Naomi; Endegue-Zanga, Marie Claire; Adeniji, Johnson A.; Oberste, M. Steven; Burns, Cara C.

    2016-01-01

    ABSTRACT The poliovirus (PV) is currently targeted for worldwide eradication and containment. Sanger-based sequencing of the viral protein 1 (VP1) capsid region is currently the standard method for PV surveillance. However, the whole-genome sequence is sometimes needed for higher resolution global surveillance. In this study, we optimized whole-genome sequencing protocols for poliovirus isolates and FTA cards using next-generation sequencing (NGS), aiming for high sequence coverage, efficiency, and throughput. We found that DNase treatment of poliovirus RNA followed by random reverse transcription (RT), amplification, and the use of the Nextera XT DNA library preparation kit produced significantly better results than other preparations. The average viral reads per total reads, a measurement of efficiency, was as high as 84.2% ± 15.6%. PV genomes covering >99 to 100% of the reference length were obtained and validated with Sanger sequencing. A total of 52 PV genomes were generated, multiplexing as many as 64 samples in a single Illumina MiSeq run. This high-throughput, sequence-independent NGS approach facilitated the detection of a diverse range of PVs, especially for those in vaccine-derived polioviruses (VDPV), circulating VDPV, or immunodeficiency-related VDPV. In contrast to results from previous studies on other viruses, our results showed that filtration and nuclease treatment did not discernibly increase the sequencing efficiency of PV isolates. However, DNase treatment after nucleic acid extraction to remove host DNA significantly improved the sequencing results. This NGS method has been successfully implemented to generate PV genomes for molecular epidemiology of the most recent PV isolates. Additionally, the ability to obtain full PV genomes from FTA cards will aid in facilitating global poliovirus surveillance. PMID:27927929

  3. Deep amplicon sequencing reveals mixed phytoplasma infection within single grapevine plants

    DEFF Research Database (Denmark)

    Nicolaisen, Mogens; Contaldo, Nicoletta; Makarova, Olga

    2011-01-01

    The diversity of phytoplasmas within single plants has not yet been fully investigated. In this project, deep amplicon sequencing was used to generate 50,926 phytoplasma sequences from 11 phytoplasma-infected grapevine samples from a PCR amplicon in the 5' end of the 16S region. After clustering ...

  4. Bacterial diversity of the Colombian fermented milk "Suero Costeño" assessed by culturing and high-throughput sequencing and DGGE analysis of 16S rRNA gene amplicons.

    Science.gov (United States)

    Motato, Karina Edith; Milani, Christian; Ventura, Marco; Valencia, Francia Elena; Ruas-Madiedo, Patricia; Delgado, Susana

    2017-12-01

    "Suero Costeño" (SC) is a traditional soured cream elaborated from raw milk in the Northern-Caribbean coast of Colombia. The natural microbiota that characterizes this popular Colombian fermented milk is unknown, although several culturing studies have previously been attempted. In this work, the microbiota associated with SC from three manufacturers in two regions, "Planeta Rica" (Córdoba) and "Caucasia" (Antioquia), was analysed by means of culturing methods in combination with high-throughput sequencing and DGGE analysis of 16S rRNA gene amplicons. The bacterial ecosystem of SC samples was revealed to be composed of lactic acid bacteria belonging to the Streptococcaceae and Lactobacillaceae families; the proportions and genera varying among manufacturers and region of elaboration. Members of the Lactobacillus acidophilus group, Lactocococcus lactis, Streptococcus infantarius and Streptococcus salivarius characterized this artisanal product. In comparison with culturing, the use of molecular in deep culture-independent techniques provides a more realistic picture of the overall bacterial communities residing in SC. Besides the descriptive purpose, these approaches will facilitate a rational strategy to follow (culture media and growing conditions) for the isolation of indigenous strains that allow standardization in the manufacture of SC. Copyright © 2017 Elsevier Ltd. All rights reserved.

  5. Using expected sequence features to improve basecalling accuracy of amplicon pyrosequencing data

    DEFF Research Database (Denmark)

    Rask, Thomas Salhøj; Petersen, Bent; Chen, Donald S.

    2016-01-01

    . The new basecalling method described here, named Multipass, implements a probabilistic framework for working with the raw flowgrams obtained by pyrosequencing. For each sequence variant Multipass calculates the likelihood and nucleotide sequence of several most likely sequences given the flowgram data....... This probabilistic approach enables integration of basecalling into a larger model where other parameters can be incorporated, such as the likelihood for observing a full-length open reading frame at the targeted region. We apply the method to 454 amplicon pyrosequencing data obtained from a malaria virulence gene...... family, where Multipass generates 20 % more error-free sequences than current state of the art methods, and provides sequence characteristics that allow generation of a set of high confidence error-free sequences. This novel method can be used to increase accuracy of existing and future amplicon...

  6. Analysis of 16S rRNA amplicon sequencing options on the Roche/454 next-generation titanium sequencing platform.

    Directory of Open Access Journals (Sweden)

    Hideyuki Tamaki

    Full Text Available BACKGROUND: 16S rRNA gene pyrosequencing approach has revolutionized studies in microbial ecology. While primer selection and short read length can affect the resulting microbial community profile, little is known about the influence of pyrosequencing methods on the sequencing throughput and the outcome of microbial community analyses. The aim of this study is to compare differences in output, ease, and cost among three different amplicon pyrosequencing methods for the Roche/454 Titanium platform METHODOLOGY/PRINCIPAL FINDINGS: The following three pyrosequencing methods for 16S rRNA genes were selected in this study: Method-1 (standard method is the recommended method for bi-directional sequencing using the LIB-A kit; Method-2 is a new option designed in this study for unidirectional sequencing with the LIB-A kit; and Method-3 uses the LIB-L kit for unidirectional sequencing. In our comparison among these three methods using 10 different environmental samples, Method-2 and Method-3 produced 1.5-1.6 times more useable reads than the standard method (Method-1, after quality-based trimming, and did not compromise the outcome of microbial community analyses. Specifically, Method-3 is the most cost-effective unidirectional amplicon sequencing method as it provided the most reads and required the least effort in consumables management. CONCLUSIONS: Our findings clearly demonstrated that alternative pyrosequencing methods for 16S rRNA genes could drastically affect sequencing output (e.g. number of reads before and after trimming but have little effect on the outcomes of microbial community analysis. This finding is important for both researchers and sequencing facilities utilizing 16S rRNA gene pyrosequencing for microbial ecological studies.

  7. Analysis and Visualization Tool for Targeted Amplicon Bisulfite Sequencing on Ion Torrent Sequencers.

    Directory of Open Access Journals (Sweden)

    Stephan Pabinger

    Full Text Available Targeted sequencing of PCR amplicons generated from bisulfite deaminated DNA is a flexible, cost-effective way to study methylation of a sample at single CpG resolution and perform subsequent multi-target, multi-sample comparisons. Currently, no platform specific protocol, support, or analysis solution is provided to perform targeted bisulfite sequencing on a Personal Genome Machine (PGM. Here, we present a novel tool, called TABSAT, for analyzing targeted bisulfite sequencing data generated on Ion Torrent sequencers. The workflow starts with raw sequencing data, performs quality assessment, and uses a tailored version of Bismark to map the reads to a reference genome. The pipeline visualizes results as lollipop plots and is able to deduce specific methylation-patterns present in a sample. The obtained profiles are then summarized and compared between samples. In order to assess the performance of the targeted bisulfite sequencing workflow, 48 samples were used to generate 53 different Bisulfite-Sequencing PCR amplicons from each sample, resulting in 2,544 amplicon targets. We obtained a mean coverage of 282X using 1,196,822 aligned reads. Next, we compared the sequencing results of these targets to the methylation level of the corresponding sites on an Illumina 450k methylation chip. The calculated average Pearson correlation coefficient of 0.91 confirms the sequencing results with one of the industry-leading CpG methylation platforms and shows that targeted amplicon bisulfite sequencing provides an accurate and cost-efficient method for DNA methylation studies, e.g., to provide platform-independent confirmation of Illumina Infinium 450k methylation data. TABSAT offers a novel way to analyze data generated by Ion Torrent instruments and can also be used with data from the Illumina MiSeq platform. It can be easily accessed via the Platomics platform, which offers a web-based graphical user interface along with sample and parameter storage

  8. Automated cleaning and pre-processing of immunoglobulin gene sequences from high-throughput sequencing

    Directory of Open Access Journals (Sweden)

    Miri eMichaeli

    2012-12-01

    Full Text Available High throughput sequencing (HTS yields tens of thousands to millions of sequences that require a large amount of pre-processing work to clean various artifacts. Such cleaning cannot be performed manually. Existing programs are not suitable for immunoglobulin (Ig genes, which are variable and often highly mutated. This paper describes Ig-HTS-Cleaner (Ig High Throughput Sequencing Cleaner, a program containing a simple cleaning procedure that successfully deals with pre-processing of Ig sequences derived from HTS, and Ig-Indel-Identifier (Ig Insertion – Deletion Identifier, a program for identifying legitimate and artifact insertions and/or deletions (indels. Our programs were designed for analyzing Ig gene sequences obtained by 454 sequencing, but they are applicable to all types of sequences and sequencing platforms. Ig-HTS-Cleaner and Ig-Indel-Identifier have been implemented in Java and saved as executable JAR files, supported on Linux and MS Windows. No special requirements are needed in order to run the programs, except for correctly constructing the input files as explained in the text. The programs' performance has been tested and validated on real and simulated data sets.

  9. The bias associated with amplicon sequencing does not affect the quantitative assessment of bacterial community dynamics.

    Directory of Open Access Journals (Sweden)

    Federico M Ibarbalz

    Full Text Available The performance of two sets of primers targeting variable regions of the 16S rRNA gene V1-V3 and V4 was compared in their ability to describe changes of bacterial diversity and temporal turnover in full-scale activated sludge. Duplicate sets of high-throughput amplicon sequencing data of the two 16S rRNA regions shared a collection of core taxa that were observed across a series of twelve monthly samples, although the relative abundance of each taxon was substantially different between regions. A case in point was the changes in the relative abundance of filamentous bacteria Thiothrix, which caused a large effect on diversity indices, but only in the V1-V3 data set. Yet the relative abundance of Thiothrix in the amplicon sequencing data from both regions correlated with the estimation of its abundance determined using fluorescence in situ hybridization. In nonmetric multidimensional analysis samples were distributed along the first ordination axis according to the sequenced region rather than according to sample identities. The dynamics of microbial communities indicated that V1-V3 and the V4 regions of the 16S rRNA gene yielded comparable patterns of: 1 the changes occurring within the communities along fixed time intervals, 2 the slow turnover of activated sludge communities and 3 the rate of species replacement calculated from the taxa-time relationships. The temperature was the only operational variable that showed significant correlation with the composition of bacterial communities over time for the sets of data obtained with both pairs of primers. In conclusion, we show that despite the bias introduced by amplicon sequencing, the variable regions V1-V3 and V4 can be confidently used for the quantitative assessment of bacterial community dynamics, and provide a proper qualitative account of general taxa in the community, especially when the data are obtained over a convenient time window rather than at a single time point.

  10. The application of the high throughput sequencing technology in the transposable elements.

    Science.gov (United States)

    Liu, Zhen; Xu, Jian-hong

    2015-09-01

    High throughput sequencing technology has dramatically improved the efficiency of DNA sequencing, and decreased the costs to a great extent. Meanwhile, this technology usually has advantages of better specificity, higher sensitivity and accuracy. Therefore, it has been applied to the research on genetic variations, transcriptomics and epigenomics. Recently, this technology has been widely employed in the studies of transposable elements and has achieved fruitful results. In this review, we summarize the application of high throughput sequencing technology in the fields of transposable elements, including the estimation of transposon content, preference of target sites and distribution, insertion polymorphism and population frequency, identification of rare copies, transposon horizontal transfers as well as transposon tagging. We also briefly introduce the major common sequencing strategies and algorithms, their advantages and disadvantages, and the corresponding solutions. Finally, we envision the developing trends of high throughput sequencing technology, especially the third generation sequencing technology, and its application in transposon studies in the future, hopefully providing a comprehensive understanding and reference for related scientific researchers.

  11. Large-scale benchmarking reveals false discoveries and count transformation sensitivity in 16S rRNA gene amplicon data analysis methods used in microbiome studies

    DEFF Research Database (Denmark)

    Thorsen, Jonathan; Brejnrod, Asker Daniel; Mortensen, Martin Steen

    2016-01-01

    BACKGROUND: There is an immense scientific interest in the human microbiome and its effects on human physiology, health, and disease. A common approach for examining bacterial communities is high-throughput sequencing of 16S rRNA gene hypervariable regions, aggregating sequence-similar amplicons...

  12. CLOTU: An online pipeline for processing and clustering of 454 amplicon reads into OTUs followed by taxonomic annotation

    Directory of Open Access Journals (Sweden)

    Shalchian-Tabrizi Kamran

    2011-05-01

    Full Text Available Abstract Background The implementation of high throughput sequencing for exploring biodiversity poses high demands on bioinformatics applications for automated data processing. Here we introduce CLOTU, an online and open access pipeline for processing 454 amplicon reads. CLOTU has been constructed to be highly user-friendly and flexible, since different types of analyses are needed for different datasets. Results In CLOTU, the user can filter out low quality sequences, trim tags, primers, adaptors, perform clustering of sequence reads, and run BLAST against NCBInr or a customized database in a high performance computing environment. The resulting data may be browsed in a user-friendly manner and easily forwarded to downstream analyses. Although CLOTU is specifically designed for analyzing 454 amplicon reads, other types of DNA sequence data can also be processed. A fungal ITS sequence dataset generated by 454 sequencing of environmental samples is used to demonstrate the utility of CLOTU. Conclusions CLOTU is a flexible and easy to use bioinformatics pipeline that includes different options for filtering, trimming, clustering and taxonomic annotation of high throughput sequence reads. Some of these options are not included in comparable pipelines. CLOTU is implemented in a Linux computer cluster and is freely accessible to academic users through the Bioportal web-based bioinformatics service (http://www.bioportal.uio.no.

  13. JRC GMO-Amplicons: a collection of nucleic acid sequences related to genetically modified organisms.

    Science.gov (United States)

    Petrillo, Mauro; Angers-Loustau, Alexandre; Henriksson, Peter; Bonfini, Laura; Patak, Alex; Kreysa, Joachim

    2015-01-01

    The DNA target sequence is the key element in designing detection methods for genetically modified organisms (GMOs). Unfortunately this information is frequently lacking, especially for unauthorized GMOs. In addition, patent sequences are generally poorly annotated, buried in complex and extensive documentation and hard to link to the corresponding GM event. Here, we present the JRC GMO-Amplicons, a database of amplicons collected by screening public nucleotide sequence databanks by in silico determination of PCR amplification with reference methods for GMO analysis. The European Union Reference Laboratory for Genetically Modified Food and Feed (EU-RL GMFF) provides these methods in the GMOMETHODS database to support enforcement of EU legislation and GM food/feed control. The JRC GMO-Amplicons database is composed of more than 240 000 amplicons, which can be easily accessed and screened through a web interface. To our knowledge, this is the first attempt at pooling and collecting publicly available sequences related to GMOs in food and feed. The JRC GMO-Amplicons supports control laboratories in the design and assessment of GMO methods, providing inter-alia in silico prediction of primers specificity and GM targets coverage. The new tool can assist the laboratories in the analysis of complex issues, such as the detection and identification of unauthorized GMOs. Notably, the JRC GMO-Amplicons database allows the retrieval and characterization of GMO-related sequences included in patents documentation. Finally, it can help annotating poorly described GM sequences and identifying new relevant GMO-related sequences in public databases. The JRC GMO-Amplicons is freely accessible through a web-based portal that is hosted on the EU-RL GMFF website. Database URL: http://gmo-crl.jrc.ec.europa.eu/jrcgmoamplicons/. © The Author(s) 2015. Published by Oxford University Press.

  14. Roche genome sequencer FLX based high-throughput sequencing of ancient DNA

    DEFF Research Database (Denmark)

    Alquezar-Planas, David E; Fordyce, Sarah Louise

    2012-01-01

    Since the development of so-called "next generation" high-throughput sequencing in 2005, this technology has been applied to a variety of fields. Such applications include disease studies, evolutionary investigations, and ancient DNA. Each application requires a specialized protocol to ensure...... that the data produced is optimal. Although much of the procedure can be followed directly from the manufacturer's protocols, the key differences lie in the library preparation steps. This chapter presents an optimized protocol for the sequencing of fossil remains and museum specimens, commonly referred...

  15. High-throughput Sequencing Based Immune Repertoire Study during Infectious Disease

    Directory of Open Access Journals (Sweden)

    Dongni Hou

    2016-08-01

    Full Text Available The selectivity of the adaptive immune response is based on the enormous diversity of T and B cell antigen-specific receptors. The immune repertoire, the collection of T and B cells with functional diversity in the circulatory system at any given time, is dynamic and reflects the essence of immune selectivity. In this article, we review the recent advances in immune repertoire study of infectious diseases that achieved by traditional techniques and high-throughput sequencing techniques. High-throughput sequencing techniques enable the determination of complementary regions of lymphocyte receptors with unprecedented efficiency and scale. This progress in methodology enhances the understanding of immunologic changes during pathogen challenge, and also provides a basis for further development of novel diagnostic markers, immunotherapies and vaccines.

  16. Application of high-throughput DNA sequencing in phytopathology.

    Science.gov (United States)

    Studholme, David J; Glover, Rachel H; Boonham, Neil

    2011-01-01

    The new sequencing technologies are already making a big impact in academic research on medically important microbes and may soon revolutionize diagnostics, epidemiology, and infection control. Plant pathology also stands to gain from exploiting these opportunities. This manuscript reviews some applications of these high-throughput sequencing methods that are relevant to phytopathology, with emphasis on the associated computational and bioinformatics challenges and their solutions. Second-generation sequencing technologies have recently been exploited in genomics of both prokaryotic and eukaryotic plant pathogens. They are also proving to be useful in diagnostics, especially with respect to viruses. Copyright © 2011 by Annual Reviews. All rights reserved.

  17. Evaluation of the reproducibility of amplicon sequencing with Illumina MiSeq platform.

    Science.gov (United States)

    Wen, Chongqing; Wu, Liyou; Qin, Yujia; Van Nostrand, Joy D; Ning, Daliang; Sun, Bo; Xue, Kai; Liu, Feifei; Deng, Ye; Liang, Yuting; Zhou, Jizhong

    2017-01-01

    Illumina's MiSeq has become the dominant platform for gene amplicon sequencing in microbial ecology studies; however, various technical concerns, such as reproducibility, still exist. To assess reproducibility, 16S rRNA gene amplicons from 18 soil samples of a reciprocal transplantation experiment were sequenced on an Illumina MiSeq. The V4 region of 16S rRNA gene from each sample was sequenced in triplicate with each replicate having a unique barcode. The average OTU overlap, without considering sequence abundance, at a rarefaction level of 10,323 sequences was 33.4±2.1% and 20.2±1.7% between two and among three technical replicates, respectively. When OTU sequence abundance was considered, the average sequence abundance weighted OTU overlap was 85.6±1.6% and 81.2±2.1% for two and three replicates, respectively. Removing singletons significantly increased the overlap for both (~1-3%, pdeep sequencing increased OTU overlap both when sequence abundance was considered (95%) and when not (44%). However, if singletons were not removed the overlap between two technical replicates (not considering sequence abundance) plateaus at 39% with 30,000 sequences. Diversity measures were not affected by the low overlap as α-diversities were similar among technical replicates while β-diversities (Bray-Curtis) were much smaller among technical replicates than among treatment replicates (e.g., 0.269 vs. 0.374). Higher diversity coverage, but lower OTU overlap, was observed when replicates were sequenced in separate runs. Detrended correspondence analysis indicated that while there was considerable variation among technical replicates, the reproducibility was sufficient for detecting treatment effects for the samples examined. These results suggest that although there is variation among technical replicates, amplicon sequencing on MiSeq is useful for analyzing microbial community structure if used appropriately and with caution. For example, including technical replicates

  18. Removing Noise From Pyrosequenced Amplicons

    Directory of Open Access Journals (Sweden)

    Davenport Russell J

    2011-01-01

    Full Text Available Abstract Background In many environmental genomics applications a homologous region of DNA from a diverse sample is first amplified by PCR and then sequenced. The next generation sequencing technology, 454 pyrosequencing, has allowed much larger read numbers from PCR amplicons than ever before. This has revolutionised the study of microbial diversity as it is now possible to sequence a substantial fraction of the 16S rRNA genes in a community. However, there is a growing realisation that because of the large read numbers and the lack of consensus sequences it is vital to distinguish noise from true sequence diversity in this data. Otherwise this leads to inflated estimates of the number of types or operational taxonomic units (OTUs present. Three sources of error are important: sequencing error, PCR single base substitutions and PCR chimeras. We present AmpliconNoise, a development of the PyroNoise algorithm that is capable of separately removing 454 sequencing errors and PCR single base errors. We also introduce a novel chimera removal program, Perseus, that exploits the sequence abundances associated with pyrosequencing data. We use data sets where samples of known diversity have been amplified and sequenced to quantify the effect of each of the sources of error on OTU inflation and to validate these algorithms. Results AmpliconNoise outperforms alternative algorithms substantially reducing per base error rates for both the GS FLX and latest Titanium protocol. All three sources of error lead to inflation of diversity estimates. In particular, chimera formation has a hitherto unrealised importance which varies according to amplification protocol. We show that AmpliconNoise allows accurate estimates of OTU number. Just as importantly AmpliconNoise generates the right OTUs even at low sequence differences. We demonstrate that Perseus has very high sensitivity, able to find 99% of chimeras, which is critical when these are present at high

  19. Discriminating activated sludge flocs from biofilm microbial communities in a novel pilot-scale reciprocation MBR using high-throughput 16S rRNA gene sequencing.

    Science.gov (United States)

    De Sotto, Ryan; Ho, Jaeho; Lee, Woonyoung; Bae, Sungwoo

    2018-03-29

    Membrane bioreactors (MBRs) are a well-established filtration technology that has become a popular solution for treating wastewater. One of the drawbacks of MBRs, however, is the formation of biofilm on the surface of membrane modules. The occurrence of biofilms leads to biofouling, which eventually compromises water quality and damages the membranes. To prevent this, it is vital to understand the mechanism of biofilm formation on membrane surfaces. In this pilot-scale study, a novel reciprocation membrane bioreactor was operated for a period of 8 months and fed with domestic wastewater from an aerobic tank of a local WWTP. Water quality parameters were monitored and the microbial composition of the attached biofilm and suspended aggregates was evaluated in this reciprocating MBR configuration. The abundance of nitrifiers and composition of microbial communities from biofilm and suspended solids samples were investigated using qPCR and high throughput 16S amplicon sequencing. Removal efficiencies of 29%, 16%, and 15% of chemical oxygen demand, total phosphorus and total nitrogen from the influent were observed after the MBR process with average effluent concentrations of 16 mg/L, 4.6 mg/L, and 5.8 mg/L respectively. This suggests that the energy-efficient MBR, apart from reducing the total energy consumption, was able to maintain effluent concentrations that are within regulatory standards for discharge. Molecular analysis showed the presence of amoA Bacteria and 16S Nitrospira genes with the occurrence of nitrification. Candidatus Accumulibacter, a genus with organisms that can accumulate phosphorus, was found to be present in both groups which explains why phosphorus removal was observed in the system. High-throughput 16S rRNA amplicon sequencing revealed the genus Saprospira to be the most abundant species from the total OTUs of both the membrane tank and biofilm samples. Copyright © 2018 Elsevier Ltd. All rights reserved.

  20. On the optimal trimming of high-throughput mRNA sequence data

    Directory of Open Access Journals (Sweden)

    Matthew D MacManes

    2014-01-01

    Full Text Available The widespread and rapid adoption of high-throughput sequencing technologies has afforded researchers the opportunity to gain a deep understanding of genome level processes that underlie evolutionary change, and perhaps more importantly, the links between genotype and phenotype. In particular, researchers interested in functional biology and adaptation have used these technologies to sequence mRNA transcriptomes of specific tissues, which in turn are often compared to other tissues, or other individuals with different phenotypes. While these techniques are extremely powerful, careful attention to data quality is required. In particular, because high-throughput sequencing is more error-prone than traditional Sanger sequencing, quality trimming of sequence reads should be an important step in all data processing pipelines. While several software packages for quality trimming exist, no general guidelines for the specifics of trimming have been developed. Here, using empirically derived sequence data, I provide general recommendations regarding the optimal strength of trimming, specifically in mRNA-Seq studies. Although very aggressive quality trimming is common, this study suggests that a more gentle trimming, specifically of those nucleotides whose Phred score < 2 or < 5, is optimal for most studies across a wide variety of metrics.

  1. Effort versus Reward: Preparing Samples for Fungal Community Characterization in High-Throughput Sequencing Surveys of Soils.

    Directory of Open Access Journals (Sweden)

    Zewei Song

    Full Text Available Next generation fungal amplicon sequencing is being used with increasing frequency to study fungal diversity in various ecosystems; however, the influence of sample preparation on the characterization of fungal community is poorly understood. We investigated the effects of four procedural modifications to library preparation for high-throughput sequencing (HTS. The following treatments were considered: 1 the amount of soil used in DNA extraction, 2 the inclusion of additional steps (freeze/thaw cycles, sonication, or hot water bath incubation in the extraction procedure, 3 the amount of DNA template used in PCR, and 4 the effect of sample pooling, either physically or computationally. Soils from two different ecosystems in Minnesota, USA, one prairie and one forest site, were used to assess the generality of our results. The first three treatments did not significantly influence observed fungal OTU richness or community structure at either site. Physical pooling captured more OTU richness compared to individual samples, but total OTU richness at each site was highest when individual samples were computationally combined. We conclude that standard extraction kit protocols are well optimized for fungal HTS surveys, but because sample pooling can significantly influence OTU richness estimates, it is important to carefully consider the study aims when planning sampling procedures.

  2. Microbial community profiling of fresh basil and pitfalls in taxonomic assignment of enterobacterial pathogenic species based upon 16S rRNA amplicon sequencing.

    Science.gov (United States)

    Ceuppens, Siele; De Coninck, Dieter; Bottledoorn, Nadine; Van Nieuwerburgh, Filip; Uyttendaele, Mieke

    2017-09-18

    Application of 16S rRNA (gene) amplicon sequencing on food samples is increasingly applied for assessing microbial diversity but may as unintended advantage also enable simultaneous detection of any human pathogens without a priori definition. In the present study high-throughput next-generation sequencing (NGS) of the V1-V2-V3 regions of the 16S rRNA gene was applied to identify the bacteria present on fresh basil leaves. However, results were strongly impacted by variations in the bioinformatics analysis pipelines (MEGAN, SILVAngs, QIIME and MG-RAST), including the database choice (Greengenes, RDP and M5RNA) and the annotation algorithm (best hit, representative hit and lowest common ancestor). The use of pipelines with default parameters will lead to discrepancies. The estimate of microbial diversity of fresh basil using 16S rRNA (gene) amplicon sequencing is thus indicative but subject to biases. Salmonella enterica was detected at low frequencies, between 0.1% and 0.4% of bacterial sequences, corresponding with 37 to 166 reads. However, this result was dependent upon the pipeline used: Salmonella was detected by MEGAN, SILVAngs and MG-RAST, but not by QIIME. Confirmation of Salmonella sequences by real-time PCR was unsuccessful. It was shown that taxonomic resolution obtained from the short (500bp) sequence reads of the 16S rRNA gene containing the hypervariable regions V1-V3 cannot allow distinction of Salmonella with closely related enterobacterial species. In conclusion 16S amplicon sequencing, getting the status of standard method in microbial ecology studies of foods, needs expertise on both bioinformatics and microbiology for analysis of results. It is a powerful tool to estimate bacterial diversity but amenable to biases. Limitations concerning taxonomic resolution for some bacterial species or its inability to detect sub-dominant (pathogenic) species should be acknowledged in order to avoid overinterpretation of results. Copyright © 2017 Elsevier B

  3. Gene Fusions Associated with Recurrent Amplicons Represent a Class of Passenger Aberrations in Breast Cancer

    Directory of Open Access Journals (Sweden)

    Shanker Kalyana-Sundaram

    2012-08-01

    Full Text Available Application of high-throughput transcriptome sequencing has spurred highly sensitive detection and discovery of gene fusions in cancer, but distinguishing potentially oncogenic fusions from random, “passenger” aberrations has proven challenging. Here we examine a distinctive group of gene fusions that involve genes present in the loci of chromosomal amplifications—a class of oncogenic aberrations that are widely prevalent in breast cancers. Integrative analysis of a panel of 14 breast cancer cell lines comparing gene fusions discovered by high-throughput transcriptome sequencing and genome-wide copy number aberrations assessed by array comparative genomic hybridization, led to the identification of 77 gene fusions, of which more than 60% were localized to amplicons including 17q12, 17q23, 20q13, chr8q, and others. Many of these fusions appeared to be recurrent or involved highly expressed oncogenic drivers, frequently fused with multiple different partners, but sometimes displaying loss of functional domains. As illustrative examples of the “amplicon-associated” gene fusions, we examined here a recurrent gene fusion involving the mediator of mammalian target of rapamycin signaling, RPS6KB1 kinase in BT-474, and the therapeutically important receptor tyrosine kinase EGFR in MDA-MB-468 breast cancer cell line. These gene fusions comprise a minor allelic fraction relative to the highly expressed full-length transcripts and encode chimera lacking the kinase domains, which do not impart dependence on the respective cells. Our study suggests that amplicon-associated gene fusions in breast cancer primarily represent a by-product of chromosomal amplifications, which constitutes a subset of passenger aberrations and should be factored accordingly during prioritization of gene fusion candidates.

  4. Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq)-A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes.

    Science.gov (United States)

    Chwialkowska, Karolina; Korotko, Urszula; Kosinska, Joanna; Szarejko, Iwona; Kwasniewski, Miroslaw

    2017-01-01

    Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq). We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS) and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare . However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation analysis in crop

  5. Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq—A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes

    Directory of Open Access Journals (Sweden)

    Karolina Chwialkowska

    2017-11-01

    Full Text Available Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq. We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare. However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation

  6. Metagenomic Analysis of Slovak Bryndza Cheese Using Next-Generation 16S rDNA Amplicon Sequencing

    Directory of Open Access Journals (Sweden)

    Planý Matej

    2016-06-01

    Full Text Available Knowledge about diversity and taxonomic structure of the microbial population present in traditional fermented foods plays a key role in starter culture selection, safety improvement and quality enhancement of the end product. Aim of this study was to investigate microbial consortia composition in Slovak bryndza cheese. For this purpose, we used culture-independent approach based on 16S rDNA amplicon sequencing using next generation sequencing platform. Results obtained by the analysis of three commercial (produced on industrial scale in winter season and one traditional (artisanal, most valued, produced in May Slovak bryndza cheese sample were compared. A diverse prokaryotic microflora composed mostly of the genera Lactococcus, Streptococcus, Lactobacillus, and Enterococcus was identified. Lactococcus lactis subsp. lactis and Lactococcus lactis subsp. cremoris were the dominant taxons in all tested samples. Second most abundant species, detected in all bryndza cheeses, were Lactococcus fujiensis and Lactococcus taiwanensis, independently by two different approaches, using different reference 16S rRNA genes databases (Greengenes and NCBI respectively. They have been detected in bryndza cheese samples in substantial amount for the first time. The narrowest microbial diversity was observed in a sample made with a starter culture from pasteurised milk. Metagenomic analysis by high-throughput sequencing using 16S rRNA genes seems to be a powerful tool for studying the structure of the microbial population in cheeses.

  7. SINA: accurate high-throughput multiple sequence alignment of ribosomal RNA genes.

    Science.gov (United States)

    Pruesse, Elmar; Peplies, Jörg; Glöckner, Frank Oliver

    2012-07-15

    In the analysis of homologous sequences, computation of multiple sequence alignments (MSAs) has become a bottleneck. This is especially troublesome for marker genes like the ribosomal RNA (rRNA) where already millions of sequences are publicly available and individual studies can easily produce hundreds of thousands of new sequences. Methods have been developed to cope with such numbers, but further improvements are needed to meet accuracy requirements. In this study, we present the SILVA Incremental Aligner (SINA) used to align the rRNA gene databases provided by the SILVA ribosomal RNA project. SINA uses a combination of k-mer searching and partial order alignment (POA) to maintain very high alignment accuracy while satisfying high throughput performance demands. SINA was evaluated in comparison with the commonly used high throughput MSA programs PyNAST and mothur. The three BRAliBase III benchmark MSAs could be reproduced with 99.3, 97.6 and 96.1 accuracy. A larger benchmark MSA comprising 38 772 sequences could be reproduced with 98.9 and 99.3% accuracy using reference MSAs comprising 1000 and 5000 sequences. SINA was able to achieve higher accuracy than PyNAST and mothur in all performed benchmarks. Alignment of up to 500 sequences using the latest SILVA SSU/LSU Ref datasets as reference MSA is offered at http://www.arb-silva.de/aligner. This page also links to Linux binaries, user manual and tutorial. SINA is made available under a personal use license.

  8. Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq)—A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes

    Science.gov (United States)

    Chwialkowska, Karolina; Korotko, Urszula; Kosinska, Joanna; Szarejko, Iwona; Kwasniewski, Miroslaw

    2017-01-01

    Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq). We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS) and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare. However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation analysis in crop

  9. High-throughput sequencing and mutagenesis to accelerate the domestication of Microlaena stipoides as a new food crop.

    Directory of Open Access Journals (Sweden)

    Frances M Shapter

    Full Text Available Global food demand, climatic variability and reduced land availability are driving the need for domestication of new crop species. The accelerated domestication of a rice-like Australian dryland polyploid grass, Microlaena stipoides (Poaceae, was targeted using chemical mutagenesis in conjunction with high throughput sequencing of genes for key domestication traits. While M. stipoides has previously been identified as having potential as a new grain crop for human consumption, only a limited understanding of its genetic diversity and breeding system was available to aid the domestication process. Next generation sequencing of deeply-pooled target amplicons estimated allelic diversity of a selected base population at 14.3 SNP/Mb and identified novel, putatively mutation-induced polymorphisms at about 2.4 mutations/Mb. A 97% lethal dose (LD₉₇ of ethyl methanesulfonate treatment was applied without inducing sterility in this polyploid species. Forward and reverse genetic screens identified beneficial alleles for the domestication trait, seed-shattering. Unique phenotypes observed in the M2 population suggest the potential for rapid accumulation of beneficial traits without recourse to a traditional cross-breeding strategy. This approach may be applicable to other wild species, unlocking their potential as new food, fibre and fuel crops.

  10. Multiplex enrichment quantitative PCR (ME-qPCR): a high-throughput, highly sensitive detection method for GMO identification.

    Science.gov (United States)

    Fu, Wei; Zhu, Pengyu; Wei, Shuang; Zhixin, Du; Wang, Chenguang; Wu, Xiyang; Li, Feiwu; Zhu, Shuifang

    2017-04-01

    Among all of the high-throughput detection methods, PCR-based methodologies are regarded as the most cost-efficient and feasible methodologies compared with the next-generation sequencing or ChIP-based methods. However, the PCR-based methods can only achieve multiplex detection up to 15-plex due to limitations imposed by the multiplex primer interactions. The detection throughput cannot meet the demands of high-throughput detection, such as SNP or gene expression analysis. Therefore, in our study, we have developed a new high-throughput PCR-based detection method, multiplex enrichment quantitative PCR (ME-qPCR), which is a combination of qPCR and nested PCR. The GMO content detection results in our study showed that ME-qPCR could achieve high-throughput detection up to 26-plex. Compared to the original qPCR, the Ct values of ME-qPCR were lower for the same group, which showed that ME-qPCR sensitivity is higher than the original qPCR. The absolute limit of detection for ME-qPCR could achieve levels as low as a single copy of the plant genome. Moreover, the specificity results showed that no cross-amplification occurred for irrelevant GMO events. After evaluation of all of the parameters, a practical evaluation was performed with different foods. The more stable amplification results, compared to qPCR, showed that ME-qPCR was suitable for GMO detection in foods. In conclusion, ME-qPCR achieved sensitive, high-throughput GMO detection in complex substrates, such as crops or food samples. In the future, ME-qPCR-based GMO content identification may positively impact SNP analysis or multiplex gene expression of food or agricultural samples. Graphical abstract For the first-step amplification, four primers (A, B, C, and D) have been added into the reaction volume. In this manner, four kinds of amplicons have been generated. All of these four amplicons could be regarded as the target of second-step PCR. For the second-step amplification, three parallels have been taken for

  11. Application of high-throughput sequencing in understanding human oral microbiome related with health and disease

    OpenAIRE

    Chen, Hui; Jiang, Wen

    2014-01-01

    The oral microbiome is one of most diversity habitat in the human body and they are closely related with oral health and disease. As the technique developing,, high throughput sequencing has become a popular approach applied for oral microbial analysis. Oral bacterial profiles have been studied to explore the relationship between microbial diversity and oral diseases such as caries and periodontal disease. This review describes the application of high-throughput sequencing for characterizati...

  12. Evaluation of a pooled strategy for high-throughput sequencing of cosmid clones from metagenomic libraries.

    Science.gov (United States)

    Lam, Kathy N; Hall, Michael W; Engel, Katja; Vey, Gregory; Cheng, Jiujun; Neufeld, Josh D; Charles, Trevor C

    2014-01-01

    High-throughput sequencing methods have been instrumental in the growing field of metagenomics, with technological improvements enabling greater throughput at decreased costs. Nonetheless, the economy of high-throughput sequencing cannot be fully leveraged in the subdiscipline of functional metagenomics. In this area of research, environmental DNA is typically cloned to generate large-insert libraries from which individual clones are isolated, based on specific activities of interest. Sequence data are required for complete characterization of such clones, but the sequencing of a large set of clones requires individual barcode-based sample preparation; this can become costly, as the cost of clone barcoding scales linearly with the number of clones processed, and thus sequencing a large number of metagenomic clones often remains cost-prohibitive. We investigated a hybrid Sanger/Illumina pooled sequencing strategy that omits barcoding altogether, and we evaluated this strategy by comparing the pooled sequencing results to reference sequence data obtained from traditional barcode-based sequencing of the same set of clones. Using identity and coverage metrics in our evaluation, we show that pooled sequencing can generate high-quality sequence data, without producing problematic chimeras. Though caveats of a pooled strategy exist and further optimization of the method is required to improve recovery of complete clone sequences and to avoid circumstances that generate unrecoverable clone sequences, our results demonstrate that pooled sequencing represents an effective and low-cost alternative for sequencing large sets of metagenomic clones.

  13. Probabilistic Methods for Processing High-Throughput Sequencing Signals

    DEFF Research Database (Denmark)

    Sørensen, Lasse Maretty

    High-throughput sequencing has the potential to answer many of the big questions in biology and medicine. It can be used to determine the ancestry of species, to chart complex ecosystems and to understand and diagnose disease. However, going from raw sequencing data to biological or medical insig....... By estimating the genotypes on a set of candidate variants obtained from both a standard mapping-based approach as well as de novo assemblies, we are able to find considerably more structural variation than previous studies...... for reconstructing transcript sequences from RNA sequencing data. The method is based on a novel sparse prior distribution over transcript abundances and is markedly more accurate than existing approaches. The second chapter describes a new method for calling genotypes from a fixed set of candidate variants....... The method queries the reads using a graph representation of the variants and hereby mitigates the reference-bias that characterise standard genotyping methods. In the last chapter, we apply this method to call the genotypes of 50 deeply sequencing parent-offspring trios from the GenomeDenmark project...

  14. Targeted Capture and High-Throughput Sequencing Using Molecular Inversion Probes (MIPs).

    Science.gov (United States)

    Cantsilieris, Stuart; Stessman, Holly A; Shendure, Jay; Eichler, Evan E

    2017-01-01

    Molecular inversion probes (MIPs) in combination with massively parallel DNA sequencing represent a versatile, yet economical tool for targeted sequencing of genomic DNA. Several thousand genomic targets can be selectively captured using long oligonucleotides containing unique targeting arms and universal linkers. The ability to append sequencing adaptors and sample-specific barcodes allows large-scale pooling and subsequent high-throughput sequencing at relatively low cost per sample. Here, we describe a "wet bench" protocol detailing the capture and subsequent sequencing of >2000 genomic targets from 192 samples, representative of a single lane on the Illumina HiSeq 2000 platform.

  15. A high-throughput protocol for mutation scanning of the BRCA1 and BRCA2 genes

    International Nuclear Information System (INIS)

    Hondow, Heather L; Fox, Stephen B; Mitchell, Gillian; Scott, Rodney J; Beshay, Victoria; Wong, Stephen Q; Dobrovic, Alexander

    2011-01-01

    Detection of mutations by DNA sequencing can be facilitated by scanning methods to identify amplicons which may have mutations. Current scanning methods used for the detection of germline sequence variants are laborious as they require post-PCR manipulation. High resolution melting (HRM) is a cost-effective rapid screening strategy, which readily detects heterozygous variants by melting curve analysis of PCR products. It is well suited to screening genes such as BRCA1 and BRCA2 as germline pathogenic mutations in these genes are always heterozygous. Assays for the analysis of all coding regions and intron-exon boundaries of BRCA1 and BRCA2 were designed, and optimised. A final set of 94 assays which ran under identical amplification conditions were chosen for BRCA1 (36) and BRCA2 (58). Significant attention was placed on primer design to enable reproducible detection of mutations within the amplicon while minimising unnecessary detection of polymorphisms. Deoxyinosine residues were incorporated into primers that overlay intronic polymorphisms. Multiple 384 well plates were used to facilitate high throughput. 169 BRCA1 and 239 BRCA2 known sequence variants were used to test the amplicons. We also performed an extensive blinded validation of the protocol with 384 separate patient DNAs. All heterozygous variants were detected with the optimised assays. This is the first HRM approach to screen the entire coding region of the BRCA1 and BRCA2 genes using one set of reaction conditions in a multi plate 384 well format using specifically designed primers. The parallel screening of a relatively large number of samples enables better detection of sequence variants. HRM has the advantages of decreasing the necessary sequencing by more than 90%. This markedly reduced cost of sequencing will result in BRCA1 and BRCA2 mutation testing becoming accessible to individuals who currently do not undergo mutation testing because of the significant costs involved

  16. 'Mitominis': multiplex PCR analysis of reduced size amplicons for compound sequence analysis of the entire mtDNA control region in highly degraded samples.

    Science.gov (United States)

    Eichmann, Cordula; Parson, Walther

    2008-09-01

    The traditional protocol for forensic mitochondrial DNA (mtDNA) analyses involves the amplification and sequencing of the two hypervariable segments HVS-I and HVS-II of the mtDNA control region. The primers usually span fragment sizes of 300-400 bp each region, which may result in weak or failed amplification in highly degraded samples. Here we introduce an improved and more stable approach using shortened amplicons in the fragment range between 144 and 237 bp. Ten such amplicons were required to produce overlapping fragments that cover the entire human mtDNA control region. These were co-amplified in two multiplex polymerase chain reactions and sequenced with the individual amplification primers. The primers were carefully selected to minimize binding on homoplasic and haplogroup-specific sites that would otherwise result in loss of amplification due to mis-priming. The multiplexes have successfully been applied to ancient and forensic samples such as bones and teeth that showed a high degree of degradation.

  17. BioMaS: a modular pipeline for Bioinformatic analysis of Metagenomic AmpliconS.

    Science.gov (United States)

    Fosso, Bruno; Santamaria, Monica; Marzano, Marinella; Alonso-Alemany, Daniel; Valiente, Gabriel; Donvito, Giacinto; Monaco, Alfonso; Notarangelo, Pasquale; Pesole, Graziano

    2015-07-01

    Substantial advances in microbiology, molecular evolution and biodiversity have been carried out in recent years thanks to Metagenomics, which allows to unveil the composition and functions of mixed microbial communities in any environmental niche. If the investigation is aimed only at the microbiome taxonomic structure, a target-based metagenomic approach, here also referred as Meta-barcoding, is generally applied. This approach commonly involves the selective amplification of a species-specific genetic marker (DNA meta-barcode) in the whole taxonomic range of interest and the exploration of its taxon-related variants through High-Throughput Sequencing (HTS) technologies. The accessibility to proper computational systems for the large-scale bioinformatic analysis of HTS data represents, currently, one of the major challenges in advanced Meta-barcoding projects. BioMaS (Bioinformatic analysis of Metagenomic AmpliconS) is a new bioinformatic pipeline designed to support biomolecular researchers involved in taxonomic studies of environmental microbial communities by a completely automated workflow, comprehensive of all the fundamental steps, from raw sequence data upload and cleaning to final taxonomic identification, that are absolutely required in an appropriately designed Meta-barcoding HTS-based experiment. In its current version, BioMaS allows the analysis of both bacterial and fungal environments starting directly from the raw sequencing data from either Roche 454 or Illumina HTS platforms, following two alternative paths, respectively. BioMaS is implemented into a public web service available at https://recasgateway.ba.infn.it/ and is also available in Galaxy at http://galaxy.cloud.ba.infn.it:8080 (only for Illumina data). BioMaS is a friendly pipeline for Meta-barcoding HTS data analysis specifically designed for users without particular computing skills. A comparative benchmark, carried out by using a simulated dataset suitably designed to broadly represent

  18. Quack: A quality assurance tool for high throughput sequence data.

    Science.gov (United States)

    Thrash, Adam; Arick, Mark; Peterson, Daniel G

    2018-05-01

    The quality of data generated by high-throughput DNA sequencing tools must be rapidly assessed in order to determine how useful the data may be in making biological discoveries; higher quality data leads to more confident results and conclusions. Due to the ever-increasing size of data sets and the importance of rapid quality assessment, tools that analyze sequencing data should quickly produce easily interpretable graphics. Quack addresses these issues by generating information-dense visualizations from FASTQ files at a speed far surpassing other publicly available quality assurance tools in a manner independent of sequencing technology. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  19. Applications of high-throughput sequencing to chromatin structure and function in mammals

    OpenAIRE

    Dunham, Ian

    2009-01-01

    High-throughput DNA sequencing approaches have enabled direct interrogation of chromatin samples from mammalian cells. We are beginning to develop a genome-wide description of nuclear function during development, but further data collection, refinement, and integration are needed.

  20. Analysis of intra-host genetic diversity of Prunus necrotic ringspot virus (PNRSV) using amplicon next generation sequencing.

    Science.gov (United States)

    Kinoti, Wycliff M; Constable, Fiona E; Nancarrow, Narelle; Plummer, Kim M; Rodoni, Brendan

    2017-01-01

    PCR amplicon next generation sequencing (NGS) analysis offers a broadly applicable and targeted approach to detect populations of both high- or low-frequency virus variants in one or more plant samples. In this study, amplicon NGS was used to explore the diversity of the tripartite genome virus, Prunus necrotic ringspot virus (PNRSV) from 53 PNRSV-infected trees using amplicons from conserved gene regions of each of PNRSV RNA1, RNA2 and RNA3. Sequencing of the amplicons from 53 PNRSV-infected trees revealed differing levels of polymorphism across the three different components of the PNRSV genome with a total number of 5040, 2083 and 5486 sequence variants observed for RNA1, RNA2 and RNA3 respectively. The RNA2 had the lowest diversity of sequences compared to RNA1 and RNA3, reflecting the lack of flexibility tolerated by the replicase gene that is encoded by this RNA component. Distinct PNRSV phylo-groups, consisting of closely related clusters of sequence variants, were observed in each of PNRSV RNA1, RNA2 and RNA3. Most plant samples had a single phylo-group for each RNA component. Haplotype network analysis showed that smaller clusters of PNRSV sequence variants were genetically connected to the largest sequence variant cluster within a phylo-group of each RNA component. Some plant samples had sequence variants occurring in multiple PNRSV phylo-groups in at least one of each RNA and these phylo-groups formed distinct clades that represent PNRSV genetic strains. Variants within the same phylo-group of each Prunus plant sample had ≥97% similarity and phylo-groups within a Prunus plant sample and between samples had less ≤97% similarity. Based on the analysis of diversity, a definition of a PNRSV genetic strain was proposed. The proposed definition was applied to determine the number of PNRSV genetic strains in each of the plant samples and the complexity in defining genetic strains in multipartite genome viruses was explored.

  1. Analysis of intra-host genetic diversity of Prunus necrotic ringspot virus (PNRSV using amplicon next generation sequencing.

    Directory of Open Access Journals (Sweden)

    Wycliff M Kinoti

    Full Text Available PCR amplicon next generation sequencing (NGS analysis offers a broadly applicable and targeted approach to detect populations of both high- or low-frequency virus variants in one or more plant samples. In this study, amplicon NGS was used to explore the diversity of the tripartite genome virus, Prunus necrotic ringspot virus (PNRSV from 53 PNRSV-infected trees using amplicons from conserved gene regions of each of PNRSV RNA1, RNA2 and RNA3. Sequencing of the amplicons from 53 PNRSV-infected trees revealed differing levels of polymorphism across the three different components of the PNRSV genome with a total number of 5040, 2083 and 5486 sequence variants observed for RNA1, RNA2 and RNA3 respectively. The RNA2 had the lowest diversity of sequences compared to RNA1 and RNA3, reflecting the lack of flexibility tolerated by the replicase gene that is encoded by this RNA component. Distinct PNRSV phylo-groups, consisting of closely related clusters of sequence variants, were observed in each of PNRSV RNA1, RNA2 and RNA3. Most plant samples had a single phylo-group for each RNA component. Haplotype network analysis showed that smaller clusters of PNRSV sequence variants were genetically connected to the largest sequence variant cluster within a phylo-group of each RNA component. Some plant samples had sequence variants occurring in multiple PNRSV phylo-groups in at least one of each RNA and these phylo-groups formed distinct clades that represent PNRSV genetic strains. Variants within the same phylo-group of each Prunus plant sample had ≥97% similarity and phylo-groups within a Prunus plant sample and between samples had less ≤97% similarity. Based on the analysis of diversity, a definition of a PNRSV genetic strain was proposed. The proposed definition was applied to determine the number of PNRSV genetic strains in each of the plant samples and the complexity in defining genetic strains in multipartite genome viruses was explored.

  2. Does universal 16S rRNA gene amplicon sequencing of environmental communities provide an accurate description of nitrifying guilds?

    DEFF Research Database (Denmark)

    Diwan, Vaibhav; Albrechtsen, Hans-Jørgen; Smets, Barth F.

    2018-01-01

    amplicon sequencing and from guild targeted approaches. The universal amplicon sequencing provided 1) accurate estimates of nitrifier composition, 2) clustering of the samples based on these compositions consistent with sample origin, 3) estimates of the relative abundance of the guilds correlated...

  3. Library Design-Facilitated High-Throughput Sequencing of Synthetic Peptide Libraries.

    Science.gov (United States)

    Vinogradov, Alexander A; Gates, Zachary P; Zhang, Chi; Quartararo, Anthony J; Halloran, Kathryn H; Pentelute, Bradley L

    2017-11-13

    A methodology to achieve high-throughput de novo sequencing of synthetic peptide mixtures is reported. The approach leverages shotgun nanoliquid chromatography coupled with tandem mass spectrometry-based de novo sequencing of library mixtures (up to 2000 peptides) as well as automated data analysis protocols to filter away incorrect assignments, noise, and synthetic side-products. For increasing the confidence in the sequencing results, mass spectrometry-friendly library designs were developed that enabled unambiguous decoding of up to 600 peptide sequences per hour while maintaining greater than 85% sequence identification rates in most cases. The reliability of the reported decoding strategy was additionally confirmed by matching fragmentation spectra for select authentic peptides identified from library sequencing samples. The methods reported here are directly applicable to screening techniques that yield mixtures of active compounds, including particle sorting of one-bead one-compound libraries and affinity enrichment of synthetic library mixtures performed in solution.

  4. Amplicon-Based Pyrosequencing Reveals High Diversity of Protistan Parasites in Ships' Ballast Water: Implications for Biogeography and Infectious Diseases.

    Science.gov (United States)

    Pagenkopp Lohan, K M; Fleischer, R C; Carney, K J; Holzer, K K; Ruiz, G M

    2016-04-01

    Ships' ballast water (BW) commonly moves macroorganisms and microorganisms across the world's oceans and along coasts; however, the majority of these microbial transfers have gone undetected. We applied high-throughput sequencing methods to identify microbial eukaryotes, specifically emphasizing the protistan parasites, in ships' BW collected from vessels calling to the Chesapeake Bay (Virginia and Maryland, USA) from European and Eastern Canadian ports. We utilized tagged-amplicon 454 pyrosequencing with two general primer sets, amplifying either the V4 or V9 domain of the small subunit (SSU) of the ribosomal RNA (rRNA) gene complex, from total DNA extracted from water samples collected from the ballast tanks of bulk cargo vessels. We detected a diverse group of protistan taxa, with some known to contain important parasites in marine systems, including Apicomplexa (unidentified apicomplexans, unidentified gregarines, Cryptosporidium spp.), Dinophyta (Blastodinium spp., Euduboscquella sp., unidentified syndinids, Karlodinium spp., Syndinium spp.), Perkinsea (Parvilucifera sp.), Opisthokonta (Ichthyosporea sp., Pseudoperkinsidae, unidentified ichthyosporeans), and Stramenopiles (Labyrinthulomycetes). Further characterization of groups with parasitic taxa, consisting of phylogenetic analyses for four taxa (Cryptosporidium spp., Parvilucifera spp., Labyrinthulomycetes, and Ichthyosporea), revealed that sequences were obtained from both known and novel lineages. This study demonstrates that high-throughput sequencing is a viable and sensitive method for detecting parasitic protists when present and transported in the ballast water of ships. These data also underscore the potential importance of human-aided dispersal in the biogeography of these microbes and emerging diseases in the world's oceans.

  5. Improvements and impacts of GRCh38 human reference on high throughput sequencing data analysis.

    Science.gov (United States)

    Guo, Yan; Dai, Yulin; Yu, Hui; Zhao, Shilin; Samuels, David C; Shyr, Yu

    2017-03-01

    Analyses of high throughput sequencing data starts with alignment against a reference genome, which is the foundation for all re-sequencing data analyses. Each new release of the human reference genome has been augmented with improved accuracy and completeness. It is presumed that the latest release of human reference genome, GRCh38 will contribute more to high throughput sequencing data analysis by providing more accuracy. But the amount of improvement has not yet been quantified. We conducted a study to compare the genomic analysis results between the GRCh38 reference and its predecessor GRCh37. Through analyses of alignment, single nucleotide polymorphisms, small insertion/deletions, copy number and structural variants, we show that GRCh38 offers overall more accurate analysis of human sequencing data. More importantly, GRCh38 produced fewer false positive structural variants. In conclusion, GRCh38 is an improvement over GRCh37 not only from the genome assembly aspect, but also yields more reliable genomic analysis results. Copyright © 2017. Published by Elsevier Inc.

  6. Using high-throughput barcode sequencing to efficiently map connectomes.

    Science.gov (United States)

    Peikon, Ian D; Kebschull, Justus M; Vagin, Vasily V; Ravens, Diana I; Sun, Yu-Chi; Brouzes, Eric; Corrêa, Ivan R; Bressan, Dario; Zador, Anthony M

    2017-07-07

    The function of a neural circuit is determined by the details of its synaptic connections. At present, the only available method for determining a neural wiring diagram with single synapse precision-a 'connectome'-is based on imaging methods that are slow, labor-intensive and expensive. Here, we present SYNseq, a method for converting the connectome into a form that can exploit the speed and low cost of modern high-throughput DNA sequencing. In SYNseq, each neuron is labeled with a unique random nucleotide sequence-an RNA 'barcode'-which is targeted to the synapse using engineered proteins. Barcodes in pre- and postsynaptic neurons are then associated through protein-protein crosslinking across the synapse, extracted from the tissue, and joined into a form suitable for sequencing. Although our failure to develop an efficient barcode joining scheme precludes the widespread application of this approach, we expect that with further development SYNseq will enable tracing of complex circuits at high speed and low cost. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  7. Intracellular diversity of the V4 and V9 regions of the 18S rRNA in marine protists (radiolarians) assessed by high-throughput sequencing.

    Science.gov (United States)

    Decelle, Johan; Romac, Sarah; Sasaki, Eriko; Not, Fabrice; Mahé, Frédéric

    2014-01-01

    Metabarcoding is a powerful tool for exploring microbial diversity in the environment, but its accurate interpretation is impeded by diverse technical (e.g. PCR and sequencing errors) and biological biases (e.g. intra-individual polymorphism) that remain poorly understood. To help interpret environmental metabarcoding datasets, we investigated the intracellular diversity of the V4 and V9 regions of the 18S rRNA gene from Acantharia and Nassellaria (radiolarians) using 454 pyrosequencing. Individual cells of radiolarians were isolated, and PCRs were performed with generalist primers to amplify the V4 and V9 regions. Different denoising procedures were employed to filter the pyrosequenced raw amplicons (Acacia, AmpliconNoise, Linkage method). For each of the six isolated cells, an average of 541 V4 and 562 V9 amplicons assigned to radiolarians were obtained, from which one numerically dominant sequence and several minor variants were found. At the 97% identity, a diversity metrics commonly used in environmental surveys, up to 5 distinct OTUs were detected in a single cell. However, most amplicons grouped within a single OTU whereas other OTUs contained very few amplicons. Different analytical methods provided evidence that most minor variants forming different OTUs correspond to PCR and sequencing artifacts. Duplicate PCR and sequencing from the same DNA extract of a single cell had only 9 to 16% of unique amplicons in common, and alignment visualization of V4 and V9 amplicons showed that most minor variants contained substitutions in highly-conserved regions. We conclude that intracellular variability of the 18S rRNA in radiolarians is very limited despite its multi-copy nature and the existence of multiple nuclei in these protists. Our study recommends some technical guidelines to conservatively discard artificial amplicons from metabarcoding datasets, and thus properly assess the diversity and richness of protists in the environment.

  8. Quartz-Seq2: a high-throughput single-cell RNA-sequencing method that effectively uses limited sequence reads.

    Science.gov (United States)

    Sasagawa, Yohei; Danno, Hiroki; Takada, Hitomi; Ebisawa, Masashi; Tanaka, Kaori; Hayashi, Tetsutaro; Kurisaki, Akira; Nikaido, Itoshi

    2018-03-09

    High-throughput single-cell RNA-seq methods assign limited unique molecular identifier (UMI) counts as gene expression values to single cells from shallow sequence reads and detect limited gene counts. We thus developed a high-throughput single-cell RNA-seq method, Quartz-Seq2, to overcome these issues. Our improvements in the reaction steps make it possible to effectively convert initial reads to UMI counts, at a rate of 30-50%, and detect more genes. To demonstrate the power of Quartz-Seq2, we analyzed approximately 10,000 transcriptomes from in vitro embryonic stem cells and an in vivo stromal vascular fraction with a limited number of reads.

  9. High-throughput gender identification of penguin species using melting curve analysis.

    Science.gov (United States)

    Tseng, Chao-Neng; Chang, Yung-Ting; Chiu, Hui-Tzu; Chou, Yii-Cheng; Huang, Hurng-Wern; Cheng, Chien-Chung; Liao, Ming-Hui; Chang, Hsueh-Wei

    2014-04-03

    Most species of penguins are sexual monomorphic and therefore it is difficult to visually identify their genders for monitoring population stability in terms of sex ratio analysis. In this study, we evaluated the suitability using melting curve analysis (MCA) for high-throughput gender identification of penguins. Preliminary test indicated that the Griffiths's P2/P8 primers were not suitable for MCA analysis. Based on sequence alignment of Chromo-Helicase-DNA binding protein (CHD)-W and CHD-Z genes from four species of penguins (Pygoscelis papua, Aptenodytes patagonicus, Spheniscus magellanicus, and Eudyptes chrysocome), we redesigned forward primers for the CHD-W/CHD-Z-common region (PGU-ZW2) and the CHD-W-specific region (PGU-W2) to be used in combination with the reverse Griffiths's P2 primer. When tested with P. papua samples, PCR using P2/PGU-ZW2 and P2/PGU-W2 primer sets generated two amplicons of 148- and 356-bp, respectively, which were easily resolved in 1.5% agarose gels. MCA analysis indicated the melting temperature (Tm) values for P2/PGU-ZW2 and P2/PGU-W2 amplicons of P. papua samples were 79.75°C-80.5°C and 81.0°C-81.5°C, respectively. Females displayed both ZW-common and W-specific Tm peaks, whereas male was positive only for ZW-common peak. Taken together, our redesigned primers coupled with MCA analysis allows precise high throughput gender identification for P. papua, and potentially for other penguin species such as A. patagonicus, S. magellanicus, and E. chrysocome as well.

  10. High-Throughput Analysis of T-DNA Location and Structure Using Sequence Capture.

    Science.gov (United States)

    Inagaki, Soichi; Henry, Isabelle M; Lieberman, Meric C; Comai, Luca

    2015-01-01

    Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA-genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously, using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. Our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.

  11. High Resolution Melting (HRM for High-Throughput Genotyping—Limitations and Caveats in Practical Case Studies

    Directory of Open Access Journals (Sweden)

    Marcin Słomka

    2017-11-01

    Full Text Available High resolution melting (HRM is a convenient method for gene scanning as well as genotyping of individual and multiple single nucleotide polymorphisms (SNPs. This rapid, simple, closed-tube, homogenous, and cost-efficient approach has the capacity for high specificity and sensitivity, while allowing easy transition to high-throughput scale. In this paper, we provide examples from our laboratory practice of some problematic issues which can affect the performance and data analysis of HRM results, especially with regard to reference curve-based targeted genotyping. We present those examples in order of the typical experimental workflow, and discuss the crucial significance of the respective experimental errors and limitations for the quality and analysis of results. The experimental details which have a decisive impact on correct execution of a HRM genotyping experiment include type and quality of DNA source material, reproducibility of isolation method and template DNA preparation, primer and amplicon design, automation-derived preparation and pipetting inconsistencies, as well as physical limitations in melting curve distinction for alternative variants and careful selection of samples for validation by sequencing. We provide a case-by-case analysis and discussion of actual problems we encountered and solutions that should be taken into account by researchers newly attempting HRM genotyping, especially in a high-throughput setup.

  12. High Resolution Melting (HRM) for High-Throughput Genotyping—Limitations and Caveats in Practical Case Studies

    Science.gov (United States)

    Słomka, Marcin; Sobalska-Kwapis, Marta; Wachulec, Monika; Bartosz, Grzegorz

    2017-01-01

    High resolution melting (HRM) is a convenient method for gene scanning as well as genotyping of individual and multiple single nucleotide polymorphisms (SNPs). This rapid, simple, closed-tube, homogenous, and cost-efficient approach has the capacity for high specificity and sensitivity, while allowing easy transition to high-throughput scale. In this paper, we provide examples from our laboratory practice of some problematic issues which can affect the performance and data analysis of HRM results, especially with regard to reference curve-based targeted genotyping. We present those examples in order of the typical experimental workflow, and discuss the crucial significance of the respective experimental errors and limitations for the quality and analysis of results. The experimental details which have a decisive impact on correct execution of a HRM genotyping experiment include type and quality of DNA source material, reproducibility of isolation method and template DNA preparation, primer and amplicon design, automation-derived preparation and pipetting inconsistencies, as well as physical limitations in melting curve distinction for alternative variants and careful selection of samples for validation by sequencing. We provide a case-by-case analysis and discussion of actual problems we encountered and solutions that should be taken into account by researchers newly attempting HRM genotyping, especially in a high-throughput setup. PMID:29099791

  13. High Resolution Melting (HRM) for High-Throughput Genotyping-Limitations and Caveats in Practical Case Studies.

    Science.gov (United States)

    Słomka, Marcin; Sobalska-Kwapis, Marta; Wachulec, Monika; Bartosz, Grzegorz; Strapagiel, Dominik

    2017-11-03

    High resolution melting (HRM) is a convenient method for gene scanning as well as genotyping of individual and multiple single nucleotide polymorphisms (SNPs). This rapid, simple, closed-tube, homogenous, and cost-efficient approach has the capacity for high specificity and sensitivity, while allowing easy transition to high-throughput scale. In this paper, we provide examples from our laboratory practice of some problematic issues which can affect the performance and data analysis of HRM results, especially with regard to reference curve-based targeted genotyping. We present those examples in order of the typical experimental workflow, and discuss the crucial significance of the respective experimental errors and limitations for the quality and analysis of results. The experimental details which have a decisive impact on correct execution of a HRM genotyping experiment include type and quality of DNA source material, reproducibility of isolation method and template DNA preparation, primer and amplicon design, automation-derived preparation and pipetting inconsistencies, as well as physical limitations in melting curve distinction for alternative variants and careful selection of samples for validation by sequencing. We provide a case-by-case analysis and discussion of actual problems we encountered and solutions that should be taken into account by researchers newly attempting HRM genotyping, especially in a high-throughput setup.

  14. Discovery of viruses and virus-like pathogens in pistachio using high-throughput sequencing

    Science.gov (United States)

    Pistachio (Pistacia vera L.) trees from the National Clonal Germplasm Repository (NCGR) and orchards in California were surveyed for viruses and virus-like agents by high-throughput sequencing (HTS). Analyses of 60 trees including clonal UCB-1 hybrid rootstock (P. atlantica × P. integerrima) identif...

  15. Next-generation sequencing of multiple individuals per barcoded library by deconvolution of sequenced amplicons using endonuclease fragment analysis

    DEFF Research Database (Denmark)

    Andersen, Jeppe D; Pereira, Vania; Pietroni, Carlotta

    2014-01-01

    The simultaneous sequencing of samples from multiple individuals increases the efficiency of next-generation sequencing (NGS) while also reducing costs. Here we describe a novel and simple approach for sequencing DNA from multiple individuals per barcode. Our strategy relies on the endonuclease...... digestion of PCR amplicons prior to library preparation, creating a specific fragment pattern for each individual that can be resolved after sequencing. By using both barcodes and restriction fragment patterns, we demonstrate the ability to sequence the human melanocortin 1 receptor (MC1R) genes from 72...... individuals using only 24 barcoded libraries....

  16. High-Throughput Analysis of T-DNA Location and Structure Using Sequence Capture.

    Directory of Open Access Journals (Sweden)

    Soichi Inagaki

    Full Text Available Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA-genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously, using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. Our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.

  17. Targeted DNA Methylation Analysis by High Throughput Sequencing in Porcine Peri-attachment Embryos

    OpenAIRE

    MORRILL, Benson H.; COX, Lindsay; WARD, Anika; HEYWOOD, Sierra; PRATHER, Randall S.; ISOM, S. Clay

    2013-01-01

    Abstract The purpose of this experiment was to implement and evaluate the effectiveness of a next-generation sequencing-based method for DNA methylation analysis in porcine embryonic samples. Fourteen discrete genomic regions were amplified by PCR using bisulfite-converted genomic DNA derived from day 14 in vivo-derived (IVV) and parthenogenetic (PA) porcine embryos as template DNA. Resulting PCR products were subjected to high-throughput sequencing using the Illumina Genome Analyzer IIx plat...

  18. Yeast diversity during the fermentation of Andean chicha: A comparison of high-throughput sequencing and culture-dependent approaches.

    Science.gov (United States)

    Mendoza, Lucía M; Neef, Alexander; Vignolo, Graciela; Belloch, Carmela

    2017-10-01

    Diversity and dynamics of yeasts associated with the fermentation of Argentinian maize-based beverage chicha was investigated. Samples taken at different stages from two chicha productions were analyzed by culture-dependent and culture-independent methods. Five hundred and ninety six yeasts were isolated by classical microbiological methods and 16 species identified by RFLPs and sequencing of D1/D2 26S rRNA gene. Genetic typing of isolates from the dominant species, Saccharomyces cerevisiae, by PCR of delta elements revealed up to 42 different patterns. High-throughput sequencing (HTS) of D1/D2 26S rRNA gene amplicons from chicha samples detected more than one hundred yeast species and almost fifty filamentous fungi taxa. Analysis of the data revealed that yeasts dominated the fermentation, although, a significant percentage of filamentous fungi appeared in the first step of the process. Statistical analysis of results showed that very few taxa were represented by more than 1% of the reads per sample at any step of the process. S. cerevisiae represented more than 90% of the reads in the fermentative samples. Other yeast species dominated the pre-fermentative steps and abounded in fermented samples when S. cerevisiae was in percentages below 90%. Most yeasts species detected by pyrosequencing were not recovered by cultivation. In contrast, the cultivation-based methodology detected very few yeast taxa, and most of them corresponded with very few reads in the pyrosequencing analysis. Copyright © 2017 Elsevier Ltd. All rights reserved.

  19. High-throughput sequencing of forensic genetic samples using punches of FTA cards with buccal swabs

    DEFF Research Database (Denmark)

    Kampmann, Marie-Louise; Buchard, Anders; Børsting, Claus

    2016-01-01

    Here, we demonstrate that punches from buccal swab samples preserved on FTA cards can be used for high-throughput DNA sequencing, also known as massively parallel sequencing (MPS). We typed 44 reference samples with the HID-Ion AmpliSeq Identity Panel using washed 1.2 mm punches from FTA cards...

  20. SNP-PHAGE – High throughput SNP discovery pipeline

    Directory of Open Access Journals (Sweden)

    Cregan Perry B

    2006-10-01

    solution for high throughput SNP discovery, identification of common haplotypes within an amplicon, and GenBank (dbSNP submissions. SNP selection and visualization are aided through a user-friendly web interface. This tool is useful for analyzing sequence tagged sites (STSs of genomic sequences, and this software can serve as a starting point for groups interested in developing SNP markers.

  1. Characterizing ncRNAs in human pathogenic protists using high-throughput sequencing technology

    Directory of Open Access Journals (Sweden)

    Lesley Joan Collins

    2011-12-01

    Full Text Available ncRNAs are key genes in many human diseases including cancer and viral infection, as well as providing critical functions in pathogenic organisms such as fungi, bacteria, viruses and protists. Until now the identification and characterization of ncRNAs associated with disease has been slow or inaccurate requiring many years of testing to understand complicated RNA and protein gene relationships. High-throughput sequencing now offers the opportunity to characterize miRNAs, siRNAs, snoRNAs and long ncRNAs on a genomic scale making it faster and easier to clarify how these ncRNAs contribute to the disease state. However, this technology is still relatively new, and ncRNA discovery is not an application of high priority for streamlined bioinformatics. Here we summarize background concepts and practical approaches for ncRNA analysis using high-throughput sequencing, and how it relates to understanding human disease. As a case study, we focus on the parasitic protists Giardia lamblia and Trichomonas vaginalis, where large evolutionary distance has meant difficulties in comparing ncRNAs with those from model eukaryotes. A combination of biological, computational and sequencing approaches has enabled easier classification of ncRNA classes such as snoRNAs, but has also aided the identification of novel classes. It is hoped that a higher level of understanding of ncRNA expression and interaction may aid in the development of less harsh treatment for protist-based diseases.

  2. Characterizing ncRNAs in Human Pathogenic Protists Using High-Throughput Sequencing Technology

    Science.gov (United States)

    Collins, Lesley Joan

    2011-01-01

    ncRNAs are key genes in many human diseases including cancer and viral infection, as well as providing critical functions in pathogenic organisms such as fungi, bacteria, viruses, and protists. Until now the identification and characterization of ncRNAs associated with disease has been slow or inaccurate requiring many years of testing to understand complicated RNA and protein gene relationships. High-throughput sequencing now offers the opportunity to characterize miRNAs, siRNAs, small nucleolar RNAs (snoRNAs), and long ncRNAs on a genomic scale, making it faster and easier to clarify how these ncRNAs contribute to the disease state. However, this technology is still relatively new, and ncRNA discovery is not an application of high priority for streamlined bioinformatics. Here we summarize background concepts and practical approaches for ncRNA analysis using high-throughput sequencing, and how it relates to understanding human disease. As a case study, we focus on the parasitic protists Giardia lamblia and Trichomonas vaginalis, where large evolutionary distance has meant difficulties in comparing ncRNAs with those from model eukaryotes. A combination of biological, computational, and sequencing approaches has enabled easier classification of ncRNA classes such as snoRNAs, but has also aided the identification of novel classes. It is hoped that a higher level of understanding of ncRNA expression and interaction may aid in the development of less harsh treatment for protist-based diseases. PMID:22303390

  3. SUGAR: graphical user interface-based data refiner for high-throughput DNA sequencing.

    Science.gov (United States)

    Sato, Yukuto; Kojima, Kaname; Nariai, Naoki; Yamaguchi-Kabata, Yumi; Kawai, Yosuke; Takahashi, Mamoru; Mimori, Takahiro; Nagasaki, Masao

    2014-08-08

    Next-generation sequencers (NGSs) have become one of the main tools for current biology. To obtain useful insights from the NGS data, it is essential to control low-quality portions of the data affected by technical errors such as air bubbles in sequencing fluidics. We develop a software SUGAR (subtile-based GUI-assisted refiner) which can handle ultra-high-throughput data with user-friendly graphical user interface (GUI) and interactive analysis capability. The SUGAR generates high-resolution quality heatmaps of the flowcell, enabling users to find possible signals of technical errors during the sequencing. The sequencing data generated from the error-affected regions of a flowcell can be selectively removed by automated analysis or GUI-assisted operations implemented in the SUGAR. The automated data-cleaning function based on sequence read quality (Phred) scores was applied to a public whole human genome sequencing data and we proved the overall mapping quality was improved. The detailed data evaluation and cleaning enabled by SUGAR would reduce technical problems in sequence read mapping, improving subsequent variant analysis that require high-quality sequence data and mapping results. Therefore, the software will be especially useful to control the quality of variant calls to the low population cells, e.g., cancers, in a sample with technical errors of sequencing procedures.

  4. Investigation of Human Cancers for Retrovirus by Low-Stringency Target Enrichment and High-Throughput Sequencing

    DEFF Research Database (Denmark)

    Vinner, Lasse; Mourier, Tobias; Friis-Nielsen, Jens

    2015-01-01

    -stringency in-solution hybridization method enables detection of discovery of hitherto unknown viral sequences by high-throughput sequencing. The sensitivity was sufficient to detect retroviral...... sequences in clinical samples. We used this method to conduct an investigation for novel retrovirus in samples from three cancer types. In accordance with recent studies our investigation revealed no retroviral infections in human B-cell lymphoma cells, cutaneous T-cell lymphoma or colorectal cancer...

  5. The use of coded PCR primers enables high-throughput sequencing of multiple homolog amplification products by 454 parallel sequencing.

    Directory of Open Access Journals (Sweden)

    Jonas Binladen

    2007-02-01

    Full Text Available The invention of the Genome Sequence 20 DNA Sequencing System (454 parallel sequencing platform has enabled the rapid and high-volume production of sequence data. Until now, however, individual emulsion PCR (emPCR reactions and subsequent sequencing runs have been unable to combine template DNA from multiple individuals, as homologous sequences cannot be subsequently assigned to their original sources.We use conventional PCR with 5'-nucleotide tagged primers to generate homologous DNA amplification products from multiple specimens, followed by sequencing through the high-throughput Genome Sequence 20 DNA Sequencing System (GS20, Roche/454 Life Sciences. Each DNA sequence is subsequently traced back to its individual source through 5'tag-analysis.We demonstrate that this new approach enables the assignment of virtually all the generated DNA sequences to the correct source once sequencing anomalies are accounted for (miss-assignment rate<0.4%. Therefore, the method enables accurate sequencing and assignment of homologous DNA sequences from multiple sources in single high-throughput GS20 run. We observe a bias in the distribution of the differently tagged primers that is dependent on the 5' nucleotide of the tag. In particular, primers 5' labelled with a cytosine are heavily overrepresented among the final sequences, while those 5' labelled with a thymine are strongly underrepresented. A weaker bias also exists with regards to the distribution of the sequences as sorted by the second nucleotide of the dinucleotide tags. As the results are based on a single GS20 run, the general applicability of the approach requires confirmation. However, our experiments demonstrate that 5'primer tagging is a useful method in which the sequencing power of the GS20 can be applied to PCR-based assays of multiple homologous PCR products. The new approach will be of value to a broad range of research areas, such as those of comparative genomics, complete mitochondrial

  6. Identification and Evaluation of Single-Nucleotide Polymorphisms in Allotetraploid Peanut (Arachis hypogaea L.) Based on Amplicon Sequencing Combined with High Resolution Melting (HRM) Analysis.

    Science.gov (United States)

    Hong, Yanbin; Pandey, Manish K; Liu, Ying; Chen, Xiaoping; Liu, Hong; Varshney, Rajeev K; Liang, Xuanqiang; Huang, Shangzhi

    2015-01-01

    The cultivated peanut (Arachis hypogaea L.) is an allotetraploid (AABB) species derived from the A-genome (Arachis duranensis) and B-genome (Arachis ipaensis) progenitors. Presence of two versions of a DNA sequence based on the two progenitor genomes poses a serious technical and analytical problem during single nucleotide polymorphism (SNP) marker identification and analysis. In this context, we have analyzed 200 amplicons derived from expressed sequence tags (ESTs) and genome survey sequences (GSS) to identify SNPs in a panel of genotypes consisting of 12 cultivated peanut varieties and two diploid progenitors representing the ancestral genomes. A total of 18 EST-SNPs and 44 genomic-SNPs were identified in 12 peanut varieties by aligning the sequence of A. hypogaea with diploid progenitors. The average frequency of sequence polymorphism was higher for genomic-SNPs than the EST-SNPs with one genomic-SNP every 1011 bp as compared to one EST-SNP every 2557 bp. In order to estimate the potential and further applicability of these identified SNPs, 96 peanut varieties were genotyped using high resolution melting (HRM) method. Polymorphism information content (PIC) values for EST-SNPs ranged between 0.021 and 0.413 with a mean of 0.172 in the set of peanut varieties, while genomic-SNPs ranged between 0.080 and 0.478 with a mean of 0.249. Total 33 SNPs were used for polymorphism detection among the parents and 10 selected lines from mapping population Y13Zh (Zhenzhuhei × Yueyou13). Of the total 33 SNPs, nine SNPs showed polymorphism in the mapping population Y13Zh, and seven SNPs were successfully mapped into five linkage groups. Our results showed that SNPs can be identified in allotetraploid peanut with high accuracy through amplicon sequencing and HRM assay. The identified SNPs were very informative and can be used for different genetic and breeding applications in peanut.

  7. Target-dependent enrichment of virions determines the reduction of high-throughput sequencing in virus discovery.

    Directory of Open Access Journals (Sweden)

    Randi Holm Jensen

    Full Text Available Viral infections cause many different diseases stemming both from well-characterized viral pathogens but also from emerging viruses, and the search for novel viruses continues to be of great importance. High-throughput sequencing is an important technology for this purpose. However, viral nucleic acids often constitute a minute proportion of the total genetic material in a sample from infected tissue. Techniques to enrich viral targets in high-throughput sequencing have been reported, but the sensitivity of such methods is not well established. This study compares different library preparation techniques targeting both DNA and RNA with and without virion enrichment. By optimizing the selection of intact virus particles, both by physical and enzymatic approaches, we assessed the effectiveness of the specific enrichment of viral sequences as compared to non-enriched sample preparations by selectively looking for and counting read sequences obtained from shotgun sequencing. Using shotgun sequencing of total DNA or RNA, viral targets were detected at concentrations corresponding to the predicted level, providing a foundation for estimating the effectiveness of virion enrichment. Virion enrichment typically produced a 1000-fold increase in the proportion of DNA virus sequences. For RNA virions the gain was less pronounced with a maximum 13-fold increase. This enrichment varied between the different sample concentrations, with no clear trend. Despite that less sequencing was required to identify target sequences, it was not evident from our data that a lower detection level was achieved by virion enrichment compared to shotgun sequencing.

  8. High Throughput Sequencing for Detection of Foodborne Pathogens

    Directory of Open Access Journals (Sweden)

    Camilla Sekse

    2017-10-01

    Full Text Available High-throughput sequencing (HTS is becoming the state-of-the-art technology for typing of microbial isolates, especially in clinical samples. Yet, its application is still in its infancy for monitoring and outbreak investigations of foods. Here we review the published literature, covering not only bacterial but also viral and Eukaryote food pathogens, to assess the status and potential of HTS implementation to inform stakeholders, improve food safety and reduce outbreak impacts. The developments in sequencing technology and bioinformatics have outpaced the capacity to analyze and interpret the sequence data. The influence of sample processing, nucleic acid extraction and purification, harmonized protocols for generation and interpretation of data, and properly annotated and curated reference databases including non-pathogenic “natural” strains are other major obstacles to the realization of the full potential of HTS in analytical food surveillance, epidemiological and outbreak investigations, and in complementing preventive approaches for the control and management of foodborne pathogens. Despite significant obstacles, the achieved progress in capacity and broadening of the application range over the last decade is impressive and unprecedented, as illustrated with the chosen examples from the literature. Large consortia, often with broad international participation, are making coordinated efforts to cope with many of the mentioned obstacles. Further rapid progress can therefore be prospected for the next decade.

  9. ISRNA: an integrative online toolkit for short reads from high-throughput sequencing data.

    Science.gov (United States)

    Luo, Guan-Zheng; Yang, Wei; Ma, Ying-Ke; Wang, Xiu-Jie

    2014-02-01

    Integrative Short Reads NAvigator (ISRNA) is an online toolkit for analyzing high-throughput small RNA sequencing data. Besides the high-speed genome mapping function, ISRNA provides statistics for genomic location, length distribution and nucleotide composition bias analysis of sequence reads. Number of reads mapped to known microRNAs and other classes of short non-coding RNAs, coverage of short reads on genes, expression abundance of sequence reads as well as some other analysis functions are also supported. The versatile search functions enable users to select sequence reads according to their sub-sequences, expression abundance, genomic location, relationship to genes, etc. A specialized genome browser is integrated to visualize the genomic distribution of short reads. ISRNA also supports management and comparison among multiple datasets. ISRNA is implemented in Java/C++/Perl/MySQL and can be freely accessed at http://omicslab.genetics.ac.cn/ISRNA/.

  10. Alignment of high-throughput sequencing data inside in-memory databases.

    Science.gov (United States)

    Firnkorn, Daniel; Knaup-Gregori, Petra; Lorenzo Bermejo, Justo; Ganzinger, Matthias

    2014-01-01

    In times of high-throughput DNA sequencing techniques, performance-capable analysis of DNA sequences is of high importance. Computer supported DNA analysis is still an intensive time-consuming task. In this paper we explore the potential of a new In-Memory database technology by using SAP's High Performance Analytic Appliance (HANA). We focus on read alignment as one of the first steps in DNA sequence analysis. In particular, we examined the widely used Burrows-Wheeler Aligner (BWA) and implemented stored procedures in both, HANA and the free database system MySQL, to compare execution time and memory management. To ensure that the results are comparable, MySQL has been running in memory as well, utilizing its integrated memory engine for database table creation. We implemented stored procedures, containing exact and inexact searching of DNA reads within the reference genome GRCh37. Due to technical restrictions in SAP HANA concerning recursion, the inexact matching problem could not be implemented on this platform. Hence, performance analysis between HANA and MySQL was made by comparing the execution time of the exact search procedures. Here, HANA was approximately 27 times faster than MySQL which means, that there is a high potential within the new In-Memory concepts, leading to further developments of DNA analysis procedures in the future.

  11. The efficacy of high-throughput sequencing and target enrichment on charred archaeobotanical remains

    DEFF Research Database (Denmark)

    Nistelberger, H. M.; Smith, O.; Wales, Nathan

    2016-01-01

    . It has been suggested that high-throughput sequencing (HTS) technologies coupled with DNA enrichment techniques may overcome some of these limitations. Here we report the findings of HTS and target enrichment on four important archaeological crops (barley, grape, maize and rice) performed in three...... lightly-charred maize cob. Even with target enrichment, this sample failed to yield adequate data required to address fundamental questions in archaeology and biology. We further reanalysed part of an existing dataset on charred plant material, and found all purported endogenous DNA sequences were likely...

  12. eRNA: a graphic user interface-based tool optimized for large data analysis from high-throughput RNA sequencing.

    Science.gov (United States)

    Yuan, Tiezheng; Huang, Xiaoyi; Dittmar, Rachel L; Du, Meijun; Kohli, Manish; Boardman, Lisa; Thibodeau, Stephen N; Wang, Liang

    2014-03-05

    RNA sequencing (RNA-seq) is emerging as a critical approach in biological research. However, its high-throughput advantage is significantly limited by the capacity of bioinformatics tools. The research community urgently needs user-friendly tools to efficiently analyze the complicated data generated by high throughput sequencers. We developed a standalone tool with graphic user interface (GUI)-based analytic modules, known as eRNA. The capacity of performing parallel processing and sample management facilitates large data analyses by maximizing hardware usage and freeing users from tediously handling sequencing data. The module miRNA identification" includes GUIs for raw data reading, adapter removal, sequence alignment, and read counting. The module "mRNA identification" includes GUIs for reference sequences, genome mapping, transcript assembling, and differential expression. The module "Target screening" provides expression profiling analyses and graphic visualization. The module "Self-testing" offers the directory setups, sample management, and a check for third-party package dependency. Integration of other GUIs including Bowtie, miRDeep2, and miRspring extend the program's functionality. eRNA focuses on the common tools required for the mapping and quantification analysis of miRNA-seq and mRNA-seq data. The software package provides an additional choice for scientists who require a user-friendly computing environment and high-throughput capacity for large data analysis. eRNA is available for free download at https://sourceforge.net/projects/erna/?source=directory.

  13. High-Throughput Mapping of Single-Neuron Projections by Sequencing of Barcoded RNA.

    Science.gov (United States)

    Kebschull, Justus M; Garcia da Silva, Pedro; Reid, Ashlan P; Peikon, Ian D; Albeanu, Dinu F; Zador, Anthony M

    2016-09-07

    Neurons transmit information to distant brain regions via long-range axonal projections. In the mouse, area-to-area connections have only been systematically mapped using bulk labeling techniques, which obscure the diverse projections of intermingled single neurons. Here we describe MAPseq (Multiplexed Analysis of Projections by Sequencing), a technique that can map the projections of thousands or even millions of single neurons by labeling large sets of neurons with random RNA sequences ("barcodes"). Axons are filled with barcode mRNA, each putative projection area is dissected, and the barcode mRNA is extracted and sequenced. Applying MAPseq to the locus coeruleus (LC), we find that individual LC neurons have preferred cortical targets. By recasting neuroanatomy, which is traditionally viewed as a problem of microscopy, as a problem of sequencing, MAPseq harnesses advances in sequencing technology to permit high-throughput interrogation of brain circuits. Copyright © 2016 Elsevier Inc. All rights reserved.

  14. Exploring fungal diversity in deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing

    Science.gov (United States)

    Zhang, Xiao-Yong; Wang, Guang-Hua; Xu, Xin-Ya; Nong, Xu-Hua; Wang, Jie; Amin, Muhammad; Qi, Shu-Hua

    2016-10-01

    The present study investigated the fungal diversity in four different deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing of the nuclear ribosomal internal transcribed spacer-1 (ITS1). A total of 40,297 fungal ITS1 sequences clustered into 420 operational taxonomic units (OTUs) with 97% sequence similarity and 170 taxa were recovered from these sediments. Most ITS1 sequences (78%) belonged to the phylum Ascomycota, followed by Basidiomycota (17.3%), Zygomycota (1.5%) and Chytridiomycota (0.8%), and a small proportion (2.4%) belonged to unassigned fungal phyla. Compared with previous studies on fungal diversity of sediments from deep-sea environments by culture-dependent approach and clone library analysis, the present result suggested that Illumina sequencing had been dramatically accelerating the discovery of fungal community of deep-sea sediments. Furthermore, our results revealed that Sordariomycetes was the most diverse and abundant fungal class in this study, challenging the traditional view that the diversity of Sordariomycetes phylotypes was low in the deep-sea environments. In addition, more than 12 taxa accounted for 21.5% sequences were found to be rarely reported as deep-sea fungi, suggesting the deep-sea sediments from Okinawa Trough harbored a plethora of different fungal communities compared with other deep-sea environments. To our knowledge, this study is the first exploration of the fungal diversity in deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing.

  15. High-throughput sequencing enhanced phage display enables the identification of patient-specific epitope motifs in serum

    DEFF Research Database (Denmark)

    Christiansen, Anders; Kringelum, Jens Vindahl; Hansen, Christian Skjødt

    2015-01-01

    of the bioinformatic approach was demonstrated by identifying epitopes of a prominent peanut allergen, Ara h 1, in sera from patients with severe peanut allergy. The identified epitopes were confirmed by high-density peptide micro-arrays. The present study demonstrates that high-throughput sequencing can empower phage...

  16. HTSeq--a Python framework to work with high-throughput sequencing data.

    Science.gov (United States)

    Anders, Simon; Pyl, Paul Theodor; Huber, Wolfgang

    2015-01-15

    A large choice of tools exists for many standard tasks in the analysis of high-throughput sequencing (HTS) data. However, once a project deviates from standard workflows, custom scripts are needed. We present HTSeq, a Python library to facilitate the rapid development of such scripts. HTSeq offers parsers for many common data formats in HTS projects, as well as classes to represent data, such as genomic coordinates, sequences, sequencing reads, alignments, gene model information and variant calls, and provides data structures that allow for querying via genomic coordinates. We also present htseq-count, a tool developed with HTSeq that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes. HTSeq is released as an open-source software under the GNU General Public Licence and available from http://www-huber.embl.de/HTSeq or from the Python Package Index at https://pypi.python.org/pypi/HTSeq. © The Author 2014. Published by Oxford University Press.

  17. HeurAA: accurate and fast detection of genetic variations with a novel heuristic amplicon aligner program for next generation sequencing.

    Directory of Open Access Journals (Sweden)

    Lőrinc S Pongor

    Full Text Available Next generation sequencing (NGS of PCR amplicons is a standard approach to detect genetic variations in personalized medicine such as cancer diagnostics. Computer programs used in the NGS community often miss insertions and deletions (indels that constitute a large part of known human mutations. We have developed HeurAA, an open source, heuristic amplicon aligner program. We tested the program on simulated datasets as well as experimental data from multiplex sequencing of 40 amplicons in 12 oncogenes collected on a 454 Genome Sequencer from lung cancer cell lines. We found that HeurAA can accurately detect all indels, and is more than an order of magnitude faster than previous programs. HeurAA can compare reads and reference sequences up to several thousand base pairs in length, and it can evaluate data from complex mixtures containing reads of different gene-segments from different samples. HeurAA is written in C and Perl for Linux operating systems, the code and the documentation are available for research applications at http://sourceforge.net/projects/heuraa/

  18. A high-throughput splinkerette-PCR method for the isolation and sequencing of retroviral insertion sites

    DEFF Research Database (Denmark)

    Uren, Anthony G; Mikkers, Harald; Kool, Jaap

    2009-01-01

    sites has been a major limitation to performing screens on this scale. Here we present a method for the high-throughput isolation of insertion sites using a highly efficient splinkerette-PCR method coupled with capillary or 454 sequencing. This protocol includes a description of the procedure for DNA......Insertional mutagens such as viruses and transposons are a useful tool for performing forward genetic screens in mice to discover cancer genes. These screens are most effective when performed using hundreds of mice; however, until recently, the cost-effective isolation and sequencing of insertion...

  19. Global Perspectives on Activated Sludge Community Composition analyzed using 16S rRNA amplicon sequencing

    DEFF Research Database (Denmark)

    Nierychlo, Marta; Saunders, Aaron Marc; Albertsen, Mads

    communities, and in this study activated sludge sampled from 32 Wastewater Treatment Plants (WWTPs) around the world was described and compared. The top abundant bacteria in the global activated sludge ecosystem were found and the core population shared by multiple samples was investigated. The results......Activated sludge is the most commonly applied bioprocess throughout the world for wastewater treatment. Microorganisms are key to the process, yet our knowledge of their identity and function is still limited. High-througput16S rRNA amplicon sequencing can reliably characterize microbial...

  20. Centroid based clustering of high throughput sequencing reads based on n-mer counts.

    Science.gov (United States)

    Solovyov, Alexander; Lipkin, W Ian

    2013-09-08

    Many problems in computational biology require alignment-free sequence comparisons. One of the common tasks involving sequence comparison is sequence clustering. Here we apply methods of alignment-free comparison (in particular, comparison using sequence composition) to the challenge of sequence clustering. We study several centroid based algorithms for clustering sequences based on word counts. Study of their performance shows that using k-means algorithm with or without the data whitening is efficient from the computational point of view. A higher clustering accuracy can be achieved using the soft expectation maximization method, whereby each sequence is attributed to each cluster with a specific probability. We implement an open source tool for alignment-free clustering. It is publicly available from github: https://github.com/luscinius/afcluster. We show the utility of alignment-free sequence clustering for high throughput sequencing analysis despite its limitations. In particular, it allows one to perform assembly with reduced resources and a minimal loss of quality. The major factor affecting performance of alignment-free read clustering is the length of the read.

  1. WebPrInSeS: automated full-length clone sequence identification and verification using high-throughput sequencing data.

    Science.gov (United States)

    Massouras, Andreas; Decouttere, Frederik; Hens, Korneel; Deplancke, Bart

    2010-07-01

    High-throughput sequencing (HTS) is revolutionizing our ability to obtain cheap, fast and reliable sequence information. Many experimental approaches are expected to benefit from the incorporation of such sequencing features in their pipeline. Consequently, software tools that facilitate such an incorporation should be of great interest. In this context, we developed WebPrInSeS, a web server tool allowing automated full-length clone sequence identification and verification using HTS data. WebPrInSeS encompasses two separate software applications. The first is WebPrInSeS-C which performs automated sequence verification of user-defined open-reading frame (ORF) clone libraries. The second is WebPrInSeS-E, which identifies positive hits in cDNA or ORF-based library screening experiments such as yeast one- or two-hybrid assays. Both tools perform de novo assembly using HTS data from any of the three major sequencing platforms. Thus, WebPrInSeS provides a highly integrated, cost-effective and efficient way to sequence-verify or identify clones of interest. WebPrInSeS is available at http://webprinses.epfl.ch/ and is open to all users.

  2. Analysis of high-throughput sequencing and annotation strategies for phage genomes.

    Directory of Open Access Journals (Sweden)

    Matthew R Henn

    Full Text Available BACKGROUND: Bacterial viruses (phages play a critical role in shaping microbial populations as they influence both host mortality and horizontal gene transfer. As such, they have a significant impact on local and global ecosystem function and human health. Despite their importance, little is known about the genomic diversity harbored in phages, as methods to capture complete phage genomes have been hampered by the lack of knowledge about the target genomes, and difficulties in generating sufficient quantities of genomic DNA for sequencing. Of the approximately 550 phage genomes currently available in the public domain, fewer than 5% are marine phage. METHODOLOGY/PRINCIPAL FINDINGS: To advance the study of phage biology through comparative genomic approaches we used marine cyanophage as a model system. We compared DNA preparation methodologies (DNA extraction directly from either phage lysates or CsCl purified phage particles, and sequencing strategies that utilize either Sanger sequencing of a linker amplification shotgun library (LASL or of a whole genome shotgun library (WGSL, or 454 pyrosequencing methods. We demonstrate that genomic DNA sample preparation directly from a phage lysate, combined with 454 pyrosequencing, is best suited for phage genome sequencing at scale, as this method is capable of capturing complete continuous genomes with high accuracy. In addition, we describe an automated annotation informatics pipeline that delivers high-quality annotation and yields few false positives and negatives in ORF calling. CONCLUSIONS/SIGNIFICANCE: These DNA preparation, sequencing and annotation strategies enable a high-throughput approach to the burgeoning field of phage genomics.

  3. Association Study of Gut Flora in Coronary Heart Disease through High-Throughput Sequencing

    OpenAIRE

    Cui, Li; Zhao, Tingting; Hu, Haibing; Zhang, Wen; Hua, Xiuguo

    2017-01-01

    Objectives. We aimed to explore the impact of gut microbiota in coronary heart disease (CHD) patients through high-throughput sequencing. Methods. A total of 29 CHD in-hospital patients and 35 healthy volunteers as controls were included. Nucleic acids were extracted from fecal samples, followed by ? diversity and principal coordinate analysis (PCoA). Based on unweighted UniFrac distance matrices, unweighted-pair group method with arithmetic mean (UPGMA) trees were created. Results. After dat...

  4. High-throughput sequencing of forensic genetic samples using punches of FTA cards with buccal swabs.

    Science.gov (United States)

    Kampmann, Marie-Louise; Buchard, Anders; Børsting, Claus; Morling, Niels

    2016-01-01

    Here, we demonstrate that punches from buccal swab samples preserved on FTA cards can be used for high-throughput DNA sequencing, also known as massively parallel sequencing (MPS). We typed 44 reference samples with the HID-Ion AmpliSeq Identity Panel using washed 1.2 mm punches from FTA cards with buccal swabs and compared the results with those obtained with DNA extracted using the EZ1 DNA Investigator Kit. Concordant profiles were obtained for all samples. Our protocol includes simple punch, wash, and PCR steps, reducing cost and hands-on time in the laboratory. Furthermore, it facilitates automation of DNA sequencing.

  5. Exploring the sources of bacterial spoilers in beefsteaks by culture-independent high-throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Francesca De Filippis

    Full Text Available Microbial growth on meat to unacceptable levels contributes significantly to change meat structure, color and flavor and to cause meat spoilage. The types of microorganisms initially present in meat depend on several factors and multiple sources of contamination can be identified. The aims of this study were to evaluate the microbial diversity in beefsteaks before and after aerobic storage at 4°C and to investigate the sources of microbial contamination by examining the microbiota of carcasses wherefrom the steaks originated and of the processing environment where the beef was handled. Carcass, environmental (processing plant and meat samples were analyzed by culture-independent high-throughput sequencing of 16S rRNA gene amplicons. The microbiota of carcass swabs was very complex, including more than 600 operational taxonomic units (OTUs belonging to 15 different phyla. A significant association was found between beef microbiota and specific beef cuts (P<0.01 indicating that different cuts of the same carcass can influence the microbial contamination of beef. Despite the initially high complexity of the carcass microbiota, the steaks after aerobic storage at 4°C showed a dramatic decrease in microbial complexity. Pseudomonas sp. and Brochothrix thermosphacta were the main contaminants, and Acinetobacter, Psychrobacter and Enterobacteriaceae were also found. Comparing the relative abundance of OTUs in the different samples it was shown that abundant OTUs in beefsteaks after storage occurred in the corresponding carcass. However, the abundance of these same OTUs clearly increased in environmental samples taken in the processing plant suggesting that spoilage-associated microbial species originate from carcasses, they are carried to the processing environment where the meat is handled and there they become a resident microbiota. Such microbiota is then further spread on meat when it is handled and it represents the starting microbial association

  6. Accurate molecular diagnosis of phenylketonuria and tetrahydrobiopterin-deficient hyperphenylalaninemias using high-throughput targeted sequencing

    Science.gov (United States)

    Trujillano, Daniel; Perez, Belén; González, Justo; Tornador, Cristian; Navarrete, Rosa; Escaramis, Georgia; Ossowski, Stephan; Armengol, Lluís; Cornejo, Verónica; Desviat, Lourdes R; Ugarte, Magdalena; Estivill, Xavier

    2014-01-01

    Genetic diagnostics of phenylketonuria (PKU) and tetrahydrobiopterin (BH4) deficient hyperphenylalaninemia (BH4DH) rely on methods that scan for known mutations or on laborious molecular tools that use Sanger sequencing. We have implemented a novel and much more efficient strategy based on high-throughput multiplex-targeted resequencing of four genes (PAH, GCH1, PTS, and QDPR) that, when affected by loss-of-function mutations, cause PKU and BH4DH. We have validated this approach in a cohort of 95 samples with the previously known PAH, GCH1, PTS, and QDPR mutations and one control sample. Pooled barcoded DNA libraries were enriched using a custom NimbleGen SeqCap EZ Choice array and sequenced using a HiSeq2000 sequencer. The combination of several robust bioinformatics tools allowed us to detect all known pathogenic mutations (point mutations, short insertions/deletions, and large genomic rearrangements) in the 95 samples, without detecting spurious calls in these genes in the control sample. We then used the same capture assay in a discovery cohort of 11 uncharacterized HPA patients using a MiSeq sequencer. In addition, we report the precise characterization of the breakpoints of four genomic rearrangements in PAH, including a novel deletion of 899 bp in intron 3. Our study is a proof-of-principle that high-throughput-targeted resequencing is ready to substitute classical molecular methods to perform differential genetic diagnosis of hyperphenylalaninemias, allowing the establishment of specifically tailored treatments a few days after birth. PMID:23942198

  7. Leveraging the Power of High Performance Computing for Next Generation Sequencing Data Analysis: Tricks and Twists from a High Throughput Exome Workflow

    Science.gov (United States)

    Wonczak, Stephan; Thiele, Holger; Nieroda, Lech; Jabbari, Kamel; Borowski, Stefan; Sinha, Vishal; Gunia, Wilfried; Lang, Ulrich; Achter, Viktor; Nürnberg, Peter

    2015-01-01

    Next generation sequencing (NGS) has been a great success and is now a standard method of research in the life sciences. With this technology, dozens of whole genomes or hundreds of exomes can be sequenced in rather short time, producing huge amounts of data. Complex bioinformatics analyses are required to turn these data into scientific findings. In order to run these analyses fast, automated workflows implemented on high performance computers are state of the art. While providing sufficient compute power and storage to meet the NGS data challenge, high performance computing (HPC) systems require special care when utilized for high throughput processing. This is especially true if the HPC system is shared by different users. Here, stability, robustness and maintainability are as important for automated workflows as speed and throughput. To achieve all of these aims, dedicated solutions have to be developed. In this paper, we present the tricks and twists that we utilized in the implementation of our exome data processing workflow. It may serve as a guideline for other high throughput data analysis projects using a similar infrastructure. The code implementing our solutions is provided in the supporting information files. PMID:25942438

  8. Fine grained compositional analysis of Port Everglades Inlet microbiome using high throughput DNA sequencing.

    Science.gov (United States)

    O'Connell, Lauren; Gao, Song; McCorquodale, Donald; Fleisher, Jay; Lopez, Jose V

    2018-01-01

    Similar to natural rivers, manmade inlets connect inland runoff to the ocean. Port Everglades Inlet (PEI) is a busy cargo and cruise ship port in South Florida, which can act as a source of pollution to surrounding beaches and offshore coral reefs. Understanding the composition and fluctuations of bacterioplankton communities ("microbiomes") in major port inlets is important due to potential impacts on surrounding environments. We hypothesize seasonal microbial fluctuations, which were profiled by high throughput 16S rRNA amplicon sequencing and analysis. Surface water samples were collected every week for one year. A total of four samples per month, two from each sampling location, were used for statistical analysis creating a high sampling frequency and finer sampling scale than previous inlet microbiome studies. We observed significant differences in community alpha diversity between months and seasons. Analysis of composition of microbiomes (ANCOM) tests were run in QIIME 2 at genus level taxonomic classification to determine which genera were differentially abundant between seasons and months. Beta diversity results yielded significant differences in PEI community composition in regard to month, season, water temperature, and salinity. Analysis of potentially pathogenic genera showed presence of Staphylococcus and Streptococcus . However, statistical analysis indicated that these organisms were not present in significantly high abundances throughout the year or between seasons. Significant differences in alpha diversity were observed when comparing microbial communities with respect to time. This observation stems from the high community evenness and low community richness in August. This indicates that only a few organisms dominated the community during this month. August had lower than average rainfall levels for a wet season, which may have contributed to less runoff, and fewer bacterial groups introduced into the port surface waters. Bacterioplankton beta

  9. Fine grained compositional analysis of Port Everglades Inlet microbiome using high throughput DNA sequencing

    Directory of Open Access Journals (Sweden)

    Lauren O’Connell

    2018-05-01

    Full Text Available Background Similar to natural rivers, manmade inlets connect inland runoff to the ocean. Port Everglades Inlet (PEI is a busy cargo and cruise ship port in South Florida, which can act as a source of pollution to surrounding beaches and offshore coral reefs. Understanding the composition and fluctuations of bacterioplankton communities (“microbiomes” in major port inlets is important due to potential impacts on surrounding environments. We hypothesize seasonal microbial fluctuations, which were profiled by high throughput 16S rRNA amplicon sequencing and analysis. Methods & Results Surface water samples were collected every week for one year. A total of four samples per month, two from each sampling location, were used for statistical analysis creating a high sampling frequency and finer sampling scale than previous inlet microbiome studies. We observed significant differences in community alpha diversity between months and seasons. Analysis of composition of microbiomes (ANCOM tests were run in QIIME 2 at genus level taxonomic classification to determine which genera were differentially abundant between seasons and months. Beta diversity results yielded significant differences in PEI community composition in regard to month, season, water temperature, and salinity. Analysis of potentially pathogenic genera showed presence of Staphylococcus and Streptococcus. However, statistical analysis indicated that these organisms were not present in significantly high abundances throughout the year or between seasons. Discussion Significant differences in alpha diversity were observed when comparing microbial communities with respect to time. This observation stems from the high community evenness and low community richness in August. This indicates that only a few organisms dominated the community during this month. August had lower than average rainfall levels for a wet season, which may have contributed to less runoff, and fewer bacterial groups

  10. High-throughput genome sequencing of two Listeria monocytogenes clinical isolates during a large foodborne outbreak

    Directory of Open Access Journals (Sweden)

    Trout-Yakel Keri M

    2010-02-01

    Full Text Available Abstract Background A large, multi-province outbreak of listeriosis associated with ready-to-eat meat products contaminated with Listeria monocytogenes serotype 1/2a occurred in Canada in 2008. Subtyping of outbreak-associated isolates using pulsed-field gel electrophoresis (PFGE revealed two similar but distinct AscI PFGE patterns. High-throughput pyrosequencing of two L. monocytogenes isolates was used to rapidly provide the genome sequence of the primary outbreak strain and to investigate the extent of genetic diversity associated with a change of a single restriction enzyme fragment during PFGE. Results The chromosomes were collinear, but differences included 28 single nucleotide polymorphisms (SNPs and three indels, including a 33 kbp prophage that accounted for the observed difference in AscI PFGE patterns. The distribution of these traits was assessed within further clinical, environmental and food isolates associated with the outbreak, and this comparison indicated that three distinct, but highly related strains may have been involved in this nationwide outbreak. Notably, these two isolates were found to harbor a 50 kbp putative mobile genomic island encoding translocation and efflux functions that has not been observed in other Listeria genomes. Conclusions High-throughput genome sequencing provided a more detailed real-time assessment of genetic traits characteristic of the outbreak strains than could be achieved with routine subtyping methods. This study confirms that the latest generation of DNA sequencing technologies can be applied during high priority public health events, and laboratories need to prepare for this inevitability and assess how to properly analyze and interpret whole genome sequences in the context of molecular epidemiology.

  11. HTSstation: a web application and open-access libraries for high-throughput sequencing data analysis.

    Science.gov (United States)

    David, Fabrice P A; Delafontaine, Julien; Carat, Solenne; Ross, Frederick J; Lefebvre, Gregory; Jarosz, Yohan; Sinclair, Lucas; Noordermeer, Daan; Rougemont, Jacques; Leleu, Marion

    2014-01-01

    The HTSstation analysis portal is a suite of simple web forms coupled to modular analysis pipelines for various applications of High-Throughput Sequencing including ChIP-seq, RNA-seq, 4C-seq and re-sequencing. HTSstation offers biologists the possibility to rapidly investigate their HTS data using an intuitive web application with heuristically pre-defined parameters. A number of open-source software components have been implemented and can be used to build, configure and run HTS analysis pipelines reactively. Besides, our programming framework empowers developers with the possibility to design their own workflows and integrate additional third-party software. The HTSstation web application is accessible at http://htsstation.epfl.ch.

  12. Environmental microbiology through the lens of high-throughput DNA sequencing: synopsis of current platforms and bioinformatics approaches.

    Science.gov (United States)

    Logares, Ramiro; Haverkamp, Thomas H A; Kumar, Surendra; Lanzén, Anders; Nederbragt, Alexander J; Quince, Christopher; Kauserud, Håvard

    2012-10-01

    The incursion of High-Throughput Sequencing (HTS) in environmental microbiology brings unique opportunities and challenges. HTS now allows a high-resolution exploration of the vast taxonomic and metabolic diversity present in the microbial world, which can provide an exceptional insight on global ecosystem functioning, ecological processes and evolution. This exploration has also economic potential, as we will have access to the evolutionary innovation present in microbial metabolisms, which could be used for biotechnological development. HTS is also challenging the research community, and the current bottleneck is present in the data analysis side. At the moment, researchers are in a sequence data deluge, with sequencing throughput advancing faster than the computer power needed for data analysis. However, new tools and approaches are being developed constantly and the whole process could be depicted as a fast co-evolution between sequencing technology, informatics and microbiologists. In this work, we examine the most popular and recently commercialized HTS platforms as well as bioinformatics methods for data handling and analysis used in microbial metagenomics. This non-exhaustive review is intended to serve as a broad state-of-the-art guide to researchers expanding into this rapidly evolving field. Copyright © 2012 Elsevier B.V. All rights reserved.

  13. High-Throughput DNA sequencing of ancient wood.

    Science.gov (United States)

    Wagner, Stefanie; Lagane, Frédéric; Seguin-Orlando, Andaine; Schubert, Mikkel; Leroy, Thibault; Guichoux, Erwan; Chancerel, Emilie; Bech-Hebelstrup, Inger; Bernard, Vincent; Billard, Cyrille; Billaud, Yves; Bolliger, Matthias; Croutsch, Christophe; Čufar, Katarina; Eynaud, Frédérique; Heussner, Karl Uwe; Köninger, Joachim; Langenegger, Fabien; Leroy, Frédéric; Lima, Christine; Martinelli, Nicoletta; Momber, Garry; Billamboz, André; Nelle, Oliver; Palomo, Antoni; Piqué, Raquel; Ramstein, Marianne; Schweichel, Roswitha; Stäuble, Harald; Tegel, Willy; Terradas, Xavier; Verdin, Florence; Plomion, Christophe; Kremer, Antoine; Orlando, Ludovic

    2018-03-01

    Reconstructing the colonization and demographic dynamics that gave rise to extant forests is essential to forecasts of forest responses to environmental changes. Classical approaches to map how population of trees changed through space and time largely rely on pollen distribution patterns, with only a limited number of studies exploiting DNA molecules preserved in wooden tree archaeological and subfossil remains. Here, we advance such analyses by applying high-throughput (HTS) DNA sequencing to wood archaeological and subfossil material for the first time, using a comprehensive sample of 167 European white oak waterlogged remains spanning a large temporal (from 550 to 9,800 years) and geographical range across Europe. The successful characterization of the endogenous DNA and exogenous microbial DNA of 140 (~83%) samples helped the identification of environmental conditions favouring long-term DNA preservation in wood remains, and started to unveil the first trends in the DNA decay process in wood material. Additionally, the maternally inherited chloroplast haplotypes of 21 samples from three periods of forest human-induced use (Neolithic, Bronze Age and Middle Ages) were found to be consistent with those of modern populations growing in the same geographic areas. Our work paves the way for further studies aiming at using ancient DNA preserved in wood to reconstruct the micro-evolutionary response of trees to climate change and human forest management. © 2018 John Wiley & Sons Ltd.

  14. [Study on Microbial Diversity of Peri-implantitis Subgingival by High-throughput Sequencing].

    Science.gov (United States)

    Li, Zhi-jie; Wang, Shao-guo; Li, Yue-hong; Tu, Dong-xiang; Liu, Shi-yun; Nie, Hong-bing; Li, Zhi-qiang; Zhang, Ju-mei

    2015-07-01

    To study microbial diversity of peri-implantitis subgingival with high-throughput sequencing, and investigate microbiological etiology of peri-implantitis. Subgingival plaques were sampled from the patients with peri-implantitis (D group) and non-peri-implantitis subjects (N group). The microbiological diversity of the subgingival plaques was detected by sequencing V4 region of 16S rRNA with Illumina Miseq platform. The diversity of the community structure was analyzed using Mothur software. A total of 156 507 gene sequences were detected in nine samples and 4 402 operational taxonomic units (OTUs) were found. Selenomonas, Pseudomonas, and Fusobacterium were dominant bacteria in D group, while Fusobacterium, Veillonella and Streptococcus were dominant bacteria in N group. Differences between peri-implantitis and non-peri-implantitis bacterial communities were observed at all phylogenetic levels by LEfSe, which was also found in PcoA test. The occurrence of peri-implantitis is not only related to periodontitis pathogenic microbe, but also related with the changes of oral microbial community structure. Treponema, Herbaspirillum, Butyricimonas and Phaeobacte may be closely related to the occurrence and development of peri-implantitis.

  15. Cytomegalovirus sequence variability, amplicon length, and DNase-sensitive non-encapsidated genomes are obstacles to standardization and commutability of plasma viral load results.

    Science.gov (United States)

    Naegele, Klaudia; Lautenschlager, Irmeli; Gosert, Rainer; Loginov, Raisa; Bir, Katia; Helanterä, Ilkka; Schaub, Stefan; Khanna, Nina; Hirsch, Hans H

    2018-04-22

    Cytomegalovirus (CMV) management post-transplantation relies on quantification in blood, but inter-laboratory and inter-assay variability impairs commutability. An international multicenter study demonstrated that variability is mitigated by standardizing plasma volumes, automating DNA extraction and amplification, and calibration to the 1st-CMV-WHO-International-Standard as in the FDA-approved Roche-CAP/CTM-CMV. However, Roche-CAP/CTM-CMV showed under-quantification and false-negative results in a quality assurance program (UK-NEQAS-2014). To evaluate factors contributing to quantification variability of CMV viral load and to develop optimized CMV-UL54-QNAT. The UL54 target of the UK-NEQAS-2014 variant was sequenced and compared to 329 available CMV GenBank sequences. Four Basel-CMV-UL54-QNAT assays of 361 bp, 254 bp, 151 bp, and 95 bp amplicons were developed that only differed in reverse primer positions. The assays were validated using plasmid dilutions, UK-NEQAS-2014 sample, as well as 107 frozen and 69 prospectively collected plasma samples from transplant patients submitted for CMV QNAT, with and without DNase-digestion prior to nucleic acid extraction. Eight of 43 mutations were identified as relevant in the UK-NEQAS-2014 target. All Basel-CMV-UL54 QNATs quantified the UK-NEQAS-2014 but revealed 10-fold increasing CMV loads as amplicon size decreased. The inverse correlation of amplicon size and viral loads was confirmed using 1st-WHO-International-Standard and patient samples. DNase pre-treatment reduced plasma CMV loads by >90% indicating the presence of unprotected CMV genomic DNA. Sequence variability, amplicon length, and non-encapsidated genomes obstruct standardization and commutability of CMV loads needed to develop thresholds for clinical research and management. Besides regular sequence surveys, matrix and extraction standardization, we propose developing reference calibrators using 100 bp amplicons. Copyright © 2018 Elsevier B.V. All

  16. SNP calling using genotype model selection on high-throughput sequencing data

    KAUST Repository

    You, Na

    2012-01-16

    Motivation: A review of the available single nucleotide polymorphism (SNP) calling procedures for Illumina high-throughput sequencing (HTS) platform data reveals that most rely mainly on base-calling and mapping qualities as sources of error when calling SNPs. Thus, errors not involved in base-calling or alignment, such as those in genomic sample preparation, are not accounted for.Results: A novel method of consensus and SNP calling, Genotype Model Selection (GeMS), is given which accounts for the errors that occur during the preparation of the genomic sample. Simulations and real data analyses indicate that GeMS has the best performance balance of sensitivity and positive predictive value among the tested SNP callers. © The Author 2012. Published by Oxford University Press. All rights reserved.

  17. Amplicon sequencing of bacterial microbiota in abortion material from cattle.

    Science.gov (United States)

    Vidal, Sara; Kegler, Kristel; Posthaus, Horst; Perreten, Vincent; Rodriguez-Campos, Sabrina

    2017-10-10

    Abortions in cattle have a significant economic impact on animal husbandry and require prompt diagnosis for surveillance of epizootic infectious agents. Since most abortions are not epizootic but sporadic with often undetected etiologies, this study examined the bacterial community present in the placenta (PL, n = 32) and fetal abomasal content (AC, n = 49) in 64 cases of bovine abortion by next generation sequencing (NGS) of the 16S rRNA gene. The PL and AC from three fetuses of dams that died from non-infectious reasons were included as controls. All samples were analyzed by bacterial culture, and 17 were examined by histopathology. We observed 922 OTUs overall and 267 taxa at the genus level. No detectable bacterial DNA was present in the control samples. The microbial profiles of the PL and AC differed significantly, both in their composition (PERMANOVA), species richness and Chao-1 (Mann-Whitney test). In both organs, Pseudomonas was the most abundant genus. The combination of NGS and culture identified opportunistic pathogens of interest in placentas with lesions, such as Vibrio metschnikovii, Streptococcus uberis, Lactococcus lactis and Escherichia coli. In placentas with lesions where culturing was unsuccessful, Pseudomonas and unidentified Aeromonadaceae were identified by NGS displaying high number of reads. Three cases with multiple possible etiologies and placentas presenting lesions were detected by NGS. Amplicon sequencing has the potential to uncover unknown etiological agents. These new insights on cattle abortion extend our focus to previously understudied opportunistic abortive bacteria.

  18. Experimental design-based functional mining and characterization of high-throughput sequencing data in the sequence read archive.

    Directory of Open Access Journals (Sweden)

    Takeru Nakazato

    Full Text Available High-throughput sequencing technology, also called next-generation sequencing (NGS, has the potential to revolutionize the whole process of genome sequencing, transcriptomics, and epigenetics. Sequencing data is captured in a public primary data archive, the Sequence Read Archive (SRA. As of January 2013, data from more than 14,000 projects have been submitted to SRA, which is double that of the previous year. Researchers can download raw sequence data from SRA website to perform further analyses and to compare with their own data. However, it is extremely difficult to search entries and download raw sequences of interests with SRA because the data structure is complicated, and experimental conditions along with raw sequences are partly described in natural language. Additionally, some sequences are of inconsistent quality because anyone can submit sequencing data to SRA with no quality check. Therefore, as a criterion of data quality, we focused on SRA entries that were cited in journal articles. We extracted SRA IDs and PubMed IDs (PMIDs from SRA and full-text versions of journal articles and retrieved 2748 SRA ID-PMID pairs. We constructed a publication list referring to SRA entries. Since, one of the main themes of -omics analyses is clarification of disease mechanisms, we also characterized SRA entries by disease keywords, according to the Medical Subject Headings (MeSH extracted from articles assigned to each SRA entry. We obtained 989 SRA ID-MeSH disease term pairs, and constructed a disease list referring to SRA data. We previously developed feature profiles of diseases in a system called "Gendoo". We generated hyperlinks between diseases extracted from SRA and the feature profiles of it. The developed project, publication and disease lists resulting from this study are available at our web service, called "DBCLS SRA" (http://sra.dbcls.jp/. This service will improve accessibility to high-quality data from SRA.

  19. Galaxy Workflows for Web-based Bioinformatics Analysis of Aptamer High-throughput Sequencing Data

    Directory of Open Access Journals (Sweden)

    William H Thiel

    2016-01-01

    Full Text Available Development of RNA and DNA aptamers for diagnostic and therapeutic applications is a rapidly growing field. Aptamers are identified through iterative rounds of selection in a process termed SELEX (Systematic Evolution of Ligands by EXponential enrichment. High-throughput sequencing (HTS revolutionized the modern SELEX process by identifying millions of aptamer sequences across multiple rounds of aptamer selection. However, these vast aptamer HTS datasets necessitated bioinformatics techniques. Herein, we describe a semiautomated approach to analyze aptamer HTS datasets using the Galaxy Project, a web-based open source collection of bioinformatics tools that were originally developed to analyze genome, exome, and transcriptome HTS data. Using a series of Workflows created in the Galaxy webserver, we demonstrate efficient processing of aptamer HTS data and compilation of a database of unique aptamer sequences. Additional Workflows were created to characterize the abundance and persistence of aptamer sequences within a selection and to filter sequences based on these parameters. A key advantage of this approach is that the online nature of the Galaxy webserver and its graphical interface allow for the analysis of HTS data without the need to compile code or install multiple programs.

  20. A Reference Viral Database (RVDB) To Enhance Bioinformatics Analysis of High-Throughput Sequencing for Novel Virus Detection.

    Science.gov (United States)

    Goodacre, Norman; Aljanahi, Aisha; Nandakumar, Subhiksha; Mikailov, Mike; Khan, Arifa S

    2018-01-01

    Detection of distantly related viruses by high-throughput sequencing (HTS) is bioinformatically challenging because of the lack of a public database containing all viral sequences, without abundant nonviral sequences, which can extend runtime and obscure viral hits. Our reference viral database (RVDB) includes all viral, virus-related, and virus-like nucleotide sequences (excluding bacterial viruses), regardless of length, and with overall reduced cellular sequences. Semantic selection criteria (SEM-I) were used to select viral sequences from GenBank, resulting in a first-generation viral database (VDB). This database was manually and computationally reviewed, resulting in refined, semantic selection criteria (SEM-R), which were applied to a new download of updated GenBank sequences to create a second-generation VDB. Viral entries in the latter were clustered at 98% by CD-HIT-EST to reduce redundancy while retaining high viral sequence diversity. The viral identity of the clustered representative sequences (creps) was confirmed by BLAST searches in NCBI databases and HMMER searches in PFAM and DFAM databases. The resulting RVDB contained a broad representation of viral families, sequence diversity, and a reduced cellular content; it includes full-length and partial sequences and endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Testing of RVDBv10.2, with an in-house HTS transcriptomic data set indicated a significantly faster run for virus detection than interrogating the entirety of the NCBI nonredundant nucleotide database, which contains all viral sequences but also nonviral sequences. RVDB is publically available for facilitating HTS analysis, particularly for novel virus detection. It is meant to be updated on a regular basis to include new viral sequences added to GenBank. IMPORTANCE To facilitate bioinformatics analysis of high-throughput sequencing (HTS) data for the detection of both known and novel viruses, we have

  1. High-throughput sequencing of three Lemnoideae (duckweeds chloroplast genomes from total DNA.

    Directory of Open Access Journals (Sweden)

    Wenqin Wang

    Full Text Available BACKGROUND: Chloroplast genomes provide a wealth of information for evolutionary and population genetic studies. Chloroplasts play a particularly important role in the adaption for aquatic plants because they float on water and their major surface is exposed continuously to sunlight. The subfamily of Lemnoideae represents such a collection of aquatic species that because of photosynthesis represents one of the fastest growing plant species on earth. METHODS: We sequenced the chloroplast genomes from three different genera of Lemnoideae, Spirodela polyrhiza, Wolffiella lingulata and Wolffia australiana by high-throughput DNA sequencing of genomic DNA using the SOLiD platform. Unfractionated total DNA contains high copies of plastid DNA so that sequences from the nucleus and mitochondria can easily be filtered computationally. Remaining sequence reads were assembled into contiguous sequences (contigs using SOLiD software tools. Contigs were mapped to a reference genome of Lemna minor and gaps, selected by PCR, were sequenced on the ABI3730xl platform. CONCLUSIONS: This combinatorial approach yielded whole genomic contiguous sequences in a cost-effective manner. Over 1,000-time coverage of chloroplast from total DNA were reached by the SOLiD platform in a single spot on a quadrant slide without purification. Comparative analysis indicated that the chloroplast genome was conserved in gene number and organization with respect to the reference genome of L. minor. However, higher nucleotide substitution, abundant deletions and insertions occurred in non-coding regions of these genomes, indicating a greater genomic dynamics than expected from the comparison of other related species in the Pooideae. Noticeably, there was no transition bias over transversion in Lemnoideae. The data should have immediate applications in evolutionary biology and plant taxonomy with increased resolution and statistical power.

  2. Polymorphism discovery and allele frequency estimation using high-throughput DNA sequencing of target-enriched pooled DNA samples

    Directory of Open Access Journals (Sweden)

    Mullen Michael P

    2012-01-01

    Full Text Available Abstract Background The central role of the somatotrophic axis in animal post-natal growth, development and fertility is well established. Therefore, the identification of genetic variants affecting quantitative traits within this axis is an attractive goal. However, large sample numbers are a pre-requisite for the identification of genetic variants underlying complex traits and although technologies are improving rapidly, high-throughput sequencing of large numbers of complete individual genomes remains prohibitively expensive. Therefore using a pooled DNA approach coupled with target enrichment and high-throughput sequencing, the aim of this study was to identify polymorphisms and estimate allele frequency differences across 83 candidate genes of the somatotrophic axis, in 150 Holstein-Friesian dairy bulls divided into two groups divergent for genetic merit for fertility. Results In total, 4,135 SNPs and 893 indels were identified during the resequencing of the 83 candidate genes. Nineteen percent (n = 952 of variants were located within 5' and 3' UTRs. Seventy-two percent (n = 3,612 were intronic and 9% (n = 464 were exonic, including 65 indels and 236 SNPs resulting in non-synonymous substitutions (NSS. Significant (P ® MassARRAY. No significant differences (P > 0.1 were observed between the two methods for any of the 43 SNPs across both pools (i.e., 86 tests in total. Conclusions The results of the current study support previous findings of the use of DNA sample pooling and high-throughput sequencing as a viable strategy for polymorphism discovery and allele frequency estimation. Using this approach we have characterised the genetic variation within genes of the somatotrophic axis and related pathways, central to mammalian post-natal growth and development and subsequent lactogenesis and fertility. We have identified a large number of variants segregating at significantly different frequencies between cattle groups divergent for calving

  3. Detecting DNA double-stranded breaks in mammalian genomes by linear amplification-mediated high-throughput genome-wide translocation sequencing.

    Science.gov (United States)

    Hu, Jiazhi; Meyers, Robin M; Dong, Junchao; Panchakshari, Rohit A; Alt, Frederick W; Frock, Richard L

    2016-05-01

    Unbiased, high-throughput assays for detecting and quantifying DNA double-stranded breaks (DSBs) across the genome in mammalian cells will facilitate basic studies of the mechanisms that generate and repair endogenous DSBs. They will also enable more applied studies, such as those to evaluate the on- and off-target activities of engineered nucleases. Here we describe a linear amplification-mediated high-throughput genome-wide sequencing (LAM-HTGTS) method for the detection of genome-wide 'prey' DSBs via their translocation in cultured mammalian cells to a fixed 'bait' DSB. Bait-prey junctions are cloned directly from isolated genomic DNA using LAM-PCR and unidirectionally ligated to bridge adapters; subsequent PCR steps amplify the single-stranded DNA junction library in preparation for Illumina Miseq paired-end sequencing. A custom bioinformatics pipeline identifies prey sequences that contribute to junctions and maps them across the genome. LAM-HTGTS differs from related approaches because it detects a wide range of broken end structures with nucleotide-level resolution. Familiarity with nucleic acid methods and next-generation sequencing analysis is necessary for library generation and data interpretation. LAM-HTGTS assays are sensitive, reproducible, relatively inexpensive, scalable and straightforward to implement with a turnaround time of <1 week.

  4. The high throughput biomedicine unit at the institute for molecular medicine Finland: high throughput screening meets precision medicine.

    Science.gov (United States)

    Pietiainen, Vilja; Saarela, Jani; von Schantz, Carina; Turunen, Laura; Ostling, Paivi; Wennerberg, Krister

    2014-05-01

    The High Throughput Biomedicine (HTB) unit at the Institute for Molecular Medicine Finland FIMM was established in 2010 to serve as a national and international academic screening unit providing access to state of the art instrumentation for chemical and RNAi-based high throughput screening. The initial focus of the unit was multiwell plate based chemical screening and high content microarray-based siRNA screening. However, over the first four years of operation, the unit has moved to a more flexible service platform where both chemical and siRNA screening is performed at different scales primarily in multiwell plate-based assays with a wide range of readout possibilities with a focus on ultraminiaturization to allow for affordable screening for the academic users. In addition to high throughput screening, the equipment of the unit is also used to support miniaturized, multiplexed and high throughput applications for other types of research such as genomics, sequencing and biobanking operations. Importantly, with the translational research goals at FIMM, an increasing part of the operations at the HTB unit is being focused on high throughput systems biological platforms for functional profiling of patient cells in personalized and precision medicine projects.

  5. Hi-Plex for Simple, Accurate, and Cost-Effective Amplicon-based Targeted DNA Sequencing.

    Science.gov (United States)

    Pope, Bernard J; Hammet, Fleur; Nguyen-Dumont, Tu; Park, Daniel J

    2018-01-01

    Hi-Plex is a suite of methods to enable simple, accurate, and cost-effective highly multiplex PCR-based targeted sequencing (Nguyen-Dumont et al., Biotechniques 58:33-36, 2015). At its core is the principle of using gene-specific primers (GSPs) to "seed" (or target) the reaction and universal primers to "drive" the majority of the reaction. In this manner, effects on amplification efficiencies across the target amplicons can, to a large extent, be restricted to early seeding cycles. Product sizes are defined within a relatively narrow range to enable high-specificity size selection, replication uniformity across target sites (including in the context of fragmented input DNA such as that derived from fixed tumor specimens (Nguyen-Dumont et al., Biotechniques 55:69-74, 2013; Nguyen-Dumont et al., Anal Biochem 470:48-51, 2015), and application of high-specificity genetic variant calling algorithms (Pope et al., Source Code Biol Med 9:3, 2014; Park et al., BMC Bioinformatics 17:165, 2016). Hi-Plex offers a streamlined workflow that is suitable for testing large numbers of specimens without the need for automation.

  6. Tracking TCRβ sequence clonotype expansions during antiviral therapy using high-throughput sequencing of the hypervariable region

    Directory of Open Access Journals (Sweden)

    Mark W Robinson

    2016-04-01

    Full Text Available To maintain a persistent infection viruses such as hepatitis C virus (HCV employ a range of mechanisms that subvert protective T cell responses. The suppression of antigen-specific T cell responses by HCV hinders efforts to profile T cell responses during chronic infection and antiviral therapy. Conventional methods of detecting antigen-specific T cells utilise either antigen stimulation (e.g. ELISpot, proliferation assays, cytokine production or antigen-loaded tetramer staining. This limits the ability to profile T cell responses during chronic infection due to suppressed effector function and the requirement for prior knowledge of antigenic viral peptide sequences. Recently high-throughput sequencing (HTS technologies have been developed for the analysis of T cell repertoires. In the present study we have assessed the feasibility of HTS of the TCRβ complementarity determining region (CDR3 to track T cell expansions in an antigen-independent manner. Using sequential blood samples from HCV-infected individuals undergoing anti-viral therapy we were able to measure the population frequencies of >35,000 TCRβ sequence clonotypes in each individual over the course of 12 weeks. TRBV/TRBJ gene segment usage varied markedly between individuals but remained relatively constant within individuals across the course of therapy. Despite this stable TRBV/TRBJ gene segment usage, a number of TCRβ sequence clonotypes showed dramatic changes in read frequency. These changes could not be linked to therapy outcomes in the present study however the TCRβ CDR3 sequences with the largest fold changes did include sequences with identical TRBV/TRBJ gene segment usage and high joining region homology to previously published CDR3 sequences from HCV-specific T cells targeting the HLA-B*0801-restricted 1395HSKKKCDEL1403 and HLA-A*0101–restricted 1435ATDALMTGY1443 epitopes. The pipeline developed in this proof of concept study provides a platform for the design of

  7. Barcoding the food chain: from Sanger to high-throughput sequencing.

    Science.gov (United States)

    Littlefair, Joanne E; Clare, Elizabeth L

    2016-11-01

    Society faces the complex challenge of supporting biodiversity and ecosystem functioning, while ensuring food security by providing safe traceable food through an ever-more-complex global food chain. The increase in human mobility brings the added threat of pests, parasites, and invaders that further complicate our agro-industrial efforts. DNA barcoding technologies allow researchers to identify both individual species, and, when combined with universal primers and high-throughput sequencing techniques, the diversity within mixed samples (metabarcoding). These tools are already being employed to detect market substitutions, trace pests through the forensic evaluation of trace "environmental DNA", and to track parasitic infections in livestock. The potential of DNA barcoding to contribute to increased security of the food chain is clear, but challenges remain in regulation and the need for validation of experimental analysis. Here, we present an overview of the current uses and challenges of applied DNA barcoding in agriculture, from agro-ecosystems within farmland to the kitchen table.

  8. Digital PCR provides sensitive and absolute calibration for high throughput sequencing

    Directory of Open Access Journals (Sweden)

    Fan H Christina

    2009-03-01

    Full Text Available Abstract Background Next-generation DNA sequencing on the 454, Solexa, and SOLiD platforms requires absolute calibration of the number of molecules to be sequenced. This requirement has two unfavorable consequences. First, large amounts of sample-typically micrograms-are needed for library preparation, thereby limiting the scope of samples which can be sequenced. For many applications, including metagenomics and the sequencing of ancient, forensic, and clinical samples, the quantity of input DNA can be critically limiting. Second, each library requires a titration sequencing run, thereby increasing the cost and lowering the throughput of sequencing. Results We demonstrate the use of digital PCR to accurately quantify 454 and Solexa sequencing libraries, enabling the preparation of sequencing libraries from nanogram quantities of input material while eliminating costly and time-consuming titration runs of the sequencer. We successfully sequenced low-nanogram scale bacterial and mammalian DNA samples on the 454 FLX and Solexa DNA sequencing platforms. This study is the first to definitively demonstrate the successful sequencing of picogram quantities of input DNA on the 454 platform, reducing the sample requirement more than 1000-fold without pre-amplification and the associated bias and reduction in library depth. Conclusion The digital PCR assay allows absolute quantification of sequencing libraries, eliminates uncertainties associated with the construction and application of standard curves to PCR-based quantification, and with a coefficient of variation close to 10%, is sufficiently precise to enable direct sequencing without titration runs.

  9. Reliable Detection of Herpes Simplex Virus Sequence Variation by High-Throughput Resequencing.

    Science.gov (United States)

    Morse, Alison M; Calabro, Kaitlyn R; Fear, Justin M; Bloom, David C; McIntyre, Lauren M

    2017-08-16

    High-throughput sequencing (HTS) has resulted in data for a number of herpes simplex virus (HSV) laboratory strains and clinical isolates. The knowledge of these sequences has been critical for investigating viral pathogenicity. However, the assembly of complete herpesviral genomes, including HSV, is complicated due to the existence of large repeat regions and arrays of smaller reiterated sequences that are commonly found in these genomes. In addition, the inherent genetic variation in populations of isolates for viruses and other microorganisms presents an additional challenge to many existing HTS sequence assembly pipelines. Here, we evaluate two approaches for the identification of genetic variants in HSV1 strains using Illumina short read sequencing data. The first, a reference-based approach, identifies variants from reads aligned to a reference sequence and the second, a de novo assembly approach, identifies variants from reads aligned to de novo assembled consensus sequences. Of critical importance for both approaches is the reduction in the number of low complexity regions through the construction of a non-redundant reference genome. We compared variants identified in the two methods. Our results indicate that approximately 85% of variants are identified regardless of the approach. The reference-based approach to variant discovery captures an additional 15% representing variants divergent from the HSV1 reference possibly due to viral passage. Reference-based approaches are significantly less labor-intensive and identify variants across the genome where de novo assembly-based approaches are limited to regions where contigs have been successfully assembled. In addition, regions of poor quality assembly can lead to false variant identification in de novo consensus sequences. For viruses with a well-assembled reference genome, a reference-based approach is recommended.

  10. Massively parallel amplicon sequencing reveals isotype-specific variability of antimicrobial peptide transcripts in Mytilus galloprovincialis.

    Directory of Open Access Journals (Sweden)

    Umberto Rosani

    Full Text Available BACKGROUND: Effective innate responses against potential pathogens are essential in the living world and possibly contributed to the evolutionary success of invertebrates. Taken together, antimicrobial peptide (AMP precursors of defensin, mytilin, myticin and mytimycin can represent about 40% of the hemocyte transcriptome in mussels injected with viral-like and bacterial preparations, and unique profiles of myticin C variants are expressed in single mussels. Based on amplicon pyrosequencing, we have ascertained and compared the natural and Vibrio-induced diversity of AMP transcripts in mussel hemocytes from three European regions. METHODOLOGY/PRINCIPAL FINDINGS: Hemolymph was collected from mussels farmed in the coastal regions of Palavas (France, Vigo (Spain and Venice (Italy. To represent the AMP families known in M. galloprovincialis, nine transcript sequences have been selected, amplified from hemocyte RNA and subjected to pyrosequencing. Hemolymph from farmed (offshore and wild (lagoon Venice mussels, both injected with 10(7 Vibrio cells, were similarly processed. Amplicon pyrosequencing emphasized the AMP transcript diversity, with Single Nucleotide Changes (SNC minimal for mytilin B/C and maximal for arthropod-like defensin and myticin C. Ratio of non-synonymous vs. synonymous changes also greatly differed between AMP isotypes. Overall, each amplicon revealed similar levels of nucleotidic variation across geographical regions, with two main sequence patterns confirmed for mytimycin and no substantial changes after immunostimulation. CONCLUSIONS/SIGNIFICANCE: Barcoding and bidirectional pyrosequencing allowed us to map and compare the transcript diversity of known mussel AMPs. Though most of the genuine cds variation was common to the analyzed samples we could estimate from 9 to 106 peptide variants in hemolymph pools representing 100 mussels, depending on the AMP isoform and sampling site. In this study, no prevailing SNC patterns related

  11. Enhanced throughput for infrared automated DNA sequencing

    Science.gov (United States)

    Middendorf, Lyle R.; Gartside, Bill O.; Humphrey, Pat G.; Roemer, Stephen C.; Sorensen, David R.; Steffens, David L.; Sutter, Scott L.

    1995-04-01

    Several enhancements have been developed and applied to infrared automated DNA sequencing resulting in significantly higher throughput. A 41 cm sequencing gel (31 cm well- to-read distance) combines high resolution of DNA sequencing fragments with optimized run times yielding two runs per day of 500 bases per sample. A 66 cm sequencing gel (56 cm well-to-read distance) produces sequence read lengths of up to 1000 bases for ds and ss templates using either T7 polymerase or cycle-sequencing protocols. Using a multichannel syringe to load 64 lanes allows 16 samples (compatible with 96-well format) to be visualized for each run. The 41 cm gel configuration allows 16,000 bases per day (16 samples X 500 bases/sample X 2 ten hour runs/day) to be sequenced with the advantages of infrared technology. Enhancements to internal labeling techniques using an infrared-labeled dATP molecule (Boehringer Mannheim GmbH, Penzberg, Germany; Sequenase (U.S. Biochemical) have also been made. The inclusion of glycerol in the sequencing reactions yields greatly improved results for some primer and template combinations. The inclusion of (alpha) -Thio-dNTP's in the labeling reaction increases signal intensity two- to three-fold.

  12. A performance evaluation of Nextera XT and KAPA HyperPlus for rapid Illumina library preparation of long-range mitogenome amplicons.

    Science.gov (United States)

    Ring, Joseph D; Sturk-Andreaggi, Kimberly; Peck, Michelle A; Marshall, Charla

    2017-07-01

    Next-generation sequencing (NGS) facilitates the rapid and high-throughput generation of human mitochondrial genome (mitogenome) data to build population and reference databases for forensic comparisons. To this end, long-range amplification provides an effective method of target enrichment that is amenable to library preparation assays employing DNA fragmentation. This study compared the Nextera XT DNA Library Preparation Kit (Illumina, San Diego, CA) and the KAPA HyperPlus Library Preparation Kit (Kapa Biosystems, Wilmington, MA) for enzymatic fragmentation and indexing of ∼8500bp mitogenome amplicons for Illumina sequencing. The Nextera XT libraries produced low-coverage regions that were consistent across all samples, while the HyperPlus libraries resulted in uniformly high coverage across the mitogenome, even with reduced-volume reaction conditions. The balanced coverage observed from KAPA HyperPlus libraries enables not only low-level variant calling across the mitogenome but also increased sample multiplexing for greater processing efficiency. Copyright © 2017 Elsevier B.V. All rights reserved.

  13. Characterization of the indigenous microflora in raw and pasteurized buffalo milk during storage at refrigeration temperature by high-throughput sequencing

    Science.gov (United States)

    The effect of refrigeration on bacterial communities within raw and pasteurized buffalo milk was studied using high-throughput sequencing. High quality samples of raw buffalo milk were obtained from five dairy farms in the Guangxi province of China. A sample of each milk was pasteurized, and both r...

  14. Pyicos: a versatile toolkit for the analysis of high-throughput sequencing data.

    Science.gov (United States)

    Althammer, Sonja; González-Vallinas, Juan; Ballaré, Cecilia; Beato, Miguel; Eyras, Eduardo

    2011-12-15

    High-throughput sequencing (HTS) has revolutionized gene regulation studies and is now fundamental for the detection of protein-DNA and protein-RNA binding, as well as for measuring RNA expression. With increasing variety and sequencing depth of HTS datasets, the need for more flexible and memory-efficient tools to analyse them is growing. We describe Pyicos, a powerful toolkit for the analysis of mapped reads from diverse HTS experiments: ChIP-Seq, either punctuated or broad signals, CLIP-Seq and RNA-Seq. We prove the effectiveness of Pyicos to select for significant signals and show that its accuracy is comparable and sometimes superior to that of methods specifically designed for each particular type of experiment. Pyicos facilitates the analysis of a variety of HTS datatypes through its flexibility and memory efficiency, providing a useful framework for data integration into models of regulatory genomics. Open-source software, with tutorials and protocol files, is available at http://regulatorygenomics.upf.edu/pyicos or as a Galaxy server at http://regulatorygenomics.upf.edu/galaxy eduardo.eyras@upf.edu Supplementary data are available at Bioinformatics online.

  15. Bacterial Pathogens and Community Composition in Advanced Sewage Treatment Systems Revealed by Metagenomics Analysis Based on High-Throughput Sequencing

    Science.gov (United States)

    Lu, Xin; Zhang, Xu-Xiang; Wang, Zhu; Huang, Kailong; Wang, Yuan; Liang, Weigang; Tan, Yunfei; Liu, Bo; Tang, Junying

    2015-01-01

    This study used 454 pyrosequencing, Illumina high-throughput sequencing and metagenomic analysis to investigate bacterial pathogens and their potential virulence in a sewage treatment plant (STP) applying both conventional and advanced treatment processes. Pyrosequencing and Illumina sequencing consistently demonstrated that Arcobacter genus occupied over 43.42% of total abundance of potential pathogens in the STP. At species level, potential pathogens Arcobacter butzleri, Aeromonas hydrophila and Klebsiella pneumonia dominated in raw sewage, which was also confirmed by quantitative real time PCR. Illumina sequencing also revealed prevalence of various types of pathogenicity islands and virulence proteins in the STP. Most of the potential pathogens and virulence factors were eliminated in the STP, and the removal efficiency mainly depended on oxidation ditch. Compared with sand filtration, magnetic resin seemed to have higher removals in most of the potential pathogens and virulence factors. However, presence of the residual A. butzleri in the final effluent still deserves more concerns. The findings indicate that sewage acts as an important source of environmental pathogens, but STPs can effectively control their spread in the environment. Joint use of the high-throughput sequencing technologies is considered a reliable method for deep and comprehensive overview of environmental bacterial virulence. PMID:25938416

  16. Illumina MiSeq 16S amplicon sequence analysis of bovine respiratory disease associated bacteria in lung and mediastinal lymph node tissue.

    Science.gov (United States)

    Johnston, Dayle; Earley, Bernadette; Cormican, Paul; Murray, Gerard; Kenny, David Anthony; Waters, Sinead Mary; McGee, Mark; Kelly, Alan Kieran; McCabe, Matthew Sean

    2017-05-02

    Bovine respiratory disease (BRD) is caused by growth of single or multiple species of pathogenic bacteria in lung tissue following stress and/or viral infection. Next generation sequencing of 16S ribosomal RNA gene PCR amplicons (NGS 16S amplicon analysis) is a powerful culture-independent open reference method that has recently been used to increase understanding of BRD-associated bacteria in the upper respiratory tract of BRD cattle. However, it has not yet been used to examine the microbiome of the bovine lower respiratory tract. The objective of this study was to use NGS 16S amplicon analysis to identify bacteria in post-mortem lung and lymph node tissue samples harvested from fatal BRD cases and clinically healthy animals. Cranial lobe and corresponding mediastinal lymph node post-mortem tissue samples were collected from calves diagnosed as BRD cases by veterinary laboratory pathologists and from clinically healthy calves. NGS 16S amplicon libraries, targeting the V3-V4 region of the bacterial 16S rRNA gene were prepared and sequenced on an Illumina MiSeq. Quantitative insights into microbial ecology (QIIME) was used to determine operational taxonomic units (OTUs) which corresponded to the 16S rRNA gene sequences. Leptotrichiaceae, Mycoplasma, Pasteurellaceae, and Fusobacterium were the most abundant OTUs identified in the lungs and lymph nodes of the calves which died from BRD. Leptotrichiaceae, Fusobacterium, Mycoplasma, Trueperella and Bacteroides had greater relative abundances in post-mortem lung samples collected from fatal cases of BRD in dairy calves, compared with clinically healthy calves without lung lesions. Leptotrichiaceae, Mycoplasma and Pasteurellaceae showed higher relative abundances in post-mortem lymph node samples collected from fatal cases of BRD in dairy calves, compared with clinically healthy calves without lung lesions. Two Leptotrichiaceae sequence contigs were subsequently assembled from bacterial DNA-enriched shotgun sequences

  17. BOOGIE: Predicting Blood Groups from High Throughput Sequencing Data.

    Science.gov (United States)

    Giollo, Manuel; Minervini, Giovanni; Scalzotto, Marta; Leonardi, Emanuela; Ferrari, Carlo; Tosatto, Silvio C E

    2015-01-01

    Over the last decade, we have witnessed an incredible growth in the amount of available genotype data due to high throughput sequencing (HTS) techniques. This information may be used to predict phenotypes of medical relevance, and pave the way towards personalized medicine. Blood phenotypes (e.g. ABO and Rh) are a purely genetic trait that has been extensively studied for decades, with currently over thirty known blood groups. Given the public availability of blood group data, it is of interest to predict these phenotypes from HTS data which may translate into more accurate blood typing in clinical practice. Here we propose BOOGIE, a fast predictor for the inference of blood groups from single nucleotide variant (SNV) databases. We focus on the prediction of thirty blood groups ranging from the well known ABO and Rh, to the less studied Junior or Diego. BOOGIE correctly predicted the blood group with 94% accuracy for the Personal Genome Project whole genome profiles where good quality SNV annotation was available. Additionally, our tool produces a high quality haplotype phase, which is of interest in the context of ethnicity-specific polymorphisms or traits. The versatility and simplicity of the analysis make it easily interpretable and allow easy extension of the protocol towards other phenotypes. BOOGIE can be downloaded from URL http://protein.bio.unipd.it/download/.

  18. A high-throughput multiplex method adapted for GMO detection.

    Science.gov (United States)

    Chaouachi, Maher; Chupeau, Gaëlle; Berard, Aurélie; McKhann, Heather; Romaniuk, Marcel; Giancola, Sandra; Laval, Valérie; Bertheau, Yves; Brunel, Dominique

    2008-12-24

    A high-throughput multiplex assay for the detection of genetically modified organisms (GMO) was developed on the basis of the existing SNPlex method designed for SNP genotyping. This SNPlex assay allows the simultaneous detection of up to 48 short DNA sequences (approximately 70 bp; "signature sequences") from taxa endogenous reference genes, from GMO constructions, screening targets, construct-specific, and event-specific targets, and finally from donor organisms. This assay avoids certain shortcomings of multiplex PCR-based methods already in widespread use for GMO detection. The assay demonstrated high specificity and sensitivity. The results suggest that this assay is reliable, flexible, and cost- and time-effective for high-throughput GMO detection.

  19. High throughput protein production screening

    Science.gov (United States)

    Beernink, Peter T [Walnut Creek, CA; Coleman, Matthew A [Oakland, CA; Segelke, Brent W [San Ramon, CA

    2009-09-08

    Methods, compositions, and kits for the cell-free production and analysis of proteins are provided. The invention allows for the production of proteins from prokaryotic sequences or eukaryotic sequences, including human cDNAs using PCR and IVT methods and detecting the proteins through fluorescence or immunoblot techniques. This invention can be used to identify optimized PCR and WT conditions, codon usages and mutations. The methods are readily automated and can be used for high throughput analysis of protein expression levels, interactions, and functional states.

  20. Targeted amplicon sequencing (TAS): a scalable next-gen approach to multilocus, multitaxa phylogenetics.

    Science.gov (United States)

    Bybee, Seth M; Bracken-Grissom, Heather; Haynes, Benjamin D; Hermansen, Russell A; Byers, Robert L; Clement, Mark J; Udall, Joshua A; Wilcox, Edward R; Crandall, Keith A

    2011-01-01

    Next-gen sequencing technologies have revolutionized data collection in genetic studies and advanced genome biology to novel frontiers. However, to date, next-gen technologies have been used principally for whole genome sequencing and transcriptome sequencing. Yet many questions in population genetics and systematics rely on sequencing specific genes of known function or diversity levels. Here, we describe a targeted amplicon sequencing (TAS) approach capitalizing on next-gen capacity to sequence large numbers of targeted gene regions from a large number of samples. Our TAS approach is easily scalable, simple in execution, neither time-nor labor-intensive, relatively inexpensive, and can be applied to a broad diversity of organisms and/or genes. Our TAS approach includes a bioinformatic application, BarcodeCrucher, to take raw next-gen sequence reads and perform quality control checks and convert the data into FASTA format organized by gene and sample, ready for phylogenetic analyses. We demonstrate our approach by sequencing targeted genes of known phylogenetic utility to estimate a phylogeny for the Pancrustacea. We generated data from 44 taxa using 68 different 10-bp multiplexing identifiers. The overall quality of data produced was robust and was informative for phylogeny estimation. The potential for this method to produce copious amounts of data from a single 454 plate (e.g., 325 taxa for 24 loci) significantly reduces sequencing expenses incurred from traditional Sanger sequencing. We further discuss the advantages and disadvantages of this method, while offering suggestions to enhance the approach.

  1. Bulk segregant analysis by high-throughput sequencing reveals a novel xylose utilization gene from Saccharomyces cerevisiae.

    Directory of Open Access Journals (Sweden)

    Jared W Wenger

    2010-05-01

    Full Text Available Fermentation of xylose is a fundamental requirement for the efficient production of ethanol from lignocellulosic biomass sources. Although they aggressively ferment hexoses, it has long been thought that native Saccharomyces cerevisiae strains cannot grow fermentatively or non-fermentatively on xylose. Population surveys have uncovered a few naturally occurring strains that are weakly xylose-positive, and some S. cerevisiae have been genetically engineered to ferment xylose, but no strain, either natural or engineered, has yet been reported to ferment xylose as efficiently as glucose. Here, we used a medium-throughput screen to identify Saccharomyces strains that can increase in optical density when xylose is presented as the sole carbon source. We identified 38 strains that have this xylose utilization phenotype, including strains of S. cerevisiae, other sensu stricto members, and hybrids between them. All the S. cerevisiae xylose-utilizing strains we identified are wine yeasts, and for those that could produce meiotic progeny, the xylose phenotype segregates as a single gene trait. We mapped this gene by Bulk Segregant Analysis (BSA using tiling microarrays and high-throughput sequencing. The gene is a putative xylitol dehydrogenase, which we name XDH1, and is located in the subtelomeric region of the right end of chromosome XV in a region not present in the S288c reference genome. We further characterized the xylose phenotype by performing gene expression microarrays and by genetically dissecting the endogenous Saccharomyces xylose pathway. We have demonstrated that natural S. cerevisiae yeasts are capable of utilizing xylose as the sole carbon source, characterized the genetic basis for this trait as well as the endogenous xylose utilization pathway, and demonstrated the feasibility of BSA using high-throughput sequencing.

  2. Identification of antigen-specific human monoclonal antibodies using high-throughput sequencing of the antibody repertoire.

    Science.gov (United States)

    Liu, Ju; Li, Ruihua; Liu, Kun; Li, Liangliang; Zai, Xiaodong; Chi, Xiangyang; Fu, Ling; Xu, Junjie; Chen, Wei

    2016-04-22

    High-throughput sequencing of the antibody repertoire provides a large number of antibody variable region sequences that can be used to generate human monoclonal antibodies. However, current screening methods for identifying antigen-specific antibodies are inefficient. In the present study, we developed an antibody clone screening strategy based on clone dynamics and relative frequency, and used it to identify antigen-specific human monoclonal antibodies. Enzyme-linked immunosorbent assay showed that at least 52% of putative positive immunoglobulin heavy chains composed antigen-specific antibodies. Combining information on dynamics and relative frequency improved identification of positive clones and elimination of negative clones. and increase the credibility of putative positive clones. Therefore the screening strategy could simplify the subsequent experimental screening and may facilitate the generation of antigen-specific antibodies. Copyright © 2016 Elsevier Inc. All rights reserved.

  3. High-Throughput Sequencing, a VersatileWeapon to Support Genome-Based Diagnosis in Infectious Diseases: Applications to Clinical Bacteriology

    Directory of Open Access Journals (Sweden)

    Ségolène Caboche

    2014-04-01

    Full Text Available The recent progresses of high-throughput sequencing (HTS technologies enable easy and cost-reduced access to whole genome sequencing (WGS or re-sequencing. HTS associated with adapted, automatic and fast bioinformatics solutions for sequencing applications promises an accurate and timely identification and characterization of pathogenic agents. Many studies have demonstrated that data obtained from HTS analysis have allowed genome-based diagnosis, which has been consistent with phenotypic observations. These proofs of concept are probably the first steps toward the future of clinical microbiology. From concept to routine use, many parameters need to be considered to promote HTS as a powerful tool to help physicians and clinicians in microbiological investigations. This review highlights the milestones to be completed toward this purpose.

  4. Amplicon-based semiconductor sequencing of human exomes: performance evaluation and optimization strategies.

    Science.gov (United States)

    Damiati, E; Borsani, G; Giacopuzzi, Edoardo

    2016-05-01

    The Ion Proton platform allows to perform whole exome sequencing (WES) at low cost, providing rapid turnaround time and great flexibility. Products for WES on Ion Proton system include the AmpliSeq Exome kit and the recently introduced HiQ sequencing chemistry. Here, we used gold standard variants from GIAB consortium to assess the performances in variants identification, characterize the erroneous calls and develop a filtering strategy to reduce false positives. The AmpliSeq Exome kit captures a large fraction of bases (>94 %) in human CDS, ClinVar genes and ACMG genes, but with 2,041 (7 %), 449 (13 %) and 11 (19 %) genes not fully represented, respectively. Overall, 515 protein coding genes contain hard-to-sequence regions, including 90 genes from ClinVar. Performance in variants detection was maximum at mean coverage >120×, while at 90× and 70× we measured a loss of variants of 3.2 and 4.5 %, respectively. WES using HiQ chemistry showed ~71/97.5 % sensitivity, ~37/2 % FDR and ~0.66/0.98 F1 score for indels and SNPs, respectively. The proposed low, medium or high-stringency filters reduced the amount of false positives by 10.2, 21.2 and 40.4 % for indels and 21.2, 41.9 and 68.2 % for SNP, respectively. Amplicon-based WES on Ion Proton platform using HiQ chemistry emerged as a competitive approach, with improved accuracy in variants identification. False-positive variants remain an issue for the Ion Torrent technology, but our filtering strategy can be applied to reduce erroneous variants.

  5. Fixing Formalin: A Method to Recover Genomic-Scale DNA Sequence Data from Formalin-Fixed Museum Specimens Using High-Throughput Sequencing.

    Directory of Open Access Journals (Sweden)

    Sarah M Hykin

    Full Text Available For 150 years or more, specimens were routinely collected and deposited in natural history collections without preserving fresh tissue samples for genetic analysis. In the case of most herpetological specimens (i.e. amphibians and reptiles, attempts to extract and sequence DNA from formalin-fixed, ethanol-preserved specimens-particularly for use in phylogenetic analyses-has been laborious and largely ineffective due to the highly fragmented nature of the DNA. As a result, tens of thousands of specimens in herpetological collections have not been available for sequence-based phylogenetic studies. Massively parallel High-Throughput Sequencing methods and the associated bioinformatics, however, are particularly suited to recovering meaningful genetic markers from severely degraded/fragmented DNA sequences such as DNA damaged by formalin-fixation. In this study, we compared previously published DNA extraction methods on three tissue types subsampled from formalin-fixed specimens of Anolis carolinensis, followed by sequencing. Sufficient quality DNA was recovered from liver tissue, making this technique minimally destructive to museum specimens. Sequencing was only successful for the more recently collected specimen (collected ~30 ybp. We suspect this could be due either to the conditions of preservation and/or the amount of tissue used for extraction purposes. For the successfully sequenced sample, we found a high rate of base misincorporation. After rigorous trimming, we successfully mapped 27.93% of the cleaned reads to the reference genome, were able to reconstruct the complete mitochondrial genome, and recovered an accurate phylogenetic placement for our specimen. We conclude that the amount of DNA available, which can vary depending on specimen age and preservation conditions, will determine if sequencing will be successful. The technique described here will greatly improve the value of museum collections by making many formalin-fixed specimens

  6. Fixing Formalin: A Method to Recover Genomic-Scale DNA Sequence Data from Formalin-Fixed Museum Specimens Using High-Throughput Sequencing.

    Science.gov (United States)

    Hykin, Sarah M; Bi, Ke; McGuire, Jimmy A

    2015-01-01

    For 150 years or more, specimens were routinely collected and deposited in natural history collections without preserving fresh tissue samples for genetic analysis. In the case of most herpetological specimens (i.e. amphibians and reptiles), attempts to extract and sequence DNA from formalin-fixed, ethanol-preserved specimens-particularly for use in phylogenetic analyses-has been laborious and largely ineffective due to the highly fragmented nature of the DNA. As a result, tens of thousands of specimens in herpetological collections have not been available for sequence-based phylogenetic studies. Massively parallel High-Throughput Sequencing methods and the associated bioinformatics, however, are particularly suited to recovering meaningful genetic markers from severely degraded/fragmented DNA sequences such as DNA damaged by formalin-fixation. In this study, we compared previously published DNA extraction methods on three tissue types subsampled from formalin-fixed specimens of Anolis carolinensis, followed by sequencing. Sufficient quality DNA was recovered from liver tissue, making this technique minimally destructive to museum specimens. Sequencing was only successful for the more recently collected specimen (collected ~30 ybp). We suspect this could be due either to the conditions of preservation and/or the amount of tissue used for extraction purposes. For the successfully sequenced sample, we found a high rate of base misincorporation. After rigorous trimming, we successfully mapped 27.93% of the cleaned reads to the reference genome, were able to reconstruct the complete mitochondrial genome, and recovered an accurate phylogenetic placement for our specimen. We conclude that the amount of DNA available, which can vary depending on specimen age and preservation conditions, will determine if sequencing will be successful. The technique described here will greatly improve the value of museum collections by making many formalin-fixed specimens available for

  7. glbase: a framework for combining, analyzing and displaying heterogeneous genomic and high-throughput sequencing data

    Directory of Open Access Journals (Sweden)

    Andrew Paul Hutchins

    2014-01-01

    Full Text Available Genomic datasets and the tools to analyze them have proliferated at an astonishing rate. However, such tools are often poorly integrated with each other: each program typically produces its own custom output in a variety of non-standard file formats. Here we present glbase, a framework that uses a flexible set of descriptors that can quickly parse non-binary data files. glbase includes many functions to intersect two lists of data, including operations on genomic interval data and support for the efficient random access to huge genomic data files. Many glbase functions can produce graphical outputs, including scatter plots, heatmaps, boxplots and other common analytical displays of high-throughput data such as RNA-seq, ChIP-seq and microarray expression data. glbase is designed to rapidly bring biological data into a Python-based analytical environment to facilitate analysis and data processing. In summary, glbase is a flexible and multifunctional toolkit that allows the combination and analysis of high-throughput data (especially next-generation sequencing and genome-wide data, and which has been instrumental in the analysis of complex data sets. glbase is freely available at http://bitbucket.org/oaxiom/glbase/.

  8. glbase: a framework for combining, analyzing and displaying heterogeneous genomic and high-throughput sequencing data.

    Science.gov (United States)

    Hutchins, Andrew Paul; Jauch, Ralf; Dyla, Mateusz; Miranda-Saavedra, Diego

    2014-01-01

    Genomic datasets and the tools to analyze them have proliferated at an astonishing rate. However, such tools are often poorly integrated with each other: each program typically produces its own custom output in a variety of non-standard file formats. Here we present glbase, a framework that uses a flexible set of descriptors that can quickly parse non-binary data files. glbase includes many functions to intersect two lists of data, including operations on genomic interval data and support for the efficient random access to huge genomic data files. Many glbase functions can produce graphical outputs, including scatter plots, heatmaps, boxplots and other common analytical displays of high-throughput data such as RNA-seq, ChIP-seq and microarray expression data. glbase is designed to rapidly bring biological data into a Python-based analytical environment to facilitate analysis and data processing. In summary, glbase is a flexible and multifunctional toolkit that allows the combination and analysis of high-throughput data (especially next-generation sequencing and genome-wide data), and which has been instrumental in the analysis of complex data sets. glbase is freely available at http://bitbucket.org/oaxiom/glbase/.

  9. Direct metagenomic detection of viral pathogens in nasal and fecal specimens using an unbiased high-throughput sequencing approach.

    Directory of Open Access Journals (Sweden)

    Shota Nakamura

    Full Text Available With the severe acute respiratory syndrome epidemic of 2003 and renewed attention on avian influenza viral pandemics, new surveillance systems are needed for the earlier detection of emerging infectious diseases. We applied a "next-generation" parallel sequencing platform for viral detection in nasopharyngeal and fecal samples collected during seasonal influenza virus (Flu infections and norovirus outbreaks from 2005 to 2007 in Osaka, Japan. Random RT-PCR was performed to amplify RNA extracted from 0.1-0.25 ml of nasopharyngeal aspirates (N = 3 and fecal specimens (N = 5, and more than 10 microg of cDNA was synthesized. Unbiased high-throughput sequencing of these 8 samples yielded 15,298-32,335 (average 24,738 reads in a single 7.5 h run. In nasopharyngeal samples, although whole genome analysis was not available because the majority (>90% of reads were host genome-derived, 20-460 Flu-reads were detected, which was sufficient for subtype identification. In fecal samples, bacteria and host cells were removed by centrifugation, resulting in gain of 484-15,260 reads of norovirus sequence (78-98% of the whole genome was covered, except for one specimen that was under-detectable by RT-PCR. These results suggest that our unbiased high-throughput sequencing approach is useful for directly detecting pathogenic viruses without advance genetic information. Although its cost and technological availability make it unlikely that this system will very soon be the diagnostic standard worldwide, this system could be useful for the earlier discovery of novel emerging viruses and bioterrorism, which are difficult to detect with conventional procedures.

  10. High throughput sequencing identifies chilling responsive genes in sweetpotato (Ipomoea batatas Lam.) during storage.

    Science.gov (United States)

    Xie, Zeyi; Zhou, Zhilin; Li, Hongmin; Yu, Jingjing; Jiang, Jiaojiao; Tang, Zhonghou; Ma, Daifu; Zhang, Baohong; Han, Yonghua; Li, Zongyun

    2018-05-21

    Sweetpotato (Ipomoea batatas L.) is a globally important economic food crop. It belongs to Convolvulaceae family and origins in the tropics; however, sweetpotato is sensitive to cold stress during storage. In this study, we performed transcriptome sequencing to investigate the sweetpotato response to chilling stress during storage. A total of 110,110 unigenes were generated via high-throughput sequencing. Differentially expressed genes (DEGs) analysis showed that 18,681 genes were up-regulated and 21,983 genes were down-regulated in low temperature condition. Many DEGs were related to the cell membrane system, antioxidant enzymes, carbohydrate metabolism, and hormone metabolism, which are potentially associated with sweetpotato resistance to low temperature. The existence of DEGs suggests a molecular basis for the biochemical and physiological consequences of sweetpotato in low temperature storage conditions. Our analysis will provide a new target for enhancement of sweetpotato cold stress tolerance in postharvest storage through genetic manipulation. Copyright © 2018. Published by Elsevier Inc.

  11. Improving High-Throughput Sequencing Approaches for Reconstructing the Evolutionary Dynamics of Upper Paleolithic Human Groups

    DEFF Research Database (Denmark)

    Seguin-Orlando, Andaine

    the development and testing of innovative molecular approaches aiming at improving the amount of informative HTS data one can recover from ancient DNA extracts. We have characterized important ligation and amplification biases in the sequencing library building and enrichment steps, which can impede further...... been mainly driven by the development of High-Throughput DNA Sequencing (HTS) technologies but also by the implementation of novel molecular tools tailored to the manipulation of ultra short and damaged DNA molecules. Our ability to retrieve traces of genetic material has tremendously improved, pushing......, that impact on the overall efficacy of the method. In a second part, we implemented some of these molecular tools to the processing of five Upper Paleolithic human samples from the Kostenki and Sunghir sites in Western Eurasia, in order to reconstruct the deep genomic history of European populations...

  12. Polymerase chain reaction-hybridization method using urease gene sequences for high-throughput Ureaplasma urealyticum and Ureaplasma parvum detection and differentiation.

    Science.gov (United States)

    Xu, Chen; Zhang, Nan; Huo, Qianyu; Chen, Minghui; Wang, Rengfeng; Liu, Zhili; Li, Xue; Liu, Yunde; Bao, Huijing

    2016-04-15

    In this article, we discuss the polymerase chain reaction (PCR)-hybridization assay that we developed for high-throughput simultaneous detection and differentiation of Ureaplasma urealyticum and Ureaplasma parvum using one set of primers and two specific DNA probes based on urease gene nucleotide sequence differences. First, U. urealyticum and U. parvum DNA samples were specifically amplified using one set of biotin-labeled primers. Furthermore, amine-modified DNA probes, which can specifically react with U. urealyticum or U. parvum DNA, were covalently immobilized to a DNA-BIND plate surface. The plate was then incubated with the PCR products to facilitate sequence-specific DNA binding. Horseradish peroxidase-streptavidin conjugation and a colorimetric assay were used. Based on the results, the PCR-hybridization assay we developed can specifically differentiate U. urealyticum and U. parvum with high sensitivity (95%) compared with cultivation (72.5%). Hence, this study demonstrates a new method for high-throughput simultaneous differentiation and detection of U. urealyticum and U. parvum with high sensitivity. Based on these observations, the PCR-hybridization assay developed in this study is ideal for detecting and discriminating U. urealyticum and U. parvum in clinical applications. Copyright © 2016 Elsevier Inc. All rights reserved.

  13. A massive parallel sequencing workflow for diagnostic genetic testing of mismatch repair genes

    Science.gov (United States)

    Hansen, Maren F; Neckmann, Ulrike; Lavik, Liss A S; Vold, Trine; Gilde, Bodil; Toft, Ragnhild K; Sjursen, Wenche

    2014-01-01

    The purpose of this study was to develop a massive parallel sequencing (MPS) workflow for diagnostic analysis of mismatch repair (MMR) genes using the GS Junior system (Roche). A pathogenic variant in one of four MMR genes, (MLH1, PMS2, MSH6, and MSH2), is the cause of Lynch Syndrome (LS), which mainly predispose to colorectal cancer. We used an amplicon-based sequencing method allowing specific and preferential amplification of the MMR genes including PMS2, of which several pseudogenes exist. The amplicons were pooled at different ratios to obtain coverage uniformity and maximize the throughput of a single-GS Junior run. In total, 60 previously identified and distinct variants (substitutions and indels), were sequenced by MPS and successfully detected. The heterozygote detection range was from 19% to 63% and dependent on sequence context and coverage. We were able to distinguish between false-positive and true-positive calls in homopolymeric regions by cross-sample comparison and evaluation of flow signal distributions. In addition, we filtered variants according to a predefined status, which facilitated variant annotation. Our study shows that implementation of MPS in routine diagnostics of LS can accelerate sample throughput and reduce costs without compromising sensitivity, compared to Sanger sequencing. PMID:24689082

  14. SSR_pipeline--computer software for the identification of microsatellite sequences from paired-end Illumina high-throughput DNA sequence data

    Science.gov (United States)

    Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (SSRs; for example, microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains three analysis modules along with a fourth control module that can be used to automate analyses of large volumes of data. The modules are used to (1) identify the subset of paired-end sequences that pass quality standards, (2) align paired-end reads into a single composite DNA sequence, and (3) identify sequences that possess microsatellites conforming to user specified parameters. Each of the three separate analysis modules also can be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc). All modules are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, Windows). The program suite relies on a compiled Python extension module to perform paired-end alignments. Instructions for compiling the extension from source code are provided in the documentation. Users who do not have Python installed on their computers or who do not have the ability to compile software also may choose to download packaged executable files. These files include all Python scripts, a copy of the compiled extension module, and a minimal installation of Python in a single binary executable. See program documentation for more information.

  15. High throughput resistance profiling of Plasmodium falciparum infections based on custom dual indexing and Illumina next generation sequencing-technology

    DEFF Research Database (Denmark)

    Nag, Sidsel; Dalgaard, Marlene Danner; Kofoed, Poul-Erik

    2017-01-01

    Genetic polymorphisms in P. falciparum can be used to indicate the parasite's susceptibility to antimalarial drugs as well as its geographical origin. Both of these factors are key to monitoring development and spread of antimalarial drug resistance. In this study, we combine multiplex PCR, custom...... designed dual indexing and Miseq sequencing for high throughput SNP-profiling of 457 malaria infections from Guinea-Bissau, at the cost of 10 USD per sample. By amplifying and sequencing 15 genetic fragments, we cover 20 resistance-conferring SNPs occurring in pfcrt, pfmdr1, pfdhfr, pfdhps, as well...

  16. Integrated analysis of RNA-binding protein complexes using in vitro selection and high-throughput sequencing and sequence specificity landscapes (SEQRS).

    Science.gov (United States)

    Lou, Tzu-Fang; Weidmann, Chase A; Killingsworth, Jordan; Tanaka Hall, Traci M; Goldstrohm, Aaron C; Campbell, Zachary T

    2017-04-15

    RNA-binding proteins (RBPs) collaborate to control virtually every aspect of RNA function. Tremendous progress has been made in the area of global assessment of RBP specificity using next-generation sequencing approaches both in vivo and in vitro. Understanding how protein-protein interactions enable precise combinatorial regulation of RNA remains a significant problem. Addressing this challenge requires tools that can quantitatively determine the specificities of both individual proteins and multimeric complexes in an unbiased and comprehensive way. One approach utilizes in vitro selection, high-throughput sequencing, and sequence-specificity landscapes (SEQRS). We outline a SEQRS experiment focused on obtaining the specificity of a multi-protein complex between Drosophila RBPs Pumilio (Pum) and Nanos (Nos). We discuss the necessary controls in this type of experiment and examine how the resulting data can be complemented with structural and cell-based reporter assays. Additionally, SEQRS data can be integrated with functional genomics data to uncover biological function. Finally, we propose extensions of the technique that will enhance our understanding of multi-protein regulatory complexes assembled onto RNA. Copyright © 2016 Elsevier Inc. All rights reserved.

  17. The main challenges that remain in applying high-throughput sequencing to clinical diagnostics.

    Science.gov (United States)

    Loeffelholz, Michael; Fofanov, Yuriy

    2015-01-01

    Over the last 10 years, the quality, price and availability of high-throughput sequencing instruments have improved to the point that this technology may be close to becoming a routine tool in the diagnostic microbiology laboratory. Two groups of challenges, however, have to be resolved in order to move this powerful research technology into routine use in the clinical microbiology laboratory. The computational/bioinformatics challenges include data storage cost and privacy concerns, requiring analysis to be performed without access to cloud storage or expensive computational infrastructure. The logistical challenges include interpretation of complex results and acceptance and understanding of the advantages and limitations of this technology by the medical community. This article focuses on the approaches to address these challenges, such as file formats, algorithms, data collection, reporting and good laboratory practices.

  18. Spatially conserved regulatory elements identified within human and mouse Cd247 gene using high-throughput sequencing data from the ENCODE project

    DEFF Research Database (Denmark)

    Pundhir, Sachin; Hannibal, Tine Dahlbæk; Bang-Berthelsen, Claus Heiner

    2014-01-01

    . In this study, we have utilized the wealth of high-throughput sequencing data produced during the Encyclopedia of DNA Elements (ENCODE) project to identify spatially conserved regulatory elements within the Cd247 gene from human and mouse. We show the presence of two transcription factor binding sites...

  19. Evolutionary dynamics of an expressed MHC class IIβ locus in the Ranidae (Anura) uncovered by genome walking and high-throughput amplicon sequencing

    Science.gov (United States)

    Mulder, Kevin P.; Cortazar-Chinarro, Maria; Harris, D. James; Crottini, Angelica; Grant, Evan H. Campbell; Fleischer, Robert C.; Savage, Anna E.

    2017-01-01

    The Major Histocompatibility Complex (MHC) is a genomic region encoding immune loci that are important and frequently used markers in studies of adaptive genetic variation and disease resistance. Given the primary role of infectious diseases in contributing to global amphibian declines, we characterized the hypervariable exon 2 and flanking introns of the MHC Class IIβ chain for 17 species of frogs in the Ranidae, a speciose and cosmopolitan family facing widespread pathogen infections and declines. We find high levels of genetic variation concentrated in the Peptide Binding Region (PBR) of the exon. Ten codons are under positive selection, nine of which are located in the mammal-defined PBR. We hypothesize that the tenth codon (residue 21) is an amphibian-specific PBR site that may be important in disease resistance. Trans-species and trans-generic polymorphisms are evident from exon-based genealogies, and co-phylogenetic analyses between intron, exon and mitochondrial based reconstructions reveal incongruent topologies, likely due to different locus histories. We developed two sets of barcoded adapters that reliably amplify a single and likely functional locus in all screened species using both 454 and Illumina based sequencing methods. These primers provide a resource for multiplexing and directly sequencing hundreds of samples in a single sequencing run, avoiding the labour and chimeric sequences associated with cloning, and enabling MHC population genetic analyses. Although the primers are currently limited to the 17 species we tested, these sequences and protocols provide a useful genetic resource and can serve as a starting point for future disease, adaptation and conservation studies across a range of anuran taxa.

  20. Fine mapping of a Phytophthora-resistance gene RpsWY in soybean (Glycine max L.) by high-throughput genome-wide sequencing.

    Science.gov (United States)

    Cheng, Yanbo; Ma, Qibin; Ren, Hailong; Xia, Qiuju; Song, Enliang; Tan, Zhiyuan; Li, Shuxian; Zhang, Gengyun; Nian, Hai

    2017-05-01

    Using a combination of phenotypic screening, genetic and statistical analyses, and high-throughput genome-wide sequencing, we have finely mapped a dominant Phytophthora resistance gene in soybean cultivar Wayao. Phytophthora root rot (PRR) caused by Phytophthora sojae is one of the most important soil-borne diseases in many soybean-production regions in the world. Identification of resistant gene(s) and incorporating them into elite varieties are an effective way for breeding to prevent soybean from being harmed by this disease. Two soybean populations of 191 F 2 individuals and 196 F 7:8 recombinant inbred lines (RILs) were developed to map Rps gene by crossing a susceptible cultivar Huachun 2 with the resistant cultivar Wayao. Genetic analysis of the F 2 population indicated that PRR resistance in Wayao was controlled by a single dominant gene, temporarily named RpsWY, which was mapped on chromosome 3. A high-density genetic linkage bin map was constructed using 3469 recombination bins of the RILs to explore the candidate genes by the high-throughput genome-wide sequencing. The results of genotypic analysis showed that the RpsWY gene was located in bin 401 between 4466230 and 4502773 bp on chromosome 3 through line 71 and 100 of the RILs. Four predicted genes (Glyma03g04350, Glyma03g04360, Glyma03g04370, and Glyma03g04380) were found at the narrowed region of 36.5 kb in bin 401. These results suggest that the high-throughput genome-wide resequencing is an effective method to fine map PRR candidate genes.

  1. High-throughput sequencing of the T cell receptor β gene identifies aggressive early-stage mycosis fungoides.

    Science.gov (United States)

    de Masson, Adele; O'Malley, John T; Elco, Christopher P; Garcia, Sarah S; Divito, Sherrie J; Lowry, Elizabeth L; Tawa, Marianne; Fisher, David C; Devlin, Phillip M; Teague, Jessica E; Leboeuf, Nicole R; Kirsch, Ilan R; Robins, Harlan; Clark, Rachael A; Kupper, Thomas S

    2018-05-09

    Mycosis fungoides (MF), the most common cutaneous T cell lymphoma (CTCL) is a malignancy of skin-tropic memory T cells. Most MF cases present as early stage (stage I A/B, limited to the skin), and these patients typically have a chronic, indolent clinical course. However, a small subset of early-stage cases develop progressive and fatal disease. Because outcomes can be so different, early identification of this high-risk population is an urgent unmet clinical need. We evaluated the use of next-generation high-throughput DNA sequencing of the T cell receptor β gene ( TCRB ) in lesional skin biopsies to predict progression and survival in a discovery cohort of 208 patients with CTCL (177 with MF) from a 15-year longitudinal observational clinical study. We compared these data to the results in an independent validation cohort of 101 CTCL patients (87 with MF). The tumor clone frequency (TCF) in lesional skin, measured by high-throughput sequencing of the TCRB gene, was an independent prognostic factor of both progression-free and overall survival in patients with CTCL and MF in particular. In early-stage patients, a TCF of >25% in the skin was a stronger predictor of progression than any other established prognostic factor (stage IB versus IA, presence of plaques, high blood lactate dehydrogenase concentration, large-cell transformation, or age). The TCF therefore may accurately predict disease progression in early-stage MF. Early identification of patients at high risk for progression could help identify candidates who may benefit from allogeneic hematopoietic stem cell transplantation before their disease becomes treatment-refractory. Copyright © 2018 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.

  2. A high throughput DNA extraction method with high yield and quality

    Directory of Open Access Journals (Sweden)

    Xin Zhanguo

    2012-07-01

    Full Text Available Abstract Background Preparation of large quantity and high quality genomic DNA from a large number of plant samples is a major bottleneck for most genetic and genomic analyses, such as, genetic mapping, TILLING (Targeting Induced Local Lesion IN Genome, and next-generation sequencing directly from sheared genomic DNA. A variety of DNA preparation methods and commercial kits are available. However, they are either low throughput, low yield, or costly. Here, we describe a method for high throughput genomic DNA isolation from sorghum [Sorghum bicolor (L. Moench] leaves and dry seeds with high yield, high quality, and affordable cost. Results We developed a high throughput DNA isolation method by combining a high yield CTAB extraction method with an improved cleanup procedure based on MagAttract kit. The method yielded large quantity and high quality DNA from both lyophilized sorghum leaves and dry seeds. The DNA yield was improved by nearly 30 fold with 4 times less consumption of MagAttract beads. The method can also be used in other plant species, including cotton leaves and pine needles. Conclusion A high throughput system for DNA extraction from sorghum leaves and seeds was developed and validated. The main advantages of the method are low cost, high yield, high quality, and high throughput. One person can process two 96-well plates in a working day at a cost of $0.10 per sample of magnetic beads plus other consumables that other methods will also need.

  3. LncRNA Expression Profile of Human Thoracic Aortic Dissection by High-Throughput Sequencing.

    Science.gov (United States)

    Sun, Jie; Chen, Guojun; Jing, Yuanwen; He, Xiang; Dong, Jianting; Zheng, Junmeng; Zou, Meisheng; Li, Hairui; Wang, Shifei; Sun, Yili; Liao, Wangjun; Liao, Yulin; Feng, Li; Bin, Jianping

    2018-01-01

    In this study, the long non-coding RNA (lncRNA) expression profile in human thoracic aortic dissection (TAD), a highly lethal cardiovascular disease, was investigated. Human TAD (n=3) and normal aortic tissues (NA) (n=3) were examined by high-throughput sequencing. Bioinformatics analyses were performed to predict the roles of aberrantly expressed lncRNAs. Quantitative real-time polymerase chain reaction (qRT-PCR) was applied to validate the results. A total of 269 lncRNAs (159 up-regulated and 110 down-regulated) and 2, 255 mRNAs (1 294 up-regulated and 961 down-regulated) were aberrantly expressed in human TAD (fold-change> 1.5, PTAD than in NA. The predicted binding motifs of three up-regulated lncRNAs (ENSG00000248508, ENSG00000226530, and EG00000259719) were correlated with up-regulated RUNX1 (R=0.982, PTAD. These findings suggest that lncRNAs are novel potential therapeutic targets for human TAD. © 2018 The Author(s). Published by S. Karger AG, Basel.

  4. Evolution of blue-flowered species of genus Linum based on high-throughput sequencing of ribosomal RNA genes.

    Science.gov (United States)

    Bolsheva, Nadezhda L; Melnikova, Nataliya V; Kirov, Ilya V; Speranskaya, Anna S; Krinitsina, Anastasia A; Dmitriev, Alexey A; Belenikin, Maxim S; Krasnov, George S; Lakunina, Valentina A; Snezhkina, Anastasiya V; Rozhmina, Tatiana A; Samatadze, Tatiana E; Yurkevich, Olga Yu; Zoshchuk, Svyatoslav A; Amosova, Аlexandra V; Kudryavtseva, Anna V; Muravenko, Olga V

    2017-12-28

    The species relationships within the genus Linum have already been studied several times by means of different molecular and phylogenetic approaches. Nevertheless, a number of ambiguities in phylogeny of Linum still remain unresolved. In particular, the species relationships within the sections Stellerolinum and Dasylinum need further clarification. Also, the question of independence of the species of the section Adenolinum still remains unanswered. Moreover, the relationships of L. narbonense and other species of the section Linum require further clarification. Additionally, the origin of tetraploid species of the section Linum (2n = 30) including the cultivated species L. usitatissimum has not been explored. The present study examines the phylogeny of blue-flowered species of Linum by comparisons of 5S rRNA gene sequences as well as ITS1 and ITS2 sequences of 35S rRNA genes. High-throughput sequencing has been used for analysis of multicopy rRNA gene families. In addition to the molecular phylogenetic analysis, the number and chromosomal localization of 5S and 35S rDNA sites has been determined by FISH. Our findings confirm that L. stelleroides forms a basal branch from the clade of blue-flowered flaxes which is independent of the branch formed by species of the sect. Dasylinum. The current molecular phylogenetic approaches, the cytogenetic analysis as well as different genomic DNA fingerprinting methods applied previously did not discriminate certain species within the sect. Adenolinum. The allotetraploid cultivated species L. usitatissimum and its wild ancestor L. angustifolium (2n = 30) could originate either as the result of hybridization of two diploid species (2n = 16) related to the modern L. gandiflorum and L. decumbens, or hybridization of a diploid species (2n = 16) and a diploid ancestor of modern L. narbonense (2n = 14). High-throughput sequencing of multicopy rRNA gene families allowed us to make several adjustments to the

  5. CSReport: A New Computational Tool Designed for Automatic Analysis of Class Switch Recombination Junctions Sequenced by High-Throughput Sequencing.

    Science.gov (United States)

    Boyer, François; Boutouil, Hend; Dalloul, Iman; Dalloul, Zeinab; Cook-Moreau, Jeanne; Aldigier, Jean-Claude; Carrion, Claire; Herve, Bastien; Scaon, Erwan; Cogné, Michel; Péron, Sophie

    2017-05-15

    B cells ensure humoral immune responses due to the production of Ag-specific memory B cells and Ab-secreting plasma cells. In secondary lymphoid organs, Ag-driven B cell activation induces terminal maturation and Ig isotype class switch (class switch recombination [CSR]). CSR creates a virtually unique IgH locus in every B cell clone by intrachromosomal recombination between two switch (S) regions upstream of each C region gene. Amount and structural features of CSR junctions reveal valuable information about the CSR mechanism, and analysis of CSR junctions is useful in basic and clinical research studies of B cell functions. To provide an automated tool able to analyze large data sets of CSR junction sequences produced by high-throughput sequencing (HTS), we designed CSReport, a software program dedicated to support analysis of CSR recombination junctions sequenced with a HTS-based protocol (Ion Torrent technology). CSReport was assessed using simulated data sets of CSR junctions and then used for analysis of Sμ-Sα and Sμ-Sγ1 junctions from CH12F3 cells and primary murine B cells, respectively. CSReport identifies junction segment breakpoints on reference sequences and junction structure (blunt-ended junctions or junctions with insertions or microhomology). Besides the ability to analyze unprecedentedly large libraries of junction sequences, CSReport will provide a unified framework for CSR junction studies. Our results show that CSReport is an accurate tool for analysis of sequences from our HTS-based protocol for CSR junctions, thereby facilitating and accelerating their study. Copyright © 2017 by The American Association of Immunologists, Inc.

  6. BioVLAB-MMIA-NGS: microRNA-mRNA integrated analysis using high-throughput sequencing data.

    Science.gov (United States)

    Chae, Heejoon; Rhee, Sungmin; Nephew, Kenneth P; Kim, Sun

    2015-01-15

    It is now well established that microRNAs (miRNAs) play a critical role in regulating gene expression in a sequence-specific manner, and genome-wide efforts are underway to predict known and novel miRNA targets. However, the integrated miRNA-mRNA analysis remains a major computational challenge, requiring powerful informatics systems and bioinformatics expertise. The objective of this study was to modify our widely recognized Web server for the integrated mRNA-miRNA analysis (MMIA) and its subsequent deployment on the Amazon cloud (BioVLAB-MMIA) to be compatible with high-throughput platforms, including next-generation sequencing (NGS) data (e.g. RNA-seq). We developed a new version called the BioVLAB-MMIA-NGS, deployed on both Amazon cloud and on a high-performance publicly available server called MAHA. By using NGS data and integrating various bioinformatics tools and databases, BioVLAB-MMIA-NGS offers several advantages. First, sequencing data is more accurate than array-based methods for determining miRNA expression levels. Second, potential novel miRNAs can be detected by using various computational methods for characterizing miRNAs. Third, because miRNA-mediated gene regulation is due to hybridization of an miRNA to its target mRNA, sequencing data can be used to identify many-to-many relationship between miRNAs and target genes with high accuracy. http://epigenomics.snu.ac.kr/biovlab_mmia_ngs/. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  7. Ultra-deep pyrosequencing (UDPS data treatment to study amplicon HCV minor variants.

    Directory of Open Access Journals (Sweden)

    Josep Gregori

    Full Text Available We have investigated the reliability and reproducibility of HCV viral quasispecies quantification by ultra-deep pyrosequencing (UDPS methods. Our study has been divided in two parts. First of all, by UDPS sequencing of clone mixes samples we have established the global noise level of UDPS and fine tuned a data treatment workflow previously optimized for HBV sequence analysis. Secondly, we have studied the reproducibility of the methodology by comparing 5 amplicons from two patient samples on three massive sequencing platforms (FLX+, FLX and Junior after applying the error filters developed from the clonal/control study. After noise filtering the UDPS results, the three replicates showed the same 12 polymorphic sites above 0.7%, with a mean CV of 4.86%. Two polymorphic sites below 0.6% were identified by two replicates and one replicate respectively. A total of 25, 23 and 26 haplotypes were detected by GS-Junior, GS-FLX and GS-FLX+. The observed CVs for the normalized Shannon entropy (Sn, the mutation frequency (Mf, and the nucleotidic diversity (Pi were 1.46%, 3.96% and 3.78%. The mean absolute difference in the two patients (5 amplicons each, in the GS-FLX and GS-FLX+, were 1.46%, 3.96% and 3.78% for Sn, Mf and Pi. No false polymorphic site was observed above 0.5%. Our results indicate that UDPS is an optimal alternative to molecular cloning for quantitative study of HCV viral quasispecies populations, both in complexity and composition. We propose an UDPS data treatment workflow for amplicons from the RNA viral quasispecies which, at a sequencing depth of at least 10,000 reads per strand, enables to obtain sequences and frequencies of consensus haplotypes above 0.5% abundance with no erroneous mutations, with high confidence, resistant mutants as minor variants at the level of 1%, with high confidence that variants are not missed, and highly confident measures of quasispecies complexity.

  8. Canary: an atomic pipeline for clinical amplicon assays.

    Science.gov (United States)

    Doig, Kenneth D; Ellul, Jason; Fellowes, Andrew; Thompson, Ella R; Ryland, Georgina; Blombery, Piers; Papenfuss, Anthony T; Fox, Stephen B

    2017-12-15

    High throughput sequencing requires bioinformatics pipelines to process large volumes of data into meaningful variants that can be translated into a clinical report. These pipelines often suffer from a number of shortcomings: they lack robustness and have many components written in multiple languages, each with a variety of resource requirements. Pipeline components must be linked together with a workflow system to achieve the processing of FASTQ files through to a VCF file of variants. Crafting these pipelines requires considerable bioinformatics and IT skills beyond the reach of many clinical laboratories. Here we present Canary, a single program that can be run on a laptop, which takes FASTQ files from amplicon assays through to an annotated VCF file ready for clinical analysis. Canary can be installed and run with a single command using Docker containerization or run as a single JAR file on a wide range of platforms. Although it is a single utility, Canary performs all the functions present in more complex and unwieldy pipelines. All variants identified by Canary are 3' shifted and represented in their most parsimonious form to provide a consistent nomenclature, irrespective of sequencing variation. Further, proximate in-phase variants are represented as a single HGVS 'delins' variant. This allows for correct nomenclature and consequences to be ascribed to complex multi-nucleotide polymorphisms (MNPs), which are otherwise difficult to represent and interpret. Variants can also be annotated with hundreds of attributes sourced from MyVariant.info to give up to date details on pathogenicity, population statistics and in-silico predictors. Canary has been used at the Peter MacCallum Cancer Centre in Melbourne for the last 2 years for the processing of clinical sequencing data. By encapsulating clinical features in a single, easily installed executable, Canary makes sequencing more accessible to all pathology laboratories. Canary is available for download as source

  9. Swarm v2: highly-scalable and high-resolution amplicon clustering.

    Science.gov (United States)

    Mahé, Frédéric; Rognes, Torbjørn; Quince, Christopher; de Vargas, Colomban; Dunthorn, Micah

    2015-01-01

    Previously we presented Swarm v1, a novel and open source amplicon clustering program that produced fine-scale molecular operational taxonomic units (OTUs), free of arbitrary global clustering thresholds and input-order dependency. Swarm v1 worked with an initial phase that used iterative single-linkage with a local clustering threshold (d), followed by a phase that used the internal abundance structures of clusters to break chained OTUs. Here we present Swarm v2, which has two important novel features: (1) a new algorithm for d = 1 that allows the computation time of the program to scale linearly with increasing amounts of data; and (2) the new fastidious option that reduces under-grouping by grafting low abundant OTUs (e.g., singletons and doubletons) onto larger ones. Swarm v2 also directly integrates the clustering and breaking phases, dereplicates sequencing reads with d = 0, outputs OTU representatives in fasta format, and plots individual OTUs as two-dimensional networks.

  10. Swarm v2: highly-scalable and high-resolution amplicon clustering

    Directory of Open Access Journals (Sweden)

    Frédéric Mahé

    2015-12-01

    Full Text Available Previously we presented Swarm v1, a novel and open source amplicon clustering program that produced fine-scale molecular operational taxonomic units (OTUs, free of arbitrary global clustering thresholds and input-order dependency. Swarm v1 worked with an initial phase that used iterative single-linkage with a local clustering threshold (d, followed by a phase that used the internal abundance structures of clusters to break chained OTUs. Here we present Swarm v2, which has two important novel features: (1 a new algorithm for d = 1 that allows the computation time of the program to scale linearly with increasing amounts of data; and (2 the new fastidious option that reduces under-grouping by grafting low abundant OTUs (e.g., singletons and doubletons onto larger ones. Swarm v2 also directly integrates the clustering and breaking phases, dereplicates sequencing reads with d = 0, outputs OTU representatives in fasta format, and plots individual OTUs as two-dimensional networks.

  11. High-specificity detection of rare alleles with Paired-End Low Error Sequencing (PELE-Seq).

    Science.gov (United States)

    Preston, Jessica L; Royall, Ariel E; Randel, Melissa A; Sikkink, Kristin L; Phillips, Patrick C; Johnson, Eric A

    2016-06-14

    Polymorphic loci exist throughout the genomes of a population and provide the raw genetic material needed for a species to adapt to changes in the environment. The minor allele frequencies of rare Single Nucleotide Polymorphisms (SNPs) within a population have been difficult to track with Next-Generation Sequencing (NGS), due to the high error rate of standard methods such as Illumina sequencing. We have developed a wet-lab protocol and variant-calling method that identifies both sequencing and PCR errors, called Paired-End Low Error Sequencing (PELE-Seq). To test the specificity and sensitivity of the PELE-Seq method, we sequenced control E. coli DNA libraries containing known rare alleles present at frequencies ranging from 0.2-0.4 % of the total reads. PELE-Seq had higher specificity and sensitivity than standard libraries. We then used PELE-Seq to characterize rare alleles in a Caenorhabditis remanei nematode worm population before and after laboratory adaptation, and found that minor and rare alleles can undergo large changes in frequency during lab-adaptation. We have developed a method of rare allele detection that mitigates both sequencing and PCR errors, called PELE-Seq. PELE-Seq was evaluated using control E. coli populations and was then used to compare a wild C. remanei population to a lab-adapted population. The PELE-Seq method is ideal for investigating the dynamics of rare alleles in a broad range of reduced-representation sequencing methods, including targeted amplicon sequencing, RAD-Seq, ddRAD, and GBS. PELE-Seq is also well-suited for whole genome sequencing of mitochondria and viruses, and for high-throughput rare mutation screens.

  12. Perchlorate reduction by hydrogen autotrophic bacteria and microbial community analysis using high-throughput sequencing.

    Science.gov (United States)

    Wan, Dongjin; Liu, Yongde; Niu, Zhenhua; Xiao, Shuhu; Li, Daorong

    2016-02-01

    Hydrogen autotrophic reduction of perchlorate have advantages of high removal efficiency and harmless to drinking water. But so far the reported information about the microbial community structure was comparatively limited, changes in the biodiversity and the dominant bacteria during acclimation process required detailed study. In this study, perchlorate-reducing hydrogen autotrophic bacteria were acclimated by hydrogen aeration from activated sludge. For the first time, high-throughput sequencing was applied to analyze changes in biodiversity and the dominant bacteria during acclimation process. The Michaelis-Menten model described the perchlorate reduction kinetics well. Model parameters q(max) and K(s) were 2.521-3.245 (mg ClO4(-)/gVSS h) and 5.44-8.23 (mg/l), respectively. Microbial perchlorate reduction occurred across at pH range 5.0-11.0; removal was highest at pH 9.0. The enriched mixed bacteria could use perchlorate, nitrate and sulfate as electron accepter, and the sequence of preference was: NO3(-) > ClO4(-) > SO4(2-). Compared to the feed culture, biodiversity decreased greatly during acclimation process, the microbial community structure gradually stabilized after 9 acclimation cycles. The Thauera genus related to Rhodocyclales was the dominated perchlorate reducing bacteria (PRB) in the mixed culture.

  13. Identification of protoplast-isolation responsive microRNAs in Citrus reticulata Blanco by high-throughput sequencing.

    Science.gov (United States)

    Xu, Xiaoyong; Xu, Xiaoling; Zhou, Yipeng; Zeng, Shaohua; Kong, Weiwen

    2017-01-01

    Protoplast isolation is a stress-inducing process, during which a variety of physiological and molecular alterations take place. Such stress response affects the expression of totipotency of cultured protoplasts. MicroRNAs (miRNAs) play important roles in plant growth, development and stress responses. However, the underlying mechanism of miRNAs involved in the protoplast totipotency remains unclear. In this study, high-throughput sequencing technology was used to sequence two populations of small RNA from calli and callus-derived protoplasts in Citrus reticulata Blanco. A total of 67 known miRNAs from 35 families and 277 novel miRNAs were identified. Among these miRNAs, 18 known miRNAs and 64 novel miRNAs were identified by differentially expressed miRNAs (DEMs) analysis. The expression patterns of the eight DEMs were verified by qRT-PCR. Target prediction showed most targets of the miRNAs were transcription factors. The expression levels of half targets showed a negative correlation to those of the miRNAs. Furthermore, the physiological analysis showed high levels of antioxidant activities in isolated protoplasts. In short, our results indicated that miRNAs may play important roles in protoplast-isolation response.

  14. Identification and characterization of microRNAs in Humulus lupulus using high-throughput sequencing and their response to Citrus bark cracking viroid (CBCVd) infection

    Czech Academy of Sciences Publication Activity Database

    Mishra, Ajay Kumar; Duraisamy, Ganesh Selvaraj; Matoušek, Jaroslav; Radišek, S.; Javornik, B.; Jakše, J.

    2016-01-01

    Roč. 17, č. 919 (2016) ISSN 1471-2164 R&D Projects: GA MŠk(CZ) LH14255 Institutional support: RVO:60077344 Keywords : Humulus lupulus * High-throughput sequencing * Citrus bark cracking viroid Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 3.729, year: 2016

  15. Detection of genomic variation by selection of a 9 mb DNA region and high throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Sergey I Nikolaev

    Full Text Available Detection of the rare polymorphisms and causative mutations of genetic diseases in a targeted genomic area has become a major goal in order to understand genomic and phenotypic variability. We have interrogated repeat-masked regions of 8.9 Mb on human chromosomes 21 (7.8 Mb and 7 (1.1 Mb from an individual from the International HapMap Project (NA12872. We have optimized a method of genomic selection for high throughput sequencing. Microarray-based selection and sequencing resulted in 260-fold enrichment, with 41% of reads mapping to the target region. 83% of SNPs in the targeted region had at least 4-fold sequence coverage and 54% at least 15-fold. When assaying HapMap SNPs in NA12872, our sequence genotypes are 91.3% concordant in regions with coverage > or = 4-fold, and 97.9% concordant in regions with coverage > or = 15-fold. About 81% of the SNPs recovered with both thresholds are listed in dbSNP. We observed that regions with low sequence coverage occur in close proximity to low-complexity DNA. Validation experiments using Sanger sequencing were performed for 46 SNPs with 15-20 fold coverage, with a confirmation rate of 96%, suggesting that DNA selection provides an accurate and cost-effective method for identifying rare genomic variants.

  16. Survey of Microbial Diversity in Flood Areas during Thailand 2011 Flood Crisis Using High-Throughput Tagged Amplicon Pyrosequencing.

    Science.gov (United States)

    Mhuantong, Wuttichai; Wongwilaiwalin, Sarunyou; Laothanachareon, Thanaporn; Eurwilaichitr, Lily; Tangphatsornruang, Sithichoke; Boonchayaanant, Benjaporn; Limpiyakorn, Tawan; Pattaragulwanit, Kobchai; Punmatharith, Thantip; McEvoy, John; Khan, Eakalak; Rachakornkij, Manaskorn; Champreda, Verawat

    2015-01-01

    The Thailand flood crisis in 2011 was one of the largest recorded floods in modern history, causing enormous damage to the economy and ecological habitats of the country. In this study, bacterial and fungal diversity in sediments and waters collected from ten flood areas in Bangkok and its suburbs, covering residential and agricultural areas, were analyzed using high-throughput 454 pyrosequencing of 16S rRNA gene and internal transcribed spacer sequences. Analysis of microbial community showed differences in taxa distribution in water and sediment with variations in the diversity of saprophytic microbes and sulfate/nitrate reducers among sampling locations, suggesting differences in microbial activity in the habitats. Overall, Proteobacteria represented a major bacterial group in waters, while this group co-existed with Firmicutes, Bacteroidetes, and Actinobacteria in sediments. Anaeromyxobacter, Steroidobacter, and Geobacter were the dominant bacterial genera in sediments, while Sulfuricurvum, Thiovirga, and Hydrogenophaga predominated in waters. For fungi in sediments, Ascomycota, Glomeromycota, and Basidiomycota, particularly in genera Philipsia, Rozella, and Acaulospora, were most frequently detected. Chytridiomycota and Ascomycota were the major fungal phyla, and Rhizophlyctis and Mortierella were the most frequently detected fungal genera in water. Diversity of sulfate-reducing bacteria, related to odor problems, was further investigated using analysis of the dsrB gene which indicated the presence of sulfate-reducing bacteria of families Desulfobacteraceae, Desulfobulbaceae, Syntrobacteraceae, and Desulfoarculaceae in the flood sediments. The work provides an insight into the diversity and function of microbes related to biological processes in flood areas.

  17. A high-throughput and quantitative method to assess the mutagenic potential of translesion DNA synthesis

    Science.gov (United States)

    Taggart, David J.; Camerlengo, Terry L.; Harrison, Jason K.; Sherrer, Shanen M.; Kshetry, Ajay K.; Taylor, John-Stephen; Huang, Kun; Suo, Zucai

    2013-01-01

    Cellular genomes are constantly damaged by endogenous and exogenous agents that covalently and structurally modify DNA to produce DNA lesions. Although most lesions are mended by various DNA repair pathways in vivo, a significant number of damage sites persist during genomic replication. Our understanding of the mutagenic outcomes derived from these unrepaired DNA lesions has been hindered by the low throughput of existing sequencing methods. Therefore, we have developed a cost-effective high-throughput short oligonucleotide sequencing assay that uses next-generation DNA sequencing technology for the assessment of the mutagenic profiles of translesion DNA synthesis catalyzed by any error-prone DNA polymerase. The vast amount of sequencing data produced were aligned and quantified by using our novel software. As an example, the high-throughput short oligonucleotide sequencing assay was used to analyze the types and frequencies of mutations upstream, downstream and at a site-specifically placed cis–syn thymidine–thymidine dimer generated individually by three lesion-bypass human Y-family DNA polymerases. PMID:23470999

  18. Identification of microRNAs from Eugenia uniflora by high-throughput sequencing and bioinformatics analysis.

    Science.gov (United States)

    Guzman, Frank; Almerão, Mauricio P; Körbes, Ana P; Loss-Morais, Guilherme; Margis, Rogerio

    2012-01-01

    microRNAs or miRNAs are small non-coding regulatory RNAs that play important functions in the regulation of gene expression at the post-transcriptional level by targeting mRNAs for degradation or inhibiting protein translation. Eugenia uniflora is a plant native to tropical America with pharmacological and ecological importance, and there have been no previous studies concerning its gene expression and regulation. To date, no miRNAs have been reported in Myrtaceae species. Small RNA and RNA-seq libraries were constructed to identify miRNAs and pre-miRNAs in Eugenia uniflora. Solexa technology was used to perform high throughput sequencing of the library, and the data obtained were analyzed using bioinformatics tools. From 14,489,131 small RNA clean reads, we obtained 1,852,722 mature miRNA sequences representing 45 conserved families that have been identified in other plant species. Further analysis using contigs assembled from RNA-seq allowed the prediction of secondary structures of 25 known and 17 novel pre-miRNAs. The expression of twenty-seven identified miRNAs was also validated using RT-PCR assays. Potential targets were predicted for the most abundant mature miRNAs in the identified pre-miRNAs based on sequence homology. This study is the first large scale identification of miRNAs and their potential targets from a species of the Myrtaceae family without genomic sequence resources. Our study provides more information about the evolutionary conservation of the regulatory network of miRNAs in plants and highlights species-specific miRNAs.

  19. REDItools: high-throughput RNA editing detection made easy.

    Science.gov (United States)

    Picardi, Ernesto; Pesole, Graziano

    2013-07-15

    The reliable detection of RNA editing sites from massive sequencing data remains challenging and, although several methodologies have been proposed, no computational tools have been released to date. Here, we introduce REDItools a suite of python scripts to perform high-throughput investigation of RNA editing using next-generation sequencing data. REDItools are in python programming language and freely available at http://code.google.com/p/reditools/. ernesto.picardi@uniba.it or graziano.pesole@uniba.it Supplementary data are available at Bioinformatics online.

  20. The use of genus-specific amplicon pyrosequencing to assess phytophthora species diversity using eDNA from soil and water in Northern Spain.

    Science.gov (United States)

    Català, Santiago; Pérez-Sierra, Ana; Abad-Campos, Paloma

    2015-01-01

    Phytophthora is one of the most important and aggressive plant pathogenic genera in agriculture and forestry. Early detection and identification of its pathways of infection and spread are of high importance to minimize the threat they pose to natural ecosystems. eDNA was extracted from soil and water from forests and plantations in the north of Spain. Phytophthora-specific primers were adapted for use in high-throughput Sequencing (HTS). Primers were tested in a control reaction containing eight Phytophthora species and applied to water and soil eDNA samples from northern Spain. Different score coverage threshold values were tested for optimal Phytophthora species separation in a custom-curated database and in the control reaction. Clustering at 99% was the optimal criteria to separate most of the Phytophthora species. Multiple Molecular Operational Taxonomic Units (MOTUs) corresponding to 36 distinct Phytophthora species were amplified in the environmental samples. Pyrosequencing of amplicons from soil samples revealed low Phytophthora diversity (13 species) in comparison with the 35 species detected in water samples. Thirteen of the MOTUs detected in rivers and streams showed no close match to sequences in international sequence databases, revealing that eDNA pyrosequencing is a useful strategy to assess Phytophthora species diversity in natural ecosystems.

  1. Optimization and high-throughput screening of antimicrobial peptides.

    Science.gov (United States)

    Blondelle, Sylvie E; Lohner, Karl

    2010-01-01

    While a well-established process for lead compound discovery in for-profit companies, high-throughput screening is becoming more popular in basic and applied research settings in academia. The development of combinatorial libraries combined with easy and less expensive access to new technologies have greatly contributed to the implementation of high-throughput screening in academic laboratories. While such techniques were earlier applied to simple assays involving single targets or based on binding affinity, they have now been extended to more complex systems such as whole cell-based assays. In particular, the urgent need for new antimicrobial compounds that would overcome the rapid rise of drug-resistant microorganisms, where multiple target assays or cell-based assays are often required, has forced scientists to focus onto high-throughput technologies. Based on their existence in natural host defense systems and their different mode of action relative to commercial antibiotics, antimicrobial peptides represent a new hope in discovering novel antibiotics against multi-resistant bacteria. The ease of generating peptide libraries in different formats has allowed a rapid adaptation of high-throughput assays to the search for novel antimicrobial peptides. Similarly, the availability nowadays of high-quantity and high-quality antimicrobial peptide data has permitted the development of predictive algorithms to facilitate the optimization process. This review summarizes the various library formats that lead to de novo antimicrobial peptide sequences as well as the latest structural knowledge and optimization processes aimed at improving the peptides selectivity.

  2. SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data

    Science.gov (United States)

    Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).

  3. SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data.

    Science.gov (United States)

    Miller, Mark P; Knaus, Brian J; Mullins, Thomas D; Haig, Susan M

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25 bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).

  4. Filtering high-throughput protein-protein interaction data using a combination of genomic features

    Directory of Open Access Journals (Sweden)

    Patil Ashwini

    2005-04-01

    Full Text Available Abstract Background Protein-protein interaction data used in the creation or prediction of molecular networks is usually obtained from large scale or high-throughput experiments. This experimental data is liable to contain a large number of spurious interactions. Hence, there is a need to validate the interactions and filter out the incorrect data before using them in prediction studies. Results In this study, we use a combination of 3 genomic features – structurally known interacting Pfam domains, Gene Ontology annotations and sequence homology – as a means to assign reliability to the protein-protein interactions in Saccharomyces cerevisiae determined by high-throughput experiments. Using Bayesian network approaches, we show that protein-protein interactions from high-throughput data supported by one or more genomic features have a higher likelihood ratio and hence are more likely to be real interactions. Our method has a high sensitivity (90% and good specificity (63%. We show that 56% of the interactions from high-throughput experiments in Saccharomyces cerevisiae have high reliability. We use the method to estimate the number of true interactions in the high-throughput protein-protein interaction data sets in Caenorhabditis elegans, Drosophila melanogaster and Homo sapiens to be 27%, 18% and 68% respectively. Our results are available for searching and downloading at http://helix.protein.osaka-u.ac.jp/htp/. Conclusion A combination of genomic features that include sequence, structure and annotation information is a good predictor of true interactions in large and noisy high-throughput data sets. The method has a very high sensitivity and good specificity and can be used to assign a likelihood ratio, corresponding to the reliability, to each interaction.

  5. Robust DNA Isolation and High-throughput Sequencing Library Construction for Herbarium Specimens.

    Science.gov (United States)

    Saeidi, Saman; McKain, Michael R; Kellogg, Elizabeth A

    2018-03-08

    Herbaria are an invaluable source of plant material that can be used in a variety of biological studies. The use of herbarium specimens is associated with a number of challenges including sample preservation quality, degraded DNA, and destructive sampling of rare specimens. In order to more effectively use herbarium material in large sequencing projects, a dependable and scalable method of DNA isolation and library preparation is needed. This paper demonstrates a robust, beginning-to-end protocol for DNA isolation and high-throughput library construction from herbarium specimens that does not require modification for individual samples. This protocol is tailored for low quality dried plant material and takes advantage of existing methods by optimizing tissue grinding, modifying library size selection, and introducing an optional reamplification step for low yield libraries. Reamplification of low yield DNA libraries can rescue samples derived from irreplaceable and potentially valuable herbarium specimens, negating the need for additional destructive sampling and without introducing discernible sequencing bias for common phylogenetic applications. The protocol has been tested on hundreds of grass species, but is expected to be adaptable for use in other plant lineages after verification. This protocol can be limited by extremely degraded DNA, where fragments do not exist in the desired size range, and by secondary metabolites present in some plant material that inhibit clean DNA isolation. Overall, this protocol introduces a fast and comprehensive method that allows for DNA isolation and library preparation of 24 samples in less than 13 h, with only 8 h of active hands-on time with minimal modifications.

  6. Genetic profiles of cervical tumors by high-throughput sequencing for personalized medical care

    International Nuclear Information System (INIS)

    Muller, Etienne; Brault, Baptiste; Holmes, Allyson; Legros, Angelina; Jeannot, Emmanuelle; Campitelli, Maura; Rousselin, Antoine; Goardon, Nicolas; Frébourg, Thierry; Krieger, Sophie; Crouet, Hubert; Nicolas, Alain; Sastre, Xavier; Vaur, Dominique; Castéra, Laurent

    2015-01-01

    Cancer treatment is facing major evolution since the advent of targeted therapies. Building genetic profiles could predict sensitivity or resistance to these therapies and highlight disease-specific abnormalities, supporting personalized patient care. In the context of biomedical research and clinical diagnosis, our laboratory has developed an oncogenic panel comprised of 226 genes and a dedicated bioinformatic pipeline to explore somatic mutations in cervical carcinomas, using high-throughput sequencing. Twenty-nine tumors were sequenced for exons within 226 genes. The automated pipeline used includes a database and a filtration system dedicated to identifying mutations of interest and excluding false positive and germline mutations. One-hundred and seventy-six total mutational events were found among the 29 tumors. Our cervical tumor mutational landscape shows that most mutations are found in PIK3CA (E545K, E542K) and KRAS (G12D, G13D) and others in FBXW7 (R465C, R505G, R479Q). Mutations have also been found in ALK (V1149L, A1266T) and EGFR (T259M). These results showed that 48% of patients display at least one deleterious mutation in genes that have been already targeted by the Food and Drug Administration approved therapies. Considering deleterious mutations, 59% of patients could be eligible for clinical trials. Sequencing hundreds of genes in a clinical context has become feasible, in terms of time and cost. In the near future, such an analysis could be a part of a battery of examinations along the diagnosis and treatment of cancer, helping to detect sensitivity or resistance to targeted therapies and allow advancements towards personalized oncology

  7. Scrutinizing virus genome termini by high-throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Shasha Li

    Full Text Available Analysis of genomic terminal sequences has been a major step in studies on viral DNA replication and packaging mechanisms. However, traditional methods to study genome termini are challenging due to the time-consuming protocols and their inefficiency where critical details are lost easily. Recent advances in next generation sequencing (NGS have enabled it to be a powerful tool to study genome termini. In this study, using NGS we sequenced one iridovirus genome and twenty phage genomes and confirmed for the first time that the high frequency sequences (HFSs found in the NGS reads are indeed the terminal sequences of viral genomes. Further, we established a criterion to distinguish the type of termini and the viral packaging mode. We also obtained additional terminal details such as terminal repeats, multi-termini, asymmetric termini. With this approach, we were able to simultaneously detect details of the genome termini as well as obtain the complete sequence of bacteriophage genomes. Theoretically, this application can be further extended to analyze larger and more complicated genomes of plant and animal viruses. This study proposed a novel and efficient method for research on viral replication, packaging, terminase activity, transcription regulation, and metabolism of the host cell.

  8. A standardized framework for accurate, high-throughput genotyping of recombinant and non-recombinant viral sequences.

    Science.gov (United States)

    Alcantara, Luiz Carlos Junior; Cassol, Sharon; Libin, Pieter; Deforche, Koen; Pybus, Oliver G; Van Ranst, Marc; Galvão-Castro, Bernardo; Vandamme, Anne-Mieke; de Oliveira, Tulio

    2009-07-01

    Human immunodeficiency virus type-1 (HIV-1), hepatitis B and C and other rapidly evolving viruses are characterized by extremely high levels of genetic diversity. To facilitate diagnosis and the development of prevention and treatment strategies that efficiently target the diversity of these viruses, and other pathogens such as human T-lymphotropic virus type-1 (HTLV-1), human herpes virus type-8 (HHV8) and human papillomavirus (HPV), we developed a rapid high-throughput-genotyping system. The method involves the alignment of a query sequence with a carefully selected set of pre-defined reference strains, followed by phylogenetic analysis of multiple overlapping segments of the alignment using a sliding window. Each segment of the query sequence is assigned the genotype and sub-genotype of the reference strain with the highest bootstrap (>70%) and bootscanning (>90%) scores. Results from all windows are combined and displayed graphically using color-coded genotypes. The new Virus-Genotyping Tools provide accurate classification of recombinant and non-recombinant viruses and are currently being assessed for their diagnostic utility. They have incorporated into several HIV drug resistance algorithms including the Stanford (http://hivdb.stanford.edu) and two European databases (http://www.umcutrecht.nl/subsite/spread-programme/ and http://www.hivrdb.org.uk/) and have been successfully used to genotype a large number of sequences in these and other databases. The tools are a PHP/JAVA web application and are freely accessible on a number of servers including: http://bioafrica.mrc.ac.za/rega-genotype/html/, http://lasp.cpqgm.fiocruz.br/virus-genotype/html/, http://jose.med.kuleuven.be/genotypetool/html/.

  9. A CRISPR CASe for High-Throughput Silencing

    Directory of Open Access Journals (Sweden)

    Jacob eHeintze

    2013-10-01

    Full Text Available Manipulation of gene expression on a genome-wide level is one of the most important systematic tools in the post-genome era. Such manipulations have largely been enabled by expression cloning approaches using sequence-verified cDNA libraries, large-scale RNA interference libraries (shRNA or siRNA and zinc finger nuclease technologies. More recently, the CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats and CRISPR-associated (Cas9-mediated gene editing technology has been described that holds great promise for future use of this technology in genomic manipulation. It was suggested that the CRISPR system has the potential to be used in high-throughput, large-scale loss of function screening. Here we discuss some of the challenges in engineering of CRISPR/Cas genomic libraries and some of the aspects that need to be addressed in order to use this technology on a high-throughput scale.

  10. Bioassessment of a Drinking Water Reservoir Using Plankton: High Throughput Sequencing vs. Traditional Morphological Method

    Directory of Open Access Journals (Sweden)

    Wanli Gao

    2018-01-01

    Full Text Available Drinking water safety is increasingly perceived as one of the top global environmental issues. Plankton has been commonly used as a bioindicator for water quality in lakes and reservoirs. Recently, DNA sequencing technology has been applied to bioassessment. In this study, we compared the effectiveness of the 16S and 18S rRNA high throughput sequencing method (HTS and the traditional optical microscopy method (TOM in the bioassessment of drinking water quality. Five stations reflecting different habitats and hydrological conditions in Danjiangkou Reservoir, one of the largest drinking water reservoirs in Asia, were sampled May 2016. Non-metric multi-dimensional scaling (NMDS analysis showed that plankton assemblages varied among the stations and the spatial patterns revealed by the two methods were consistent. The correlation between TOM and HTS in a symmetric Procrustes analysis was 0.61, revealing overall good concordance between the two methods. Procrustes analysis also showed that site-specific differences between the two methods varied among the stations. Station Heijizui (H, a site heavily influenced by two tributaries, had the largest difference while station Qushou (Q, a confluence site close to the outlet dam, had the smallest difference between the two methods. Our results show that DNA sequencing has the potential to provide consistent identification of taxa, and reliable bioassessment in a long-term biomonitoring and assessment program for drinking water reservoirs.

  11. Evaluation of the microbial diversity in amyotrophic lateral sclerosis using high-throughput sequencing

    Directory of Open Access Journals (Sweden)

    Xin Fang

    2016-09-01

    Full Text Available More and more evidences indicate that diseases of the central nervous system (CNS have been seriously affected by faecal microbes. However, little work is done to explore interaction between amyotrophic lateral sclerosis (ALS and faecal microbes. In the present study, high-throughput sequencing method was used to compare the intestinal microbial diversity of healthy people and ALS patients. The principal coordinate analysis (PCoA, Venn and unweighted pair-group method using arithmetic averages (UPGMA showed an obvious microbial changes between healthy people (group H and ALS patients (group A, and the average ratios of Bacteroides, Faecalibacterium, Anaerostipes, Prevotella, Escherichia and Lachnospira at genus level between ALS patients and healthy people were 0.78, 2.18, 3.41, 0.35, 0.79 and 13.07. Furthermore, the decreased Firmicutes/Bacteroidetes ratio at phylum level using LEfSE (LDA >4.0, together with the significant increased genus Dorea (harmful microorganisms and significant reduced genus Oscillibacter, Anaerostipes, Lachnospiraceae (beneficial microorganisms in ALS patients, indicated that the imbalance in intestinal microflora constitution had a strong association with the pathogenesis of ALS.

  12. Evaluation of the Microbial Diversity in Amyotrophic Lateral Sclerosis Using High-Throughput Sequencing.

    Science.gov (United States)

    Fang, Xin; Wang, Xin; Yang, Shaoguo; Meng, Fanjing; Wang, Xiaolei; Wei, Hua; Chen, Tingtao

    2016-01-01

    More and more evidences indicate that diseases of the central nervous system have been seriously affected by fecal microbes. However, little work is done to explore interaction between amyotrophic lateral sclerosis (ALS) and fecal microbes. In the present study, high-throughput sequencing method was used to compare the intestinal microbial diversity of healthy people and ALS patients. The principal coordinate analysis, Venn and unweighted pair-group method using arithmetic averages (UPGMA) showed an obvious microbial changes between healthy people (group H) and ALS patients (group A), and the average ratios of Bacteroides , Faecalibacterium , Anaerostipes , Prevotella , Escherichia , and Lachnospira at genus level between ALS patients and healthy people were 0.78, 2.18, 3.41, 0.35, 0.79, and 13.07. Furthermore, the decreased Firmicutes/Bacteroidetes ratio at phylum level using LEfSE (LDA > 4.0), together with the significant increased genus Dorea (harmful microorganisms) and significant reduced genus Oscillibacter , Anaerostipes , Lachnospiraceae (beneficial microorganisms) in ALS patients, indicated that the imbalance in intestinal microflora constitution had a strong association with the pathogenesis of ALS.

  13. High-throughput open source computational methods for genetics and genomics

    NARCIS (Netherlands)

    Prins, J.C.P.

    2015-01-01

    Biology is increasingly data driven by virtue of the development of high-throughput technologies, such as DNA and RNA sequencing. Computational biology and bioinformatics are scientific disciplines that cross-over between the disciplines of biology, informatics and statistics; which is clearly

  14. Insight into the transcriptome of Arthrobotrys conoides using high throughput sequencing.

    Science.gov (United States)

    Ramesh, Pandit; Reena, Patel; Amitbikram, Mohapatra; Chaitanya, Joshi; Anju, Kunjadia

    2015-12-01

    Arthrobotrys conoides is a nematode-trapping fungus belonging to Orbiliales, Ascomycota group, and traps prey nematodes by means of adhesive network. Fungus has a potential to be used as a biocontrol agent against plant parasitic nematodes. In the present study, we characterized the transcriptome of A. conoides using high-throughput sequencing technology and characterized its virulence unigenes. Total 7,255 cDNA contigs with an average length of 425 bp were generated and 6184 (61.81%) transcripts were functionally annotated and characterized. Majority of unigenes were found analogous to the genes of plant pathogenic fungi. A total of 1749 transcripts were found to be orthologous with eukaryotic proteins of KOG database. Several carbohydrate active enzymes and peptidases were identified. We also analyzed classically and nonclassically secreted proteins and confirmed by BLASTP against fungal secretome database. A total of 916 contigs were analogous to 556 unique proteins of Pathogen Host Interaction (PHI) database. Further, we identified 91 unigenes homologous to the database of fungal virulence factor (DFVF). A total of 104 putative protein kinases coding transcripts were identified by BLASTP against KinBase database, which are major players in signaling pathways. This study provides a comprehensive look at the transcriptome of A. conoides and the identified unigenes might have a role in catching and killing prey nematodes by A. conoides. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  15. A morphogenetic survey on ciliate plankton from a mountain lake pinpoints the necessity of lineage-specific barcode markers in microbial ecology.

    Science.gov (United States)

    Stoeck, Thorsten; Breiner, Hans-Werner; Filker, Sabine; Ostermaier, Veronika; Kammerlander, Barbara; Sonntag, Bettina

    2014-02-01

    Analyses of high-throughput environmental sequencing data have become the 'gold-standard' to address fundamental questions of microbial diversity, ecology and biogeography. Findings that emerged from sequencing are, e.g. the discovery of the extensive 'rare microbial biosphere' and its potential function as a seed-bank. Even though applied since several years, results from high-throughput environmental sequencing have hardly been validated. We assessed how well pyrosequenced amplicons [the hypervariable eukaryotic V4 region of the small subunit ribosomal RNA (SSU rRNA) gene] reflected morphotype ciliate plankton. Moreover, we assessed if amplicon sequencing had the potential to detect the annual ciliate plankton stock. In both cases, we identified significant quantitative and qualitative differences. Our study makes evident that taxon abundance distributions inferred from amplicon data are highly biased and do not mirror actual morphotype abundances at all. Potential reasons included cell losses after fixation, cryptic morphotypes, resting stages, insufficient sequence data availability of morphologically described species and the unsatisfying resolution of the V4 SSU rRNA fragment for accurate taxonomic assignments. The latter two underline the necessity of barcoding initiatives for eukaryotic microbes to better and fully exploit environmental amplicon data sets, which then will also allow studying the potential of seed-bank taxa as a buffer for environmental changes. © 2013 The Authors. Environmental Microbiology published by Society for Applied Microbiology and John Wiley & Sons Ltd.

  16. Identification of microRNAs and their targets in Finger millet by high throughput sequencing.

    Science.gov (United States)

    Usha, S; Jyothi, M N; Sharadamma, N; Dixit, Rekha; Devaraj, V R; Nagesh Babu, R

    2015-12-15

    MicroRNAs are short non-coding RNAs which play an important role in regulating gene expression by mRNA cleavage or by translational repression. The majority of identified miRNAs were evolutionarily conserved; however, others expressed in a species-specific manner. Finger millet is an important cereal crop; nonetheless, no practical information is available on microRNAs to date. In this study, we have identified 95 conserved microRNAs belonging to 39 families and 3 novel microRNAs by high throughput sequencing. For the identified conserved and novel miRNAs a total of 507 targets were predicted. 11 miRNAs were validated and tissue specificity was determined by stem loop RT-qPCR, Northern blot. GO analyses revealed targets of miRNA were involved in wide range of regulatory functions. This study implies large number of known and novel miRNAs found in Finger millet which may play important role in growth and development. Copyright © 2015 Elsevier B.V. All rights reserved.

  17. ImmuneDB: a system for the analysis and exploration of high-throughput adaptive immune receptor sequencing data.

    Science.gov (United States)

    Rosenfeld, Aaron M; Meng, Wenzhao; Luning Prak, Eline T; Hershberg, Uri

    2017-01-15

    As high-throughput sequencing of B cells becomes more common, the need for tools to analyze the large quantity of data also increases. This article introduces ImmuneDB, a system for analyzing vast amounts of heavy chain variable region sequences and exploring the resulting data. It can take as input raw FASTA/FASTQ data, identify genes, determine clones, construct lineages, as well as provide information such as selection pressure and mutation analysis. It uses an industry leading database, MySQL, to provide fast analysis and avoid the complexities of using error prone flat-files. ImmuneDB is freely available at http://immunedb.comA demo of the ImmuneDB web interface is available at: http://immunedb.com/demo CONTACT: Uh25@drexel.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  18. Discovery of precursor and mature microRNAs and their putative gene targets using high-throughput sequencing in pineapple (Ananas comosus var. comosus).

    Science.gov (United States)

    Yusuf, Noor Hydayaty Md; Ong, Wen Dee; Redwan, Raimi Mohamed; Latip, Mariam Abd; Kumar, S Vijay

    2015-10-15

    MicroRNAs (miRNAs) are a class of small, endogenous non-coding RNAs that negatively regulate gene expression, resulting in the silencing of target mRNA transcripts through mRNA cleavage or translational inhibition. MiRNAs play significant roles in various biological and physiological processes in plants. However, the miRNA-mediated gene regulatory network in pineapple, the model tropical non-climacteric fruit, remains largely unexplored. Here, we report a complete list of pineapple mature miRNAs obtained from high-throughput small RNA sequencing and precursor miRNAs (pre-miRNAs) obtained from ESTs. Two small RNA libraries were constructed from pineapple fruits and leaves, respectively, using Illumina's Solexa technology. Sequence similarity analysis using miRBase revealed 579,179 reads homologous to 153 miRNAs from 41 miRNA families. In addition, a pineapple fruit transcriptome library consisting of approximately 30,000 EST contigs constructed using Solexa sequencing was used for the discovery of pre-miRNAs. In all, four pre-miRNAs were identified (MIR156, MIR399, MIR444 and MIR2673). Furthermore, the same pineapple transcriptome was used to dissect the function of the miRNAs in pineapple by predicting their putative targets in conjunction with their regulatory networks. In total, 23 metabolic pathways were found to be regulated by miRNAs in pineapple. The use of high-throughput sequencing in pineapples to unveil the presence of miRNAs and their regulatory pathways provides insight into the repertoire of miRNA regulation used exclusively in this non-climacteric model plant. Copyright © 2015 Elsevier B.V. All rights reserved.

  19. Rapid and Accurate Sequencing of Enterovirus Genomes Using MinION Nanopore Sequencer.

    Science.gov (United States)

    Wang, Ji; Ke, Yue Hua; Zhang, Yong; Huang, Ke Qiang; Wang, Lei; Shen, Xin Xin; Dong, Xiao Ping; Xu, Wen Bo; Ma, Xue Jun

    2017-10-01

    Knowledge of an enterovirus genome sequence is very important in epidemiological investigation to identify transmission patterns and ascertain the extent of an outbreak. The MinION sequencer is increasingly used to sequence various viral pathogens in many clinical situations because of its long reads, portability, real-time accessibility of sequenced data, and very low initial costs. However, information is lacking on MinION sequencing of enterovirus genomes. In this proof-of-concept study using Enterovirus 71 (EV71) and Coxsackievirus A16 (CA16) strains as examples, we established an amplicon-based whole genome sequencing method using MinION. We explored the accuracy, minimum sequencing time, discrimination and high-throughput sequencing ability of MinION, and compared its performance with Sanger sequencing. Within the first minute (min) of sequencing, the accuracy of MinION was 98.5% for the single EV71 strain and 94.12%-97.33% for 10 genetically-related CA16 strains. In as little as 14 min, 99% identity was reached for the single EV71 strain, and in 17 min (on average), 99% identity was achieved for 10 CA16 strains in a single run. MinION is suitable for whole genome sequencing of enteroviruses with sufficient accuracy and fine discrimination and has the potential as a fast, reliable and convenient method for routine use. Copyright © 2017 The Editorial Board of Biomedical and Environmental Sciences. Published by China CDC. All rights reserved.

  20. High-Throughput Tabular Data Processor - Platform independent graphical tool for processing large data sets.

    Science.gov (United States)

    Madanecki, Piotr; Bałut, Magdalena; Buckley, Patrick G; Ochocka, J Renata; Bartoszewski, Rafał; Crossman, David K; Messiaen, Ludwine M; Piotrowski, Arkadiusz

    2018-01-01

    High-throughput technologies generate considerable amount of data which often requires bioinformatic expertise to analyze. Here we present High-Throughput Tabular Data Processor (HTDP), a platform independent Java program. HTDP works on any character-delimited column data (e.g. BED, GFF, GTF, PSL, WIG, VCF) from multiple text files and supports merging, filtering and converting of data that is produced in the course of high-throughput experiments. HTDP can also utilize itemized sets of conditions from external files for complex or repetitive filtering/merging tasks. The program is intended to aid global, real-time processing of large data sets using a graphical user interface (GUI). Therefore, no prior expertise in programming, regular expression, or command line usage is required of the user. Additionally, no a priori assumptions are imposed on the internal file composition. We demonstrate the flexibility and potential of HTDP in real-life research tasks including microarray and massively parallel sequencing, i.e. identification of disease predisposing variants in the next generation sequencing data as well as comprehensive concurrent analysis of microarray and sequencing results. We also show the utility of HTDP in technical tasks including data merge, reduction and filtering with external criteria files. HTDP was developed to address functionality that is missing or rudimentary in other GUI software for processing character-delimited column data from high-throughput technologies. Flexibility, in terms of input file handling, provides long term potential functionality in high-throughput analysis pipelines, as the program is not limited by the currently existing applications and data formats. HTDP is available as the Open Source software (https://github.com/pmadanecki/htdp).

  1. Evolution of MHC class I genes in the endangered loggerhead sea turtle (Caretta caretta) revealed by 454 amplicon sequencing.

    Science.gov (United States)

    Stiebens, Victor A; Merino, Sonia E; Chain, Frédéric J J; Eizaguirre, Christophe

    2013-04-30

    In evolutionary and conservation biology, parasitism is often highlighted as a major selective pressure. To fight against parasites and pathogens, genetic diversity of the immune genes of the major histocompatibility complex (MHC) are particularly important. However, the extensive degree of polymorphism observed in these genes makes it difficult to conduct thorough population screenings. We utilized a genotyping protocol that uses 454 amplicon sequencing to characterize the MHC class I in the endangered loggerhead sea turtle (Caretta caretta) and to investigate their evolution at multiple relevant levels of organization. MHC class I genes revealed signatures of trans-species polymorphism across several reptile species. In the studied loggerhead turtle individuals, it results in the maintenance of two ancient allelic lineages. We also found that individuals carrying an intermediate number of MHC class I alleles are larger than those with either a low or high number of alleles. Multiple modes of evolution seem to maintain MHC diversity in the loggerhead turtles, with relatively high polymorphism for an endangered species.

  2. Diversity and Structure of Diazotrophic Communities in Mangrove Rhizosphere, Revealed by High-Throughput Sequencing.

    Science.gov (United States)

    Zhang, Yanying; Yang, Qingsong; Ling, Juan; Van Nostrand, Joy D; Shi, Zhou; Zhou, Jizhong; Dong, Junde

    2017-01-01

    Diazotrophic communities make an essential contribution to the productivity through providing new nitrogen. However, knowledge of the roles that both mangrove tree species and geochemical parameters play in shaping mangove rhizosphere diazotrophic communities is still elusive. Here, a comprehensive examination of the diversity and structure of microbial communities in the rhizospheres of three mangrove species, Rhizophora apiculata , Avicennia marina , and Ceriops tagal , was undertaken using high - throughput sequencing of the 16S rRNA and nifH genes. Our results revealed a great diversity of both the total microbial composition and the diazotrophic composition specifically in the mangrove rhizosphere. Deltaproteobacteria and Gammaproteobacteria were both ubiquitous and dominant, comprising an average of 45.87 and 86.66% of total microbial and diazotrophic communities, respectively. Sulfate-reducing bacteria belonging to the Desulfobacteraceae and Desulfovibrionaceae were the dominant diazotrophs. Community statistical analyses suggested that both mangrove tree species and additional environmental variables played important roles in shaping total microbial and potential diazotroph communities in mangrove rhizospheres. In contrast to the total microbial community investigated by analysis of 16S rRNA gene sequences, most of the dominant diazotrophic groups identified by nifH gene sequences were significantly different among mangrove species. The dominant diazotrophs of the family Desulfobacteraceae were positively correlated with total phosphorus, but negatively correlated with the nitrogen to phosphorus ratio. The Pseudomonadaceae were positively correlated with the concentration of available potassium, suggesting that diazotrophs potentially play an important role in biogeochemical cycles, such as those of nitrogen, phosphorus, sulfur, and potassium, in the mangrove ecosystem.

  3. Diversity and Structure of Diazotrophic Communities in Mangrove Rhizosphere, Revealed by High-Throughput Sequencing

    Directory of Open Access Journals (Sweden)

    Yanying Zhang

    2017-10-01

    Full Text Available Diazotrophic communities make an essential contribution to the productivity through providing new nitrogen. However, knowledge of the roles that both mangrove tree species and geochemical parameters play in shaping mangove rhizosphere diazotrophic communities is still elusive. Here, a comprehensive examination of the diversity and structure of microbial communities in the rhizospheres of three mangrove species, Rhizophora apiculata, Avicennia marina, and Ceriops tagal, was undertaken using high-throughput sequencing of the 16S rRNA and nifH genes. Our results revealed a great diversity of both the total microbial composition and the diazotrophic composition specifically in the mangrove rhizosphere. Deltaproteobacteria and Gammaproteobacteria were both ubiquitous and dominant, comprising an average of 45.87 and 86.66% of total microbial and diazotrophic communities, respectively. Sulfate-reducing bacteria belonging to the Desulfobacteraceae and Desulfovibrionaceae were the dominant diazotrophs. Community statistical analyses suggested that both mangrove tree species and additional environmental variables played important roles in shaping total microbial and potential diazotroph communities in mangrove rhizospheres. In contrast to the total microbial community investigated by analysis of 16S rRNA gene sequences, most of the dominant diazotrophic groups identified by nifH gene sequences were significantly different among mangrove species. The dominant diazotrophs of the family Desulfobacteraceae were positively correlated with total phosphorus, but negatively correlated with the nitrogen to phosphorus ratio. The Pseudomonadaceae were positively correlated with the concentration of available potassium, suggesting that diazotrophs potentially play an important role in biogeochemical cycles, such as those of nitrogen, phosphorus, sulfur, and potassium, in the mangrove ecosystem.

  4. Comparative analysis of transcriptomes in aerial stems and roots of Ephedra sinica based on high-throughput mRNA sequencing

    Directory of Open Access Journals (Sweden)

    Taketo Okada

    2016-12-01

    Full Text Available Ephedra plants are taxonomically classified as gymnosperms, and are medicinally important as the botanical origin of crude drugs and as bioresources that contain pharmacologically active chemicals. Here we show a comparative analysis of the transcriptomes of aerial stems and roots of Ephedra sinica based on high-throughput mRNA sequencing by RNA-Seq. De novo assembly of short cDNA sequence reads generated 23,358, 13,373, and 28,579 contigs longer than 200 bases from aerial stems, roots, or both aerial stems and roots, respectively. The presumed functions encoded by these contig sequences were annotated by BLAST (blastx. Subsequently, these contigs were classified based on gene ontology slims, Enzyme Commission numbers, and the InterPro database. Furthermore, comparative gene expression analysis was performed between aerial stems and roots. These transcriptome analyses revealed differences and similarities between the transcriptomes of aerial stems and roots in E. sinica. Deep transcriptome sequencing of Ephedra should open the door to molecular biological studies based on the entire transcriptome, tissue- or organ-specific transcriptomes, or targeted genes of interest.

  5. A Proteomic Workflow Using High-Throughput De Novo Sequencing Towards Complementation of Genome Information for Improved Comparative Crop Science.

    Science.gov (United States)

    Turetschek, Reinhard; Lyon, David; Desalegn, Getinet; Kaul, Hans-Peter; Wienkoop, Stefanie

    2016-01-01

    The proteomic study of non-model organisms, such as many crop plants, is challenging due to the lack of comprehensive genome information. Changing environmental conditions require the study and selection of adapted cultivars. Mutations, inherent to cultivars, hamper protein identification and thus considerably complicate the qualitative and quantitative comparison in large-scale systems biology approaches. With this workflow, cultivar-specific mutations are detected from high-throughput comparative MS analyses, by extracting sequence polymorphisms with de novo sequencing. Stringent criteria are suggested to filter for confidential mutations. Subsequently, these polymorphisms complement the initially used database, which is ready to use with any preferred database search algorithm. In our example, we thereby identified 26 specific mutations in two cultivars of Pisum sativum and achieved an increased number (17 %) of peptide spectrum matches.

  6. Error correction and statistical analyses for intra-host comparisons of feline immunodeficiency virus diversity from high-throughput sequencing data.

    Science.gov (United States)

    Liu, Yang; Chiaromonte, Francesca; Ross, Howard; Malhotra, Raunaq; Elleder, Daniel; Poss, Mary

    2015-06-30

    Infection with feline immunodeficiency virus (FIV) causes an immunosuppressive disease whose consequences are less severe if cats are co-infected with an attenuated FIV strain (PLV). We use virus diversity measurements, which reflect replication ability and the virus response to various conditions, to test whether diversity of virulent FIV in lymphoid tissues is altered in the presence of PLV. Our data consisted of the 3' half of the FIV genome from three tissues of animals infected with FIV alone, or with FIV and PLV, sequenced by 454 technology. Since rare variants dominate virus populations, we had to carefully distinguish sequence variation from errors due to experimental protocols and sequencing. We considered an exponential-normal convolution model used for background correction of microarray data, and modified it to formulate an error correction approach for minor allele frequencies derived from high-throughput sequencing. Similar to accounting for over-dispersion in counts, this accounts for error-inflated variability in frequencies - and quite effectively reproduces empirically observed distributions. After obtaining error-corrected minor allele frequencies, we applied ANalysis Of VAriance (ANOVA) based on a linear mixed model and found that conserved sites and transition frequencies in FIV genes differ among tissues of dual and single infected cats. Furthermore, analysis of minor allele frequencies at individual FIV genome sites revealed 242 sites significantly affected by infection status (dual vs. single) or infection status by tissue interaction. All together, our results demonstrated a decrease in FIV diversity in bone marrow in the presence of PLV. Importantly, these effects were weakened or undetectable when error correction was performed with other approaches (thresholding of minor allele frequencies; probabilistic clustering of reads). We also queried the data for cytidine deaminase activity on the viral genome, which causes an asymmetric increase

  7. Sequence-specific validation of LAMP amplicons in real-time optomagnetic detection of Dengue serotype 2 synthetic DNA

    DEFF Research Database (Denmark)

    Minero, Gabriel Khose Antonio; Nogueira, Catarina; Rizzi, Giovanni

    2017-01-01

    We report on an optomagnetic technique optimised for real-time molecular detection of Dengue fever virus under ideal as well as non-ideal laboratory conditions using two different detection approaches. The first approach is based on the detection of the hydrodynamic volume of streptavidin coated...... magnetic nanoparticles attached to biotinylated LAMP amplicons. We demonstrate detection of sub-femtomolar Dengue DNA target concentrations in the ideal contamination-free lab environment within 20 min. The second detection approach is based on sequence-specific binding of functionalised magnetic...... claim detection of down to 100 fM of Dengue target after 20 min of LAMP with a contamination background....

  8. High-throughput verification of transcriptional starting sites by Deep-RACE

    DEFF Research Database (Denmark)

    Olivarius, Signe; Plessy, Charles; Carninci, Piero

    2009-01-01

    We present a high-throughput method for investigating the transcriptional starting sites of genes of interest, which we named Deep-RACE (Deep–rapid amplification of cDNA ends). Taking advantage of the latest sequencing technology, it allows the parallel analysis of multiple genes and is free...

  9. ASAP: Amplification, sequencing & annotation of plastomes

    Directory of Open Access Journals (Sweden)

    Folta Kevin M

    2005-12-01

    Full Text Available Abstract Background Availability of DNA sequence information is vital for pursuing structural, functional and comparative genomics studies in plastids. Traditionally, the first step in mining the valuable information within a chloroplast genome requires sequencing a chloroplast plasmid library or BAC clones. These activities involve complicated preparatory procedures like chloroplast DNA isolation or identification of the appropriate BAC clones to be sequenced. Rolling circle amplification (RCA is being used currently to amplify the chloroplast genome from purified chloroplast DNA and the resulting products are sheared and cloned prior to sequencing. Herein we present a universal high-throughput, rapid PCR-based technique to amplify, sequence and assemble plastid genome sequence from diverse species in a short time and at reasonable cost from total plant DNA, using the large inverted repeat region from strawberry and peach as proof of concept. The method exploits the highly conserved coding regions or intergenic regions of plastid genes. Using an informatics approach, chloroplast DNA sequence information from 5 available eudicot plastomes was aligned to identify the most conserved regions. Cognate primer pairs were then designed to generate ~1 – 1.2 kb overlapping amplicons from the inverted repeat region in 14 diverse genera. Results 100% coverage of the inverted repeat region was obtained from Arabidopsis, tobacco, orange, strawberry, peach, lettuce, tomato and Amaranthus. Over 80% coverage was obtained from distant species, including Ginkgo, loblolly pine and Equisetum. Sequence from the inverted repeat region of strawberry and peach plastome was obtained, annotated and analyzed. Additionally, a polymorphic region identified from gel electrophoresis was sequenced from tomato and Amaranthus. Sequence analysis revealed large deletions in these species relative to tobacco plastome thus exhibiting the utility of this method for structural and

  10. Fluorescence-based high-throughput screening of dicer cleavage activity.

    Science.gov (United States)

    Podolska, Katerina; Sedlak, David; Bartunek, Petr; Svoboda, Petr

    2014-03-01

    Production of small RNAs by ribonuclease III Dicer is a key step in microRNA and RNA interference pathways, which employ Dicer-produced small RNAs as sequence-specific silencing guides. Further studies and manipulations of microRNA and RNA interference pathways would benefit from identification of small-molecule modulators. Here, we report a study of a fluorescence-based in vitro Dicer cleavage assay, which was adapted for high-throughput screening. The kinetic assay can be performed under single-turnover conditions (35 nM substrate and 70 nM Dicer) in a small volume (5 µL), which makes it suitable for high-throughput screening in a 1536-well format. As a proof of principle, a small library of bioactive compounds was analyzed, demonstrating potential of the assay.

  11. Transcriptomic analysis of Petunia hybrida in response to salt stress using high throughput RNA sequencing.

    Directory of Open Access Journals (Sweden)

    Gonzalo H Villarino

    Full Text Available Salinity and drought stress are the primary cause of crop losses worldwide. In sodic saline soils sodium chloride (NaCl disrupts normal plant growth and development. The complex interactions of plant systems with abiotic stress have made RNA sequencing a more holistic and appealing approach to study transcriptome level responses in a single cell and/or tissue. In this work, we determined the Petunia transcriptome response to NaCl stress by sequencing leaf samples and assembling 196 million Illumina reads with Trinity software. Using our reference transcriptome we identified more than 7,000 genes that were differentially expressed within 24 h of acute NaCl stress. The proposed transcriptome can also be used as an excellent tool for biological and bioinformatics in the absence of an available Petunia genome and it is available at the SOL Genomics Network (SGN http://solgenomics.net. Genes related to regulation of reactive oxygen species, transport, and signal transductions as well as novel and undescribed transcripts were among those differentially expressed in response to salt stress. The candidate genes identified in this study can be applied as markers for breeding or to genetically engineer plants to enhance salt tolerance. Gene Ontology analyses indicated that most of the NaCl damage happened at 24 h inducing genotoxicity, affecting transport and organelles due to the high concentration of Na+ ions. Finally, we report a modification to the library preparation protocol whereby cDNA samples were bar-coded with non-HPLC purified primers, without affecting the quality and quantity of the RNA-seq data. The methodological improvement presented here could substantially reduce the cost of sample preparation for future high-throughput RNA sequencing experiments.

  12. Transcriptomic analysis of Petunia hybrida in response to salt stress using high throughput RNA sequencing.

    Science.gov (United States)

    Villarino, Gonzalo H; Bombarely, Aureliano; Giovannoni, James J; Scanlon, Michael J; Mattson, Neil S

    2014-01-01

    Salinity and drought stress are the primary cause of crop losses worldwide. In sodic saline soils sodium chloride (NaCl) disrupts normal plant growth and development. The complex interactions of plant systems with abiotic stress have made RNA sequencing a more holistic and appealing approach to study transcriptome level responses in a single cell and/or tissue. In this work, we determined the Petunia transcriptome response to NaCl stress by sequencing leaf samples and assembling 196 million Illumina reads with Trinity software. Using our reference transcriptome we identified more than 7,000 genes that were differentially expressed within 24 h of acute NaCl stress. The proposed transcriptome can also be used as an excellent tool for biological and bioinformatics in the absence of an available Petunia genome and it is available at the SOL Genomics Network (SGN) http://solgenomics.net. Genes related to regulation of reactive oxygen species, transport, and signal transductions as well as novel and undescribed transcripts were among those differentially expressed in response to salt stress. The candidate genes identified in this study can be applied as markers for breeding or to genetically engineer plants to enhance salt tolerance. Gene Ontology analyses indicated that most of the NaCl damage happened at 24 h inducing genotoxicity, affecting transport and organelles due to the high concentration of Na+ ions. Finally, we report a modification to the library preparation protocol whereby cDNA samples were bar-coded with non-HPLC purified primers, without affecting the quality and quantity of the RNA-seq data. The methodological improvement presented here could substantially reduce the cost of sample preparation for future high-throughput RNA sequencing experiments.

  13. Preliminary High-Throughput Metagenome Assembly

    Energy Technology Data Exchange (ETDEWEB)

    Dusheyko, Serge; Furman, Craig; Pangilinan, Jasmyn; Shapiro, Harris; Tu, Hank

    2007-03-26

    Metagenome data sets present a qualitatively different assembly problem than traditional single-organism whole-genome shotgun (WGS) assembly. The unique aspects of such projects include the presence of a potentially large number of distinct organisms and their representation in the data set at widely different fractions. In addition, multiple closely related strains could be present, which would be difficult to assemble separately. Failure to take these issues into account can result in poor assemblies that either jumble together different strains or which fail to yield useful results. The DOE Joint Genome Institute has sequenced a number of metagenomic projects and plans to considerably increase this number in the coming year. As a result, the JGI has a need for high-throughput tools and techniques for handling metagenome projects. We present the techniques developed to handle metagenome assemblies in a high-throughput environment. This includes a streamlined assembly wrapper, based on the JGI?s in-house WGS assembler, Jazz. It also includes the selection of sensible defaults targeted for metagenome data sets, as well as quality control automation for cleaning up the raw results. While analysis is ongoing, we will discuss preliminary assessments of the quality of the assembly results (http://fames.jgi-psf.org).

  14. Global repeat discovery and estimation of genomic copy number in a large, complex genome using a high-throughput 454 sequence survey

    Directory of Open Access Journals (Sweden)

    Varala Kranthi

    2007-05-01

    Full Text Available Abstract Background Extensive computational and database tools are available to mine genomic and genetic databases for model organisms, but little genomic data is available for many species of ecological or agricultural significance, especially those with large genomes. Genome surveys using conventional sequencing techniques are powerful, particularly for detecting sequences present in many copies per genome. However these methods are time-consuming and have potential drawbacks. High throughput 454 sequencing provides an alternative method by which much information can be gained quickly and cheaply from high-coverage surveys of genomic DNA. Results We sequenced 78 million base-pairs of randomly sheared soybean DNA which passed our quality criteria. Computational analysis of the survey sequences provided global information on the abundant repetitive sequences in soybean. The sequence was used to determine the copy number across regions of large genomic clones or contigs and discover higher-order structures within satellite repeats. We have created an annotated, online database of sequences present in multiple copies in the soybean genome. The low bias of pyrosequencing against repeat sequences is demonstrated by the overall composition of the survey data, which matches well with past estimates of repetitive DNA content obtained by DNA re-association kinetics (Cot analysis. Conclusion This approach provides a potential aid to conventional or shotgun genome assembly, by allowing rapid assessment of copy number in any clone or clone-end sequence. In addition, we show that partial sequencing can provide access to partial protein-coding sequences.

  15. High-throughput sequencing of plasma microRNA in chronic fatigue syndrome/myalgic encephalomyelitis.

    Directory of Open Access Journals (Sweden)

    Ekua W Brenu

    Full Text Available BACKGROUND: MicroRNAs (miRNAs are known to regulate many biological processes and their dysregulation has been associated with a variety of diseases including Chronic Fatigue Syndrome/Myalgic Encephalomyelitis (CFS/ME. The recent discovery of stable and reproducible miRNA in plasma has raised the possibility that circulating miRNAs may serve as novel diagnostic markers. The objective of this study was to determine the role of plasma miRNA in CFS/ME. RESULTS: Using Illumina high-throughput sequencing we identified 19 miRNAs that were differentially expressed in the plasma of CFS/ME patients in comparison to non-fatigued controls. Following RT-qPCR analysis, we were able to confirm the significant up-regulation of three miRNAs (hsa-miR-127-3p, hsa-miR-142-5p and hsa-miR-143-3p in the CFS/ME patients. CONCLUSION: Our study is the first to identify circulating miRNAs from CFS/ME patients and also to confirm three differentially expressed circulating miRNAs in CFS/ME patients, providing a basis for further study to find useful CFS/ME biomarkers.

  16. High-Throughput Sequencing of Plasma MicroRNA in Chronic Fatigue Syndrome/Myalgic Encephalomyelitis

    Science.gov (United States)

    Brenu, Ekua W.; Ashton, Kevin J.; Batovska, Jana; Staines, Donald R.; Marshall-Gradisnik, Sonya M.

    2014-01-01

    Background MicroRNAs (miRNAs) are known to regulate many biological processes and their dysregulation has been associated with a variety of diseases including Chronic Fatigue Syndrome/Myalgic Encephalomyelitis (CFS/ME). The recent discovery of stable and reproducible miRNA in plasma has raised the possibility that circulating miRNAs may serve as novel diagnostic markers. The objective of this study was to determine the role of plasma miRNA in CFS/ME. Results Using Illumina high-throughput sequencing we identified 19 miRNAs that were differentially expressed in the plasma of CFS/ME patients in comparison to non-fatigued controls. Following RT-qPCR analysis, we were able to confirm the significant up-regulation of three miRNAs (hsa-miR-127-3p, hsa-miR-142-5p and hsa-miR-143-3p) in the CFS/ME patients. Conclusion Our study is the first to identify circulating miRNAs from CFS/ME patients and also to confirm three differentially expressed circulating miRNAs in CFS/ME patients, providing a basis for further study to find useful CFS/ME biomarkers. PMID:25238588

  17. High Throughput Facility

    Data.gov (United States)

    Federal Laboratory Consortium — Argonne?s high throughput facility provides highly automated and parallel approaches to material and materials chemistry development. The facility allows scientists...

  18. A comprehensive analysis of in vitro and in vivo genetic fitness of Pseudomonas aeruginosa using high-throughput sequencing of transposon libraries.

    Directory of Open Access Journals (Sweden)

    David Skurnik

    Full Text Available High-throughput sequencing of transposon (Tn libraries created within entire genomes identifies and quantifies the contribution of individual genes and operons to the fitness of organisms in different environments. We used insertion-sequencing (INSeq to analyze the contribution to fitness of all non-essential genes in the chromosome of Pseudomonas aeruginosa strain PA14 based on a library of ∼300,000 individual Tn insertions. In vitro growth in LB provided a baseline for comparison with the survival of the Tn insertion strains following 6 days of colonization of the murine gastrointestinal tract as well as a comparison with Tn-inserts subsequently able to systemically disseminate to the spleen following induction of neutropenia. Sequencing was performed following DNA extraction from the recovered bacteria, digestion with the MmeI restriction enzyme that hydrolyzes DNA 16 bp away from the end of the Tn insert, and fractionation into oligonucleotides of 1,200-1,500 bp that were prepared for high-throughput sequencing. Changes in frequency of Tn inserts into the P. aeruginosa genome were used to quantify in vivo fitness resulting from loss of a gene. 636 genes had <10 sequencing reads in LB, thus defined as unable to grow in this medium. During in vivo infection there were major losses of strains with Tn inserts in almost all known virulence factors, as well as respiration, energy utilization, ion pumps, nutritional genes and prophages. Many new candidates for virulence factors were also identified. There were consistent changes in the recovery of Tn inserts in genes within most operons and Tn insertions into some genes enhanced in vivo fitness. Strikingly, 90% of the non-essential genes were required for in vivo survival following systemic dissemination during neutropenia. These experiments resulted in the identification of the P. aeruginosa strain PA14 genes necessary for optimal survival in the mucosal and systemic environments of a mammalian

  19. Evaluation of a transposase protocol for rapid generation of shotgun high-throughput sequencing libraries from nanogram quantities of DNA.

    Science.gov (United States)

    Marine, Rachel; Polson, Shawn W; Ravel, Jacques; Hatfull, Graham; Russell, Daniel; Sullivan, Matthew; Syed, Fraz; Dumas, Michael; Wommack, K Eric

    2011-11-01

    Construction of DNA fragment libraries for next-generation sequencing can prove challenging, especially for samples with low DNA yield. Protocols devised to circumvent the problems associated with low starting quantities of DNA can result in amplification biases that skew the distribution of genomes in metagenomic data. Moreover, sample throughput can be slow, as current library construction techniques are time-consuming. This study evaluated Nextera, a new transposon-based method that is designed for quick production of DNA fragment libraries from a small quantity of DNA. The sequence read distribution across nine phage genomes in a mock viral assemblage met predictions for six of the least-abundant phages; however, the rank order of the most abundant phages differed slightly from predictions. De novo genome assemblies from Nextera libraries provided long contigs spanning over half of the phage genome; in four cases where full-length genome sequences were available for comparison, consensus sequences were found to match over 99% of the genome with near-perfect identity. Analysis of areas of low and high sequence coverage within phage genomes indicated that GC content may influence coverage of sequences from Nextera libraries. Comparisons of phage genomes prepared using both Nextera and a standard 454 FLX Titanium library preparation protocol suggested that the coverage biases according to GC content observed within the Nextera libraries were largely attributable to bias in the Nextera protocol rather than to the 454 sequencing technology. Nevertheless, given suitable sequence coverage, the Nextera protocol produced high-quality data for genomic studies. For metagenomics analyses, effects of GC amplification bias would need to be considered; however, the library preparation standardization that Nextera provides should benefit comparative metagenomic analyses.

  20. MicroRNA from Moringa oleifera: Identification by High Throughput Sequencing and Their Potential Contribution to Plant Medicinal Value.

    Science.gov (United States)

    Pirrò, Stefano; Zanella, Letizia; Kenzo, Maurice; Montesano, Carla; Minutolo, Antonella; Potestà, Marina; Sobze, Martin Sanou; Canini, Antonella; Cirilli, Marco; Muleo, Rosario; Colizzi, Vittorio; Galgani, Andrea

    2016-01-01

    Moringa oleifera is a widespread plant with substantial nutritional and medicinal value. We postulated that microRNAs (miRNAs), which are endogenous, noncoding small RNAs regulating gene expression at the post-transcriptional level, might contribute to the medicinal properties of plants of this species after ingestion into human body, regulating human gene expression. However, the knowledge is scarce about miRNA in Moringa. Furthermore, in order to test the hypothesis on the pharmacological potential properties of miRNA, we conducted a high-throughput sequencing analysis using the Illumina platform. A total of 31,290,964 raw reads were produced from a library of small RNA isolated from M. oleifera seeds. We identified 94 conserved and two novel miRNAs that were validated by qRT-PCR assays. Results from qRT-PCR trials conducted on the expression of 20 Moringa miRNA showed that are conserved across multiple plant species as determined by their detection in tissue of other common crop plants. In silico analyses predicted target genes for the conserved miRNA that in turn allowed to relate the miRNAs to the regulation of physiological processes. Some of the predicted plant miRNAs have functional homology to their mammalian counterparts and regulated human genes when they were transfected into cell lines. To our knowledge, this is the first report of discovering M. oleifera miRNAs based on high-throughput sequencing and bioinformatics analysis and we provided new insight into a potential cross-species control of human gene expression. The widespread cultivation and consumption of M. oleifera, for nutritional and medicinal purposes, brings humans into close contact with products and extracts of this plant species. The potential for miRNA transfer should be evaluated as one possible mechanism of action to account for beneficial properties of this valuable species.

  1. Temporal dynamics of soil microbial communities under different moisture regimes: high-throughput sequencing and bioinformatics analysis

    Science.gov (United States)

    Semenov, Mikhail; Zhuravleva, Anna; Semenov, Vyacheslav; Yevdokimov, Ilya; Larionova, Alla

    2017-04-01

    Recent climate scenarios predict not only continued global warming but also an increased frequency and intensity of extreme climatic events such as strong changes in temperature and precipitation regimes. Microorganisms are well known to be more sensitive to changes in environmental conditions than to other soil chemical and physical parameters. In this study, we determined the shifts in soil microbial community structure as well as indicative taxa in soils under three moisture regimes using high-throughput Illumina sequencing and range of bioinformatics approaches for the assessment of sequence data. Incubation experiments were performed in soil-filled (Greyic Phaeozems Albic) rhizoboxes with maize and without plants. Three contrasting moisture regimes were being simulated: 1) optimal wetting (OW), a watering 2-3 times per week to maintain soil moisture of 20-25% by weight; 2) periodic wetting (PW), with alternating periods of wetting and drought; and 3) constant insufficient wetting (IW), while soil moisture of 12% by weight was permanently maintained. Sampled fresh soils were homogenized, and the total DNA of three replicates was extracted using the FastDNA® SPIN kit for Soil. DNA replicates were combined in a pooled sample and the DNA was used for PCR with specific primers for the 16S V3 and V4 regions. In order to compare variability between different samples and replicates within a single sample, some DNA replicates treated separately. The products were purified and submitted to Illumina MiSeq sequencing. Sequence data were evaluated by alpha-diversity (Chao1 and Shannon H' diversity indexes), beta-diversity (UniFrac and Bray-Curtis dissimilarity), heatmap, tagcloud, and plot-bar analyses using the MiSeq Reporter Metagenomics Workflow and R packages (phyloseq, vegan, tagcloud). Shannon index varied in a rather narrow range (4.4-4.9) with the lowest values for microbial communities under PW treatment. Chao1 index varied from 385 to 480, being a more flexible

  2. High-throughput sequencing of the B-cell receptor in African Burkitt lymphoma reveals clues to pathogenesis.

    Science.gov (United States)

    Lombardo, Katharine A; Coffey, David G; Morales, Alicia J; Carlson, Christopher S; Towlerton, Andrea M H; Gerdts, Sarah E; Nkrumah, Francis K; Neequaye, Janet; Biggar, Robert J; Orem, Jackson; Casper, Corey; Mbulaiteye, Sam M; Bhatia, Kishor G; Warren, Edus H

    2017-03-28

    Burkitt lymphoma (BL), the most common pediatric cancer in sub-Saharan Africa, is a malignancy of antigen-experienced B lymphocytes. High-throughput sequencing (HTS) of the immunoglobulin heavy ( IGH ) and light chain ( IGK / IGL ) loci was performed on genomic DNA from 51 primary BL tumors: 19 from Uganda and 32 from Ghana. Reverse transcription polymerase chain reaction analysis and tumor RNA sequencing (RNAseq) was performed on the Ugandan tumors to confirm and extend the findings from the HTS of tumor DNA. Clonal IGH and IGK / IGL rearrangements were identified in 41 and 46 tumors, respectively. Evidence for rearrangement of the second IGH allele was observed in only 6 of 41 tumor samples with a clonal IGH rearrangement, suggesting that the normal process of biallelic IGHD to IGHJ diversity-joining (DJ) rearrangement is often disrupted in BL progenitor cells. Most tumors, including those with a sole dominant, nonexpressed DJ rearrangement, contained many IGH and IGK / IGL sequences that differed from the dominant rearrangement by < 10 nucleotides, suggesting that the target of ongoing mutagenesis of these loci in BL tumor cells is not limited to expressed alleles. IGHV usage in both BL tumor cohorts revealed enrichment for IGHV genes that are infrequently used in memory B cells from healthy subjects. Analysis of publicly available DNA sequencing and RNAseq data revealed that these same IGHV genes were overrepresented in dominant tumor-associated IGH rearrangements in several independent BL tumor cohorts. These data suggest that BL derives from an abnormal B-cell progenitor and that aberrant mutational processes are active on the immunoglobulin loci in BL cells.

  3. Genotyping by PCR and High-Throughput Sequencing of Commercial Probiotic Products Reveals Composition Biases.

    Directory of Open Access Journals (Sweden)

    Wesley Morovic

    2016-11-01

    Full Text Available Recent advances in microbiome research have brought renewed focus on beneficial bacteria, many of which are available in food and dietary supplements. Although probiotics have historically been defined as microorganisms that convey health benefits when ingested in sufficient viable amounts, this description now includes the stipulation well defined strains, encompassing definitive taxonomy for consumer consideration and regulatory oversight. Here, we evaluated 52 commercial dietary supplements covering a range of labeled species, and determined their content using plate counting, targeted genotyping. Additionally, strain identities were assessed using methods recently published by the United States Pharmacopeial Convention. We also determined the relative abundance of individual bacteria by high-throughput sequencing (HTS of the 16S rRNA sequence using paired-end 2x250bp Illumina MiSeq technology. Using multiple methods, we tested the hypothesis that products do contain the quantitative amount of labeled bacteria, and qualitative list of labeled microbial species. We found that 17 samples (33% were below label claim for CFU prior to their expiration dates. A multiplexed-PCR scheme showed that only 30/52 (58% of the products contained a correctly labeled classification, with issues encompassing incorrect taxonomy, missing species and un-labeled species. The HTS revealed that many blended products consisted predominantly of Lactobacillus acidophilus and Bifidobacterium animalis subsp. lactis. These results highlight the need for reliable methods to qualitatively determine the correct taxonomy and quantitatively ascertain the relative amounts of mixed microbial populations in commercial probiotic products.

  4. High-throughput sequencing of RNA silencing-associated small RNAs in olive (Olea europaea L..

    Directory of Open Access Journals (Sweden)

    Livia Donaire

    Full Text Available Small RNAs (sRNAs of 20 to 25 nucleotides (nt in length maintain genome integrity and control gene expression in a multitude of developmental and physiological processes. Despite RNA silencing has been primarily studied in model plants, the advent of high-throughput sequencing technologies has enabled profiling of the sRNA component of more than 40 plant species. Here, we used deep sequencing and molecular methods to report the first inventory of sRNAs in olive (Olea europaea L.. sRNA libraries prepared from juvenile and adult shoots revealed that the 24-nt class dominates the sRNA transcriptome and atypically accumulates to levels never seen in other plant species, suggesting an active role of heterochromatin silencing in the maintenance and integrity of its large genome. A total of 18 known miRNA families were identified in the libraries. Also, 5 other sRNAs derived from potential hairpin-like precursors remain as plausible miRNA candidates. RNA blots confirmed miRNA expression and suggested tissue- and/or developmental-specific expression patterns. Target mRNAs of conserved miRNAs were computationally predicted among the olive cDNA collection and experimentally validated through endonucleolytic cleavage assays. Finally, we use expression data to uncover genetic components of the miR156, miR172 and miR390/TAS3-derived trans-acting small interfering RNA (tasiRNA regulatory nodes, suggesting that these interactive networks controlling developmental transitions are fully operational in olive.

  5. Universal and blocking primer mismatches limit the use of high-throughput DNA sequencing for the quantitative metabarcoding of arthropods.

    Science.gov (United States)

    Piñol, J; Mir, G; Gomez-Polo, P; Agustí, N

    2015-07-01

    The quantification of the biological diversity in environmental samples using high-throughput DNA sequencing is hindered by the PCR bias caused by variable primer-template mismatches of the individual species. In some dietary studies, there is the added problem that samples are enriched with predator DNA, so often a predator-specific blocking oligonucleotide is used to alleviate the problem. However, specific blocking oligonucleotides could coblock nontarget species to some degree. Here, we accurately estimate the extent of the PCR biases induced by universal and blocking primers on a mock community prepared with DNA of twelve species of terrestrial arthropods. We also compare universal and blocking primer biases with those induced by variable annealing temperature and number of PCR cycles. The results show that reads of all species were recovered after PCR enrichment at our control conditions (no blocking oligonucleotide, 45 °C annealing temperature and 40 cycles) and high-throughput sequencing. They also show that the four factors considered biased the final proportions of the species to some degree. Among these factors, the number of primer-template mismatches of each species had a disproportionate effect (up to five orders of magnitude) on the amplification efficiency. In particular, the number of primer-template mismatches explained most of the variation (~3/4) in the amplification efficiency of the species. The effect of blocking oligonucleotide concentration on nontarget species relative abundance was also significant, but less important (below one order of magnitude). Considering the results reported here, the quantitative potential of the technique is limited, and only qualitative results (the species list) are reliable, at least when targeting the barcoding COI region. © 2014 John Wiley & Sons Ltd.

  6. Metagenomic analysis and functional characterization of the biogas microbiome using high throughput shotgun sequencing and a novel binning strategy.

    Science.gov (United States)

    Campanaro, Stefano; Treu, Laura; Kougias, Panagiotis G; De Francisci, Davide; Valle, Giorgio; Angelidaki, Irini

    2016-01-01

    Biogas production is an economically attractive technology that has gained momentum worldwide over the past years. Biogas is produced by a biologically mediated process, widely known as "anaerobic digestion." This process is performed by a specialized and complex microbial community, in which different members have distinct roles in the establishment of a collective organization. Deciphering the complex microbial community engaged in this process is interesting both for unraveling the network of bacterial interactions and for applicability potential to the derived knowledge. In this study, we dissect the bioma involved in anaerobic digestion by means of high throughput Illumina sequencing (~51 gigabases of sequence data), disclosing nearly one million genes and extracting 106 microbial genomes by a novel strategy combining two binning processes. Microbial phylogeny and putative taxonomy performed using >400 proteins revealed that the biogas community is a trove of new species. A new approach based on functional properties as per network representation was developed to assign roles to the microbial species. The organization of the anaerobic digestion microbiome is resembled by a funnel concept, in which the microbial consortium presents a progressive functional specialization while reaching the final step of the process (i.e., methanogenesis). Key microbial genomes encoding enzymes involved in specific metabolic pathways, such as carbohydrates utilization, fatty acids degradation, amino acids fermentation, and syntrophic acetate oxidation, were identified. Additionally, the analysis identified a new uncultured archaeon that was putatively related to Methanomassiliicoccales but surprisingly having a methylotrophic methanogenic pathway. This study is a pioneer research on the phylogenetic and functional characterization of the microbial community populating biogas reactors. By applying for the first time high-throughput sequencing and a novel binning strategy, the

  7. Rhea: a transparent and modular R pipeline for microbial profiling based on 16S rRNA gene amplicons.

    Science.gov (United States)

    Lagkouvardos, Ilias; Fischer, Sandra; Kumar, Neeraj; Clavel, Thomas

    2017-01-01

    The importance of 16S rRNA gene amplicon profiles for understanding the influence of microbes in a variety of environments coupled with the steep reduction in sequencing costs led to a surge of microbial sequencing projects. The expanding crowd of scientists and clinicians wanting to make use of sequencing datasets can choose among a range of multipurpose software platforms, the use of which can be intimidating for non-expert users. Among available pipeline options for high-throughput 16S rRNA gene analysis, the R programming language and software environment for statistical computing stands out for its power and increased flexibility, and the possibility to adhere to most recent best practices and to adjust to individual project needs. Here we present the Rhea pipeline, a set of R scripts that encode a series of well-documented choices for the downstream analysis of Operational Taxonomic Units (OTUs) tables, including normalization steps, alpha - and beta -diversity analysis, taxonomic composition, statistical comparisons, and calculation of correlations. Rhea is primarily a straightforward starting point for beginners, but can also be a framework for advanced users who can modify and expand the tool. As the community standards evolve, Rhea will adapt to always represent the current state-of-the-art in microbial profiles analysis in the clear and comprehensive way allowed by the R language. Rhea scripts and documentation are freely available at https://lagkouvardos.github.io/Rhea.

  8. Rhea: a transparent and modular R pipeline for microbial profiling based on 16S rRNA gene amplicons

    Directory of Open Access Journals (Sweden)

    Ilias Lagkouvardos

    2017-01-01

    Full Text Available The importance of 16S rRNA gene amplicon profiles for understanding the influence of microbes in a variety of environments coupled with the steep reduction in sequencing costs led to a surge of microbial sequencing projects. The expanding crowd of scientists and clinicians wanting to make use of sequencing datasets can choose among a range of multipurpose software platforms, the use of which can be intimidating for non-expert users. Among available pipeline options for high-throughput 16S rRNA gene analysis, the R programming language and software environment for statistical computing stands out for its power and increased flexibility, and the possibility to adhere to most recent best practices and to adjust to individual project needs. Here we present the Rhea pipeline, a set of R scripts that encode a series of well-documented choices for the downstream analysis of Operational Taxonomic Units (OTUs tables, including normalization steps, alpha- and beta-diversity analysis, taxonomic composition, statistical comparisons, and calculation of correlations. Rhea is primarily a straightforward starting point for beginners, but can also be a framework for advanced users who can modify and expand the tool. As the community standards evolve, Rhea will adapt to always represent the current state-of-the-art in microbial profiles analysis in the clear and comprehensive way allowed by the R language. Rhea scripts and documentation are freely available at https://lagkouvardos.github.io/Rhea.

  9. High-throughput sequencing of microbial community diversity in soil, grapes, leaves, grape juice and wine of grapevine from China.

    Science.gov (United States)

    Wei, Yu-Jie; Wu, Yun; Yan, Yin-Zhuo; Zou, Wan; Xue, Jie; Ma, Wen-Rui; Wang, Wei; Tian, Ge; Wang, Li-Ye

    2018-01-01

    In this study Illumina MiSeq was performed to investigate microbial diversity in soil, leaves, grape, grape juice and wine. A total of 1,043,102 fungal Internal Transcribed Spacer (ITS) reads and 2,422,188 high quality bacterial 16S rDNA sequences were used for taxonomic classification, revealed five fungal and eight bacterial phyla. At the genus level, the dominant fungi were Ascomycota, Sordariales, Tetracladium and Geomyces in soil, Aureobasidium and Pleosporaceae in grapes leaves, Aureobasidium in grape and grape juice. The dominant bacteria were Kaistobacter, Arthrobacter, Skermanella and Sphingomonas in soil, Pseudomonas, Acinetobacter and Kaistobacter in grape and grapes leaves, and Oenococcus in grape juice and wine. Principal coordinate analysis showed structural separation between the composition of fungi and bacteria in all samples. This is the first study to understand microbiome population in soil, grape, grapes leaves, grape juice and wine in Xinjiang through High-throughput Sequencing and identify microorganisms like Saccharomyces cerevisiae and Oenococcus spp. that may contribute to the quality and flavor of wine.

  10. High-throughput sequencing of microbial community diversity in soil, grapes, leaves, grape juice and wine of grapevine from China

    Science.gov (United States)

    Yan, Yin-zhuo; Zou, Wan; Ma, Wen-rui; Wang, Wei; Tian, Ge; Wang, Li-ye

    2018-01-01

    In this study Illumina MiSeq was performed to investigate microbial diversity in soil, leaves, grape, grape juice and wine. A total of 1,043,102 fungal Internal Transcribed Spacer (ITS) reads and 2,422,188 high quality bacterial 16S rDNA sequences were used for taxonomic classification, revealed five fungal and eight bacterial phyla. At the genus level, the dominant fungi were Ascomycota, Sordariales, Tetracladium and Geomyces in soil, Aureobasidium and Pleosporaceae in grapes leaves, Aureobasidium in grape and grape juice. The dominant bacteria were Kaistobacter, Arthrobacter, Skermanella and Sphingomonas in soil, Pseudomonas, Acinetobacter and Kaistobacter in grape and grapes leaves, and Oenococcus in grape juice and wine. Principal coordinate analysis showed structural separation between the composition of fungi and bacteria in all samples. This is the first study to understand microbiome population in soil, grape, grapes leaves, grape juice and wine in Xinjiang through High-throughput Sequencing and identify microorganisms like Saccharomyces cerevisiae and Oenococcus spp. that may contribute to the quality and flavor of wine. PMID:29565999

  11. High-throughput screening of effective siRNAs using luciferase-linked chimeric mRNA.

    Directory of Open Access Journals (Sweden)

    Shen Pang

    Full Text Available The use of siRNAs to knock down gene expression can potentially be an approach to treat various diseases. To avoid siRNA toxicity the less transcriptionally active H1 pol III promoter, rather than the U6 promoter, was proposed for siRNA expression. To identify highly efficacious siRNA sequences, extensive screening is required, since current computer programs may not render ideal results. Here, we used CCR5 gene silencing as a model to investigate a rapid and efficient screening approach. We constructed a chimeric luciferase-CCR5 gene for high-throughput screening of siRNA libraries. After screening approximately 900 shRNA clones, 12 siRNA sequences were identified. Sequence analysis demonstrated that most (11 of the 12 sequences of these siRNAs did not match those identified by available siRNA prediction algorithms. Significant inhibition of CCR5 in a T-lymphocyte cell line and primary T cells by these identified siRNAs was confirmed using the siRNA lentiviral vectors to infect these cells. The inhibition of CCR5 expression significantly protected cells from R5 HIV-1JRCSF infection. These results indicated that the high-throughput screening method allows efficient identification of siRNA sequences to inhibit the target genes at low levels of expression.

  12. Algorithm for post-clustering curation of DNA amplicon data yields reliable biodiversity estimates

    DEFF Research Database (Denmark)

    Froslev, Tobias Guldberg; Kjoller, Rasmus; Bruun, Hans Henrik

    2017-01-01

    by high-throughput sequencing of amplified marker genes. LULU identifies errors by combining sequence similarity and co-occurrence patterns. To validate the LULU method, we use a unique data set of high quality survey data of vascular plants paired with plant ITS2 metabarcoding data of DNA extracted from...

  13. High-throughput screening of carbohydrate-degrading enzymes using novel insoluble chromogenic substrate assay kits

    DEFF Research Database (Denmark)

    Schückel, Julia; Kracun, Stjepan Kresimir; Willats, William George Tycho

    2016-01-01

    for this is that advances in genome and transcriptome sequencing, together with associated bioinformatics tools allow for rapid identification of candidate CAZymes, but technology for determining an enzyme's biochemical characteristics has advanced more slowly. To address this technology gap, a novel high-throughput assay...... CPH and ICB substrates are provided in a 96-well high-throughput assay system. The CPH substrates can be made in four different colors, enabling them to be mixed together and thus increasing assay throughput. The protocol describes a 96-well plate assay and illustrates how this assay can be used...... for screening the activities of enzymes, enzyme cocktails, and broths....

  14. Applications of High Throughput Nucleotide Sequencing

    DEFF Research Database (Denmark)

    Waage, Johannes Eichler

    equally large demands in data handling, analysis and interpretation, perhaps defining the modern challenge of the computational biologist of the post-genomic era. The first part of this thesis consists of a general introduction to the history, common terms and challenges of next generation sequencing......-sequencing, a study of the effects on alternative RNA splicing of KO of the nonsense mediated RNA decay system in Mus, using digital gene expression and a custom-built exon-exon junction mapping pipeline is presented (article I). Evolved from this work, a Bioconductor package, spliceR, for classifying alternative...

  15. Isolation and characterization of antigen-specific alpaca (Lama pacos) VHH antibodies by biopanning followed by high-throughput sequencing.

    Science.gov (United States)

    Miyazaki, Nobuo; Kiyose, Norihiko; Akazawa, Yoko; Takashima, Mizuki; Hagihara, Yosihisa; Inoue, Naokazu; Matsuda, Tomonari; Ogawa, Ryu; Inoue, Seiya; Ito, Yuji

    2015-09-01

    The antigen-binding domain of camelid dimeric heavy chain antibodies, known as VHH or Nanobody, has much potential in pharmaceutical and industrial applications. To establish the isolation process of antigen-specific VHH, a VHH phage library was constructed with a diversity of 8.4 × 10(7) from cDNA of peripheral blood mononuclear cells of an alpaca (Lama pacos) immunized with a fragment of IZUMO1 (IZUMO1PFF) as a model antigen. By conventional biopanning, 13 antigen-specific VHHs were isolated. The amino acid sequences of these VHHs, designated as N-group VHHs, were very similar to each other (>93% identity). To find more diverse antibodies, we performed high-throughput sequencing (HTS) of VHH genes. By comparing the frequencies of each sequence between before and after biopanning, we found the sequences whose frequencies were increased by biopanning. The top 100 sequences of them were supplied for phylogenic tree analysis. In total 75% of them belonged to N-group VHHs, but the other were phylogenically apart from N-group VHHs (Non N-group). Two of three VHHs selected from non N-group VHHs showed sufficient antigen binding ability. These results suggested that biopanning followed by HTS provided a useful method for finding minor and diverse antigen-specific clones that could not be identified by conventional biopanning. © The Authors 2015. Published by Oxford University Press on behalf of the Japanese Biochemical Society. All rights reserved.

  16. Association Study of Gut Flora in Coronary Heart Disease through High-Throughput Sequencing.

    Science.gov (United States)

    Cui, Li; Zhao, Tingting; Hu, Haibing; Zhang, Wen; Hua, Xiuguo

    2017-01-01

    Objectives. We aimed to explore the impact of gut microbiota in coronary heart disease (CHD) patients through high-throughput sequencing. Methods. A total of 29 CHD in-hospital patients and 35 healthy volunteers as controls were included. Nucleic acids were extracted from fecal samples, followed by α diversity and principal coordinate analysis (PCoA). Based on unweighted UniFrac distance matrices, unweighted-pair group method with arithmetic mean (UPGMA) trees were created. Results. After data optimization, an average of 121312 ± 19293 reads in CHD patients and 234372 ± 108725 reads in controls was obtained. Reads corresponding to 38 phyla, 90 classes, and 584 genera were detected in CHD patients, whereas 40 phyla, 99 classes, and 775 genera were detected in controls. The proportion of phylum Bacteroidetes (56.12%) was lower and that of phylum Firmicutes was higher (37.06%) in CHD patients than those in the controls (60.92% and 32.06%, P UPGMA tree analysis showed that there were significant differences of gut microbial compositions between the two groups. Conclusion. The diversity and compositions of gut flora were different between CHD patients and healthy controls. The incidence of CHD might be associated with the alteration of gut microbiota.

  17. Association Study of Gut Flora in Coronary Heart Disease through High-Throughput Sequencing

    Directory of Open Access Journals (Sweden)

    Li Cui

    2017-01-01

    Full Text Available Objectives. We aimed to explore the impact of gut microbiota in coronary heart disease (CHD patients through high-throughput sequencing. Methods. A total of 29 CHD in-hospital patients and 35 healthy volunteers as controls were included. Nucleic acids were extracted from fecal samples, followed by α diversity and principal coordinate analysis (PCoA. Based on unweighted UniFrac distance matrices, unweighted-pair group method with arithmetic mean (UPGMA trees were created. Results. After data optimization, an average of 121312±19293 reads in CHD patients and 234372±108725 reads in controls was obtained. Reads corresponding to 38 phyla, 90 classes, and 584 genera were detected in CHD patients, whereas 40 phyla, 99 classes, and 775 genera were detected in controls. The proportion of phylum Bacteroidetes (56.12% was lower and that of phylum Firmicutes was higher (37.06% in CHD patients than those in the controls (60.92% and 32.06%, P<0.05. PCoA and UPGMA tree analysis showed that there were significant differences of gut microbial compositions between the two groups. Conclusion. The diversity and compositions of gut flora were different between CHD patients and healthy controls. The incidence of CHD might be associated with the alteration of gut microbiota.

  18. High-throughput continuous cryopump

    International Nuclear Information System (INIS)

    Foster, C.A.

    1986-01-01

    A cryopump with a unique method of regeneration which allows continuous operation at high throughput has been constructed and tested. Deuterium was pumped continuously at a throughput of 30 Torr.L/s at a speed of 2000 L/s and a compression ratio of 200. Argon was pumped at a throughput of 60 Torr.L/s at a speed of 1275 L/s. To produce continuous operation of the pump, a method of regeneration that does not thermally cycle the pump is employed. A small chamber (the ''snail'') passes over the pumping surface and removes the frost from it either by mechanical action with a scraper or by local heating. The material removed is topologically in a secondary vacuum system with low conductance into the primary vacuum; thus, the exhaust can be pumped at pressures up to an effective compression ratio determined by the ratio of the pumping speed to the leakage conductance of the snail. The pump, which is all-metal-sealed and dry and which regenerates every 60 s, would be an ideal system for pumping tritium. Potential fusion applications are for mpmp limiters, for repeating pneumatic pellet injection lines, and for the centrifuge pellet injector spin tank, all of which will require pumping tritium at high throughput. Industrial applications requiring ultraclean pumping of corrosive gases at high throughput, such as the reactive ion etch semiconductor process, may also be feasible

  19. A need for standardization in drinking water analysis – an investigation of DNA extraction procedure, primer choice and detection limit of 16S rRNA amplicon sequencing

    DEFF Research Database (Denmark)

    Brandt, Jakob; Nielsen, Per Halkjær; Albertsen, Mads

    have been made to illuminate the effects specifically related to bacterial communities in drinking water. In this study, we investigated the impact of the DNA extraction and primer choice on the observed community structure, and we also estimated the detection limit of the 16S rRNA amplicon sequencing...

  20. High-throughput sequencing of natively paired antibody chains provides evidence for original antigenic sin shaping the antibody response to influenza vaccination.

    Science.gov (United States)

    Tan, Yann-Chong; Blum, Lisa K; Kongpachith, Sarah; Ju, Chia-Hsin; Cai, Xiaoyong; Lindstrom, Tamsin M; Sokolove, Jeremy; Robinson, William H

    2014-03-01

    We developed a DNA barcoding method to enable high-throughput sequencing of the cognate heavy- and light-chain pairs of the antibodies expressed by individual B cells. We used this approach to elucidate the plasmablast antibody response to influenza vaccination. We show that >75% of the rationally selected plasmablast antibodies bind and neutralize influenza, and that antibodies from clonal families, defined by sharing both heavy-chain VJ and light-chain VJ sequence usage, do so most effectively. Vaccine-induced heavy-chain VJ regions contained on average >20 nucleotide mutations as compared to their predicted germline gene sequences, and some vaccine-induced antibodies exhibited higher binding affinities for hemagglutinins derived from prior years' seasonal influenza as compared to their affinities for the immunization strains. Our results show that influenza vaccination induces the recall of memory B cells that express antibodies that previously underwent affinity maturation against prior years' seasonal influenza, suggesting that 'original antigenic sin' shapes the antibody response to influenza vaccination. Published by Elsevier Inc.

  1. High-throughput computational methods and software for quantitative trait locus (QTL) mapping

    NARCIS (Netherlands)

    Arends, Danny

    2014-01-01

    De afgelopen jaren zijn vele nieuwe technologieen zoals Tiling arrays en High throughput DNA sequencing een belangrijke rol gaan spelen binnen het onderzoeksveld van de systeem genetica. Voor onderzoekers is het extreem belangrijk om te begrijpen dat deze methodes hun manier van werken zullen gaan

  2. Application of High-Throughput Next-Generation Sequencing for HLA Typing on Buccal Extracted DNA: Results from over 10,000 Donor Recruitment Samples.

    Science.gov (United States)

    Yin, Yuxin; Lan, James H; Nguyen, David; Valenzuela, Nicole; Takemura, Ping; Bolon, Yung-Tsi; Springer, Brianna; Saito, Katsuyuki; Zheng, Ying; Hague, Tim; Pasztor, Agnes; Horvath, Gyorgy; Rigo, Krisztina; Reed, Elaine F; Zhang, Qiuheng

    2016-01-01

    Unambiguous HLA typing is important in hematopoietic stem cell transplantation (HSCT), HLA disease association studies, and solid organ transplantation. However, current molecular typing methods only interrogate the antigen recognition site (ARS) of HLA genes, resulting in many cis-trans ambiguities that require additional typing methods to resolve. Here we report high-resolution HLA typing of 10,063 National Marrow Donor Program (NMDP) registry donors using long-range PCR by next generation sequencing (NGS) approach on buccal swab DNA. Multiplex long-range PCR primers amplified the full-length of HLA class I genes (A, B, C) from promotor to 3' UTR. Class II genes (DRB1, DQB1) were amplified from exon 2 through part of exon 4. PCR amplicons were pooled and sheared using Covaris fragmentation. Library preparation was performed using the Illumina TruSeq Nano kit on the Beckman FX automated platform. Each sample was tagged with a unique barcode, followed by 2×250 bp paired-end sequencing on the Illumina MiSeq. HLA typing was assigned using Omixon Twin software that combines two independent computational algorithms to ensure high confidence in allele calling. Consensus sequence and typing results were reported in Histoimmunogenetics Markup Language (HML) format. All homozygous alleles were confirmed by Luminex SSO typing and exon novelties were confirmed by Sanger sequencing. Using this automated workflow, over 10,063 NMDP registry donors were successfully typed under high-resolution by NGS. Despite known challenges of nucleic acid degradation and low DNA concentration commonly associated with buccal-based specimens, 97.8% of samples were successfully amplified using long-range PCR. Among these, 98.2% were successfully reported by NGS, with an accuracy rate of 99.84% in an independent blind Quality Control audit performed by the NDMP. In this study, NGS-HLA typing identified 23 null alleles (0.023%), 92 rare alleles (0.091%) and 42 exon novelties (0.042%). Long

  3. Application of High-Throughput Next-Generation Sequencing for HLA Typing on Buccal Extracted DNA: Results from over 10,000 Donor Recruitment Samples.

    Directory of Open Access Journals (Sweden)

    Yuxin Yin

    Full Text Available Unambiguous HLA typing is important in hematopoietic stem cell transplantation (HSCT, HLA disease association studies, and solid organ transplantation. However, current molecular typing methods only interrogate the antigen recognition site (ARS of HLA genes, resulting in many cis-trans ambiguities that require additional typing methods to resolve. Here we report high-resolution HLA typing of 10,063 National Marrow Donor Program (NMDP registry donors using long-range PCR by next generation sequencing (NGS approach on buccal swab DNA.Multiplex long-range PCR primers amplified the full-length of HLA class I genes (A, B, C from promotor to 3' UTR. Class II genes (DRB1, DQB1 were amplified from exon 2 through part of exon 4. PCR amplicons were pooled and sheared using Covaris fragmentation. Library preparation was performed using the Illumina TruSeq Nano kit on the Beckman FX automated platform. Each sample was tagged with a unique barcode, followed by 2×250 bp paired-end sequencing on the Illumina MiSeq. HLA typing was assigned using Omixon Twin software that combines two independent computational algorithms to ensure high confidence in allele calling. Consensus sequence and typing results were reported in Histoimmunogenetics Markup Language (HML format. All homozygous alleles were confirmed by Luminex SSO typing and exon novelties were confirmed by Sanger sequencing.Using this automated workflow, over 10,063 NMDP registry donors were successfully typed under high-resolution by NGS. Despite known challenges of nucleic acid degradation and low DNA concentration commonly associated with buccal-based specimens, 97.8% of samples were successfully amplified using long-range PCR. Among these, 98.2% were successfully reported by NGS, with an accuracy rate of 99.84% in an independent blind Quality Control audit performed by the NDMP. In this study, NGS-HLA typing identified 23 null alleles (0.023%, 92 rare alleles (0.091% and 42 exon novelties (0.042%.Long

  4. HTP-OligoDesigner: An Online Primer Design Tool for High-Throughput Gene Cloning and Site-Directed Mutagenesis.

    Science.gov (United States)

    Camilo, Cesar M; Lima, Gustavo M A; Maluf, Fernando V; Guido, Rafael V C; Polikarpov, Igor

    2016-01-01

    Following burgeoning genomic and transcriptomic sequencing data, biochemical and molecular biology groups worldwide are implementing high-throughput cloning and mutagenesis facilities in order to obtain a large number of soluble proteins for structural and functional characterization. Since manual primer design can be a time-consuming and error-generating step, particularly when working with hundreds of targets, the automation of primer design process becomes highly desirable. HTP-OligoDesigner was created to provide the scientific community with a simple and intuitive online primer design tool for both laboratory-scale and high-throughput projects of sequence-independent gene cloning and site-directed mutagenesis and a Tm calculator for quick queries.

  5. Use of genotyping by sequencing data to develop a high-throughput and multifunctional SNP panel for conservation applications in Pacific lamprey.

    Science.gov (United States)

    Hess, Jon E; Campbell, Nathan R; Docker, Margaret F; Baker, Cyndi; Jackson, Aaron; Lampman, Ralph; McIlraith, Brian; Moser, Mary L; Statler, David P; Young, William P; Wildbill, Andrew J; Narum, Shawn R

    2015-01-01

    Next-generation sequencing data can be mined for highly informative single nucleotide polymorphisms (SNPs) to develop high-throughput genomic assays for nonmodel organisms. However, choosing a set of SNPs to address a variety of objectives can be difficult because SNPs are often not equally informative. We developed an optimal combination of 96 high-throughput SNP assays from a total of 4439 SNPs identified in a previous study of Pacific lamprey (Entosphenus tridentatus) and used them to address four disparate objectives: parentage analysis, species identification and characterization of neutral and adaptive variation. Nine of these SNPs are FST outliers, and five of these outliers are localized within genes and significantly associated with geography, run-timing and dwarf life history. Two of the 96 SNPs were diagnostic for two other lamprey species that were morphologically indistinguishable at early larval stages and were sympatric in the Pacific Northwest. The majority (85) of SNPs in the panel were highly informative for parentage analysis, that is, putatively neutral with high minor allele frequency across the species' range. Results from three case studies are presented to demonstrate the broad utility of this panel of SNP markers in this species. As Pacific lamprey populations are undergoing rapid decline, these SNPs provide an important resource to address critical uncertainties associated with the conservation and recovery of this imperiled species. © 2014 John Wiley & Sons Ltd.

  6. High-throughput characterization methods for lithium batteries

    Directory of Open Access Journals (Sweden)

    Yingchun Lyu

    2017-09-01

    Full Text Available The development of high-performance lithium ion batteries requires the discovery of new materials and the optimization of key components. By contrast with traditional one-by-one method, high-throughput method can synthesize and characterize a large number of compositionally varying samples, which is able to accelerate the pace of discovery, development and optimization process of materials. Because of rapid progress in thin film and automatic control technologies, thousands of compounds with different compositions could be synthesized rapidly right now, even in a single experiment. However, the lack of rapid or combinatorial characterization technologies to match with high-throughput synthesis methods, limit the application of high-throughput technology. Here, we review a series of representative high-throughput characterization methods used in lithium batteries, including high-throughput structural and electrochemical characterization methods and rapid measuring technologies based on synchrotron light sources.

  7. HTSSIP: An R package for analysis of high throughput sequencing data from nucleic acid stable isotope probing (SIP experiments.

    Directory of Open Access Journals (Sweden)

    Nicholas D Youngblut

    Full Text Available Combining high throughput sequencing with stable isotope probing (HTS-SIP is a powerful method for mapping in situ metabolic processes to thousands of microbial taxa. However, accurately mapping metabolic processes to taxa is complex and challenging. Multiple HTS-SIP data analysis methods have been developed, including high-resolution stable isotope probing (HR-SIP, multi-window high-resolution stable isotope probing (MW-HR-SIP, quantitative stable isotope probing (qSIP, and ΔBD. Currently, there is no publicly available software designed specifically for analyzing HTS-SIP data. To address this shortfall, we have developed the HTSSIP R package, an open-source, cross-platform toolset for conducting HTS-SIP analyses in a straightforward and easily reproducible manner. The HTSSIP package, along with full documentation and examples, is available from CRAN at https://cran.r-project.org/web/packages/HTSSIP/index.html and Github at https://github.com/buckleylab/HTSSIP.

  8. A high-throughput method to detect RNA profiling by integration of RT-MLPA with next generation sequencing technology.

    Science.gov (United States)

    Wang, Jing; Yang, Xue; Chen, Haofeng; Wang, Xuewei; Wang, Xiangyu; Fang, Yi; Jia, Zhenyu; Gao, Jidong

    2017-07-11

    RNA in formalin-fixed and paraffin-embedded (FFPE) tissues provides large amount of information indicating disease stages, histological tumor types and grades, as well as clinical outcomes. However, Detection of RNA expression levels in formalin-fixed and paraffin-embedded samples is extremely difficult due to poor RNA quality. Here we developed a high-throughput method, Reverse Transcription-Multiple Ligation-dependent Probe Sequencing (RT-MLPSeq), to determine expression levels of multiple transcripts in FFPE samples. By combining Reverse Transcription-Multiple Ligation-dependent Amplification method and next generation sequencing technology, RT-MLPSeq overcomes the limit of probe length in multiplex ligation-dependent probe amplification assay and thus could detect expression levels of transcripts without quantitative limitations. We proved that different RT-MLPSeq probes targeting on the same transcripts have highly consistent results and the starting RNA/cDNA input could be as little as 1 ng. RT-MLPSeq also presented consistent relative RNA levels of selected 13 genes with reverse transcription quantitative PCR. Finally, we demonstrated the application of the new RT-MLPSeq method by measuring the mRNA expression levels of 21 genes which can be used for accurate calculation of the breast cancer recurrence score - an index that has been widely used for managing breast cancer patients.

  9. Seasonal diversity and dynamics of haptophytes in the Skagerrak, Norway, explored by high-throughput sequencing

    Science.gov (United States)

    Egge, Elianne Sirnæs; Johannessen, Torill Vik; Andersen, Tom; Eikrem, Wenche; Bittner, Lucie; Larsen, Aud; Sandaa, Ruth-Anne; Edvardsen, Bente

    2015-01-01

    Microalgae in the division Haptophyta play key roles in the marine ecosystem and in global biogeochemical processes. Despite their ecological importance, knowledge on seasonal dynamics, community composition and abundance at the species level is limited due to their small cell size and few morphological features visible under the light microscope. Here, we present unique data on haptophyte seasonal diversity and dynamics from two annual cycles, with the taxonomic resolution and sampling depth obtained with high-throughput sequencing. From outer Oslofjorden, S Norway, nano- and picoplanktonic samples were collected monthly for 2 years, and the haptophytes targeted by amplification of RNA/cDNA with Haptophyta-specific 18S rDNA V4 primers. We obtained 156 operational taxonomic units (OTUs), from c. 400.000 454 pyrosequencing reads, after rigorous bioinformatic filtering and clustering at 99.5%. Most OTUs represented uncultured and/or not yet 18S rDNA-sequenced species. Haptophyte OTU richness and community composition exhibited high temporal variation and significant yearly periodicity. Richness was highest in September–October (autumn) and lowest in April–May (spring). Some taxa were detected all year, such as Chrysochromulina simplex, Emiliania huxleyi and Phaeocystis cordata, whereas most calcifying coccolithophores only appeared from summer to early winter. We also revealed the seasonal dynamics of OTUs representing putative novel classes (clades HAP-3–5) or orders (clades D, E, F). Season, light and temperature accounted for 29% of the variation in OTU composition. Residual variation may be related to biotic factors, such as competition and viral infection. This study provides new, in-depth knowledge on seasonal diversity and dynamics of haptophytes in North Atlantic coastal waters. PMID:25893259

  10. Large-scale DNA Barcode Library Generation for Biomolecule Identification in High-throughput Screens.

    Science.gov (United States)

    Lyons, Eli; Sheridan, Paul; Tremmel, Georg; Miyano, Satoru; Sugano, Sumio

    2017-10-24

    High-throughput screens allow for the identification of specific biomolecules with characteristics of interest. In barcoded screens, DNA barcodes are linked to target biomolecules in a manner allowing for the target molecules making up a library to be identified by sequencing the DNA barcodes using Next Generation Sequencing. To be useful in experimental settings, the DNA barcodes in a library must satisfy certain constraints related to GC content, homopolymer length, Hamming distance, and blacklisted subsequences. Here we report a novel framework to quickly generate large-scale libraries of DNA barcodes for use in high-throughput screens. We show that our framework dramatically reduces the computation time required to generate large-scale DNA barcode libraries, compared with a naїve approach to DNA barcode library generation. As a proof of concept, we demonstrate that our framework is able to generate a library consisting of one million DNA barcodes for use in a fragment antibody phage display screening experiment. We also report generating a general purpose one billion DNA barcode library, the largest such library yet reported in literature. Our results demonstrate the value of our novel large-scale DNA barcode library generation framework for use in high-throughput screening applications.

  11. Uncovering leaf rust responsive miRNAs in wheat (Triticum aestivum L.) using high-throughput sequencing and prediction of their targets through degradome analysis.

    Science.gov (United States)

    Kumar, Dhananjay; Dutta, Summi; Singh, Dharmendra; Prabhu, Kumble Vinod; Kumar, Manish; Mukhopadhyay, Kunal

    2017-01-01

    Deep sequencing identified 497 conserved and 559 novel miRNAs in wheat, while degradome analysis revealed 701 targets genes. QRT-PCR demonstrated differential expression of miRNAs during stages of leaf rust progression. Bread wheat (Triticum aestivum L.) is an important cereal food crop feeding 30 % of the world population. Major threat to wheat production is the rust epidemics. This study was targeted towards identification and functional characterizations of micro(mi)RNAs and their target genes in wheat in response to leaf rust ingression. High-throughput sequencing was used for transcriptome-wide identification of miRNAs and their expression profiling in retort to leaf rust using mock and pathogen-inoculated resistant and susceptible near-isogenic wheat plants. A total of 1056 mature miRNAs were identified, of which 497 miRNAs were conserved and 559 miRNAs were novel. The pathogen-inoculated resistant plants manifested more miRNAs compared with the pathogen infected susceptible plants. The miRNA counts increased in susceptible isoline due to leaf rust, conversely, the counts decreased in the resistant isoline in response to pathogenesis illustrating precise spatial tuning of miRNAs during compatible and incompatible interaction. Stem-loop quantitative real-time PCR was used to profile 10 highly differentially expressed miRNAs obtained from high-throughput sequencing data. The spatio-temporal profiling validated the differential expression of miRNAs between the isolines as well as in retort to pathogen infection. Degradome analysis provided 701 predicted target genes associated with defense response, signal transduction, development, metabolism, and transcriptional regulation. The obtained results indicate that wheat isolines employ diverse arrays of miRNAs that modulate their target genes during compatible and incompatible interaction. Our findings contribute to increase knowledge on roles of microRNA in wheat-leaf rust interactions and could help in rust

  12. High-throughput sequencing, characterization and detection of new and conserved cucumber miRNAs.

    Directory of Open Access Journals (Sweden)

    Germán Martínez

    Full Text Available Micro RNAS (miRNAs are a class of endogenous small non coding RNAs involved in the post-transcriptional regulation of gene expression. In plants, a great number of conserved and specific miRNAs, mainly arising from model species, have been identified to date. However less is known about the diversity of these regulatory RNAs in vegetal species with agricultural and/or horticultural importance. Here we report a combined approach of bioinformatics prediction, high-throughput sequencing data and molecular methods to analyze miRNAs populations in cucumber (Cucumis sativus plants. A set of 19 conserved and 6 known but non-conserved miRNA families were found in our cucumber small RNA dataset. We also identified 7 (3 with their miRNA* strand not previously described miRNAs, candidates to be cucumber-specific. To validate their description these new C. sativus miRNAs were detected by northern blot hybridization. Additionally, potential targets for most conserved and new miRNAs were identified in cucumber genome.In summary, in this study we have identified, by first time, conserved, known non-conserved and new miRNAs arising from an agronomically important species such as C. sativus. The detection of this complex population of regulatory small RNAs suggests that similarly to that observe in other plant species, cucumber miRNAs may possibly play an important role in diverse biological and metabolic processes.

  13. A new sieving matrix for DNA sequencing, genotyping and mutation detection and high-throughput genotyping with a 96-capillary array system

    Energy Technology Data Exchange (ETDEWEB)

    Gao, David [Iowa State Univ., Ames, IA (United States)

    1999-11-08

    Capillary electrophoresis has been widely accepted as a fast separation technique in DNA analysis. In this dissertation, a new sieving matrix is described for DNA analysis, especially DNA sequencing, genetic typing and mutation detection. A high-throughput 96 capillary array electrophoresis system was also demonstrated for simultaneous multiple genotyping. The authors first evaluated the influence of different capillary coatings on the performance of DNA sequencing. A bare capillary was compared with a DB-wax, an FC-coated and a polyvinylpyrrolidone dynamically coated capillary with PEO as sieving matrix. It was found that covalently-coated capillaries had no better performance than bare capillaries while PVP coating provided excellent and reproducible results. The authors also developed a new sieving Matrix for DNA separation based on commercially available poly(vinylpyrrolidone) (PVP). This sieving matrix has a very low viscosity and an excellent self-coating effect. Successful separations were achieved in uncoated capillaries. Sequencing of M13mp18 showed good resolution up to 500 bases in treated PVP solution. Temperature gradient capillary electrophoresis and PVP solution was applied to mutation detection. A heteroduplex sample and a homoduplex reference were injected during a pair of continuous runs. A temperature gradient of 10 C with a ramp of 0.7 C/min was swept throughout the capillary. Detection was accomplished by laser induced fluorescence detection. Mutation detection was performed by comparing the pattern changes between the homoduplex and the heteroduplex samples. High throughput, high detection rate and easy operation were achieved in this system. They further demonstrated fast and reliable genotyping based on CTTv STR system by multiple-capillary array electrophoresis. The PCR products from individuals were mixed with pooled allelic ladder as an absolute standard and coinjected with a 96-vial tray. Simultaneous one-color laser-induced fluorescence

  14. miRanalyzer: an update on the detection and analysis of microRNAs in high-throughput sequencing experiments

    Science.gov (United States)

    Hackenberg, Michael; Rodríguez-Ezpeleta, Naiara; Aransay, Ana M.

    2011-01-01

    We present a new version of miRanalyzer, a web server and stand-alone tool for the detection of known and prediction of new microRNAs in high-throughput sequencing experiments. The new version has been notably improved regarding speed, scope and available features. Alignments are now based on the ultrafast short-read aligner Bowtie (granting also colour space support, allowing mismatches and improving speed) and 31 genomes, including 6 plant genomes, can now be analysed (previous version contained only 7). Differences between plant and animal microRNAs have been taken into account for the prediction models and differential expression of both, known and predicted microRNAs, between two conditions can be calculated. Additionally, consensus sequences of predicted mature and precursor microRNAs can be obtained from multiple samples, which increases the reliability of the predicted microRNAs. Finally, a stand-alone version of the miRanalyzer that is based on a local and easily customized database is also available; this allows the user to have more control on certain parameters as well as to use specific data such as unpublished assemblies or other libraries that are not available in the web server. miRanalyzer is available at http://bioinfo2.ugr.es/miRanalyzer/miRanalyzer.php. PMID:21515631

  15. Discovery of J chain in African lungfish (Protopterus dolloi, Sarcopterygii using high throughput transcriptome sequencing: implications in mucosal immunity.

    Directory of Open Access Journals (Sweden)

    Luca Tacchi

    Full Text Available J chain is a small polypeptide responsible for immunoglobulin (Ig polymerization and transport of Igs across mucosal surfaces in higher vertebrates. We identified a J chain in dipnoid fish, the African lungfish (Protopterus dolloi by high throughput sequencing of the transcriptome. P. dolloi J chain is 161 aa long and contains six of the eight Cys residues present in mammalian J chain. Phylogenetic studies place the lungfish J chain closer to tetrapod J chain than to the coelacanth or nurse shark sequences. J chain expression occurs in all P. dolloi immune tissues examined and it increases in the gut and kidney in response to an experimental bacterial infection. Double fluorescent in-situ hybridization shows that 88.5% of IgM⁺ cells in the gut co-express J chain, a significantly higher percentage than in the pre-pyloric spleen. Importantly, J chain expression is not restricted to the B-cell compartment since gut epithelial cells also express J chain. These results improve our current view of J chain from a phylogenetic perspective.

  16. Combining high-throughput phenotyping and genome-wide association studies to reveal natural genetic variation in rice

    OpenAIRE

    Yang, Wanneng; Guo, Zilong; Huang, Chenglong; Duan, Lingfeng; Chen, Guoxing; Jiang, Ni; Fang, Wei; Feng, Hui; Xie, Weibo; Lian, Xingming; Wang, Gongwei; Luo, Qingming; Zhang, Qifa; Liu, Qian; Xiong, Lizhong

    2014-01-01

    Even as the study of plant genomics rapidly develops through the use of high-throughput sequencing techniques, traditional plant phenotyping lags far behind. Here we develop a high-throughput rice phenotyping facility (HRPF) to monitor 13 traditional agronomic traits and 2 newly defined traits during the rice growth period. Using genome-wide association studies (GWAS) of the 15 traits, we identify 141 associated loci, 25 of which contain known genes such as the Green Revolution semi-dwarf gen...

  17. Seasonal diversity and dynamics of haptophytes in the Skagerrak, Norway, explored by high-throughput sequencing.

    Science.gov (United States)

    Egge, Elianne Sirnaes; Johannessen, Torill Vik; Andersen, Tom; Eikrem, Wenche; Bittner, Lucie; Larsen, Aud; Sandaa, Ruth-Anne; Edvardsen, Bente

    2015-06-01

    Microalgae in the division Haptophyta play key roles in the marine ecosystem and in global biogeochemical processes. Despite their ecological importance, knowledge on seasonal dynamics, community composition and abundance at the species level is limited due to their small cell size and few morphological features visible under the light microscope. Here, we present unique data on haptophyte seasonal diversity and dynamics from two annual cycles, with the taxonomic resolution and sampling depth obtained with high-throughput sequencing. From outer Oslofjorden, S Norway, nano- and picoplanktonic samples were collected monthly for 2 years, and the haptophytes targeted by amplification of RNA/cDNA with Haptophyta-specific 18S rDNA V4 primers. We obtained 156 operational taxonomic units (OTUs), from c. 400.000 454 pyrosequencing reads, after rigorous bioinformatic filtering and clustering at 99.5%. Most OTUs represented uncultured and/or not yet 18S rDNA-sequenced species. Haptophyte OTU richness and community composition exhibited high temporal variation and significant yearly periodicity. Richness was highest in September-October (autumn) and lowest in April-May (spring). Some taxa were detected all year, such as Chrysochromulina simplex, Emiliania huxleyi and Phaeocystis cordata, whereas most calcifying coccolithophores only appeared from summer to early winter. We also revealed the seasonal dynamics of OTUs representing putative novel classes (clades HAP-3-5) or orders (clades D, E, F). Season, light and temperature accounted for 29% of the variation in OTU composition. Residual variation may be related to biotic factors, such as competition and viral infection. This study provides new, in-depth knowledge on seasonal diversity and dynamics of haptophytes in North Atlantic coastal waters. © 2015 The Authors. Molecular Ecology Published by John Wiley & Sons Ltd.

  18. High throughput protease profiling comprehensively defines active site specificity for thrombin and ADAMTS13.

    Science.gov (United States)

    Kretz, Colin A; Tomberg, Kärt; Van Esbroeck, Alexander; Yee, Andrew; Ginsburg, David

    2018-02-12

    We have combined random 6 amino acid substrate phage display with high throughput sequencing to comprehensively define the active site specificity of the serine protease thrombin and the metalloprotease ADAMTS13. The substrate motif for thrombin was determined by >6,700 cleaved peptides, and was highly concordant with previous studies. In contrast, ADAMTS13 cleaved only 96 peptides (out of >10 7 sequences), with no apparent consensus motif. However, when the hexapeptide library was substituted into the P3-P3' interval of VWF73, an exosite-engaging substrate of ADAMTS13, 1670 unique peptides were cleaved. ADAMTS13 exhibited a general preference for aliphatic amino acids throughout the P3-P3' interval, except at P2 where Arg was tolerated. The cleaved peptides assembled into a motif dominated by P3 Leu, and bulky aliphatic residues at P1 and P1'. Overall, the P3-P2' amino acid sequence of von Willebrand Factor appears optimally evolved for ADAMTS13 recognition. These data confirm the critical role of exosite engagement for substrates to gain access to the active site of ADAMTS13, and define the substrate recognition motif for ADAMTS13. Combining substrate phage display with high throughput sequencing is a powerful approach for comprehensively defining the active site specificity of proteases.

  19. Analysing Microbial Community Composition through Amplicon Sequencing: From Sampling to Hypothesis Testing

    Directory of Open Access Journals (Sweden)

    Luisa W. Hugerth

    2017-09-01

    Full Text Available Microbial ecology as a scientific field is fundamentally driven by technological advance. The past decade's revolution in DNA sequencing cost and throughput has made it possible for most research groups to map microbial community composition in environments of interest. However, the computational and statistical methodology required to analyse this kind of data is often not part of the biologist training. In this review, we give a historical perspective on the use of sequencing data in microbial ecology and restate the current need for this method; but also highlight the major caveats with standard practices for handling these data, from sample collection and library preparation to statistical analysis. Further, we outline the main new analytical tools that have been developed in the past few years to bypass these caveats, as well as highlight the major requirements of common statistical practices and the extent to which they are applicable to microbial data. Besides delving into the meaning of select alpha- and beta-diversity measures, we give special consideration to techniques for finding the main drivers of community dissimilarity and for interaction network construction. While every project design has specific needs, this review should serve as a starting point for considering what options are available.

  20. PLAN: a web platform for automating high-throughput BLAST searches and for managing and mining results.

    Science.gov (United States)

    He, Ji; Dai, Xinbin; Zhao, Xuechun

    2007-02-09

    BLAST searches are widely used for sequence alignment. The search results are commonly adopted for various functional and comparative genomics tasks such as annotating unknown sequences, investigating gene models and comparing two sequence sets. Advances in sequencing technologies pose challenges for high-throughput analysis of large-scale sequence data. A number of programs and hardware solutions exist for efficient BLAST searching, but there is a lack of generic software solutions for mining and personalized management of the results. Systematically reviewing the results and identifying information of interest remains tedious and time-consuming. Personal BLAST Navigator (PLAN) is a versatile web platform that helps users to carry out various personalized pre- and post-BLAST tasks, including: (1) query and target sequence database management, (2) automated high-throughput BLAST searching, (3) indexing and searching of results, (4) filtering results online, (5) managing results of personal interest in favorite categories, (6) automated sequence annotation (such as NCBI NR and ontology-based annotation). PLAN integrates, by default, the Decypher hardware-based BLAST solution provided by Active Motif Inc. with a greatly improved efficiency over conventional BLAST software. BLAST results are visualized by spreadsheets and graphs and are full-text searchable. BLAST results and sequence annotations can be exported, in part or in full, in various formats including Microsoft Excel and FASTA. Sequences and BLAST results are organized in projects, the data publication levels of which are controlled by the registered project owners. In addition, all analytical functions are provided to public users without registration. PLAN has proved a valuable addition to the community for automated high-throughput BLAST searches, and, more importantly, for knowledge discovery, management and sharing based on sequence alignment results. The PLAN web interface is platform

  1. PLAN: a web platform for automating high-throughput BLAST searches and for managing and mining results

    Directory of Open Access Journals (Sweden)

    Zhao Xuechun

    2007-02-01

    Full Text Available Abstract Background BLAST searches are widely used for sequence alignment. The search results are commonly adopted for various functional and comparative genomics tasks such as annotating unknown sequences, investigating gene models and comparing two sequence sets. Advances in sequencing technologies pose challenges for high-throughput analysis of large-scale sequence data. A number of programs and hardware solutions exist for efficient BLAST searching, but there is a lack of generic software solutions for mining and personalized management of the results. Systematically reviewing the results and identifying information of interest remains tedious and time-consuming. Results Personal BLAST Navigator (PLAN is a versatile web platform that helps users to carry out various personalized pre- and post-BLAST tasks, including: (1 query and target sequence database management, (2 automated high-throughput BLAST searching, (3 indexing and searching of results, (4 filtering results online, (5 managing results of personal interest in favorite categories, (6 automated sequence annotation (such as NCBI NR and ontology-based annotation. PLAN integrates, by default, the Decypher hardware-based BLAST solution provided by Active Motif Inc. with a greatly improved efficiency over conventional BLAST software. BLAST results are visualized by spreadsheets and graphs and are full-text searchable. BLAST results and sequence annotations can be exported, in part or in full, in various formats including Microsoft Excel and FASTA. Sequences and BLAST results are organized in projects, the data publication levels of which are controlled by the registered project owners. In addition, all analytical functions are provided to public users without registration. Conclusion PLAN has proved a valuable addition to the community for automated high-throughput BLAST searches, and, more importantly, for knowledge discovery, management and sharing based on sequence alignment results

  2. Apple ring rot-responsive putative microRNAs revealed by high-throughput sequencing in Malus × domestica Borkh.

    Science.gov (United States)

    Yu, Xin-Yi; Du, Bei-Bei; Gao, Zhi-Hong; Zhang, Shi-Jie; Tu, Xu-Tong; Chen, Xiao-Yun; Zhang, Zhen; Qu, Shen-Chun

    2014-08-01

    MicroRNAs (miRNAs) are small non-coding RNAs, which silence target mRNA via cleavage or translational inhibition to function in regulating gene expression. MiRNAs act as important regulators of plant development and stress response. For understanding the role of miRNAs responsive to apple ring rot stress, we identified disease-responsive miRNAs using high-throughput sequencing in Malus × domestica Borkh.. Four small RNA libraries were constructed from two control strains in M. domestica, crabapple (CKHu) and Fuji Naga-fu No. 6 (CKFu), and two disease stress strains, crabapple (DSHu) and Fuji Naga-fu No. 6 (DSFu). A total of 59 miRNA families were identified and five miRNAs might be responsive to apple ring rot infection and validated via qRT-PCR. Furthermore, we predicted 76 target genes which were regulated by conserved miRNAs potentially. Our study demonstrated that miRNAs was responsive to apple ring rot infection and may have important implications on apple disease resistance.

  3. Bacterial community compositions of coking wastewater treatment plants in steel industry revealed by Illumina high-throughput sequencing.

    Science.gov (United States)

    Ma, Qiao; Qu, Yuanyuan; Shen, Wenli; Zhang, Zhaojing; Wang, Jingwei; Liu, Ziyan; Li, Duanxing; Li, Huijie; Zhou, Jiti

    2015-03-01

    In this study, Illumina high-throughput sequencing was used to reveal the community structures of nine coking wastewater treatment plants (CWWTPs) in China for the first time. The sludge systems exhibited a similar community composition at each taxonomic level. Compared to previous studies, some of the core genera in municipal wastewater treatment plants such as Zoogloea, Prosthecobacter and Gp6 were detected as minor species. Thiobacillus (20.83%), Comamonas (6.58%), Thauera (4.02%), Azoarcus (7.78%) and Rhodoplanes (1.42%) were the dominant genera shared by at least six CWWTPs. The percentages of autotrophic ammonia-oxidizing bacteria and nitrite-oxidizing bacteria were unexpectedly low, which were verified by both real-time PCR and fluorescence in situ hybridization analyses. Hierarchical clustering and canonical correspondence analysis indicated that operation mode, flow rate and temperature might be the key factors in community formation. This study provides new insights into our understanding of microbial community compositions and structures of CWWTPs. Copyright © 2014 Elsevier Ltd. All rights reserved.

  4. Metagenomic analysis and functional characterization of the biogas microbiome using high throughput shotgun sequencing and a novel binning strategy

    DEFF Research Database (Denmark)

    Campanaro, Stefano; Treu, Laura; Kougias, Panagiotis

    2016-01-01

    Biogas production is an economically attractive technology that has gained momentum worldwide over the past years. Biogas is produced by a biologically mediated process, widely known as "anaerobic digestion." This process is performed by a specialized and complex microbial community, in which...... performed using >400 proteins revealed that the biogas community is a trove of new species. A new approach based on functional properties as per network representation was developed to assign roles to the microbial species. The organization of the anaerobic digestion microbiome is resembled by a funnel...... on the phylogenetic and functional characterization of the microbial community populating biogas reactors. By applying for the first time high-throughput sequencing and a novel binning strategy, the identified genes were anchored to single genomes providing a clear understanding of their metabolic pathways...

  5. Ultraspecific probes for high throughput HLA typing

    Directory of Open Access Journals (Sweden)

    Eggers Rick

    2009-02-01

    Full Text Available Abstract Background The variations within an individual's HLA (Human Leukocyte Antigen genes have been linked to many immunological events, e.g. susceptibility to disease, response to vaccines, and the success of blood, tissue, and organ transplants. Although the microarray format has the potential to achieve high-resolution typing, this has yet to be attained due to inefficiencies of current probe design strategies. Results We present a novel three-step approach for the design of high-throughput microarray assays for HLA typing. This approach first selects sequences containing the SNPs present in all alleles of the locus of interest and next calculates the number of base changes necessary to convert a candidate probe sequences to the closest subsequence within the set of sequences that are likely to be present in the sample including the remainder of the human genome in order to identify those candidate probes which are "ultraspecific" for the allele of interest. Due to the high specificity of these sequences, it is possible that preliminary steps such as PCR amplification are no longer necessary. Lastly, the minimum number of these ultraspecific probes is selected such that the highest resolution typing can be achieved for the minimal cost of production. As an example, an array was designed and in silico results were obtained for typing of the HLA-B locus. Conclusion The assay presented here provides a higher resolution than has previously been developed and includes more alleles than previously considered. Based upon the in silico and preliminary experimental results, we believe that the proposed approach can be readily applied to any highly polymorphic gene system.

  6. Algorithms for mapping high-throughput DNA sequences

    DEFF Research Database (Denmark)

    Frellsen, Jes; Menzel, Peter; Krogh, Anders

    2014-01-01

    of data generation, new bioinformatics approaches have been developed to cope with the large amount of sequencing reads obtained in these experiments. In this chapter, we first introduce HTS technologies and their usage in molecular biology and discuss the problem of mapping sequencing reads...... to their genomic origin. We then in detail describe two approaches that offer very fast heuristics to solve the mapping problem in a feasible runtime. In particular, we describe the BLAT algorithm, and we give an introduction to the Burrows-Wheeler Transform and the mapping algorithms based on this transformation....

  7. Identification and characterization of microRNAs related to salt stress in broccoli, using high-throughput sequencing and bioinformatics analysis.

    Science.gov (United States)

    Tian, Yunhong; Tian, Yunming; Luo, Xiaojun; Zhou, Tao; Huang, Zuoping; Liu, Ying; Qiu, Yihan; Hou, Bing; Sun, Dan; Deng, Hongyu; Qian, Shen; Yao, Kaitai

    2014-09-03

    MicroRNAs (miRNAs) are a new class of endogenous regulators of a broad range of physiological processes, which act by regulating gene expression post-transcriptionally. The brassica vegetable, broccoli (Brassica oleracea var. italica), is very popular with a wide range of consumers, but environmental stresses such as salinity are a problem worldwide in restricting its growth and yield. Little is known about the role of miRNAs in the response of broccoli to salt stress. In this study, broccoli subjected to salt stress and broccoli grown under control conditions were analyzed by high-throughput sequencing. Differential miRNA expression was confirmed by real-time reverse transcription polymerase chain reaction (RT-PCR). The prediction of miRNA targets was undertaken using the Kyoto Encyclopedia of Genes and Genomes (KEGG) Orthology (KO) database and Gene Ontology (GO)-enrichment analyses. Two libraries of small (or short) RNAs (sRNAs) were constructed and sequenced by high-throughput Solexa sequencing. A total of 24,511,963 and 21,034,728 clean reads, representing 9,861,236 (40.23%) and 8,574,665 (40.76%) unique reads, were obtained for control and salt-stressed broccoli, respectively. Furthermore, 42 putative known and 39 putative candidate miRNAs that were differentially expressed between control and salt-stressed broccoli were revealed by their read counts and confirmed by the use of stem-loop real-time RT-PCR. Amongst these, the putative conserved miRNAs, miR393 and miR855, and two putative candidate miRNAs, miR3 and miR34, were the most strongly down-regulated when broccoli was salt-stressed, whereas the putative conserved miRNA, miR396a, and the putative candidate miRNA, miR37, were the most up-regulated. Finally, analysis of the predicted gene targets of miRNAs using the GO and KO databases indicated that a range of metabolic and other cellular functions known to be associated with salt stress were up-regulated in broccoli treated with salt. A comprehensive

  8. The simple fool's guide to population genomics via RNA-Seq: An introduction to high-throughput sequencing data analysis

    DEFF Research Database (Denmark)

    De Wit, P.; Pespeni, M.H.; Ladner, J.T.

    2012-01-01

    to Population Genomics via RNA-seq' (SFG), a document intended to serve as an easy-to-follow protocol, walking a user through one example of high-throughput sequencing data analysis of nonmodel organisms. It is by no means an exhaustive protocol, but rather serves as an introduction to the bioinformatic methods...... used in population genomics, enabling a user to gain familiarity with basic analysis steps. The SFG consists of two parts. This document summarizes the steps needed and lays out the basic themes for each and a simple approach to follow. The second document is the full SFG, publicly available at http://sfg.......stanford.edu, that includes detailed protocols for data processing and analysis, along with a repository of custom-made scripts and sample files. Steps included in the SFG range from tissue collection to de novo assembly, blast annotation, alignment, gene expression, functional enrichment, SNP detection, principal components...

  9. Rapid detection of SMARCB1 sequence variation using high resolution melting

    Directory of Open Access Journals (Sweden)

    Ashley David M

    2009-12-01

    Full Text Available Abstract Background Rhabdoid tumors are rare cancers of early childhood arising in the kidney, central nervous system and other organs. The majority are caused by somatic inactivating mutations or deletions affecting the tumor suppressor locus SMARCB1 [OMIM 601607]. Germ-line SMARCB1 inactivation has been reported in association with rhabdoid tumor, epitheloid sarcoma and familial schwannomatosis, underscoring the importance of accurate mutation screening to ascertain recurrence and transmission risks. We describe a rapid and sensitive diagnostic screening method, using high resolution melting (HRM, for detecting sequence variations in SMARCB1. Methods Amplicons, encompassing the nine coding exons of SMARCB1, flanking splice site sequences and the 5' and 3' UTR, were screened by both HRM and direct DNA sequencing to establish the reliability of HRM as a primary mutation screening tool. Reaction conditions were optimized with commercially available HRM mixes. Results The false negative rate for detecting sequence variants by HRM in our sample series was zero. Nine amplicons out of a total of 140 (6.4% showed variant melt profiles that were subsequently shown to be false positive. Overall nine distinct pathogenic SMARCB1 mutations were identified in a total of 19 possible rhabdoid tumors. Two tumors had two distinct mutations and two harbored SMARCB1 deletion. Other mutations were nonsense or frame-shifts. The detection sensitivity of the HRM screening method was influenced by both sequence context and specific nucleotide change and varied from 1: 4 to 1:1000 (variant to wild-type DNA. A novel method involving digital HRM, followed by re-sequencing, was used to confirm mutations in tumor specimens containing associated normal tissue. Conclusions This is the first report describing SMARCB1 mutation screening using HRM. HRM is a rapid, sensitive and inexpensive screening technology that is likely to be widely adopted in diagnostic laboratories to

  10. Rapid detection of SMARCB1 sequence variation using high resolution melting

    International Nuclear Information System (INIS)

    Dagar, Vinod; Chow, Chung-Wo; Ashley, David M; Algar, Elizabeth M

    2009-01-01

    Rhabdoid tumors are rare cancers of early childhood arising in the kidney, central nervous system and other organs. The majority are caused by somatic inactivating mutations or deletions affecting the tumor suppressor locus SMARCB1 [OMIM 601607]. Germ-line SMARCB1 inactivation has been reported in association with rhabdoid tumor, epitheloid sarcoma and familial schwannomatosis, underscoring the importance of accurate mutation screening to ascertain recurrence and transmission risks. We describe a rapid and sensitive diagnostic screening method, using high resolution melting (HRM), for detecting sequence variations in SMARCB1. Amplicons, encompassing the nine coding exons of SMARCB1, flanking splice site sequences and the 5' and 3' UTR, were screened by both HRM and direct DNA sequencing to establish the reliability of HRM as a primary mutation screening tool. Reaction conditions were optimized with commercially available HRM mixes. The false negative rate for detecting sequence variants by HRM in our sample series was zero. Nine amplicons out of a total of 140 (6.4%) showed variant melt profiles that were subsequently shown to be false positive. Overall nine distinct pathogenic SMARCB1 mutations were identified in a total of 19 possible rhabdoid tumors. Two tumors had two distinct mutations and two harbored SMARCB1 deletion. Other mutations were nonsense or frame-shifts. The detection sensitivity of the HRM screening method was influenced by both sequence context and specific nucleotide change and varied from 1: 4 to 1:1000 (variant to wild-type DNA). A novel method involving digital HRM, followed by re-sequencing, was used to confirm mutations in tumor specimens containing associated normal tissue. This is the first report describing SMARCB1 mutation screening using HRM. HRM is a rapid, sensitive and inexpensive screening technology that is likely to be widely adopted in diagnostic laboratories to facilitate whole gene mutation screening

  11. Amplicon sequencing for the quantification of spoilage microbiota in complex foods including bacterial spores

    NARCIS (Netherlands)

    Boer, de P.; Caspers, M.; Sanders, J.W.; Kemperman, R.; Wijman, J.; Lommerse, G.; Roeselers, G.; Montijn, R.; Abee, T.; Kort, R.

    2015-01-01

    Background
    Spoilage of food products is frequently caused by bacterial spores and lactic acid bacteria. Identification of these organisms by classic cultivation methods is limited by their ability to form colonies on nutrient agar plates. In this study, we adapted and optimized 16S rRNA amplicon

  12. High throughput sample processing and automated scoring

    Directory of Open Access Journals (Sweden)

    Gunnar eBrunborg

    2014-10-01

    Full Text Available The comet assay is a sensitive and versatile method for assessing DNA damage in cells. In the traditional version of the assay, there are many manual steps involved and few samples can be treated in one experiment. High throughput modifications have been developed during recent years, and they are reviewed and discussed. These modifications include accelerated scoring of comets; other important elements that have been studied and adapted to high throughput are cultivation and manipulation of cells or tissues before and after exposure, and freezing of treated samples until comet analysis and scoring. High throughput methods save time and money but they are useful also for other reasons: large-scale experiments may be performed which are otherwise not practicable (e.g., analysis of many organs from exposed animals, and human biomonitoring studies, and automation gives more uniform sample treatment and less dependence on operator performance. The high throughput modifications now available vary largely in their versatility, capacity, complexity and costs. The bottleneck for further increase of throughput appears to be the scoring.

  13. [Research on soil bacteria under the impact of sealed CO2 leakage by high-throughput sequencing technology].

    Science.gov (United States)

    Tian, Di; Ma, Xin; Li, Yu-E; Zha, Liang-Song; Wu, Yang; Zou, Xiao-Xia; Liu, Shuang

    2013-10-01

    Carbon dioxide Capture and Storage has provided a new option for mitigating global anthropogenic CO2 emission with its unique advantages. However, there is a risk of the sealed CO2 leakage, bringing a serious threat to the ecology system. It is widely known that soil microorganisms are closely related to soil health, while the study on the impact of sequestered CO2 leakage on soil microorganisms is quite deficient. In this study, the leakage scenarios of sealed CO2 were constructed and the 16S rRNA genes of soil bacteria were sequenced by Illumina high-throughput sequencing technology on Miseq platform, and related biological analysis was conducted to explore the changes of soil bacterial abundance, diversity and structure. There were 486,645 reads for 43,017 OTUs of 15 soil samples and the results of biological analysis showed that there were differences in the abundance, diversity and community structure of soil bacterial community under different CO, leakage scenarios while the abundance and diversity of the bacterial community declined with the amplification of CO2 leakage quantity and leakage time, and some bacteria species became the dominant bacteria species in the bacteria community, therefore the increase of Acidobacteria species would be a biological indicator for the impact of sealed CO2 leakage on soil ecology system.

  14. Assessing the Diversity of Rodent-Borne Viruses: Exploring of High-Throughput Sequencing and Classical Amplification/Sequencing Approaches.

    Science.gov (United States)

    Drewes, Stephan; Straková, Petra; Drexler, Jan F; Jacob, Jens; Ulrich, Rainer G

    2017-01-01

    Rodents are distributed throughout the world and interact with humans in many ways. They provide vital ecosystem services, some species are useful models in biomedical research and some are held as pet animals. However, many rodent species can have adverse effects such as damage to crops and stored produce, and they are of health concern because of the transmission of pathogens to humans and livestock. The first rodent viruses were discovered by isolation approaches and resulted in break-through knowledge in immunology, molecular and cell biology, and cancer research. In addition to rodent-specific viruses, rodent-borne viruses are causing a large number of zoonotic diseases. Most prominent examples are reemerging outbreaks of human hemorrhagic fever disease cases caused by arena- and hantaviruses. In addition, rodents are reservoirs for vector-borne pathogens, such as tick-borne encephalitis virus and Borrelia spp., and may carry human pathogenic agents, but likely are not involved in their transmission to human. In our days, next-generation sequencing or high-throughput sequencing (HTS) is revolutionizing the speed of the discovery of novel viruses, but other molecular approaches, such as generic RT-PCR/PCR and rolling circle amplification techniques, contribute significantly to the rapidly ongoing process. However, the current knowledge still represents only the tip of the iceberg, when comparing the known human viruses to those known for rodents, the mammalian taxon with the largest species number. The diagnostic potential of HTS-based metagenomic approaches is illustrated by their use in the discovery and complete genome determination of novel borna- and adenoviruses as causative disease agents in squirrels. In conclusion, HTS, in combination with conventional RT-PCR/PCR-based approaches, resulted in a drastically increased knowledge of the diversity of rodent viruses. Future improvements of the used workflows, including bioinformatics analysis, will further

  15. GxGrare: gene-gene interaction analysis method for rare variants from high-throughput sequencing data.

    Science.gov (United States)

    Kwon, Minseok; Leem, Sangseob; Yoon, Joon; Park, Taesung

    2018-03-19

    With the rapid advancement of array-based genotyping techniques, genome-wide association studies (GWAS) have successfully identified common genetic variants associated with common complex diseases. However, it has been shown that only a small proportion of the genetic etiology of complex diseases could be explained by the genetic factors identified from GWAS. This missing heritability could possibly be explained by gene-gene interaction (epistasis) and rare variants. There has been an exponential growth of gene-gene interaction analysis for common variants in terms of methodological developments and practical applications. Also, the recent advancement of high-throughput sequencing technologies makes it possible to conduct rare variant analysis. However, little progress has been made in gene-gene interaction analysis for rare variants. Here, we propose GxGrare which is a new gene-gene interaction method for the rare variants in the framework of the multifactor dimensionality reduction (MDR) analysis. The proposed method consists of three steps; 1) collapsing the rare variants, 2) MDR analysis for the collapsed rare variants, and 3) detect top candidate interaction pairs. GxGrare can be used for the detection of not only gene-gene interactions, but also interactions within a single gene. The proposed method is illustrated with 1080 whole exome sequencing data of the Korean population in order to identify causal gene-gene interaction for rare variants for type 2 diabetes. The proposed GxGrare performs well for gene-gene interaction detection with collapsing of rare variants. GxGrare is available at http://bibs.snu.ac.kr/software/gxgrare which contains simulation data and documentation. Supported operating systems include Linux and OS X.

  16. CRISPR-Cas9-Edited Site Sequencing (CRES-Seq): An Efficient and High-Throughput Method for the Selection of CRISPR-Cas9-Edited Clones.

    Science.gov (United States)

    Veeranagouda, Yaligara; Debono-Lagneaux, Delphine; Fournet, Hamida; Thill, Gilbert; Didier, Michel

    2018-01-16

    The emergence of clustered regularly interspaced short palindromic repeats-Cas9 (CRISPR-Cas9) gene editing systems has enabled the creation of specific mutants at low cost, in a short time and with high efficiency, in eukaryotic cells. Since a CRISPR-Cas9 system typically creates an array of mutations in targeted sites, a successful gene editing project requires careful selection of edited clones. This process can be very challenging, especially when working with multiallelic genes and/or polyploid cells (such as cancer and plants cells). Here we described a next-generation sequencing method called CRISPR-Cas9 Edited Site Sequencing (CRES-Seq) for the efficient and high-throughput screening of CRISPR-Cas9-edited clones. CRES-Seq facilitates the precise genotyping up to 96 CRISPR-Cas9-edited sites (CRES) in a single MiniSeq (Illumina) run with an approximate sequencing cost of $6/clone. CRES-Seq is particularly useful when multiple genes are simultaneously targeted by CRISPR-Cas9, and also for screening of clones generated from multiallelic genes/polyploid cells. © 2018 by John Wiley & Sons, Inc. Copyright © 2018 John Wiley & Sons, Inc.

  17. Combining high-throughput phenotyping and genome-wide association studies to reveal natural genetic variation in rice

    Science.gov (United States)

    Yang, Wanneng; Guo, Zilong; Huang, Chenglong; Duan, Lingfeng; Chen, Guoxing; Jiang, Ni; Fang, Wei; Feng, Hui; Xie, Weibo; Lian, Xingming; Wang, Gongwei; Luo, Qingming; Zhang, Qifa; Liu, Qian; Xiong, Lizhong

    2014-01-01

    Even as the study of plant genomics rapidly develops through the use of high-throughput sequencing techniques, traditional plant phenotyping lags far behind. Here we develop a high-throughput rice phenotyping facility (HRPF) to monitor 13 traditional agronomic traits and 2 newly defined traits during the rice growth period. Using genome-wide association studies (GWAS) of the 15 traits, we identify 141 associated loci, 25 of which contain known genes such as the Green Revolution semi-dwarf gene, SD1. Based on a performance evaluation of the HRPF and GWAS results, we demonstrate that high-throughput phenotyping has the potential to replace traditional phenotyping techniques and can provide valuable gene identification information. The combination of the multifunctional phenotyping tools HRPF and GWAS provides deep insights into the genetic architecture of important traits. PMID:25295980

  18. High Throughput Neuro-Imaging Informatics

    Directory of Open Access Journals (Sweden)

    Michael I Miller

    2013-12-01

    Full Text Available This paper describes neuroinformatics technologies at 1 mm anatomical scale based on high throughput 3D functional and structural imaging technologies of the human brain. The core is an abstract pipeline for converting functional and structural imagery into their high dimensional neuroinformatic representations index containing O(E3-E4 discriminating dimensions. The pipeline is based on advanced image analysis coupled to digital knowledge representations in the form of dense atlases of the human brain at gross anatomical scale. We demonstrate the integration of these high-dimensional representations with machine learning methods, which have become the mainstay of other fields of science including genomics as well as social networks. Such high throughput facilities have the potential to alter the way medical images are stored and utilized in radiological workflows. The neuroinformatics pipeline is used to examine cross-sectional and personalized analyses of neuropsychiatric illnesses in clinical applications as well as longitudinal studies. We demonstrate the use of high throughput machine learning methods for supporting (i cross-sectional image analysis to evaluate the health status of individual subjects with respect to the population data, (ii integration of image and non-image information for diagnosis and prognosis.

  19. Coupled high-throughput functional screening and next generation sequencing for identification of plant polymer decomposing enzymes in metagenomic libraries

    Directory of Open Access Journals (Sweden)

    Mari eNyyssönen

    2013-09-01

    Full Text Available Recent advances in sequencing technologies generate new predictions and hypotheses about the functional roles of environmental microorganisms. Yet, until we can test these predictions at a scale that matches our ability to generate them, most of them will remain as hypotheses. Function-based mining of metagenomic libraries can provide direct linkages between genes, metabolic traits and microbial taxa and thus bridge this gap between sequence data generation and functional predictions. Here we developed high-throughput screening assays for function-based characterization of activities involved in plant polymer decomposition from environmental metagenomic libraries. The multiplexed assays use fluorogenic and chromogenic substrates, combine automated liquid handling and use a genetically modified expression host to enable simultaneous screening of 12,160 clones for 14 activities in a total of 170,240 reactions. Using this platform we identified 374 (0.26 % cellulose, hemicellulose, chitin, starch, phosphate and protein hydrolyzing clones from fosmid libraries prepared from decomposing leaf litter. Sequencing on the Illumina MiSeq platform, followed by assembly and gene prediction of a subset of 95 fosmid clones, identified a broad range of bacterial phyla, including Actinobacteria, Bacteroidetes, multiple Proteobacteria sub-phyla in addition to some Fungi. Carbohydrate-active enzyme genes from 20 different glycoside hydrolase families were detected. Using tetranucleotide frequency binning of fosmid sequences, multiple enzyme activities from distinct fosmids were linked, demonstrating how biochemically-confirmed functional traits in environmental metagenomes may be attributed to groups of specific organisms. Overall, our results demonstrate how functional screening of metagenomic libraries can be used to connect microbial functionality to community composition and, as a result, complement large-scale metagenomic sequencing efforts.

  20. Abundance and diversity of bacterial nitrifiers and denitrifiers and their functional genes in tannery wastewater treatment plants revealed by high-throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Zhu Wang

    Full Text Available Biological nitrification/denitrification is frequently used to remove nitrogen from tannery wastewater containing high concentrations of ammonia. However, information is limited about the bacterial nitrifiers and denitrifiers and their functional genes in tannery wastewater treatment plants (WWTPs due to the low-throughput of the previously used methods. In this study, 454 pyrosequencing and Illumina high-throughput sequencing, combined with molecular methods, were used to comprehensively characterize structures and functions of nitrification and denitrification bacterial communities in aerobic and anaerobic sludge of two full-scale tannery WWTPs. Pyrosequencing of 16S rRNA genes showed that Proteobacteria and Synergistetes dominated in the aerobic and anaerobic sludge, respectively. Ammonia-oxidizing bacteria (AOB amoA gene cloning revealed that Nitrosomonas europaea dominated the ammonia-oxidizing community in the WWTPs. Metagenomic analysis showed that the denitrifiers mainly included the genera of Thauera, Paracoccus, Hyphomicrobium, Comamonas and Azoarcus, which may greatly contribute to the nitrogen removal in the two WWTPs. It is interesting that AOB and ammonia-oxidizing archaea had low abundance although both WWTPs demonstrated high ammonium removal efficiency. Good correlation between the qPCR and metagenomic analysis is observed for the quantification of functional genes amoA, nirK, nirS and nosZ, indicating that the metagenomic approach may be a promising method used to comprehensively investigate the abundance of functional genes of nitrifiers and denitrifiers in the environment.

  1. High Throughput Computing Impact on Meta Genomics (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    Energy Technology Data Exchange (ETDEWEB)

    Gore, Brooklin

    2011-10-12

    This presentation includes a brief background on High Throughput Computing, correlating gene transcription factors, optical mapping, genotype to phenotype mapping via QTL analysis, and current work on next gen sequencing.

  2. A high-throughput pipeline for the design of real-time PCR signatures

    Directory of Open Access Journals (Sweden)

    Reifman Jaques

    2010-06-01

    Full Text Available Abstract Background Pathogen diagnostic assays based on polymerase chain reaction (PCR technology provide high sensitivity and specificity. However, the design of these diagnostic assays is computationally intensive, requiring high-throughput methods to identify unique PCR signatures in the presence of an ever increasing availability of sequenced genomes. Results We present the Tool for PCR Signature Identification (TOPSI, a high-performance computing pipeline for the design of PCR-based pathogen diagnostic assays. The TOPSI pipeline efficiently designs PCR signatures common to multiple bacterial genomes by obtaining the shared regions through pairwise alignments between the input genomes. TOPSI successfully designed PCR signatures common to 18 Staphylococcus aureus genomes in less than 14 hours using 98 cores on a high-performance computing system. Conclusions TOPSI is a computationally efficient, fully integrated tool for high-throughput design of PCR signatures common to multiple bacterial genomes. TOPSI is freely available for download at http://www.bhsai.org/downloads/topsi.tar.gz.

  3. Always look on both sides: phylogenetic information conveyed by simple sequence repeat allele sequences.

    Directory of Open Access Journals (Sweden)

    Stéphanie Barthe

    Full Text Available Simple sequence repeat (SSR markers are widely used tools for inferences about genetic diversity, phylogeography and spatial genetic structure. Their applications assume that variation among alleles is essentially caused by an expansion or contraction of the number of repeats and that, accessorily, mutations in the target sequences follow the stepwise mutation model (SMM. Generally speaking, PCR amplicon sizes are used as direct indicators of the number of SSR repeats composing an allele with the data analysis either ignoring the extent of allele size differences or assuming that there is a direct correlation between differences in amplicon size and evolutionary distance. However, without precisely knowing the kind and distribution of polymorphism within an allele (SSR and the associated flanking region (FR sequences, it is hard to say what kind of evolutionary message is conveyed by such a synthetic descriptor of polymorphism as DNA amplicon size. In this study, we sequenced several SSR alleles in multiple populations of three divergent tree genera and disentangled the types of polymorphisms contained in each portion of the DNA amplicon containing an SSR. The patterns of diversity provided by amplicon size variation, SSR variation itself, insertions/deletions (indels, and single nucleotide polymorphisms (SNPs observed in the FRs were compared. Amplicon size variation largely reflected SSR repeat number. The amount of variation was as large in FRs as in the SSR itself. The former contributed significantly to the phylogenetic information and sometimes was the main source of differentiation among individuals and populations contained by FR and SSR regions of SSR markers. The presence of mutations occurring at different rates within a marker's sequence offers the opportunity to analyse evolutionary events occurring on various timescales, but at the same time calls for caution in the interpretation of SSR marker data when the distribution of within

  4. Microbial biogeography of drinking water: patterns in phylogenetic diversity across space and time

    NARCIS (Netherlands)

    Roeselers, G.; Coolen, J.; Wielen, P.W. van der; Jaspers, M.C.; Atsma, A.; Graaf, B. de; Schuren, F.

    2015-01-01

    In this study, we collected water from different locations in 32 drinking water distribution networks in the Netherlands and analysed the spatial and temporal variation in microbial community composition by high-throughput sequencing of 16S rRNA gene amplicons. We observed that microbial community

  5. Overexpressed Genes/ESTs and Characterization of Distinct Amplicons on 17823 in Breast Cancer Cells

    Directory of Open Access Journals (Sweden)

    Ayse E. Erson

    2001-01-01

    Full Text Available 17823 is a frequent site of gene amplification in breast cancer. Several lines of evidence suggest the presence of multiple amplicons on 17823. To characterize distinct amplicons on 17823 and localize putative oncogenes, we screened genes and expressed sequence tags (ESTs in existing physical and radiation hybrid maps for amplification and overexpression in breast cancer cell lines by semiquantitative duplex PCR, semiquantitative duplex RT-PCR, Southern blot, Northern blot analyses. We identified two distinct amplicons on 17823, one including TBX2 and another proximal region including RPS6KB1 (PS6K and MUL. In addition to these previously reported overexpressed genes, we also identified amplification and overexpression of additional uncharacterized genes and ESTs, some of which suggest potential oncogenic activity. In conclusion, we have further defined two distinct regions of gene amplification and overexpression on 17823 with identification of new potential oncogene candidates. Based on the amplification and overexpression patterns of known and as of yet unrecognized genes on 17823, it is likely that some of these genes mapping to the discrete amplicons function as oncogenes and contribute to tumor progression in breast cancer cells.

  6. Winnowing DNA for rare sequences: highly specific sequence and methylation based enrichment.

    Directory of Open Access Journals (Sweden)

    Jason D Thompson

    Full Text Available Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue.

  7. Winnowing DNA for rare sequences: highly specific sequence and methylation based enrichment.

    Science.gov (United States)

    Thompson, Jason D; Shibahara, Gosuke; Rajan, Sweta; Pel, Joel; Marziali, Andre

    2012-01-01

    Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue.

  8. [New-generation high-throughput technologies based 'omics' research strategy in human disease].

    Science.gov (United States)

    Yang, Xu; Jiao, Rui; Yang, Lin; Wu, Li-Ping; Li, Ying-Rui; Wang, Jun

    2011-08-01

    In recent years, new-generation high-throughput technologies, including next-generation sequencing technology and mass spectrometry method, have been widely applied in solving biological problems, especially in human diseases field. This data driven, large-scale and industrialized research model enables the omnidirectional and multi-level study of human diseases from the perspectives of genomics, transcriptomics and proteomics levels, etc. In this paper, the latest development of the high-throughput technologies that applied in DNA, RNA, epigenomics, metagenomics including proteomics and some applications in translational medicine are reviewed. At genomics level, exome sequencing has been the hot spot of the recent research. However, the predominance of whole genome resequencing in detecting large structural variants within the whole genome level is coming to stand out as the drop of sequencing cost, which also makes it possible for personalized genome based medicine application. At trancriptomics level, e.g., small RNA sequencing can be used to detect known and predict unknown miRNA. Those small RNA could not only be the biomarkers for disease diagnosis and prognosis, but also show the potential of disease treatment. At proteomics level, e.g., target proteomics can be used to detect the possible disease-related protein or peptides, which can be useful index for clinical staging and typing. Furthermore, the application and development of trans-omics study in disease research are briefly introduced. By applying bioinformatics technologies for integrating multi-omics data, the mechanism, diagnosis and therapy of the disease are likely to be systemically explained and realized, so as to provide powerful tools for disease diagnosis and therapies.

  9. Advances in High-Throughput Speed, Low-Latency Communication for Embedded Instrumentation (7th Annual SFAF Meeting, 2012)

    Energy Technology Data Exchange (ETDEWEB)

    Jordan, Scott

    2012-06-01

    Scott Jordan on "Advances in high-throughput speed, low-latency communication for embedded instrumentation" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

  10. High-throughput sequencing and copy number variation detection using formalin fixed embedded tissue in metastatic gastric cancer.

    Directory of Open Access Journals (Sweden)

    Seokhwi Kim

    Full Text Available In the era of targeted therapy, mutation profiling of cancer is a crucial aspect of making therapeutic decisions. To characterize cancer at a molecular level, the use of formalin-fixed paraffin-embedded tissue is important. We tested the Ion AmpliSeq Cancer Hotspot Panel v2 and nCounter Copy Number Variation Assay in 89 formalin-fixed paraffin-embedded gastric cancer samples to determine whether they are applicable in archival clinical samples for personalized targeted therapies. We validated the results with Sanger sequencing, real-time quantitative PCR, fluorescence in situ hybridization and immunohistochemistry. Frequently detected somatic mutations included TP53 (28.17%, APC (10.1%, PIK3CA (5.6%, KRAS (4.5%, SMO (3.4%, STK11 (3.4%, CDKN2A (3.4% and SMAD4 (3.4%. Amplifications of HER2, CCNE1, MYC, KRAS and EGFR genes were observed in 8 (8.9%, 4 (4.5%, 2 (2.2%, 1 (1.1% and 1 (1.1% cases, respectively. In the cases with amplification, fluorescence in situ hybridization for HER2 verified gene amplification and immunohistochemistry for HER2, EGFR and CCNE1 verified the overexpression of proteins in tumor cells. In conclusion, we successfully performed semiconductor-based sequencing and nCounter copy number variation analyses in formalin-fixed paraffin-embedded gastric cancer samples. High-throughput screening in archival clinical samples enables faster, more accurate and cost-effective detection of hotspot mutations or amplification in genes.

  11. Whole-exome sequencing and high throughput genotyping identified KCNJ11 as the thirteenth MODY gene.

    Science.gov (United States)

    Bonnefond, Amélie; Philippe, Julien; Durand, Emmanuelle; Dechaume, Aurélie; Huyvaert, Marlène; Montagne, Louise; Marre, Michel; Balkau, Beverley; Fajardy, Isabelle; Vambergue, Anne; Vatin, Vincent; Delplanque, Jérôme; Le Guilcher, David; De Graeve, Franck; Lecoeur, Cécile; Sand, Olivier; Vaxillaire, Martine; Froguel, Philippe

    2012-01-01

    Maturity-onset of the young (MODY) is a clinically heterogeneous form of diabetes characterized by an autosomal-dominant mode of inheritance, an onset before the age of 25 years, and a primary defect in the pancreatic beta-cell function. Approximately 30% of MODY families remain genetically unexplained (MODY-X). Here, we aimed to use whole-exome sequencing (WES) in a four-generation MODY-X family to identify a new susceptibility gene for MODY. WES (Agilent-SureSelect capture/Illumina-GAIIx sequencing) was performed in three affected and one non-affected relatives in the MODY-X family. We then performed a high-throughput multiplex genotyping (Illumina-GoldenGate assay) of the putative causal mutations in the whole family and in 406 controls. A linkage analysis was also carried out. By focusing on variants of interest (i.e. gains of stop codon, frameshift, non-synonymous and splice-site variants not reported in dbSNP130) present in the three affected relatives and not present in the control, we found 69 mutations. However, as WES was not uniform between samples, a total of 324 mutations had to be assessed in the whole family and in controls. Only one mutation (p.Glu227Lys in KCNJ11) co-segregated with diabetes in the family (with a LOD-score of 3.68). No KCNJ11 mutation was found in 25 other MODY-X unrelated subjects. Beyond neonatal diabetes mellitus (NDM), KCNJ11 is also a MODY gene ('MODY13'), confirming the wide spectrum of diabetes related phenotypes due to mutations in NDM genes (i.e. KCNJ11, ABCC8 and INS). Therefore, the molecular diagnosis of MODY should include KCNJ11 as affected carriers can be ideally treated with oral sulfonylureas.

  12. Whole-exome sequencing and high throughput genotyping identified KCNJ11 as the thirteenth MODY gene.

    Directory of Open Access Journals (Sweden)

    Amélie Bonnefond

    Full Text Available BACKGROUND: Maturity-onset of the young (MODY is a clinically heterogeneous form of diabetes characterized by an autosomal-dominant mode of inheritance, an onset before the age of 25 years, and a primary defect in the pancreatic beta-cell function. Approximately 30% of MODY families remain genetically unexplained (MODY-X. Here, we aimed to use whole-exome sequencing (WES in a four-generation MODY-X family to identify a new susceptibility gene for MODY. METHODOLOGY: WES (Agilent-SureSelect capture/Illumina-GAIIx sequencing was performed in three affected and one non-affected relatives in the MODY-X family. We then performed a high-throughput multiplex genotyping (Illumina-GoldenGate assay of the putative causal mutations in the whole family and in 406 controls. A linkage analysis was also carried out. PRINCIPAL FINDINGS: By focusing on variants of interest (i.e. gains of stop codon, frameshift, non-synonymous and splice-site variants not reported in dbSNP130 present in the three affected relatives and not present in the control, we found 69 mutations. However, as WES was not uniform between samples, a total of 324 mutations had to be assessed in the whole family and in controls. Only one mutation (p.Glu227Lys in KCNJ11 co-segregated with diabetes in the family (with a LOD-score of 3.68. No KCNJ11 mutation was found in 25 other MODY-X unrelated subjects. CONCLUSIONS/SIGNIFICANCE: Beyond neonatal diabetes mellitus (NDM, KCNJ11 is also a MODY gene ('MODY13', confirming the wide spectrum of diabetes related phenotypes due to mutations in NDM genes (i.e. KCNJ11, ABCC8 and INS. Therefore, the molecular diagnosis of MODY should include KCNJ11 as affected carriers can be ideally treated with oral sulfonylureas.

  13. Characteristics of MHC class I genes in house sparrows Passer domesticus as revealed by long cDNA transcripts and amplicon sequencing.

    Science.gov (United States)

    Karlsson, Maria; Westerdahl, Helena

    2013-08-01

    In birds the major histocompatibility complex (MHC) organization differs both among and within orders; chickens Gallus gallus of the order Galliformes have a simple arrangement, while many songbirds of the order Passeriformes have a more complex arrangement with larger numbers of MHC class I and II genes. Chicken MHC genes are found at two independent loci, classical MHC-B and non-classical MHC-Y, whereas non-classical MHC genes are yet to be verified in passerines. Here we characterize MHC class I transcripts (α1 to α3 domain) and perform amplicon sequencing using a next-generation sequencing technique on exon 3 from house sparrow Passer domesticus (a passerine) families. Then we use phylogenetic, selection, and segregation analyses to gain a better understanding of the MHC class I organization. Trees based on the α1 and α2 domain revealed a distinct cluster with short terminal branches for transcripts with a 6-bp deletion. Interestingly, this cluster was not seen in the tree based on the α3 domain. 21 exon 3 sequences were verified in a single individual and the average numbers within an individual were nine and five for sequences with and without a 6-bp deletion, respectively. All individuals had exon 3 sequences with and without a 6-bp deletion. The sequences with a 6-bp deletion have many characteristics in common with non-classical MHC, e.g., highly conserved amino acid positions were substituted compared with the other alleles, low nucleotide diversity and just a single site was subject to positive selection. However, these alleles also have characteristics that suggest they could be classical, e.g., complete linkage and absence of a distinct cluster in a tree based on the α3 domain. Thus, we cannot determine for certain whether or not the alleles with a 6-bp deletion are non-classical based on our present data. Further analyses on segregation patterns of these alleles in combination with dating the 6-bp deletion through MHC characterization across the

  14. High-throughput sequencing of microRNAs in peripheral blood mononuclear cells: identification of potential weight loss biomarkers.

    Directory of Open Access Journals (Sweden)

    Fermín I Milagro

    Full Text Available INTRODUCTION: MicroRNAs (miRNAs are being increasingly studied in relation to energy metabolism and body composition homeostasis. Indeed, the quantitative analysis of miRNAs expression in different adiposity conditions may contribute to understand the intimate mechanisms participating in body weight control and to find new biomarkers with diagnostic or prognostic value in obesity management. OBJECTIVE: The aim of this study was the search for miRNAs in blood cells whose expression could be used as prognostic biomarkers of weight loss. METHODS: Ten Caucasian obese women were selected among the participants in a weight-loss trial that consisted in following an energy-restricted treatment. Weight loss was considered unsuccessful when 5% (responders. At baseline, total miRNA isolated from peripheral blood mononuclear cells (PBMC was sequenced with SOLiD v4. The miRNA sequencing data were validated by RT-PCR. RESULTS: Differential baseline expression of several miRNAs was found between responders and non-responders. Two miRNAs were up-regulated in the non-responder group (mir-935 and mir-4772 and three others were down-regulated (mir-223, mir-224 and mir-376b. Both mir-935 and mir-4772 showed relevant associations with the magnitude of weight loss, although the expression of other transcripts (mir-874, mir-199b, mir-766, mir-589 and mir-148b also correlated with weight loss. CONCLUSIONS: This research addresses the use of high-throughput sequencing technologies in the search for miRNA expression biomarkers in obesity, by determining the miRNA transcriptome of PBMC. Basal expression of different miRNAs, particularly mir-935 and mir-4772, could be prognostic biomarkers and may forecast the response to a hypocaloric diet.

  15. The use of high-throughput DNA sequencing in the investigation of antigenic variation: application to Neisseria species.

    Directory of Open Access Journals (Sweden)

    John K Davies

    Full Text Available Antigenic variation occurs in a broad range of species. This process resembles gene conversion in that variant DNA is unidirectionally transferred from partial gene copies (or silent loci into an expression locus. Previous studies of antigenic variation have involved the amplification and sequencing of individual genes from hundreds of colonies. Using the pilE gene from Neisseria gonorrhoeae we have demonstrated that it is possible to use PCR amplification, followed by high-throughput DNA sequencing and a novel assembly process, to detect individual antigenic variation events. The ability to detect these events was much greater than has previously been possible. In N. gonorrhoeae most silent loci contain multiple partial gene copies. Here we show that there is a bias towards using the copy at the 3' end of the silent loci (copy 1 as the donor sequence. The pilE gene of N. gonorrhoeae and some strains of Neisseria meningitidis encode class I pilin, but strains of N. meningitidis from clonal complexes 8 and 11 encode a class II pilin. We have confirmed that the class II pili of meningococcal strain FAM18 (clonal complex 11 are non-variable, and this is also true for the class II pili of strain NMB from clonal complex 8. In addition when a gene encoding class I pilin was moved into the meningococcal strain NMB background there was no evidence of antigenic variation. Finally we investigated several members of the opa gene family of N. gonorrhoeae, where it has been suggested that limited variation occurs. Variation was detected in the opaK gene that is located close to pilE, but not at the opaJ gene located elsewhere on the genome. The approach described here promises to dramatically improve studies of the extent and nature of antigenic variation systems in a variety of species.

  16. Targeted genotyping-by-sequencing permits cost-effective identification and discrimination of pasture grass species and cultivars.

    Science.gov (United States)

    Pembleton, Luke W; Drayton, Michelle C; Bain, Melissa; Baillie, Rebecca C; Inch, Courtney; Spangenberg, German C; Wang, Junping; Forster, John W; Cogan, Noel O I

    2016-05-01

    A targeted amplicon-based genotyping-by-sequencing approach has permitted cost-effective and accurate discrimination between ryegrass species (perennial, Italian and inter-species hybrid), and identification of cultivars based on bulked samples. Perennial ryegrass and Italian ryegrass are the most important temperate forage species for global agriculture, and are represented in the commercial pasture seed market by numerous cultivars each composed of multiple highly heterozygous individuals. Previous studies have identified difficulties in the use of morphophysiological criteria to discriminate between these two closely related taxa. Recently, a highly multiplexed single nucleotide polymorphism (SNP)-based genotyping assay has been developed that permits accurate differentiation between both species and cultivars of ryegrasses at the genetic level. This assay has since been further developed into an amplicon-based genotyping-by-sequencing (GBS) approach implemented on a second-generation sequencing platform, allowing accelerated throughput and ca. sixfold reduction in cost. Using the GBS approach, 63 cultivars of perennial, Italian and interspecific hybrid ryegrasses, as well as intergeneric Festulolium hybrids, were genotyped. The genetic relationships between cultivars were interpreted in terms of known breeding histories and indistinct species boundaries within the Lolium genus, as well as suitability of current cultivar registration methodologies. An example of applicability to quality assurance and control (QA/QC) of seed purity is also described. Rapid, low-cost genotypic assays provide new opportunities for breeders to more fully explore genetic diversity within breeding programs, allowing the combination of novel unique genetic backgrounds. Such tools also offer the potential to more accurately define cultivar identities, allowing protection of varieties in the commercial market and supporting processes of cultivar accreditation and quality assurance.

  17. High Throughput Transcriptomics @ USEPA (Toxicology ...

    Science.gov (United States)

    The ideal chemical testing approach will provide complete coverage of all relevant toxicological responses. It should be sensitive and specific It should identify the mechanism/mode-of-action (with dose-dependence). It should identify responses relevant to the species of interest. Responses should ideally be translated into tissue-, organ-, and organism-level effects. It must be economical and scalable. Using a High Throughput Transcriptomics platform within US EPA provides broader coverage of biological activity space and toxicological MOAs and helps fill the toxicological data gap. Slide presentation at the 2016 ToxForum on using High Throughput Transcriptomics at US EPA for broader coverage biological activity space and toxicological MOAs.

  18. Application of ToxCast High-Throughput Screening and ...

    Science.gov (United States)

    Slide presentation at the SETAC annual meeting on High-Throughput Screening and Modeling Approaches to Identify Steroidogenesis Distruptors Slide presentation at the SETAC annual meeting on High-Throughput Screening and Modeling Approaches to Identify Steroidogenssis Distruptors

  19. High-throughput sequencing of microRNA transcriptome and expression assay in the sturgeon, Acipenser schrenckii.

    Directory of Open Access Journals (Sweden)

    Lihong Yuan

    Full Text Available Sturgeons are considered as living fossils and have very high evolutionary, economical and conservation values. The multiploidy of sturgeon that has been caused by chromosome duplication may lead to the emergence of new microRNAs (miRNAs involved in the ploidy and physiological processes. In the present study, we performed the first sturgeon miRNAs analysis by RNA-seq high-throughput sequencing combined with expression assay of microarray and real-time PCR, and aimed to discover the sturgeon-specific miRNAs, confirm the expressed pattern of miRNAs and illustrate the potential role of miRNAs-targets on sturgeon biological processes. A total of 103 miRNAs were identified, including 58 miRNAs with strongly detected signals (signal >500 and P≤0.01, which were detected by microarray. Real-time PCR assay supported the expression pattern obtained by microarray. Moreover, co-expression of 21 miRNAs in all five tissues and tissue-specific expression of 16 miRNAs implied the crucial and particular function of them in sturgeon physiological processes. Target gene prediction, especially the enriched functional gene groups (369 GO terms and pathways (37 KEGG regulated by 58 miRNAs (P<0.05, illustrated the interaction of miRNAs and putative mRNAs, and also the potential mechanism involved in these biological processes. Our new findings of sturgeon miRNAs expand the public database of transcriptome information for this species, contribute to our understanding of sturgeon biology, and also provide invaluable data that may be applied in sturgeon breeding.

  20. High-throughput sequencing of microRNA transcriptome and expression assay in the sturgeon, Acipenser schrenckii.

    Science.gov (United States)

    Yuan, Lihong; Zhang, Xiujuan; Li, Linmiao; Jiang, Haiying; Chen, Jinping

    2014-01-01

    Sturgeons are considered as living fossils and have very high evolutionary, economical and conservation values. The multiploidy of sturgeon that has been caused by chromosome duplication may lead to the emergence of new microRNAs (miRNAs) involved in the ploidy and physiological processes. In the present study, we performed the first sturgeon miRNAs analysis by RNA-seq high-throughput sequencing combined with expression assay of microarray and real-time PCR, and aimed to discover the sturgeon-specific miRNAs, confirm the expressed pattern of miRNAs and illustrate the potential role of miRNAs-targets on sturgeon biological processes. A total of 103 miRNAs were identified, including 58 miRNAs with strongly detected signals (signal >500 and P≤0.01), which were detected by microarray. Real-time PCR assay supported the expression pattern obtained by microarray. Moreover, co-expression of 21 miRNAs in all five tissues and tissue-specific expression of 16 miRNAs implied the crucial and particular function of them in sturgeon physiological processes. Target gene prediction, especially the enriched functional gene groups (369 GO terms) and pathways (37 KEGG) regulated by 58 miRNAs (P<0.05), illustrated the interaction of miRNAs and putative mRNAs, and also the potential mechanism involved in these biological processes. Our new findings of sturgeon miRNAs expand the public database of transcriptome information for this species, contribute to our understanding of sturgeon biology, and also provide invaluable data that may be applied in sturgeon breeding.

  1. Multiple and high-throughput droplet reactions via combination of microsampling technique and microfluidic chip

    KAUST Repository

    Wu, Jinbo

    2012-11-20

    Microdroplets offer unique compartments for accommodating a large number of chemical and biological reactions in tiny volume with precise control. A major concern in droplet-based microfluidics is the difficulty to address droplets individually and achieve high throughput at the same time. Here, we have combined an improved cartridge sampling technique with a microfluidic chip to perform droplet screenings and aggressive reaction with minimal (nanoliter-scale) reagent consumption. The droplet composition, distance, volume (nanoliter to subnanoliter scale), number, and sequence could be precisely and digitally programmed through the improved sampling technique, while sample evaporation and cross-contamination are effectively eliminated. Our combined device provides a simple model to utilize multiple droplets for various reactions with low reagent consumption and high throughput. © 2012 American Chemical Society.

  2. High-throughput sequencing and pathway analysis reveal alteration of the pituitary transcriptome by 17α-ethynylestradiol (EE2) in female coho salmon, Oncorhynchus kisutch

    Energy Technology Data Exchange (ETDEWEB)

    Harding, Louisa B. [School of Aquatic and Fishery Sciences, University of Washington, Seattle, WA 98195 (United States); Schultz, Irvin R. [Battelle, Marine Sciences Laboratory – Pacific Northwest National Laboratory, 1529 West Sequim Bay Road, Sequim, WA 98382 (United States); Goetz, Giles W. [School of Aquatic and Fishery Sciences, University of Washington, Seattle, WA 98195 (United States); Luckenbach, J. Adam [Northwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, 2725 Montlake Blvd E, Seattle, WA 98112 (United States); Center for Reproductive Biology, Washington State University, Pullman, WA 98164 (United States); Young, Graham [School of Aquatic and Fishery Sciences, University of Washington, Seattle, WA 98195 (United States); Center for Reproductive Biology, Washington State University, Pullman, WA 98164 (United States); Goetz, Frederick W. [Northwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, Manchester Research Station, P.O. Box 130, Manchester, WA 98353 (United States); Swanson, Penny, E-mail: penny.swanson@noaa.gov [Northwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, 2725 Montlake Blvd E, Seattle, WA 98112 (United States); Center for Reproductive Biology, Washington State University, Pullman, WA 98164 (United States)

    2013-10-15

    Highlights: •Studied impacts of ethynylestradiol (EE2) exposure on salmon pituitary transcriptome. •High-throughput sequencing, RNAseq, and pathway analysis were performed. •EE2 altered mRNAs for genes in circadian rhythm, GnRH, and TGFβ signaling pathways. •LH and FSH beta subunit mRNAs were most highly up- and down-regulated by EE2, respectively. •Estrogens may alter processes associated with reproductive timing in salmon. -- Abstract: Considerable research has been done on the effects of endocrine disrupting chemicals (EDCs) on reproduction and gene expression in the brain, liver and gonads of teleost fish, but information on impacts to the pituitary gland are still limited despite its central role in regulating reproduction. The aim of this study was to further our understanding of the potential effects of natural and synthetic estrogens on the brain–pituitary–gonad axis in fish by determining the effects of 17α-ethynylestradiol (EE2) on the pituitary transcriptome. We exposed sub-adult coho salmon (Oncorhynchus kisutch) to 0 or 12 ng EE2/L for up to 6 weeks and effects on the pituitary transcriptome of females were assessed using high-throughput Illumina{sup ®} sequencing, RNA-Seq and pathway analysis. After 1 or 6 weeks, 218 and 670 contiguous sequences (contigs) respectively, were differentially expressed in pituitaries of EE2-exposed fish relative to control. Two of the most highly up- and down-regulated contigs were luteinizing hormone β subunit (241-fold and 395-fold at 1 and 6 weeks, respectively) and follicle-stimulating hormone β subunit (−3.4-fold at 6 weeks). Additional contigs related to gonadotropin synthesis and release were differentially expressed in EE2-exposed fish relative to controls. These included contigs involved in gonadotropin releasing hormone (GNRH) and transforming growth factor-β signaling. There was an over-representation of significantly affected contigs in 33 and 18 canonical pathways at 1 and 6 weeks

  3. High-throughput sequencing and pathway analysis reveal alteration of the pituitary transcriptome by 17α-ethynylestradiol (EE2) in female coho salmon, Oncorhynchus kisutch

    International Nuclear Information System (INIS)

    Harding, Louisa B.; Schultz, Irvin R.; Goetz, Giles W.; Luckenbach, J. Adam; Young, Graham; Goetz, Frederick W.; Swanson, Penny

    2013-01-01

    Highlights: •Studied impacts of ethynylestradiol (EE2) exposure on salmon pituitary transcriptome. •High-throughput sequencing, RNAseq, and pathway analysis were performed. •EE2 altered mRNAs for genes in circadian rhythm, GnRH, and TGFβ signaling pathways. •LH and FSH beta subunit mRNAs were most highly up- and down-regulated by EE2, respectively. •Estrogens may alter processes associated with reproductive timing in salmon. -- Abstract: Considerable research has been done on the effects of endocrine disrupting chemicals (EDCs) on reproduction and gene expression in the brain, liver and gonads of teleost fish, but information on impacts to the pituitary gland are still limited despite its central role in regulating reproduction. The aim of this study was to further our understanding of the potential effects of natural and synthetic estrogens on the brain–pituitary–gonad axis in fish by determining the effects of 17α-ethynylestradiol (EE2) on the pituitary transcriptome. We exposed sub-adult coho salmon (Oncorhynchus kisutch) to 0 or 12 ng EE2/L for up to 6 weeks and effects on the pituitary transcriptome of females were assessed using high-throughput Illumina ® sequencing, RNA-Seq and pathway analysis. After 1 or 6 weeks, 218 and 670 contiguous sequences (contigs) respectively, were differentially expressed in pituitaries of EE2-exposed fish relative to control. Two of the most highly up- and down-regulated contigs were luteinizing hormone β subunit (241-fold and 395-fold at 1 and 6 weeks, respectively) and follicle-stimulating hormone β subunit (−3.4-fold at 6 weeks). Additional contigs related to gonadotropin synthesis and release were differentially expressed in EE2-exposed fish relative to controls. These included contigs involved in gonadotropin releasing hormone (GNRH) and transforming growth factor-β signaling. There was an over-representation of significantly affected contigs in 33 and 18 canonical pathways at 1 and 6 weeks

  4. MUSCLE: multiple sequence alignment with high accuracy and high throughput.

    Science.gov (United States)

    Edgar, Robert C

    2004-01-01

    We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the log-expectation score, and refinement using tree-dependent restricted partitioning. The speed and accuracy of MUSCLE are compared with T-Coffee, MAFFT and CLUSTALW on four test sets of reference alignments: BAliBASE, SABmark, SMART and a new benchmark, PREFAB. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of these sets. Without refinement, MUSCLE achieves average accuracy statistically indistinguishable from T-Coffee and MAFFT, and is the fastest of the tested methods for large numbers of sequences, aligning 5000 sequences of average length 350 in 7 min on a current desktop computer. The MUSCLE program, source code and PREFAB test data are freely available at http://www.drive5. com/muscle.

  5. Metabolomic and high-throughput sequencing analysis—modern approach for the assessment of biodeterioration of materials from historic buildings

    Science.gov (United States)

    Gutarowska, Beata; Celikkol-Aydin, Sukriye; Bonifay, Vincent; Otlewska, Anna; Aydin, Egemen; Oldham, Athenia L.; Brauer, Jonathan I.; Duncan, Kathleen E.; Adamiak, Justyna; Sunner, Jan A.; Beech, Iwona B.

    2015-01-01

    Preservation of cultural heritage is of paramount importance worldwide. Microbial colonization of construction materials, such as wood, brick, mortar, and stone in historic buildings can lead to severe deterioration. The aim of the present study was to give modern insight into the phylogenetic diversity and activated metabolic pathways of microbial communities colonized historic objects located in the former Auschwitz II–Birkenau concentration and extermination camp in Oświecim, Poland. For this purpose we combined molecular, microscopic and chemical methods. Selected specimens were examined using Field Emission Scanning Electron Microscopy (FESEM), metabolomic analysis and high-throughput Illumina sequencing. FESEM imaging revealed the presence of complex microbial communities comprising diatoms, fungi and bacteria, mainly cyanobacteria and actinobacteria, on sample surfaces. Microbial diversity of brick specimens appeared higher than that of the wood and was dominated by algae and cyanobacteria, while wood was mainly colonized by fungi. DNA sequences documented the presence of 15 bacterial phyla representing 99 genera including Halomonas, Halorhodospira, Salinisphaera, Salinibacterium, Rubrobacter, Streptomyces, Arthrobacter and nine fungal classes represented by 113 genera including Cladosporium, Acremonium, Alternaria, Engyodontium, Penicillium, Rhizopus, and Aureobasidium. Most of the identified sequences were characteristic of organisms implicated in deterioration of wood and brick. Metabolomic data indicated the activation of numerous metabolic pathways, including those regulating the production of primary and secondary metabolites, for example, metabolites associated with the production of antibiotics, organic acids and deterioration of organic compounds. The study demonstrated that a combination of electron microscopy imaging with metabolomic and genomic techniques allows to link the phylogenetic information and metabolic profiles of microbial

  6. Metabolomic and high-throughput sequencing analysis-modern approach for the assessment of biodeterioration of materials from historic buildings.

    Science.gov (United States)

    Gutarowska, Beata; Celikkol-Aydin, Sukriye; Bonifay, Vincent; Otlewska, Anna; Aydin, Egemen; Oldham, Athenia L; Brauer, Jonathan I; Duncan, Kathleen E; Adamiak, Justyna; Sunner, Jan A; Beech, Iwona B

    2015-01-01

    Preservation of cultural heritage is of paramount importance worldwide. Microbial colonization of construction materials, such as wood, brick, mortar, and stone in historic buildings can lead to severe deterioration. The aim of the present study was to give modern insight into the phylogenetic diversity and activated metabolic pathways of microbial communities colonized historic objects located in the former Auschwitz II-Birkenau concentration and extermination camp in Oświecim, Poland. For this purpose we combined molecular, microscopic and chemical methods. Selected specimens were examined using Field Emission Scanning Electron Microscopy (FESEM), metabolomic analysis and high-throughput Illumina sequencing. FESEM imaging revealed the presence of complex microbial communities comprising diatoms, fungi and bacteria, mainly cyanobacteria and actinobacteria, on sample surfaces. Microbial diversity of brick specimens appeared higher than that of the wood and was dominated by algae and cyanobacteria, while wood was mainly colonized by fungi. DNA sequences documented the presence of 15 bacterial phyla representing 99 genera including Halomonas, Halorhodospira, Salinisphaera, Salinibacterium, Rubrobacter, Streptomyces, Arthrobacter and nine fungal classes represented by 113 genera including Cladosporium, Acremonium, Alternaria, Engyodontium, Penicillium, Rhizopus, and Aureobasidium. Most of the identified sequences were characteristic of organisms implicated in deterioration of wood and brick. Metabolomic data indicated the activation of numerous metabolic pathways, including those regulating the production of primary and secondary metabolites, for example, metabolites associated with the production of antibiotics, organic acids and deterioration of organic compounds. The study demonstrated that a combination of electron microscopy imaging with metabolomic and genomic techniques allows to link the phylogenetic information and metabolic profiles of microbial communities

  7. Who's for dinner? High-throughput sequencing reveals bat dietary differentiation in a biodiversity hotspot where prey taxonomy is largely undescribed.

    Science.gov (United States)

    Burgar, Joanna M; Murray, Daithi C; Craig, Michael D; Haile, James; Houston, Jayne; Stokes, Vicki; Bunce, Michael

    2014-08-01

    Effective management and conservation of biodiversity requires understanding of predator-prey relationships to ensure the continued existence of both predator and prey populations. Gathering dietary data from predatory species, such as insectivorous bats, often presents logistical challenges, further exacerbated in biodiversity hot spots because prey items are highly speciose, yet their taxonomy is largely undescribed. We used high-throughput sequencing (HTS) and bioinformatic analyses to phylogenetically group DNA sequences into molecular operational taxonomic units (MOTUs) to examine predator-prey dynamics of three sympatric insectivorous bat species in the biodiversity hotspot of south-western Australia. We could only assign between 4% and 20% of MOTUs to known genera or species, depending on the method used, underscoring the importance of examining dietary diversity irrespective of taxonomic knowledge in areas lacking a comprehensive genetic reference database. MOTU analysis confirmed that resource partitioning occurred, with dietary divergence positively related to the ecomorphological divergence of the three bat species. We predicted that bat species' diets would converge during times of high energetic requirements, that is, the maternity season for females and the mating season for males. There was an interactive effect of season on female, but not male, bat species' diets, although small sample sizes may have limited our findings. Contrary to our predictions, females of two ecomorphologically similar species showed dietary convergence during the mating season rather than the maternity season. HTS-based approaches can help elucidate complex predator-prey relationships in highly speciose regions, which should facilitate the conservation of biodiversity in genetically uncharacterized areas, such as biodiversity hotspots. © 2013 John Wiley & Sons Ltd.

  8. PCR cycles above routine numbers do not compromise high-throughput DNA barcoding results.

    Science.gov (United States)

    Vierna, J; Doña, J; Vizcaíno, A; Serrano, D; Jovani, R

    2017-10-01

    High-throughput DNA barcoding has become essential in ecology and evolution, but some technical questions still remain. Increasing the number of PCR cycles above the routine 20-30 cycles is a common practice when working with old-type specimens, which provide little amounts of DNA, or when facing annealing issues with the primers. However, increasing the number of cycles can raise the number of artificial mutations due to polymerase errors. In this work, we sequenced 20 COI libraries in the Illumina MiSeq platform. Libraries were prepared with 40, 45, 50, 55, and 60 PCR cycles from four individuals belonging to four species of four genera of cephalopods. We found no relationship between the number of PCR cycles and the number of mutations despite using a nonproofreading polymerase. Moreover, even when using a high number of PCR cycles, the resulting number of mutations was low enough not to be an issue in the context of high-throughput DNA barcoding (but may still remain an issue in DNA metabarcoding due to chimera formation). We conclude that the common practice of increasing the number of PCR cycles should not negatively impact the outcome of a high-throughput DNA barcoding study in terms of the occurrence of point mutations.

  9. Laboratory Information Management Software for genotyping workflows: applications in high throughput crop genotyping

    Directory of Open Access Journals (Sweden)

    Prasanth VP

    2006-08-01

    Full Text Available Abstract Background With the advances in DNA sequencer-based technologies, it has become possible to automate several steps of the genotyping process leading to increased throughput. To efficiently handle the large amounts of genotypic data generated and help with quality control, there is a strong need for a software system that can help with the tracking of samples and capture and management of data at different steps of the process. Such systems, while serving to manage the workflow precisely, also encourage good laboratory practice by standardizing protocols, recording and annotating data from every step of the workflow. Results A laboratory information management system (LIMS has been designed and implemented at the International Crops Research Institute for the Semi-Arid Tropics (ICRISAT that meets the requirements of a moderately high throughput molecular genotyping facility. The application is designed as modules and is simple to learn and use. The application leads the user through each step of the process from starting an experiment to the storing of output data from the genotype detection step with auto-binning of alleles; thus ensuring that every DNA sample is handled in an identical manner and all the necessary data are captured. The application keeps track of DNA samples and generated data. Data entry into the system is through the use of forms for file uploads. The LIMS provides functions to trace back to the electrophoresis gel files or sample source for any genotypic data and for repeating experiments. The LIMS is being presently used for the capture of high throughput SSR (simple-sequence repeat genotyping data from the legume (chickpea, groundnut and pigeonpea and cereal (sorghum and millets crops of importance in the semi-arid tropics. Conclusion A laboratory information management system is available that has been found useful in the management of microsatellite genotype data in a moderately high throughput genotyping

  10. Applications of High-Throughput Nucleotide Sequencing (PhD)

    DEFF Research Database (Denmark)

    Waage, Johannes

    equally large demands in data handling, analysis and interpretation, perhaps defining the modern challenge of the computational biologist of the post-genomic era. The first part of this thesis consists of a general introduction to the history, common terms and challenges of next generation sequencing......-sequencing, a study of the effects on alternative RNA splicing of KO of the nonsense mediated RNA decay system in Mus, using digital gene expression and a custom-built exon-exon junction mapping pipeline is presented (article I). Evolved from this work, a Bioconductor package, spliceR, for classifying alternative...

  11. Searching for resistance genes to Bursaphelenchus xylophilus using high throughput screening

    Directory of Open Access Journals (Sweden)

    Santos Carla S

    2012-11-01

    Full Text Available Abstract Background Pine wilt disease (PWD, caused by the pinewood nematode (PWN; Bursaphelenchus xylophilus, damages and kills pine trees and is causing serious economic damage worldwide. Although the ecological mechanism of infestation is well described, the plant’s molecular response to the pathogen is not well known. This is due mainly to the lack of genomic information and the complexity of the disease. High throughput sequencing is now an efficient approach for detecting the expression of genes in non-model organisms, thus providing valuable information in spite of the lack of the genome sequence. In an attempt to unravel genes potentially involved in the pine defense against the pathogen, we hereby report the high throughput comparative sequence analysis of infested and non-infested stems of Pinus pinaster (very susceptible to PWN and Pinus pinea (less susceptible to PWN. Results Four cDNA libraries from infested and non-infested stems of P. pinaster and P. pinea were sequenced in a full 454 GS FLX run, producing a total of 2,083,698 reads. The putative amino acid sequences encoded by the assembled transcripts were annotated according to Gene Ontology, to assign Pinus contigs into Biological Processes, Cellular Components and Molecular Functions categories. Most of the annotated transcripts corresponded to Picea genes-25.4-39.7%, whereas a smaller percentage, matched Pinus genes, 1.8-12.8%, probably a consequence of more public genomic information available for Picea than for Pinus. The comparative transcriptome analysis showed that when P. pinaster was infested with PWN, the genes malate dehydrogenase, ABA, water deficit stress related genes and PAR1 were highly expressed, while in PWN-infested P. pinea, the highly expressed genes were ricin B-related lectin, and genes belonging to the SNARE and high mobility group families. Quantitative PCR experiments confirmed the differential gene expression between the two pine species

  12. Searching for resistance genes to Bursaphelenchus xylophilus using high throughput screening

    Science.gov (United States)

    2012-01-01

    Background Pine wilt disease (PWD), caused by the pinewood nematode (PWN; Bursaphelenchus xylophilus), damages and kills pine trees and is causing serious economic damage worldwide. Although the ecological mechanism of infestation is well described, the plant’s molecular response to the pathogen is not well known. This is due mainly to the lack of genomic information and the complexity of the disease. High throughput sequencing is now an efficient approach for detecting the expression of genes in non-model organisms, thus providing valuable information in spite of the lack of the genome sequence. In an attempt to unravel genes potentially involved in the pine defense against the pathogen, we hereby report the high throughput comparative sequence analysis of infested and non-infested stems of Pinus pinaster (very susceptible to PWN) and Pinus pinea (less susceptible to PWN). Results Four cDNA libraries from infested and non-infested stems of P. pinaster and P. pinea were sequenced in a full 454 GS FLX run, producing a total of 2,083,698 reads. The putative amino acid sequences encoded by the assembled transcripts were annotated according to Gene Ontology, to assign Pinus contigs into Biological Processes, Cellular Components and Molecular Functions categories. Most of the annotated transcripts corresponded to Picea genes-25.4-39.7%, whereas a smaller percentage, matched Pinus genes, 1.8-12.8%, probably a consequence of more public genomic information available for Picea than for Pinus. The comparative transcriptome analysis showed that when P. pinaster was infested with PWN, the genes malate dehydrogenase, ABA, water deficit stress related genes and PAR1 were highly expressed, while in PWN-infested P. pinea, the highly expressed genes were ricin B-related lectin, and genes belonging to the SNARE and high mobility group families. Quantitative PCR experiments confirmed the differential gene expression between the two pine species. Conclusions Defense-related genes

  13. Nuclear Species-Diagnostic SNP Markers Mined from 454 Amplicon Sequencing Reveal Admixture Genomic Structure of Modern Citrus Varieties

    Science.gov (United States)

    Curk, Franck; Ancillo, Gema; Ollitrault, Frédérique; Perrier, Xavier; Jacquemoud-Collet, Jean-Pierre; Garcia-Lor, Andres; Navarro, Luis; Ollitrault, Patrick

    2015-01-01

    Most cultivated Citrus species originated from interspecific hybridisation between four ancestral taxa (C. reticulata, C. maxima, C. medica, and C. micrantha) with limited further interspecific recombination due to vegetative propagation. This evolution resulted in admixture genomes with frequent interspecific heterozygosity. Moreover, a major part of the phenotypic diversity of edible citrus results from the initial differentiation between these taxa. Deciphering the phylogenomic structure of citrus germplasm is therefore essential for an efficient utilization of citrus biodiversity in breeding schemes. The objective of this work was to develop a set of species-diagnostic single nucleotide polymorphism (SNP) markers for the four Citrus ancestral taxa covering the nine chromosomes, and to use these markers to infer the phylogenomic structure of secondary species and modern cultivars. Species-diagnostic SNPs were mined from 454 amplicon sequencing of 57 gene fragments from 26 genotypes of the four basic taxa. Of the 1,053 SNPs mined from 28,507 kb sequence, 273 were found to be highly diagnostic for a single basic taxon. Species-diagnostic SNP markers (105) were used to analyse the admixture structure of varieties and rootstocks. This revealed C. maxima introgressions in most of the old and in all recent selections of mandarins, and suggested that C. reticulata × C. maxima reticulation and introgression processes were important in edible mandarin domestication. The large range of phylogenomic constitutions between C. reticulata and C. maxima revealed in mandarins, tangelos, tangors, sweet oranges, sour oranges, grapefruits, and orangelos is favourable for genetic association studies based on phylogenomic structures of the germplasm. Inferred admixture structures were in agreement with previous hypotheses regarding the origin of several secondary species and also revealed the probable origin of several acid citrus varieties. The developed species-diagnostic SNP

  14. The First Report of miRNAs from a Thysanopteran Insect, Thrips palmi Karny Using High-Throughput Sequencing.

    Directory of Open Access Journals (Sweden)

    K B Rebijith

    Full Text Available Thrips palmi Karny (Thysanoptera: Thripidae is the sole vector of Watermelon bud necrosis tospovirus, where the crop loss has been estimated to be around USD 50 million annually. Chemical insecticides are of limited use in the management of T. palmi due to the thigmokinetic behaviour and development of high levels of resistance to insecticides. There is an urgent need to find out an effective futuristic management strategy, where the small RNAs especially microRNAs hold great promise as a key player in the growth and development. miRNAs are a class of short non-coding RNAs involved in regulation of gene expression either by mRNA cleavage or by translational repression. We identified and characterized a total of 77 miRNAs from T. palmi using high-throughput deep sequencing. Functional classifications of the targets for these miRNAs revealed that majority of them are involved in the regulation of transcription and translation, nucleotide binding and signal transduction. We have also validated few of these miRNAs employing stem-loop RT-PCR, qRT-PCR and Northern blot. The present study not only provides an in-depth understanding of the biological and physiological roles of miRNAs in governing gene expression but may also lead as an invaluable tool for the management of thysanopteran insects in the future.

  15. High-Resolution Melt Analysis for Rapid Comparison of Bacterial Community Compositions

    DEFF Research Database (Denmark)

    Hjelmsø, Mathis Hjort; Hansen, Lars Hestbjerg; Bælum, Jacob

    2014-01-01

    In the study of bacterial community composition, 16S rRNA gene amplicon sequencing is today among the preferred methods of analysis. The cost of nucleotide sequence analysis, including requisite computational and bioinformatic steps, however, takes up a large part of many research budgets. High......-resolution melt (HRM) analysis is the study of the melt behavior of specific PCR products. Here we describe a novel high-throughput approach in which we used HRM analysis targeting the 16S rRNA gene to rapidly screen multiple complex samples for differences in bacterial community composition. We hypothesized...... that HRM analysis of amplified 16S rRNA genes from a soil ecosystem could be used as a screening tool to identify changes in bacterial community structure. This hypothesis was tested using a soil microcosm setup exposed to a total of six treatments representing different combinations of pesticide...

  16. Towards high-throughput phenotyping of complex patterned behaviors in rodents: focus on mouse self-grooming and its sequencing.

    Science.gov (United States)

    Kyzar, Evan; Gaikwad, Siddharth; Roth, Andrew; Green, Jeremy; Pham, Mimi; Stewart, Adam; Liang, Yiqing; Kobla, Vikrant; Kalueff, Allan V

    2011-12-01

    Increasingly recognized in biological psychiatry, rodent self-grooming is a complex patterned behavior with evolutionarily conserved cephalo-caudal progression. While grooming is traditionally assessed by the latency, frequency and duration, its sequencing represents another important domain sensitive to various experimental manipulations. Such behavioral complexity requires novel objective approaches to quantify rodent grooming, in addition to time-consuming and highly variable manual observation. The present study combined modern behavior-recognition video-tracking technologies (CleverSys, Inc.) with manual observation to characterize in-depth spontaneous (novelty-induced) and artificial (water-induced) self-grooming in adult male C57BL/6J mice. We specifically focused on individual episodes of grooming (paw licking, head washing, body/leg washing, and tail/genital grooming), their duration and transitions between episodes. Overall, the frequency, duration and transitions detected using the automated approach significantly correlated with manual observations (R=0.51-0.7, pgrooming, also indicating that behavior-recognition tools can be applied to characterize both the amount and sequential organization (patterning) of rodent grooming. Together with further refinement and methodological advancement, this approach will foster high-throughput neurophenotyping of grooming, with multiple applications in drug screening and testing of genetically modified animals. Copyright © 2011 Elsevier B.V. All rights reserved.

  17. Arioc: high-throughput read alignment with GPU-accelerated exploration of the seed-and-extend search space

    Directory of Open Access Journals (Sweden)

    Richard Wilton

    2015-03-01

    Full Text Available When computing alignments of DNA sequences to a large genome, a key element in achieving high processing throughput is to prioritize locations in the genome where high-scoring mappings might be expected. We formulated this task as a series of list-processing operations that can be efficiently performed on graphics processing unit (GPU hardware.We followed this approach in implementing a read aligner called Arioc that uses GPU-based parallel sort and reduction techniques to identify high-priority locations where potential alignments may be found. We then carried out a read-by-read comparison of Arioc’s reported alignments with the alignments found by several leading read aligners. With simulated reads, Arioc has comparable or better accuracy than the other read aligners we tested. With human sequencing reads, Arioc demonstrates significantly greater throughput than the other aligners we evaluated across a wide range of sensitivity settings. The Arioc software is available at https://github.com/RWilton/Arioc. It is released under a BSD open-source license.

  18. High-throughput metagenomic technologies for complex microbial community analysis: open and closed formats.

    Science.gov (United States)

    Zhou, Jizhong; He, Zhili; Yang, Yunfeng; Deng, Ye; Tringe, Susannah G; Alvarez-Cohen, Lisa

    2015-01-27

    Understanding the structure, functions, activities and dynamics of microbial communities in natural environments is one of the grand challenges of 21st century science. To address this challenge, over the past decade, numerous technologies have been developed for interrogating microbial communities, of which some are amenable to exploratory work (e.g., high-throughput sequencing and phenotypic screening) and others depend on reference genes or genomes (e.g., phylogenetic and functional gene arrays). Here, we provide a critical review and synthesis of the most commonly applied "open-format" and "closed-format" detection technologies. We discuss their characteristics, advantages, and disadvantages within the context of environmental applications and focus on analysis of complex microbial systems, such as those in soils, in which diversity is high and reference genomes are few. In addition, we discuss crucial issues and considerations associated with applying complementary high-throughput molecular technologies to address important ecological questions. Copyright © 2015 Zhou et al.

  19. Differential Expression and Functional Analysis of High-Throughput -Omics Data Using Open Source Tools.

    Science.gov (United States)

    Kebschull, Moritz; Fittler, Melanie Julia; Demmer, Ryan T; Papapanou, Panos N

    2017-01-01

    Today, -omics analyses, including the systematic cataloging of messenger RNA and microRNA sequences or DNA methylation patterns in a cell population, organ, or tissue sample, allow for an unbiased, comprehensive genome-level analysis of complex diseases, offering a large advantage over earlier "candidate" gene or pathway analyses. A primary goal in the analysis of these high-throughput assays is the detection of those features among several thousand that differ between different groups of samples. In the context of oral biology, our group has successfully utilized -omics technology to identify key molecules and pathways in different diagnostic entities of periodontal disease.A major issue when inferring biological information from high-throughput -omics studies is the fact that the sheer volume of high-dimensional data generated by contemporary technology is not appropriately analyzed using common statistical methods employed in the biomedical sciences.In this chapter, we outline a robust and well-accepted bioinformatics workflow for the initial analysis of -omics data generated using microarrays or next-generation sequencing technology using open-source tools. Starting with quality control measures and necessary preprocessing steps for data originating from different -omics technologies, we next outline a differential expression analysis pipeline that can be used for data from both microarray and sequencing experiments, and offers the possibility to account for random or fixed effects. Finally, we present an overview of the possibilities for a functional analysis of the obtained data.

  20. High Throughput PBTK: Open-Source Data and Tools for ...

    Science.gov (United States)

    Presentation on High Throughput PBTK at the PBK Modelling in Risk Assessment meeting in Ispra, Italy Presentation on High Throughput PBTK at the PBK Modelling in Risk Assessment meeting in Ispra, Italy

  1. An integrated tool to study MHC region: accurate SNV detection and HLA genes typing in human MHC region using targeted high-throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Hongzhi Cao

    Full Text Available The major histocompatibility complex (MHC is one of the most variable and gene-dense regions of the human genome. Most studies of the MHC, and associated regions, focus on minor variants and HLA typing, many of which have been demonstrated to be associated with human disease susceptibility and metabolic pathways. However, the detection of variants in the MHC region, and diagnostic HLA typing, still lacks a coherent, standardized, cost effective and high coverage protocol of clinical quality and reliability. In this paper, we presented such a method for the accurate detection of minor variants and HLA types in the human MHC region, using high-throughput, high-coverage sequencing of target regions. A probe set was designed to template upon the 8 annotated human MHC haplotypes, and to encompass the 5 megabases (Mb of the extended MHC region. We deployed our probes upon three, genetically diverse human samples for probe set evaluation, and sequencing data show that ∼97% of the MHC region, and over 99% of the genes in MHC region, are covered with sufficient depth and good evenness. 98% of genotypes called by this capture sequencing prove consistent with established HapMap genotypes. We have concurrently developed a one-step pipeline for calling any HLA type referenced in the IMGT/HLA database from this target capture sequencing data, which shows over 96% typing accuracy when deployed at 4 digital resolution. This cost-effective and highly accurate approach for variant detection and HLA typing in the MHC region may lend further insight into immune-mediated diseases studies, and may find clinical utility in transplantation medicine research. This one-step pipeline is released for general evaluation and use by the scientific community.

  2. The use of coded PCR primers enables high-throughput sequencing of multiple homolog amplification products by 454 parallel sequencing

    DEFF Research Database (Denmark)

    Binladen, Jonas; Gilbert, M Thomas P; Bollback, Jonathan P

    2007-01-01

    BACKGROUND: The invention of the Genome Sequence 20 DNA Sequencing System (454 parallel sequencing platform) has enabled the rapid and high-volume production of sequence data. Until now, however, individual emulsion PCR (emPCR) reactions and subsequent sequencing runs have been unable to combine...... primers that is dependent on the 5' nucleotide of the tag. In particular, primers 5' labelled with a cytosine are heavily overrepresented among the final sequences, while those 5' labelled with a thymine are strongly underrepresented. A weaker bias also exists with regards to the distribution...

  3. Strong selective sweeps associated with ampliconic regions in great ape X chromosomes

    DEFF Research Database (Denmark)

    Nam, Kiwoong; Munch, Kasper; Hobolth, Asger

    2014-01-01

    The unique inheritance pattern of X chromosomes makes them preferential targets of adaptive evolution. We here investigate natural selection on the X chromosome in all species of great apes. We find that diversity is more strongly reduced around genes on the X compared with autosomes...... with ampliconic sequences we propose that intra-genomic conflict between the X and the Y chromosomes is a major driver of X chromosome evolution....

  4. Distribution and Diversity of Bacteria and Fungi Colonization in Stone Monuments Analyzed by High-Throughput Sequencing.

    Science.gov (United States)

    Li, Qiang; Zhang, Bingjian; He, Zhang; Yang, Xiaoru

    The historical and cultural heritage of Qingxing palace and Lingyin and Kaihua temple, located in Hangzhou of China, include a large number of exquisite Buddhist statues and ancient stone sculptures which date back to the Northern Song (960-1219 A.D.) and Qing dynasties (1636-1912 A.D.) and are considered to be some of the best examples of ancient stone sculpting techniques. They were added to the World Heritage List in 2011 because of their unique craftsmanship and importance to the study of ancient Chinese Buddhist culture. However, biodeterioration of the surface of the ancient Buddhist statues and white marble pillars not only severely impairs their aesthetic value but also alters their material structure and thermo-hygric properties. In this study, high-throughput sequencing was utilized to identify the microbial communities colonizing the stone monuments. The diversity and distribution of the microbial communities in six samples collected from three different environmental conditions with signs of deterioration were analyzed by means of bioinformatics software and diversity indices. In addition, the impact of environmental factors, including temperature, light intensity, air humidity, and the concentration of NO2 and SO2, on the microbial communities' diversity and distribution was evaluated. The results indicate that the presence of predominantly phototrophic microorganisms was correlated with light and humidity, while nitrifying bacteria and Thiobacillus were associated with NO2 and SO2 from air pollution.

  5. High-throughput sequencing approach uncovers the miRNome of peritoneal endometriotic lesions and adjacent healthy tissues.

    Directory of Open Access Journals (Sweden)

    Merli Saare

    Full Text Available Accumulating data have shown the involvement of microRNAs (miRNAs in endometriosis pathogenesis. In this study, we used a novel approach to determine the endometriotic lesion-specific miRNAs by high-throughput small RNA sequencing of paired samples of peritoneal endometriotic lesions and matched healthy surrounding tissues together with eutopic endometria of the same patients. We found five miRNAs specific to epithelial cells--miR-34c, miR-449a, miR-200a, miR-200b and miR-141 showing significantly higher expression in peritoneal endometriotic lesions compared to healthy peritoneal tissues. We also determined the expression levels of miR-200 family target genes E-cadherin, ZEB1 and ZEB2 and found that the expression level of E-cadherin was significantly higher in endometriotic lesions compared to healthy tissues. Further evaluation verified that studied miRNAs could be used as diagnostic markers for confirming the presence of endometrial cells in endometriotic lesion biopsy samples. Furthermore, we demonstrated that the miRNA profile of peritoneal endometriotic lesion biopsies is largely masked by the surrounding peritoneal tissue, challenging the discovery of an accurate lesion-specific miRNA profile. Taken together, our findings indicate that only particular miRNAs with a significantly higher expression in endometriotic cells can be detected from lesion biopsies, and can serve as diagnostic markers for endometriosis.

  6. Metabolomic and high-throughput sequencing analysis – modern approach for the assessment of biodeterioration of materials from historic buildings

    Directory of Open Access Journals (Sweden)

    Beata eGutarowska

    2015-09-01

    Full Text Available Preservation of cultural heritage is of paramount importance worldwide. Microbial colonization of construction materials, such as wood, brick, mortar and stone in historic buildings can lead to severe deterioration. The aim of the present study was to give modern insight into the phylogenetic diversity and activated metabolic pathways of microbial communities colonized historic objects located in the former Auschwitz II-Birkenau concentration and extermination camp in Oświęcim, Poland. For this purpose we combined molecular, microscopic and chemical methods. Selected specimens were examined using Field Emission Scanning Electron Microscopy (FESEM, metabolomic analysis and high-throughput Illumina sequencing. FESEM imaging revealed the presence of complex microbial communities comprising diatoms, fungi and bacteria, mainly cyanobacteria and actinobacteria, on sample surfaces. Microbial diversity of brick specimens appeared higher than that of the wood and was dominated by algae and cyanobacteria, while wood was mainly colonized by fungi. DNA sequences documented the presence of 15 bacterial phyla representing 99 genera including Halomonas, Halorhodospira, Salinisphaera, Salinibacterium, Rubrobacter, Streptomyces, Arthrobacter and 9 fungal classes represented by 113 genera including Cladosporium, Acremonium, Alternaria, Engyodontium, Penicillium, Rhizopus and Aureobasidium. Most of the identified sequences were characteristic of organisms implicated in deterioration of wood and brick. Metabolomic data indicated the activation of numerous metabolic pathways, including those regulating the production of primary and secondary metabolites, for example, metabolites associated with the production of antibiotics, organic acids and deterioration of organic compounds. The study demonstrated that a combination of electron microscopy imaging with metabolomic and genomic techniques allows to link the phylogenetic information and metabolic profiles of

  7. High throughput screening method for assessing heterogeneity of microorganisms

    NARCIS (Netherlands)

    Ingham, C.J.; Sprenkels, A.J.; van Hylckama Vlieg, J.E.T.; Bomer, Johan G.; de Vos, W.M.; van den Berg, Albert

    2006-01-01

    The invention relates to the field of microbiology. Provided is a method which is particularly powerful for High Throughput Screening (HTS) purposes. More specific a high throughput method for determining heterogeneity or interactions of microorganisms is provided.

  8. 16S rRNA Amplicon Sequencing Demonstrates that Indoor-Reared Bumblebees (Bombus terrestris Harbor a Core Subset of Bacteria Normally Associated with the Wild Host.

    Directory of Open Access Journals (Sweden)

    Ivan Meeus

    Full Text Available A MiSeq multiplexed 16S rRNA amplicon sequencing of the gut microbiota of wild and indoor-reared Bombus terrestris (bumblebees confirmed the presence of a core set of bacteria, which consisted of Neisseriaceae (Snodgrassella, Orbaceae (Gilliamella, Lactobacillaceae (Lactobacillus, and Bifidobacteriaceae (Bifidobacterium. In wild B. terrestris we detected several non-core bacteria having a more variable prevalence. Although Enterobacteriaceae are unreported by non next-generation sequencing studies, it can become a dominant gut resident. Furthermore the presence of some non-core lactobacilli were associated with the relative abundance of bifidobacteria. This association was not observed in indoor-reared bumblebees lacking the non-core bacteria, but having a more standardized microbiota compared to their wild counterparts. The impact of the bottleneck microbiota of indoor-reared bumblebees when they are used in the field for pollination purpose is discussed.

  9. An efficient and high fidelity method for amplification, cloning and sequencing of complete tospovirus genomic RNA segments

    Science.gov (United States)

    Amplification and sequencing of the complete M- and S-RNA segments of Tomato spotted wilt virus and Impatiens necrotic spot virus as a single fragment is useful for whole genome sequencing of tospoviruses co-infecting a single host plant. It avoids issues associated with overlapping amplicon-based ...

  10. High-Throughput Analysis With 96-Capillary Array Electrophoresis and Integrated Sample Preparation for DNA Sequencing Based on Laser Induced Fluorescence Detection

    Energy Technology Data Exchange (ETDEWEB)

    Xue, Gang [Iowa State Univ., Ames, IA (United States)

    2001-01-01

    The purpose of this research was to improve the fluorescence detection for the multiplexed capillary array electrophoresis, extend its use beyond the genomic analysis, and to develop an integrated micro-sample preparation system for high-throughput DNA sequencing. The authors first demonstrated multiplexed capillary zone electrophoresis (CZE) and micellar electrokinetic chromatography (MEKC) separations in a 96-capillary array system with laser-induced fluorescence detection. Migration times of four kinds of fluoresceins and six polyaromatic hydrocarbons (PAHs) are normalized to one of the capillaries using two internal standards. The relative standard deviations (RSD) after normalization are 0.6-1.4% for the fluoresceins and 0.1-1.5% for the PAHs. Quantitative calibration of the separations based on peak areas is also performed, again with substantial improvement over the raw data. This opens up the possibility of performing massively parallel separations for high-throughput chemical analysis for process monitoring, combinatorial synthesis, and clinical diagnosis. The authors further improved the fluorescence detection by step laser scanning. A computer-controlled galvanometer scanner is adapted for scanning a focused laser beam across a 96-capillary array for laser-induced fluorescence detection. The signal at a single photomultiplier tube is temporally sorted to distinguish among the capillaries. The limit of detection for fluorescein is 3 x 10-11 M (S/N = 3) for 5-mW of total laser power scanned at 4 Hz. The observed cross-talk among capillaries is 0.2%. Advantages include the efficient utilization of light due to the high duty-cycle of step scan, good detection performance due to the reduction of stray light, ruggedness due to the small mass of the galvanometer mirror, low cost due to the simplicity of components, and flexibility due to the independent paths for excitation and emission.

  11. FRESCO: Referential compression of highly similar sequences.

    Science.gov (United States)

    Wandelt, Sebastian; Leser, Ulf

    2013-01-01

    In many applications, sets of similar texts or sequences are of high importance. Prominent examples are revision histories of documents or genomic sequences. Modern high-throughput sequencing technologies are able to generate DNA sequences at an ever-increasing rate. In parallel to the decreasing experimental time and cost necessary to produce DNA sequences, computational requirements for analysis and storage of the sequences are steeply increasing. Compression is a key technology to deal with this challenge. Recently, referential compression schemes, storing only the differences between a to-be-compressed input and a known reference sequence, gained a lot of interest in this field. In this paper, we propose a general open-source framework to compress large amounts of biological sequence data called Framework for REferential Sequence COmpression (FRESCO). Our basic compression algorithm is shown to be one to two orders of magnitudes faster than comparable related work, while achieving similar compression ratios. We also propose several techniques to further increase compression ratios, while still retaining the advantage in speed: 1) selecting a good reference sequence; and 2) rewriting a reference sequence to allow for better compression. In addition,we propose a new way of further boosting the compression ratios by applying referential compression to already referentially compressed files (second-order compression). This technique allows for compression ratios way beyond state of the art, for instance,4,000:1 and higher for human genomes. We evaluate our algorithms on a large data set from three different species (more than 1,000 genomes, more than 3 TB) and on a collection of versions of Wikipedia pages. Our results show that real-time compression of highly similar sequences at high compression ratios is possible on modern hardware.

  12. High-throughput sample adaptive offset hardware architecture for high-efficiency video coding

    Science.gov (United States)

    Zhou, Wei; Yan, Chang; Zhang, Jingzhi; Zhou, Xin

    2018-03-01

    A high-throughput hardware architecture for a sample adaptive offset (SAO) filter in the high-efficiency video coding video coding standard is presented. First, an implementation-friendly and simplified bitrate estimation method of rate-distortion cost calculation is proposed to reduce the computational complexity in the mode decision of SAO. Then, a high-throughput VLSI architecture for SAO is presented based on the proposed bitrate estimation method. Furthermore, multiparallel VLSI architecture for in-loop filters, which integrates both deblocking filter and SAO filter, is proposed. Six parallel strategies are applied in the proposed in-loop filters architecture to improve the system throughput and filtering speed. Experimental results show that the proposed in-loop filters architecture can achieve up to 48% higher throughput in comparison with prior work. The proposed architecture can reach a high-operating clock frequency of 297 MHz with TSMC 65-nm library and meet the real-time requirement of the in-loop filters for 8 K × 4 K video format at 132 fps.

  13. Not all are free-living: high-throughput DNA metabarcoding reveals a diverse community of protists parasitizing soil metazoa

    NARCIS (Netherlands)

    Geisen, S.; Laros, I.; Vizcaino, A.; Bonkowski, M.; Groot, de G.A.

    2015-01-01

    Protists, the most diverse eukaryotes, are largely considered to be free-living bacterivores, but vast numbers of taxa are known to parasitize plants or animals. High-throughput sequencing (HTS) approaches now commonly replace cultivation-based approaches in studying soil protists, but insights into

  14. Raman-Activated Droplet Sorting (RADS) for Label-Free High-Throughput Screening of Microalgal Single-Cells.

    Science.gov (United States)

    Wang, Xixian; Ren, Lihui; Su, Yetian; Ji, Yuetong; Liu, Yaoping; Li, Chunyu; Li, Xunrong; Zhang, Yi; Wang, Wei; Hu, Qiang; Han, Danxiang; Xu, Jian; Ma, Bo

    2017-11-21

    Raman-activated cell sorting (RACS) has attracted increasing interest, yet throughput remains one major factor limiting its broader application. Here we present an integrated Raman-activated droplet sorting (RADS) microfluidic system for functional screening of live cells in a label-free and high-throughput manner, by employing AXT-synthetic industrial microalga Haematococcus pluvialis (H. pluvialis) as a model. Raman microspectroscopy analysis of individual cells is carried out prior to their microdroplet encapsulation, which is then directly coupled to DEP-based droplet sorting. To validate the system, H. pluvialis cells containing different levels of AXT were mixed and underwent RADS. Those AXT-hyperproducing cells were sorted with an accuracy of 98.3%, an enrichment ratio of eight folds, and a throughput of ∼260 cells/min. Of the RADS-sorted cells, 92.7% remained alive and able to proliferate, which is equivalent to the unsorted cells. Thus, the RADS achieves a much higher throughput than existing RACS systems, preserves the vitality of cells, and facilitates seamless coupling with downstream manipulations such as single-cell sequencing and cultivation.

  15. Function of the EGR-1/TIS8 radiation inducible promoter in a minimal HSV-1 amplicon system

    International Nuclear Information System (INIS)

    Spear, M.A.; Sakamoto, K.M.; Herrlinger, U.; Pechan, P.; Breakefield, X.O.

    1997-01-01

    Purpose: To evaluate function of the EGR-1/TIS8 promoter region in minimal HSV-1 amplicon system in order to determine the feasibility of using the system to regulate vector replication with radiation. Materials and Methods: A 600-base pair 5' upstream region of the EGR-1 promoter linked to chloramphenicol acetyltransferase (CAT) was recombined into a minimal HSV-1 amplicon vector system (pONEC). pONEC or a control plasmid was transfected into U87 glioma cells using the Lipofectamine method. Thirty-six hours later one aliquot of cells from each transfection was irradiated to a dose of 20 Gy and another identical aliquot served as a control. CAT activity was assayed 8 hours after irradiation. Results: pONEC transfected cells irradiated with 20 Gy demonstrated 2.0 fold increase in CAT activity compared to non-irradiated cells. Cells transfected with control plasmid showed no change in CAT activity. Unirradiated pONEC cells had CAT activity 1.3 times those transfected with control plasmid. Conclusion: We have previously created HSV-1 gene therapy amplicon vector systems which allow virus-amplicon interdependent replication, with the intent of regulating replication. The above data demonstrates that a minimal amplicon system will allow radiation dependent regulation by the EGR-1 promoter, thus indicating the possibility of using this system to regulate onsite, spatially and temporally regulated vector production. Baseline CAT activity was higher and relative induction lower than other reported expression constructs, which raises concern for the ability of the system to produce a differential in transcription levels sufficient for this purpose. This is possibly the result of residual promoter/enhancer elements remaining in the HSV-1 sequences. We are attempting to create constructs lacking these elements. Addition of secondary promoter sequences may also be of use. We are also currently evaluating the efficacy of the putative IEX-1 radiation inducible promoter region in

  16. High Throughput Determinations of Critical Dosing Parameters (IVIVE workshop)

    Science.gov (United States)

    High throughput toxicokinetics (HTTK) is an approach that allows for rapid estimations of TK for hundreds of environmental chemicals. HTTK-based reverse dosimetry (i.e, reverse toxicokinetics or RTK) is used in order to convert high throughput in vitro toxicity screening (HTS) da...

  17. Viral metagenomics: Analysis of begomoviruses by illumina high-throughput sequencing

    KAUST Repository

    Idris, Ali

    2014-03-12

    Traditional DNA sequencing methods are inefficient, lack the ability to discern the least abundant viral sequences, and ineffective for determining the extent of variability in viral populations. Here, populations of single-stranded DNA plant begomoviral genomes and their associated beta- and alpha-satellite molecules (virus-satellite complexes) (genus, Begomovirus; family, Geminiviridae) were enriched from total nucleic acids isolated from symptomatic, field-infected plants, using rolling circle amplification (RCA). Enriched virus-satellite complexes were subjected to Illumina-Next Generation Sequencing (NGS). CASAVA and SeqMan NGen programs were implemented, respectively, for quality control and for de novo and reference-guided contig assembly of viral-satellite sequences. The authenticity of the begomoviral sequences, and the reproducibility of the Illumina-NGS approach for begomoviral deep sequencing projects, were validated by comparing NGS results with those obtained using traditional molecular cloning and Sanger sequencing of viral components and satellite DNAs, also enriched by RCA or amplified by polymerase chain reaction. As the use of NGS approaches, together with advances in software development, make possible deep sequence coverage at a lower cost; the approach described herein will streamline the exploration of begomovirus diversity and population structure from naturally infected plants, irrespective of viral abundance. This is the first report of the implementation of Illumina-NGS to explore the diversity and identify begomoviral-satellite SNPs directly from plants naturally-infected with begomoviruses under field conditions. 2014 by the authors; licensee MDPI, Basel, Switzerland.

  18. Viral Metagenomics: Analysis of Begomoviruses by Illumina High-Throughput Sequencing

    Directory of Open Access Journals (Sweden)

    Ali Idris

    2014-03-01

    Full Text Available Traditional DNA sequencing methods are inefficient, lack the ability to discern the least abundant viral sequences, and ineffective for determining the extent of variability in viral populations. Here, populations of single-stranded DNA plant begomoviral genomes and their associated beta- and alpha-satellite molecules (virus-satellite complexes (genus, Begomovirus; family, Geminiviridae were enriched from total nucleic acids isolated from symptomatic, field-infected plants, using rolling circle amplification (RCA. Enriched virus-satellite complexes were subjected to Illumina-Next Generation Sequencing (NGS. CASAVA and SeqMan NGen programs were implemented, respectively, for quality control and for de novo and reference-guided contig assembly of viral-satellite sequences. The authenticity of the begomoviral sequences, and the reproducibility of the Illumina-NGS approach for begomoviral deep sequencing projects, were validated by comparing NGS results with those obtained using traditional molecular cloning and Sanger sequencing of viral components and satellite DNAs, also enriched by RCA or amplified by polymerase chain reaction. As the use of NGS approaches, together with advances in software development, make possible deep sequence coverage at a lower cost; the approach described herein will streamline the exploration of begomovirus diversity and population structure from naturally infected plants, irrespective of viral abundance. This is the first report of the implementation of Illumina-NGS to explore the diversity and identify begomoviral-satellite SNPs directly from plants naturally-infected with begomoviruses under field conditions.

  19. Identifying driver mutations in sequenced cancer genomes

    DEFF Research Database (Denmark)

    Raphael, Benjamin J; Dobson, Jason R; Oesper, Layla

    2014-01-01

    High-throughput DNA sequencing is revolutionizing the study of cancer and enabling the measurement of the somatic mutations that drive cancer development. However, the resulting sequencing datasets are large and complex, obscuring the clinically important mutations in a background of errors, nois...... patterns of mutual exclusivity. These techniques, coupled with advances in high-throughput DNA sequencing, are enabling precision medicine approaches to the diagnosis and treatment of cancer....

  20. High-throughput sequencing and analysis of the gill tissue transcriptome from the deep-sea hydrothermal vent mussel Bathymodiolus azoricus

    Directory of Open Access Journals (Sweden)

    Gomes Paula

    2010-10-01

    Full Text Available Abstract Background Bathymodiolus azoricus is a deep-sea hydrothermal vent mussel found in association with large faunal communities living in chemosynthetic environments at the bottom of the sea floor near the Azores Islands. Investigation of the exceptional physiological reactions that vent mussels have adopted in their habitat, including responses to environmental microbes, remains a difficult challenge for deep-sea biologists. In an attempt to reveal genes potentially involved in the deep-sea mussel innate immunity we carried out a high-throughput sequence analysis of freshly collected B. azoricus transcriptome using gills tissues as the primary source of immune transcripts given its strategic role in filtering the surrounding waterborne potentially infectious microorganisms. Additionally, a substantial EST data set was produced and from which a comprehensive collection of genes coding for putative proteins was organized in a dedicated database, "DeepSeaVent" the first deep-sea vent animal transcriptome database based on the 454 pyrosequencing technology. Results A normalized cDNA library from gills tissue was sequenced in a full 454 GS-FLX run, producing 778,996 sequencing reads. Assembly of the high quality reads resulted in 75,407 contigs of which 3,071 were singletons. A total of 39,425 transcripts were conceptually translated into amino-sequences of which 22,023 matched known proteins in the NCBI non-redundant protein database, 15,839 revealed conserved protein domains through InterPro functional classification and 9,584 were assigned with Gene Ontology terms. Queries conducted within the database enabled the identification of genes putatively involved in immune and inflammatory reactions which had not been previously evidenced in the vent mussel. Their physical counterpart was confirmed by semi-quantitative quantitative Reverse-Transcription-Polymerase Chain Reactions (RT-PCR and their RNA transcription level by quantitative PCR (q

  1. Denoising PCR-amplified metagenome data

    Directory of Open Access Journals (Sweden)

    Rosen Michael J

    2012-10-01

    Full Text Available Abstract Background PCR amplification and high-throughput sequencing theoretically enable the characterization of the finest-scale diversity in natural microbial and viral populations, but each of these methods introduces random errors that are difficult to distinguish from genuine biological diversity. Several approaches have been proposed to denoise these data but lack either speed or accuracy. Results We introduce a new denoising algorithm that we call DADA (Divisive Amplicon Denoising Algorithm. Without training data, DADA infers both the sample genotypes and error parameters that produced a metagenome data set. We demonstrate performance on control data sequenced on Roche’s 454 platform, and compare the results to the most accurate denoising software currently available, AmpliconNoise. Conclusions DADA is more accurate and over an order of magnitude faster than AmpliconNoise. It eliminates the need for training data to establish error parameters, fully utilizes sequence-abundance information, and enables inclusion of context-dependent PCR error rates. It should be readily extensible to other sequencing platforms such as Illumina.

  2. Applications of ambient mass spectrometry in high-throughput screening.

    Science.gov (United States)

    Li, Li-Ping; Feng, Bao-Sheng; Yang, Jian-Wang; Chang, Cui-Lan; Bai, Yu; Liu, Hu-Wei

    2013-06-07

    The development of rapid screening and identification techniques is of great importance for drug discovery, doping control, forensic identification, food safety and quality control. Ambient mass spectrometry (AMS) allows rapid and direct analysis of various samples in open air with little sample preparation. Recently, its applications in high-throughput screening have been in rapid progress. During the past decade, various ambient ionization techniques have been developed and applied in high-throughput screening. This review discusses typical applications of AMS, including DESI (desorption electrospray ionization), DART (direct analysis in real time), EESI (extractive electrospray ionization), etc., in high-throughput screening (HTS).

  3. GlycoExtractor: a web-based interface for high throughput processing of HPLC-glycan data.

    Science.gov (United States)

    Artemenko, Natalia V; Campbell, Matthew P; Rudd, Pauline M

    2010-04-05

    Recently, an automated high-throughput HPLC platform has been developed that can be used to fully sequence and quantify low concentrations of N-linked sugars released from glycoproteins, supported by an experimental database (GlycoBase) and analytical tools (autoGU). However, commercial packages that support the operation of HPLC instruments and data storage lack platforms for the extraction of large volumes of data. The lack of resources and agreed formats in glycomics is now a major limiting factor that restricts the development of bioinformatic tools and automated workflows for high-throughput HPLC data analysis. GlycoExtractor is a web-based tool that interfaces with a commercial HPLC database/software solution to facilitate the extraction of large volumes of processed glycan profile data (peak number, peak areas, and glucose unit values). The tool allows the user to export a series of sample sets to a set of file formats (XML, JSON, and CSV) rather than a collection of disconnected files. This approach not only reduces the amount of manual refinement required to export data into a suitable format for data analysis but also opens the field to new approaches for high-throughput data interpretation and storage, including biomarker discovery and validation and monitoring of online bioprocessing conditions for next generation biotherapeutics.

  4. The keystone species of Precambrian deep bedrock biosphere belong to Burkholderiales and Clostridiales

    OpenAIRE

    L. Purkamo; M. Bomberg; R. Kietäväinen; H. Salavirta; M. Nyyssönen; M. Nuppunen-Puputti; L. Ahonen; I. Kukkonen; M. Itävaara

    2015-01-01

    The bacterial and archaeal community composition and the possible carbon assimilation processes and energy sources of microbial communities in oligotrophic, deep, crystalline bedrock fractures is yet to be resolved. In this study, intrinsic microbial communities from six fracture zones from 180–2300 m depths in Outokumpu bedrock were characterized using high-throughput amplicon sequencing and metagenomic prediction. Comamonadaceae-, ...

  5. Generalized schemes for high throughput manipulation of the Desulfovibrio vulgaris Hildenborough genome

    Energy Technology Data Exchange (ETDEWEB)

    Chhabra, S.R.; Butland, G.; Elias, D.; Chandonia, J.-M.; Fok, V.; Juba, T.; Gorur, A.; Allen, S.; Leung, C.-M.; Keller, K.; Reveco, S.; Zane, G.; Semkiw, E.; Prathapam, R.; Gold, B.; Singer, M.; Ouellet, M.; Sazakal, E.; Jorgens, D.; Price, M.; Witkowska, E.; Beller, H.; Hazen, T.C.; Biggin, M.; Auer, M.; Wall, J.; Keasling, J.

    2011-07-15

    The ability to conduct advanced functional genomic studies of the thousands of sequenced bacteria has been hampered by the lack of available tools for making high- throughput chromosomal manipulations in a systematic manner that can be applied across diverse species. In this work, we highlight the use of synthetic biological tools to assemble custom suicide vectors with reusable and interchangeable DNA “parts” to facilitate chromosomal modification at designated loci. These constructs enable an array of downstream applications including gene replacement and creation of gene fusions with affinity purification or localization tags. We employed this approach to engineer chromosomal modifications in a bacterium that has previously proven difficult to manipulate genetically, Desulfovibrio vulgaris Hildenborough, to generate a library of over 700 strains. Furthermore, we demonstrate how these modifications can be used for examining metabolic pathways, protein-protein interactions, and protein localization. The ubiquity of suicide constructs in gene replacement throughout biology suggests that this approach can be applied to engineer a broad range of species for a diverse array of systems biological applications and is amenable to high-throughput implementation.

  6. Conservative fragments in bacterial 16S rRNA genes and primer design for 16S ribosomal DNA amplicons in metagenomic studies

    KAUST Repository

    Wang, Yong

    2009-10-09

    Bacterial 16S ribosomal DNA (rDNA) amplicons have been widely used in the classification of uncultured bacteria inhabiting environmental niches. Primers targeting conservative regions of the rDNAs are used to generate amplicons of variant regions that are informative in taxonomic assignment. One problem is that the percentage coverage and application scope of the primers used in previous studies are largely unknown. In this study, conservative fragments of available rDNA sequences were first mined and then used to search for candidate primers within the fragments by measuring the coverage rate defined as the percentage of bacterial sequences containing the target. Thirty predicted primers with a high coverage rate (>90%) were identified, which were basically located in the same conservative regions as known primers in previous reports, whereas 30% of the known primers were associated with a coverage rate of <90%. The application scope of the primers was also examined by calculating the percentages of failed detections in bacterial phyla. Primers A519-539, E969- 983, E1063-1081, U515 and E517, are highly recommended because of their high coverage in almost all phyla. As expected, the three predominant phyla, Firmicutes, Gemmatimonadetes and Proteobacteria, are best covered by the predicted primers. The primers recommended in this report shall facilitate a comprehensive and reliable survey of bacterial diversity in metagenomic studies. © 2009 Wang, Qian.

  7. Detection of a Usp-like gene in Calotropis procera plant from the de novo assembled genome contigs of the high-throughput sequencing dataset

    KAUST Repository

    Shokry, Ahmed M.

    2014-02-01

    The wild plant species Calotropis procera (C. procera) has many potential applications and beneficial uses in medicine, industry and ornamental field. It also represents an excellent source of genes for drought and salt tolerance. Genes encoding proteins that contain the conserved universal stress protein (USP) domain are known to provide organisms like bacteria, archaea, fungi, protozoa and plants with the ability to respond to a plethora of environmental stresses. However, information on the possible occurrence of Usp in C. procera is not available. In this study, we uncovered and characterized a one-class A Usp-like (UspA-like, NCBI accession No. KC954274) gene in this medicinal plant from the de novo assembled genome contigs of the high-throughput sequencing dataset. A number of GenBank accessions for Usp sequences were blasted with the recovered de novo assembled contigs. Homology modelling of the deduced amino acids (NCBI accession No. AGT02387) was further carried out using Swiss-Model, accessible via the EXPASY. Superimposition of C. procera USPA-like full sequence model on Thermus thermophilus USP UniProt protein (PDB accession No. Q5SJV7) was constructed using RasMol and Deep-View programs. The functional domains of the novel USPA-like amino acids sequence were identified from the NCBI conserved domain database (CDD) that provide insights into sequence structure/function relationships, as well as domain models imported from a number of external source databases (Pfam, SMART, COG, PRK, TIGRFAM). © 2014 Académie des sciences.

  8. High Throughput Sample Preparation and Analysis for DNA Sequencing, PCR and Combinatorial Screening of Catalysis Based on Capillary Array Technique

    Energy Technology Data Exchange (ETDEWEB)

    Zhang, Yonghua [Iowa State Univ., Ames, IA (United States)

    2000-01-01

    Sample preparation has been one of the major bottlenecks for many high throughput analyses. The purpose of this research was to develop new sample preparation and integration approach for DNA sequencing, PCR based DNA analysis and combinatorial screening of homogeneous catalysis based on multiplexed capillary electrophoresis with laser induced fluorescence or imaging UV absorption detection. The author first introduced a method to integrate the front-end tasks to DNA capillary-array sequencers. protocols for directly sequencing the plasmids from a single bacterial colony in fused-silica capillaries were developed. After the colony was picked, lysis was accomplished in situ in the plastic sample tube using either a thermocycler or heating block. Upon heating, the plasmids were released while chromsomal DNA and membrane proteins were denatured and precipitated to the bottom of the tube. After adding enzyme and Sanger reagents, the resulting solution was aspirated into the reaction capillaries by a syringe pump, and cycle sequencing was initiated. No deleterious effect upon the reaction efficiency, the on-line purification system, or the capillary electrophoresis separation was observed, even though the crude lysate was used as the template. Multiplexed on-line DNA sequencing data from 8 parallel channels allowed base calling up to 620 bp with an accuracy of 98%. The entire system can be automatically regenerated for repeated operation. For PCR based DNA analysis, they demonstrated that capillary electrophoresis with UV detection can be used for DNA analysis starting from clinical sample without purification. After PCR reaction using cheek cell, blood or HIV-1 gag DNA, the reaction mixtures was injected into the capillary either on-line or off-line by base stacking. The protocol was also applied to capillary array electrophoresis. The use of cheaper detection, and the elimination of purification of DNA sample before or after PCR reaction, will make this approach an

  9. Highly multiplexed targeted DNA sequencing from single nuclei.

    Science.gov (United States)

    Leung, Marco L; Wang, Yong; Kim, Charissa; Gao, Ruli; Jiang, Jerry; Sei, Emi; Navin, Nicholas E

    2016-02-01

    Single-cell DNA sequencing methods are challenged by poor physical coverage, high technical error rates and low throughput. To address these issues, we developed a single-cell DNA sequencing protocol that combines flow-sorting of single nuclei, time-limited multiple-displacement amplification (MDA), low-input library preparation, DNA barcoding, targeted capture and next-generation sequencing (NGS). This approach represents a major improvement over our previous single nucleus sequencing (SNS) Nature Protocols paper in terms of generating higher-coverage data (>90%), thereby enabling the detection of genome-wide variants in single mammalian cells at base-pair resolution. Furthermore, by pooling 48-96 single-cell libraries together for targeted capture, this approach can be used to sequence many single-cell libraries in parallel in a single reaction. This protocol greatly reduces the cost of single-cell DNA sequencing, and it can be completed in 5-6 d by advanced users. This single-cell DNA sequencing protocol has broad applications for studying rare cells and complex populations in diverse fields of biological research and medicine.

  10. Characterization of Intestinal Microbiomes of Hirschsprung's Disease Patients with or without Enterocolitis Using Illumina-MiSeq High-Throughput Sequencing.

    Directory of Open Access Journals (Sweden)

    Yuqing Li

    Full Text Available Hirschsprung-associated enterocolitis (HAEC is a life-threatening complication of Hirschsprung's disease (HD. Although the pathological mechanisms are still unclear, studies have shown that HAEC has a close relationship with the disturbance of intestinal microbiota. This study aimed to investigate the characteristics of the intestinal microbiome of HD patients with or without enterocolitis. During routine or emergency surgery, we collected 35 intestinal content samples from five patients with HAEC and eight HD patients, including three HD patients with a history of enterocolitis who were in a HAEC remission (HAEC-R phase. Using Illumina-MiSeq high-throughput sequencing, we sequenced the V4 region of bacterial 16S rRNA, and operational taxonomic units (OTUs were defined by 97% sequence similarity. Principal coordinate analysis (PCoA of weighted UniFrac distances was performed to evaluate the diversity of each intestinal microbiome sample. The microbiota differed significantly between the HD patients (characterized by the prevalence of Bacteroidetes and HAEC patients (characterized by the prevalence of Proteobacteria, while the microbiota of the HAEC-R patients was more similar to that of the HAEC patients. We also observed that the specimens from different intestinal sites of each HD patient differed significantly, while the specimens from different intestinal sites of each HAEC and HAEC-R patient were more similar. In conclusion, the microbiome pattern of the HAEC-R patients was more similar to that of the HAEC patients than to that of the HD patients. The HD patients had a relatively distinct, more stable community than the HAEC and HAEC-R patients, suggesting that enterocolitis may either be caused by or result in a disruption of the patient's uniquely adapted intestinal flora. The intestinal microbiota associated with enterocolitis may persist following symptom resolution and can be implicated in the symptom recurrence.

  11. High-throughput screening (HTS) and modeling of the retinoid ...

    Science.gov (United States)

    Presentation at the Retinoids Review 2nd workshop in Brussels, Belgium on the application of high throughput screening and model to the retinoid system Presentation at the Retinoids Review 2nd workshop in Brussels, Belgium on the application of high throughput screening and model to the retinoid system

  12. Similar diversity of Alphaproteobacteria and nitrogenase gene amplicons on two related Sphagnum mosses

    Directory of Open Access Journals (Sweden)

    Anastasia eBragina

    2012-01-01

    Full Text Available Sphagnum mosses represent a main component in ombrotrophic wetlands. They harbor a specific and diverse microbial community with essential functions for the host. To understand extend and degree of host specificity, Sphagnum fallax and S. angustifolium, two phylogenetically closely related species, which show distinct habitat preference with respect to the nutrient level, were analyzed by a multifaceted approach. Microbial fingerprints obtained by PCR-SSCP (single-strand conformation polymorphism using universal, group-specific and functional primers were highly similar. Similarity was confirmed for colonization patterns obtained by fluorescence in situ hybridization (FISH coupled with confocal laser scanning microscopy (CLSM: Alphaproteobacteria were the main colonizers inside the hyaline cells of Sphagnum leaves. A deeper survey of Alphaproteobacteria by 16S rRNA gene amplicon sequencing reveals a high diversity with Acidocella, Acidisphaera, Rhodopila and Phenylobacterium as major genera for both mosses. Pathogen defense and nitrogen fixation are important functions of Sphagnum-associated bacteria, which are fulfilled by microbial communities of both Sphagna in a similar way. NifH libraries of Sphagnum-associated microbial communities were characterized by high diversity and abundance of Alphaproteobacteria but contained also diverse amplicons of other taxa, e.g. Cyanobacteria, Geobacter and Spirochaeta. Statistically significant differences between the microbial communities of both Sphagnum species could not be discovered in any of the experimental approach. Our results show that the same close relationship, which exists between the physical, morphological and chemical characteristics of Sphagnum mosses and the ecology and function of bog ecosystems, also connects moss plantlets with their associated bacterial communities.

  13. Distribution and Diversity of Bacteria and Fungi Colonization in Stone Monuments Analyzed by High-Throughput Sequencing.

    Directory of Open Access Journals (Sweden)

    Qiang Li

    Full Text Available The historical and cultural heritage of Qingxing palace and Lingyin and Kaihua temple, located in Hangzhou of China, include a large number of exquisite Buddhist statues and ancient stone sculptures which date back to the Northern Song (960-1219 A.D. and Qing dynasties (1636-1912 A.D. and are considered to be some of the best examples of ancient stone sculpting techniques. They were added to the World Heritage List in 2011 because of their unique craftsmanship and importance to the study of ancient Chinese Buddhist culture. However, biodeterioration of the surface of the ancient Buddhist statues and white marble pillars not only severely impairs their aesthetic value but also alters their material structure and thermo-hygric properties. In this study, high-throughput sequencing was utilized to identify the microbial communities colonizing the stone monuments. The diversity and distribution of the microbial communities in six samples collected from three different environmental conditions with signs of deterioration were analyzed by means of bioinformatics software and diversity indices. In addition, the impact of environmental factors, including temperature, light intensity, air humidity, and the concentration of NO2 and SO2, on the microbial communities' diversity and distribution was evaluated. The results indicate that the presence of predominantly phototrophic microorganisms was correlated with light and humidity, while nitrifying bacteria and Thiobacillus were associated with NO2 and SO2 from air pollution.

  14. Single-nucleotide polymorphism discovery by high-throughput sequencing in sorghum

    Directory of Open Access Journals (Sweden)

    White Frank F

    2011-07-01

    Full Text Available Abstract Background Eight diverse sorghum (Sorghum bicolor L. Moench accessions were subjected to short-read genome sequencing to characterize the distribution of single-nucleotide polymorphisms (SNPs. Two strategies were used for DNA library preparation. Missing SNP genotype data were imputed by local haplotype comparison. The effect of library type and genomic diversity on SNP discovery and imputation are evaluated. Results Alignment of eight genome equivalents (6 Gb to the public reference genome revealed 283,000 SNPs at ≥82% confirmation probability. Sequencing from libraries constructed to limit sequencing to start at defined restriction sites led to genotyping 10-fold more SNPs in all 8 accessions, and correctly imputing 11% more missing data, than from semirandom libraries. The SNP yield advantage of the reduced-representation method was less than expected, since up to one fifth of reads started at noncanonical restriction sites and up to one third of restriction sites predicted in silico to yield unique alignments were not sampled at near-saturation. For imputation accuracy, the availability of a genomically similar accession in the germplasm panel was more important than panel size or sequencing coverage. Conclusions A sequence quantity of 3 million 50-base reads per accession using a BsrFI library would conservatively provide satisfactory genotyping of 96,000 sorghum SNPs. For most reliable SNP-genotype imputation in shallowly sequenced genomes, germplasm panels should consist of pairs or groups of genomically similar entries. These results may help in designing strategies for economical genotyping-by-sequencing of large numbers of plant accessions.

  15. High-throughput scoring of seed germination

    NARCIS (Netherlands)

    Ligterink, Wilco; Hilhorst, Henk W.M.

    2017-01-01

    High-throughput analysis of seed germination for phenotyping large genetic populations or mutant collections is very labor intensive and would highly benefit from an automated setup. Although very often used, the total germination percentage after a nominated period of time is not very

  16. Identification and Analysis of Red Sea Mangrove (Avicennia marina) microRNAs by High-Throughput Sequencing and Their Association with Stress Responses

    KAUST Repository

    Khraiwesh, Basel; Pugalenthi, Ganesan; Fedoroff, Nina V.

    2013-01-01

    Although RNA silencing has been studied primarily in model plants, advances in high-throughput sequencing technologies have enabled profiling of the small RNA components of many more plant species, providing insights into the ubiquity and conservatism of some miRNA-based regulatory mechanisms. Small RNAs of 20 to 24 nucleotides (nt) are important regulators of gene transcript levels by either transcriptional or by posttranscriptional gene silencing, contributing to genome maintenance and controlling a variety of developmental and physiological processes. Here, we used deep sequencing and molecular methods to create an inventory of the small RNAs in the mangrove species, Avicennia marina. We identified 26 novel mangrove miRNAs and 193 conserved miRNAs belonging to 36 families. We determined that 2 of the novel miRNAs were produced from known miRNA precursors and 4 were likely to be species-specific by the criterion that we found no homologs in other plant species. We used qRT-PCR to analyze the expression of miRNAs and their target genes in different tissue sets and some demonstrated tissue-specific expression. Furthermore, we predicted potential targets of these putative miRNAs based on a sequence homology and experimentally validated through endonucleolytic cleavage assays. Our results suggested that expression profiles of miRNAs and their predicted targets could be useful in exploring the significance of the conservation patterns of plants, particularly in response to abiotic stress. Because of their well-developed abilities in this regard, mangroves and other extremophiles are excellent models for such exploration. © 2013 Khraiwesh et al.

  17. Identification and analysis of red sea mangrove (Avicennia marina microRNAs by high-throughput sequencing and their association with stress responses.

    Directory of Open Access Journals (Sweden)

    Basel Khraiwesh

    Full Text Available Although RNA silencing has been studied primarily in model plants, advances in high-throughput sequencing technologies have enabled profiling of the small RNA components of many more plant species, providing insights into the ubiquity and conservatism of some miRNA-based regulatory mechanisms. Small RNAs of 20 to 24 nucleotides (nt are important regulators of gene transcript levels by either transcriptional or by posttranscriptional gene silencing, contributing to genome maintenance and controlling a variety of developmental and physiological processes. Here, we used deep sequencing and molecular methods to create an inventory of the small RNAs in the mangrove species, Avicennia marina. We identified 26 novel mangrove miRNAs and 193 conserved miRNAs belonging to 36 families. We determined that 2 of the novel miRNAs were produced from known miRNA precursors and 4 were likely to be species-specific by the criterion that we found no homologs in other plant species. We used qRT-PCR to analyze the expression of miRNAs and their target genes in different tissue sets and some demonstrated tissue-specific expression. Furthermore, we predicted potential targets of these putative miRNAs based on a sequence homology and experimentally validated through endonucleolytic cleavage assays. Our results suggested that expression profiles of miRNAs and their predicted targets could be useful in exploring the significance of the conservation patterns of plants, particularly in response to abiotic stress. Because of their well-developed abilities in this regard, mangroves and other extremophiles are excellent models for such exploration.

  18. Identification and Analysis of Red Sea Mangrove (Avicennia marina) microRNAs by High-Throughput Sequencing and Their Association with Stress Responses

    KAUST Repository

    Khraiwesh, Basel

    2013-04-08

    Although RNA silencing has been studied primarily in model plants, advances in high-throughput sequencing technologies have enabled profiling of the small RNA components of many more plant species, providing insights into the ubiquity and conservatism of some miRNA-based regulatory mechanisms. Small RNAs of 20 to 24 nucleotides (nt) are important regulators of gene transcript levels by either transcriptional or by posttranscriptional gene silencing, contributing to genome maintenance and controlling a variety of developmental and physiological processes. Here, we used deep sequencing and molecular methods to create an inventory of the small RNAs in the mangrove species, Avicennia marina. We identified 26 novel mangrove miRNAs and 193 conserved miRNAs belonging to 36 families. We determined that 2 of the novel miRNAs were produced from known miRNA precursors and 4 were likely to be species-specific by the criterion that we found no homologs in other plant species. We used qRT-PCR to analyze the expression of miRNAs and their target genes in different tissue sets and some demonstrated tissue-specific expression. Furthermore, we predicted potential targets of these putative miRNAs based on a sequence homology and experimentally validated through endonucleolytic cleavage assays. Our results suggested that expression profiles of miRNAs and their predicted targets could be useful in exploring the significance of the conservation patterns of plants, particularly in response to abiotic stress. Because of their well-developed abilities in this regard, mangroves and other extremophiles are excellent models for such exploration. © 2013 Khraiwesh et al.

  19. PRIMEGENSw3: a web-based tool for high-throughput primer and probe design.

    Science.gov (United States)

    Kushwaha, Garima; Srivastava, Gyan Prakash; Xu, Dong

    2015-01-01

    Highly specific and efficient primer and probe design has been a major hurdle in many high-throughput techniques. Successful implementation of any PCR or probe hybridization technique depends on the quality of primers and probes used in terms of their specificity and cross-hybridization. Here we describe PRIMEGENSw3, a set of web-based utilities for high-throughput primer and probe design. These utilities allow users to select genomic regions and to design primer/probe for selected regions in an interactive, user-friendly, and automatic fashion. The system runs the PRIMEGENS algorithm in the back-end on the high-performance server with the stored genomic database or user-provided custom database for cross-hybridization check. Cross-hybridization is checked not only using BLAST but also by checking mismatch positions and energy calculation of potential hybridization hits. The results can be visualized online and also can be downloaded. The average success rate of primer design using PRIMEGENSw3 is ~90 %. The web server also supports primer design for methylated sequences, which is used in epigenetic studies. Stand-alone version of the software is also available for download at the website.

  20. metaBIT, an integrative and automated metagenomic pipeline for analysing microbial profiles from high-throughput sequencing shotgun data

    DEFF Research Database (Denmark)

    Louvel, Guillaume; Der Sarkissian, Clio; Hanghøj, Kristian Ebbesen

    2016-01-01

    -throughput DNA sequencing (HTS). Here, we develop metaBIT, an open-source computational pipeline automatizing routine microbial profiling of shotgun HTS data. Customizable by the user at different stringency levels, it performs robust taxonomy-based assignment and relative abundance calculation of microbial taxa......, as well as cross-sample statistical analyses of microbial diversity distributions. We demonstrate the versatility of metaBIT within a range of published HTS data sets sampled from the environment (soil and seawater) and the human body (skin and gut), but also from archaeological specimens. We present......-friendly profiling of the microbial DNA present in HTS shotgun data sets. The applications of metaBIT are vast, from monitoring of laboratory errors and contaminations, to the reconstruction of past and present microbiota, and the detection of candidate species, including pathogens....

  1. Viral metagenomics: Analysis of begomoviruses by illumina high-throughput sequencing

    KAUST Repository

    Idris, Ali; Al-Saleh, Mohammed; Piatek, Marek J.; Al-Shahwan, Ibrahim; Ali, Shahjahan; Brown, Judith K.

    2014-01-01

    Traditional DNA sequencing methods are inefficient, lack the ability to discern the least abundant viral sequences, and ineffective for determining the extent of variability in viral populations. Here, populations of single-stranded DNA plant

  2. LSGermOPA, a custom OPA of 384 EST-derived SNPs for high-throughput lettuce (Lactuca sativa L.) germplasm fingerprinting

    Science.gov (United States)

    We assessed the genetic diversity and population structure among 148 cultivated lettuce (Lactuca sativa L.) accessions using the high-throughput GoldenGate assay and 384 EST (Expressed Sequence Tag)-derived SNP (single nucleotide polymorphism) markers. A custom OPA (Oligo Pool All), LSGermOPA was fo...

  3. ESPRIT-Forest: Parallel clustering of massive amplicon sequence data in subquadratic time.

    Science.gov (United States)

    Cai, Yunpeng; Zheng, Wei; Yao, Jin; Yang, Yujie; Mai, Volker; Mao, Qi; Sun, Yijun

    2017-04-01

    The rapid development of sequencing technology has led to an explosive accumulation of genomic sequence data. Clustering is often the first step to perform in sequence analysis, and hierarchical clustering is one of the most commonly used approaches for this purpose. However, it is currently computationally expensive to perform hierarchical clustering of extremely large sequence datasets due to its quadratic time and space complexities. In this paper we developed a new algorithm called ESPRIT-Forest for parallel hierarchical clustering of sequences. The algorithm achieves subquadratic time and space complexity and maintains a high clustering accuracy comparable to the standard method. The basic idea is to organize sequences into a pseudo-metric based partitioning tree for sub-linear time searching of nearest neighbors, and then use a new multiple-pair merging criterion to construct clusters in parallel using multiple threads. The new algorithm was tested on the human microbiome project (HMP) dataset, currently one of the largest published microbial 16S rRNA sequence dataset. Our experiment demonstrated that with the power of parallel computing it is now compu- tationally feasible to perform hierarchical clustering analysis of tens of millions of sequences. The software is available at http://www.acsu.buffalo.edu/∼yijunsun/lab/ESPRIT-Forest.html.

  4. 20180311 - High Throughput Transcriptomics: From screening to pathways (SOT 2018)

    Science.gov (United States)

    The EPA ToxCast effort has screened thousands of chemicals across hundreds of high-throughput in vitro screening assays. The project is now leveraging high-throughput transcriptomic (HTTr) technologies to substantially expand its coverage of biological pathways. The first HTTr sc...

  5. High throughput label-free platform for statistical bio-molecular sensing

    DEFF Research Database (Denmark)

    Bosco, Filippo; Hwu, En-Te; Chen, Ching-Hsiu

    2011-01-01

    Sensors are crucial in many daily operations including security, environmental control, human diagnostics and patient monitoring. Screening and online monitoring require reliable and high-throughput sensing. We report on the demonstration of a high-throughput label-free sensor platform utilizing...

  6. Sequence assembly

    DEFF Research Database (Denmark)

    Scheibye-Alsing, Karsten; Hoffmann, S.; Frankel, Annett Maria

    2009-01-01

    Despite the rapidly increasing number of sequenced and re-sequenced genomes, many issues regarding the computational assembly of large-scale sequencing data have remain unresolved. Computational assembly is crucial in large genome projects as well for the evolving high-throughput technologies and...... in genomic DNA, highly expressed genes and alternative transcripts in EST sequences. We summarize existing comparisons of different assemblers and provide a detailed descriptions and directions for download of assembly programs at: http://genome.ku.dk/resources/assembly/methods.html....

  7. Theory and implementation of a very high throughput true random number generator in field programmable gate array

    Energy Technology Data Exchange (ETDEWEB)

    Wang, Yonggang, E-mail: wangyg@ustc.edu.cn; Hui, Cong; Liu, Chong; Xu, Chao [Department of Modern Physics, University of Science and Technology of China, Hefei 230026 (China)

    2016-04-15

    The contribution of this paper is proposing a new entropy extraction mechanism based on sampling phase jitter in ring oscillators to make a high throughput true random number generator in a field programmable gate array (FPGA) practical. Starting from experimental observation and analysis of the entropy source in FPGA, a multi-phase sampling method is exploited to harvest the clock jitter with a maximum entropy and fast sampling speed. This parametrized design is implemented in a Xilinx Artix-7 FPGA, where the carry chains in the FPGA are explored to realize the precise phase shifting. The generator circuit is simple and resource-saving, so that multiple generation channels can run in parallel to scale the output throughput for specific applications. The prototype integrates 64 circuit units in the FPGA to provide a total output throughput of 7.68 Gbps, which meets the requirement of current high-speed quantum key distribution systems. The randomness evaluation, as well as its robustness to ambient temperature, confirms that the new method in a purely digital fashion can provide high-speed high-quality random bit sequences for a variety of embedded applications.

  8. XMRF: an R package to fit Markov Networks to high-throughput genetics data.

    Science.gov (United States)

    Wan, Ying-Wooi; Allen, Genevera I; Baker, Yulia; Yang, Eunho; Ravikumar, Pradeep; Anderson, Matthew; Liu, Zhandong

    2016-08-26

    Technological advances in medicine have led to a rapid proliferation of high-throughput "omics" data. Tools to mine this data and discover disrupted disease networks are needed as they hold the key to understanding complicated interactions between genes, mutations and aberrations, and epi-genetic markers. We developed an R software package, XMRF, that can be used to fit Markov Networks to various types of high-throughput genomics data. Encoding the models and estimation techniques of the recently proposed exponential family Markov Random Fields (Yang et al., 2012), our software can be used to learn genetic networks from RNA-sequencing data (counts via Poisson graphical models), mutation and copy number variation data (categorical via Ising models), and methylation data (continuous via Gaussian graphical models). XMRF is the only tool that allows network structure learning using the native distribution of the data instead of the standard Gaussian. Moreover, the parallelization feature of the implemented algorithms computes the large-scale biological networks efficiently. XMRF is available from CRAN and Github ( https://github.com/zhandong/XMRF ).

  9. Swarm: robust and fast clustering method for amplicon-based studies

    Science.gov (United States)

    Rognes, Torbjørn; Quince, Christopher; de Vargas, Colomban; Dunthorn, Micah

    2014-01-01

    Popular de novo amplicon clustering methods suffer from two fundamental flaws: arbitrary global clustering thresholds, and input-order dependency induced by centroid selection. Swarm was developed to address these issues by first clustering nearly identical amplicons iteratively using a local threshold, and then by using clusters’ internal structure and amplicon abundances to refine its results. This fast, scalable, and input-order independent approach reduces the influence of clustering parameters and produces robust operational taxonomic units. PMID:25276506

  10. Swarm: robust and fast clustering method for amplicon-based studies

    Directory of Open Access Journals (Sweden)

    Frédéric Mahé

    2014-09-01

    Full Text Available Popular de novo amplicon clustering methods suffer from two fundamental flaws: arbitrary global clustering thresholds, and input-order dependency induced by centroid selection. Swarm was developed to address these issues by first clustering nearly identical amplicons iteratively using a local threshold, and then by using clusters’ internal structure and amplicon abundances to refine its results. This fast, scalable, and input-order independent approach reduces the influence of clustering parameters and produces robust operational taxonomic units.

  11. High-throughput determination of RNA structure by proximity ligation.

    Science.gov (United States)

    Ramani, Vijay; Qiu, Ruolan; Shendure, Jay

    2015-09-01

    We present an unbiased method to globally resolve RNA structures through pairwise contact measurements between interacting regions. RNA proximity ligation (RPL) uses proximity ligation of native RNA followed by deep sequencing to yield chimeric reads with ligation junctions in the vicinity of structurally proximate bases. We apply RPL in both baker's yeast (Saccharomyces cerevisiae) and human cells and generate contact probability maps for ribosomal and other abundant RNAs, including yeast snoRNAs, the RNA subunit of the signal recognition particle and the yeast U2 spliceosomal RNA homolog. RPL measurements correlate with established secondary structures for these RNA molecules, including stem-loop structures and long-range pseudoknots. We anticipate that RPL will complement the current repertoire of computational and experimental approaches in enabling the high-throughput determination of secondary and tertiary RNA structures.

  12. High Throughput Sequencing to Detect Differences in Methanotrophic Methylococcaceae and Methylocystaceae in Surface Peat, Forest Soil, and Sphagnum Moss in Cranesville Swamp Preserve, West Virginia, USA

    Science.gov (United States)

    Lau, Evan; Nolan, Edward J.; Dillard, Zachary W.; Dague, Ryan D.; Semple, Amanda L.; Wentzell, Wendi L.

    2015-01-01

    Northern temperate forest soils and Sphagnum-dominated peatlands are a major source and sink of methane. In these ecosystems, methane is mainly oxidized by aerobic methanotrophic bacteria, which are typically found in aerated forest soils, surface peat, and Sphagnum moss. We contrasted methanotrophic bacterial diversity and abundances from the (i) organic horizon of forest soil; (ii) surface peat; and (iii) submerged Sphagnum moss from Cranesville Swamp Preserve, West Virginia, using multiplex sequencing of bacterial 16S rRNA (V3 region) gene amplicons. From ~1 million reads, >50,000 unique OTUs (Operational Taxonomic Units), 29 and 34 unique sequences were detected in the Methylococcaceae and Methylocystaceae, respectively, and 24 potential methanotrophs in the Beijerinckiaceae were also identified. Methylacidiphilum-like methanotrophs were not detected. Proteobacterial methanotrophic bacteria constitute Sphagnum moss) or co-occurred in both Sphagnum moss and peat. This study provides insights into the structure of methanotrophic communities in relationship to habitat type, and suggests that peat and Sphagnum moss can influence methanotroph community structure and biogeography. PMID:27682082

  13. Data from: Not all are free-living: high-throughput DNA metabarcoding reveals a diverse community of protists parasitizing soil metazoa

    NARCIS (Netherlands)

    Geisen, Stefan; Laros, I.; Vizcaino, A.; Bonkowski, M.; Groot, de G.A.

    2015-01-01

    Protists, the most diverse eukaryotes, are largely considered to be free-living bacterivores, but vast numbers of taxa are known to parasitize plants or animals. High-throughput sequencing (HTS) approaches now commonly replace cultivation-based approaches in studying soil protists, but insights into

  14. Genome-wide LORE1 retrotransposon mutagenesis and high-throughput insertion detection in Lotus japonicus

    DEFF Research Database (Denmark)

    Urbanski, Dorian Fabian; Malolepszy, Anna; Stougaard, Jens

    2012-01-01

    Insertion mutants facilitate functional analysis of genes, but for most plant species it has been difficult to identify a suitable mutagen and to establish large populations for reverse genetics. The main challenge is developing efficient high-throughput procedures for both mutagenesis and insert......Insertion mutants facilitate functional analysis of genes, but for most plant species it has been difficult to identify a suitable mutagen and to establish large populations for reverse genetics. The main challenge is developing efficient high-throughput procedures for both mutagenesis...... plants. The identified insertions showed that the endogenous LORE1 retrotransposon is well suited for insertion mutagenesis due to its homogenous gene targeting and exonic insertion preference. Since LORE1 transposition occurs in the germline, harvesting seeds from a single founder line and cultivating...... progeny generates a complete mutant population. This ease of LORE1 mutagenesis combined with the efficient FSTpoolit protocol, which exploits 2D pooling, Illumina sequencing, and automated data analysis, allows highly cost-efficient development of a comprehensive reverse genetic resource....

  15. A high-throughput microfluidic dental plaque biofilm system to visualize and quantify the effect of antimicrobials

    Science.gov (United States)

    Nance, William C.; Dowd, Scot E.; Samarian, Derek; Chludzinski, Jeffrey; Delli, Joseph; Battista, John; Rickard, Alexander H.

    2013-01-01

    Objectives Few model systems are amenable to developing multi-species biofilms in parallel under environmentally germane conditions. This is a problem when evaluating the potential real-world effectiveness of antimicrobials in the laboratory. One such antimicrobial is cetylpyridinium chloride (CPC), which is used in numerous over-the-counter oral healthcare products. The aim of this work was to develop a high-throughput microfluidic system that is combined with a confocal laser scanning microscope (CLSM) to quantitatively evaluate the effectiveness of CPC against oral multi-species biofilms grown in human saliva. Methods Twenty-four-channel BioFlux microfluidic plates were inoculated with pooled human saliva and fed filter-sterilized saliva for 20 h at 37°C. The bacterial diversity of the biofilms was evaluated by bacterial tag-encoded FLX amplicon pyrosequencing (bTEFAP). The antimicrobial/anti-biofilm effect of CPC (0.5%–0.001% w/v) was examined using Live/Dead stain, CLSM and 3D imaging software. Results The analysis of biofilms by bTEFAP demonstrated that they contained genera typically found in human dental plaque. These included Aggregatibacter, Fusobacterium, Neisseria, Porphyromonas, Streptococcus and Veillonella. Using Live/Dead stain, clear gradations in killing were observed when the biofilms were treated with CPC between 0.5% and 0.001% w/v. At 0.5% (w/v) CPC, 90% of the total signal was from dead/damaged cells. Below this concentration range, less killing was observed. In the 0.5%–0.05% (w/v) range CPC penetration/killing was greatest and biofilm thickness was significantly reduced. Conclusions This work demonstrates the utility of a high-throughput microfluidic–CLSM system to grow multi-species oral biofilms, which are compositionally similar to naturally occurring biofilms, to assess the effectiveness of antimicrobials. PMID:23800904

  16. Peptide Pattern Recognition for high-throughput protein sequence analysis and clustering

    DEFF Research Database (Denmark)

    Busk, Peter Kamp

    2017-01-01

    Large collections of protein sequences with divergent sequences are tedious to analyze for understanding their phylogenetic or structure-function relation. Peptide Pattern Recognition is an algorithm that was developed to facilitate this task but the previous version does only allow a limited...... number of sequences as input. I implemented Peptide Pattern Recognition as a multithread software designed to handle large numbers of sequences and perform analysis in a reasonable time frame. Benchmarking showed that the new implementation of Peptide Pattern Recognition is twenty times faster than...... the previous implementation on a small protein collection with 673 MAP kinase sequences. In addition, the new implementation could analyze a large protein collection with 48,570 Glycosyl Transferase family 20 sequences without reaching its upper limit on a desktop computer. Peptide Pattern Recognition...

  17. Identification of miRNAs and their targets through high-throughput sequencing and degradome analysis in male and female Asparagus officinalis.

    Science.gov (United States)

    Chen, Jingli; Zheng, Yi; Qin, Li; Wang, Yan; Chen, Lifei; He, Yanjun; Fei, Zhangjun; Lu, Gang

    2016-04-12

    MicroRNAs (miRNAs), a class of non-coding small RNAs (sRNAs), regulate various biological processes. Although miRNAs have been identified and characterized in several plant species, miRNAs in Asparagus officinalis have not been reported. As a dioecious plant with homomorphic sex chromosomes, asparagus is regarded as an important model system for studying mechanisms of plant sex determination. Two independent sRNA libraries from male and female asparagus plants were sequenced with Illumina sequencing, thereby generating 4.13 and 5.88 million final clean reads, respectively. Both libraries predominantly contained 24-nt sRNAs, followed by 21-nt sRNAs. Further analysis identified 154 conserved miRNAs, which belong to 26 families, and 39 novel miRNA candidates seemed to be specific to asparagus. Comparative profiling revealed that 63 miRNAs exhibited significant differential expression between male and female plants, which was confirmed by real-time quantitative PCR analysis. Among them, 37 miRNAs were significantly up-regulated in the female library, whereas the others were preferentially expressed in the male library. Furthermore, 40 target mRNAs representing 44 conserved and seven novel miRNAs were identified in asparagus through high-throughput degradome sequencing. Functional annotation showed that these target mRNAs were involved in a wide range of developmental and metabolic processes. We identified a large set of conserved and specific miRNAs and compared their expression levels between male and female asparagus plants. Several asparagus miRNAs, which belong to the miR159, miR167, and miR172 families involved in reproductive organ development, were differentially expressed between male and female plants, as well as during flower development. Consistently, several predicted targets of asparagus miRNAs were associated with floral organ development. These findings suggest the potential roles of miRNAs in sex determination and reproductive developmental processes in

  18. High-throughput sequencing of nematode communities from total soil DNA extractions

    DEFF Research Database (Denmark)

    Sapkota, Rumakanta; Nicolaisen, Mogens

    2015-01-01

    nematodes without the need for enrichment was developed. Using this strategy on DNA templates from a set of 22 agricultural soils, we obtained 64.4% sequences of nematode origin in total, whereas the remaining sequences were almost entirely from other metazoans. The nematode sequences were derived from...... in previous sequence-based studies are not nematode specific but also amplify other groups of organisms such as fungi and plantae, and thus require a nematode enrichment step that may introduce biases. Results: In this study an amplification strategy which selectively amplifies a fragment of the SSU from...... a broad taxonomic range and most sequences were from nematode taxa that have previously been found to be abundant in soil such as Tylenchida, Rhabditida, Dorylaimida, Triplonchida and Araeolaimida. Conclusions: Our amplification and sequencing strategy for assessing nematode diversity was able to collect...

  19. High throughput imaging cytometer with acoustic focussing.

    Science.gov (United States)

    Zmijan, Robert; Jonnalagadda, Umesh S; Carugo, Dario; Kochi, Yu; Lemm, Elizabeth; Packham, Graham; Hill, Martyn; Glynne-Jones, Peter

    2015-10-31

    We demonstrate an imaging flow cytometer that uses acoustic levitation to assemble cells and other particles into a sheet structure. This technique enables a high resolution, low noise CMOS camera to capture images of thousands of cells with each frame. While ultrasonic focussing has previously been demonstrated for 1D cytometry systems, extending the technology to a planar, much higher throughput format and integrating imaging is non-trivial, and represents a significant jump forward in capability, leading to diagnostic possibilities not achievable with current systems. A galvo mirror is used to track the images of the moving cells permitting exposure times of 10 ms at frame rates of 50 fps with motion blur of only a few pixels. At 80 fps, we demonstrate a throughput of 208 000 beads per second. We investigate the factors affecting motion blur and throughput, and demonstrate the system with fluorescent beads, leukaemia cells and a chondrocyte cell line. Cells require more time to reach the acoustic focus than beads, resulting in lower throughputs; however a longer device would remove this constraint.

  20. High-throughput GPU-based LDPC decoding

    Science.gov (United States)

    Chang, Yang-Lang; Chang, Cheng-Chun; Huang, Min-Yu; Huang, Bormin

    2010-08-01

    Low-density parity-check (LDPC) code is a linear block code known to approach the Shannon limit via the iterative sum-product algorithm. LDPC codes have been adopted in most current communication systems such as DVB-S2, WiMAX, WI-FI and 10GBASE-T. LDPC for the needs of reliable and flexible communication links for a wide variety of communication standards and configurations have inspired the demand for high-performance and flexibility computing. Accordingly, finding a fast and reconfigurable developing platform for designing the high-throughput LDPC decoder has become important especially for rapidly changing communication standards and configurations. In this paper, a new graphic-processing-unit (GPU) LDPC decoding platform with the asynchronous data transfer is proposed to realize this practical implementation. Experimental results showed that the proposed GPU-based decoder achieved 271x speedup compared to its CPU-based counterpart. It can serve as a high-throughput LDPC decoder.

  1. Annotating Protein Functional Residues by Coupling High-Throughput Fitness Profile and Homologous-Structure Analysis.

    Science.gov (United States)

    Du, Yushen; Wu, Nicholas C; Jiang, Lin; Zhang, Tianhao; Gong, Danyang; Shu, Sara; Wu, Ting-Ting; Sun, Ren

    2016-11-01

    Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available. To fully comprehend the diverse functions of a protein, it is essential to understand the functionality of individual residues. Current methods are highly dependent on evolutionary sequence conservation, which is

  2. Evaluating High Throughput Toxicokinetics and Toxicodynamics for IVIVE (WC10)

    Science.gov (United States)

    High-throughput screening (HTS) generates in vitro data for characterizing potential chemical hazard. TK models are needed to allow in vitro to in vivo extrapolation (IVIVE) to real world situations. The U.S. EPA has created a public tool (R package “httk” for high throughput tox...

  3. MACRO: a combined microchip-PCR and microarray system for high-throughput monitoring of genetically modified organisms.

    Science.gov (United States)

    Shao, Ning; Jiang, Shi-Meng; Zhang, Miao; Wang, Jing; Guo, Shu-Juan; Li, Yang; Jiang, He-Wei; Liu, Cheng-Xi; Zhang, Da-Bing; Yang, Li-Tao; Tao, Sheng-Ce

    2014-01-21

    The monitoring of genetically modified organisms (GMOs) is a primary step of GMO regulation. However, there is presently a lack of effective and high-throughput methodologies for specifically and sensitively monitoring most of the commercialized GMOs. Herein, we developed a multiplex amplification on a chip with readout on an oligo microarray (MACRO) system specifically for convenient GMO monitoring. This system is composed of a microchip for multiplex amplification and an oligo microarray for the readout of multiple amplicons, containing a total of 91 targets (18 universal elements, 20 exogenous genes, 45 events, and 8 endogenous reference genes) that covers 97.1% of all GM events that have been commercialized up to 2012. We demonstrate that the specificity of MACRO is ~100%, with a limit of detection (LOD) that is suitable for real-world applications. Moreover, the results obtained of simulated complex samples and blind samples with MACRO were 100% consistent with expectations and the results of independently performed real-time PCRs, respectively. Thus, we believe MACRO is the first system that can be applied for effectively monitoring the majority of the commercialized GMOs in a single test.

  4. High Throughput T Epitope Mapping and Vaccine Development

    Directory of Open Access Journals (Sweden)

    Giuseppina Li Pira

    2010-01-01

    Full Text Available Mapping of antigenic peptide sequences from proteins of relevant pathogens recognized by T helper (Th and by cytolytic T lymphocytes (CTL is crucial for vaccine development. In fact, mapping of T-cell epitopes provides useful information for the design of peptide-based vaccines and of peptide libraries to monitor specific cellular immunity in protected individuals, patients and vaccinees. Nevertheless, epitope mapping is a challenging task. In fact, large panels of overlapping peptides need to be tested with lymphocytes to identify the sequences that induce a T-cell response. Since numerous peptide panels from antigenic proteins are to be screened, lymphocytes available from human subjects are a limiting factor. To overcome this limitation, high throughput (HTP approaches based on miniaturization and automation of T-cell assays are needed. Here we consider the most recent applications of the HTP approach to T epitope mapping. The alternative or complementary use of in silico prediction and experimental epitope definition is discussed in the context of the recent literature. The currently used methods are described with special reference to the possibility of applying the HTP concept to make epitope mapping an easier procedure in terms of time, workload, reagents, cells and overall cost.

  5. Genetic analysis and gene mapping of a low stigma exposed mutant gene by high-throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Xiao Ma

    Full Text Available Rice is one of the main food crops and several studies have examined the molecular mechanism of the exposure of the rice plant stigma. The improvement in the exposure of the stigma in female parent hybrid combinations can enhance the efficiency of hybrid breeding. In the present study, a mutant plant with low exposed stigma (lesr was discovered among the descendants of the indica thermo-sensitive sterile line 115S. The ES% rate of the mutant decreased by 70.64% compared with the wild type variety. The F2 population was established by genetic analysis considering the mutant as the female parent and the restorer line 93S as the male parent. The results indicated a normal F1 population, while a clear division was noted for the high and low exposed stigma groups, respectively. This process was possible only by a ES of 25% in the F2 population. This was in agreement with the ratio of 3:1, which indicated that the mutant was controlled by a recessive main-effect QTL locus, temporarily named as LESR. Genome-wide comparison of the SNP profiles between the early, high and low production bulks were constructed from F2 plants using bulked segregant analysis in combination with high-throughput sequencing technology. The results demonstrated that the candidate loci was located on the chromosome 10 of the rice. Following screening of the recombinant rice plants with newly developed molecular markers, the genetic region was narrowed down to 0.25 Mb. This region was flanked by InDel-2 and InDel-2 at the physical location from 13.69 to 13.94 Mb. Within this region, 7 genes indicated base differences between parents. A total of 2 genes exhibited differences at the coding region and upstream of the coding region, respectively. The present study aimed to further clone the LESR gene, verify its function and identify the stigma variation.

  6. Optimal Activation of Isopsoralen To Prevent Amplicon Carryover

    OpenAIRE

    Fahle, Gary A.; Gill, Vee J.; Fischer, Steven H.

    1999-01-01

    We compared the efficiencies of activation of the photochemical isopsoralen compound 10 and its resulting amplicon neutralizations under conditions with a UV transilluminator box at room temperature (RT) and a HRI-300 UV photothermal reaction chamber at RT and at 5°C. Our data suggest that use of the HRI-300 reaction chamber at 5°C results in a statistically significantly higher degree of amplicon neutralization.

  7. Evaluating multiplexed next-generation sequencing as a method in palynology for mixed pollen samples.

    Science.gov (United States)

    Keller, A; Danner, N; Grimmer, G; Ankenbrand, M; von der Ohe, K; von der Ohe, W; Rost, S; Härtel, S; Steffan-Dewenter, I

    2015-03-01

    The identification of pollen plays an important role in ecology, palaeo-climatology, honey quality control and other areas. Currently, expert knowledge and reference collections are essential to identify pollen origin through light microscopy. Pollen identification through molecular sequencing and DNA barcoding has been proposed as an alternative approach, but the assessment of mixed pollen samples originating from multiple plant species is still a tedious and error-prone task. Next-generation sequencing has been proposed to avoid this hindrance. In this study we assessed mixed pollen probes through next-generation sequencing of amplicons from the highly variable, species-specific internal transcribed spacer 2 region of nuclear ribosomal DNA. Further, we developed a bioinformatic workflow to analyse these high-throughput data with a newly created reference database. To evaluate the feasibility, we compared results from classical identification based on light microscopy from the same samples with our sequencing results. We assessed in total 16 mixed pollen samples, 14 originated from honeybee colonies and two from solitary bee nests. The sequencing technique resulted in higher taxon richness (deeper assignments and more identified taxa) compared to light microscopy. Abundance estimations from sequencing data were significantly correlated with counted abundances through light microscopy. Simulation analyses of taxon specificity and sensitivity indicate that 96% of taxa present in the database are correctly identifiable at the genus level and 70% at the species level. Next-generation sequencing thus presents a useful and efficient workflow to identify pollen at the genus and species level without requiring specialised palynological expert knowledge. © 2014 German Botanical Society and The Royal Botanical Society of the Netherlands.

  8. On the use of high-throughput sequencing for the study of cyanobacterial diversity in Antarctic aquatic mats.

    Science.gov (United States)

    Pessi, Igor Stelmach; Maalouf, Pedro De Carvalho; Laughinghouse, Haywood Dail; Baurain, Denis; Wilmotte, Annick

    2016-06-01

    The study of Antarctic cyanobacterial diversity has been mostly limited to morphological identification and traditional molecular techniques. High-throughput sequencing (HTS) allows a much better understanding of microbial distribution in the environment, but its application is hampered by several methodological and analytical challenges. In this work, we explored the use of HTS as a tool for the study of cyanobacterial diversity in Antarctic aquatic mats. Our results highlight the importance of using artificial communities to validate the parameters of the bioinformatics procedure used to analyze natural communities, since pipeline-dependent biases had a strong effect on the observed community structures. Analysis of microbial mats from five Antarctic lakes and an aquatic biofilm from the Sub-Antarctic showed that HTS is a valuable tool for the assessment of cyanobacterial diversity. The majority of the operational taxonomic units retrieved were related to filamentous taxa such as Leptolyngbya and Phormidium, which are common genera in Antarctic lacustrine microbial mats. However, other phylotypes related to different taxa such as Geitlerinema, Pseudanabaena, Synechococcus, Chamaesiphon, Calothrix, and Coleodesmium were also found. Results revealed a much higher diversity than what had been reported using traditional methods and also highlighted remarkable differences between the cyanobacterial communities of the studied lakes. The aquatic biofilm from the Sub-Antarctic had a distinct cyanobacterial community from the Antarctic lakes, which in turn displayed a salinity-dependent community structure at the phylotype level. © 2016 Phycological Society of America.

  9. Nitrogen Cycle Evaluation (NiCE) Chip for the Simultaneous Analysis of Multiple N-Cycle Associated Genes.

    Science.gov (United States)

    Oshiki, Mamoru; Segawa, Takahiro; Ishii, Satoshi

    2018-02-02

    Various microorganisms play key roles in the Nitrogen (N) cycle. Quantitative PCR (qPCR) and PCR-amplicon sequencing of the N cycle functional genes allow us to analyze the abundance and diversity of microbes responsible in the N transforming reactions in various environmental samples. However, analysis of multiple target genes can be cumbersome and expensive. PCR-independent analysis, such as metagenomics and metatranscriptomics, is useful but expensive especially when we analyze multiple samples and try to detect N cycle functional genes present at relatively low abundance. Here, we present the application of microfluidic qPCR chip technology to simultaneously quantify and prepare amplicon sequence libraries for multiple N cycle functional genes as well as taxon-specific 16S rRNA gene markers for many samples. This approach, named as N cycle evaluation (NiCE) chip, was evaluated by using DNA from pure and artificially mixed bacterial cultures and by comparing the results with those obtained by conventional qPCR and amplicon sequencing methods. Quantitative results obtained by the NiCE chip were comparable to those obtained by conventional qPCR. In addition, the NiCE chip was successfully applied to examine abundance and diversity of N cycle functional genes in wastewater samples. Although non-specific amplification was detected on the NiCE chip, this could be overcome by optimizing the primer sequences in the future. As the NiCE chip can provide high-throughput format to quantify and prepare sequence libraries for multiple N cycle functional genes, this tool should advance our ability to explore N cycling in various samples. Importance. We report a novel approach, namely Nitrogen Cycle Evaluation (NiCE) chip by using microfluidic qPCR chip technology. By sequencing the amplicons recovered from the NiCE chip, we can assess diversities of the N cycle functional genes. The NiCE chip technology is applicable to analyze the temporal dynamics of the N cycle gene

  10. Comparative Genomics in Switchgrass Using 61,585 High-Quality Expressed Sequence Tags

    Directory of Open Access Journals (Sweden)

    Christian M. Tobias

    2008-11-01

    Full Text Available The development of genomic resources for switchgrass ( L., a perennial NAD-malic enzyme type C grass, is required to enable molecular breeding and biotechnological approaches for improving its value as a forage and bioenergy crop. Expressed sequence tag (EST sequencing is one method that can quickly sample gene inventories and produce data suitable for marker development or analysis of tissue-specific patterns of expression. Toward this goal, three cDNA libraries from callus, crown, and seedling tissues of ‘Kanlow’ switchgrass were end-sequenced to generate a total of 61,585 high-quality ESTs from 36,565 separate clones. Seventy-three percent of the assembled consensus sequences could be aligned with the sorghum [ (L. Moench] genome at a -value of <1 × 10, indicating a high degree of similarity. Sixty-five percent of the ESTs matched with gene ontology molecular terms, and 3.3% of the sequences were matched with genes that play potential roles in cell-wall biogenesis. The representation in the three libraries of gene families known to be associated with C photosynthesis, cellulose and β-glucan synthesis, phenylpropanoid biosynthesis, and peroxidase activity indicated likely roles for individual family members. Pairwise comparisons of synonymous codon substitutions were used to assess genome sequence diversity and indicated an overall similarity between the two genome copies present in the tetraploid. Identification of EST–simple sequence repeat markers and amplification on two individual parents of a mapping population yielded an average of 2.18 amplicons per individual, and 35% of the markers produced fragment length polymorphisms.

  11. A simple, high throughput method to locate single copy sequences from Bacterial Artificial Chromosome (BAC libraries using High Resolution Melt analysis

    Directory of Open Access Journals (Sweden)

    Caligari Peter DS

    2010-05-01

    Full Text Available Abstract Background The high-throughput anchoring of genetic markers into contigs is required for many ongoing physical mapping projects. Multidimentional BAC pooling strategies for PCR-based screening of large insert libraries is a widely used alternative to high density filter hybridisation of bacterial colonies. To date, concerns over reliability have led most if not all groups engaged in high throughput physical mapping projects to favour BAC DNA isolation prior to amplification by conventional PCR. Results Here, we report the first combined use of Multiplex Tandem PCR (MT-PCR and High Resolution Melt (HRM analysis on bacterial stocks of BAC library superpools as a means of rapidly anchoring markers to BAC colonies and thereby to integrate genetic and physical maps. We exemplify the approach using a BAC library of the model plant Arabidopsis thaliana. Super pools of twenty five 384-well plates and two-dimension matrix pools of the BAC library were prepared for marker screening. The entire procedure only requires around 3 h to anchor one marker. Conclusions A pre-amplification step during MT-PCR allows high multiplexing and increases the sensitivity and reliability of subsequent HRM discrimination. This simple gel-free protocol is more reliable, faster and far less costly than conventional PCR screening. The option to screen in parallel 3 genetic markers in one MT-PCR-HRM reaction using templates from directly pooled bacterial stocks of BAC-containing bacteria further reduces time for anchoring markers in physical maps of species with large genomes.

  12. Improved inference of taxonomic richness from environmental DNA.

    Directory of Open Access Journals (Sweden)

    Matthew J Morgan

    Full Text Available Accurate estimation of biological diversity in environmental DNA samples using high-throughput amplicon pyrosequencing must account for errors generated by PCR and sequencing. We describe a novel approach to distinguish the underlying sequence diversity in environmental DNA samples from errors that uses information on the abundance distribution of similar sequences across independent samples, as well as the frequency and diversity of sequences within individual samples. We have further refined this approach into a bioinformatics pipeline, Amplicon Pyrosequence Denoising Program (APDP that is able to process raw sequence datasets into a set of validated sequences in formats compatible with commonly used downstream analyses packages. We demonstrate, by sequencing complex environmental samples and mock communities, that APDP is effective for removing errors from deeply sequenced datasets comprising biological and technical replicates, and can efficiently denoise single-sample datasets. APDP provides more conservative diversity estimates for complex datasets than other approaches; however, for some applications this may provide a more accurate and appropriate level of resolution, and result in greater confidence that returned sequences reflect the diversity of the underlying sample.

  13. "First generation" automated DNA sequencing technology.

    Science.gov (United States)

    Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M

    2011-10-01

    Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines. © 2011 by John Wiley & Sons, Inc.

  14. High-throughput automated microfluidic sample preparation for accurate microbial genomics.

    Science.gov (United States)

    Kim, Soohong; De Jonghe, Joachim; Kulesa, Anthony B; Feldman, David; Vatanen, Tommi; Bhattacharyya, Roby P; Berdy, Brittany; Gomez, James; Nolan, Jill; Epstein, Slava; Blainey, Paul C

    2017-01-27

    Low-cost shotgun DNA sequencing is transforming the microbial sciences. Sequencing instruments are so effective that sample preparation is now the key limiting factor. Here, we introduce a microfluidic sample preparation platform that integrates the key steps in cells to sequence library sample preparation for up to 96 samples and reduces DNA input requirements 100-fold while maintaining or improving data quality. The general-purpose microarchitecture we demonstrate supports workflows with arbitrary numbers of reaction and clean-up or capture steps. By reducing the sample quantity requirements, we enabled low-input (∼10,000 cells) whole-genome shotgun (WGS) sequencing of Mycobacterium tuberculosis and soil micro-colonies with superior results. We also leveraged the enhanced throughput to sequence ∼400 clinical Pseudomonas aeruginosa libraries and demonstrate excellent single-nucleotide polymorphism detection performance that explained phenotypically observed antibiotic resistance. Fully-integrated lab-on-chip sample preparation overcomes technical barriers to enable broader deployment of genomics across many basic research and translational applications.

  15. High-throughput optical system for HDES hyperspectral imager

    Science.gov (United States)

    Václavík, Jan; Melich, Radek; Pintr, Pavel; Pleštil, Jan

    2015-01-01

    Affordable, long-wave infrared hyperspectral imaging calls for use of an uncooled FPA with high-throughput optics. This paper describes the design of the optical part of a stationary hyperspectral imager in a spectral range of 7-14 um with a field of view of 20°×10°. The imager employs a push-broom method made by a scanning mirror. High throughput and a demand for simplicity and rigidity led to a fully refractive design with highly aspheric surfaces and off-axis positioning of the detector array. The design was optimized to exploit the machinability of infrared materials by the SPDT method and a simple assemblage.

  16. Why barcode? High-throughput multiplex sequencing of mitochondrial genomes for molecular systematics.

    Science.gov (United States)

    Timmermans, M J T N; Dodsworth, S; Culverwell, C L; Bocak, L; Ahrens, D; Littlewood, D T J; Pons, J; Vogler, A P

    2010-11-01

    Mitochondrial genome sequences are important markers for phylogenetics but taxon sampling remains sporadic because of the great effort and cost required to acquire full-length sequences. Here, we demonstrate a simple, cost-effective way to sequence the full complement of protein coding mitochondrial genes from pooled samples using the 454/Roche platform. Multiplexing was achieved without the need for expensive indexing tags ('barcodes'). The method was trialled with a set of long-range polymerase chain reaction (PCR) fragments from 30 species of Coleoptera (beetles) sequenced in a 1/16th sector of a sequencing plate. Long contigs were produced from the pooled sequences with sequencing depths ranging from ∼10 to 100× per contig. Species identity of individual contigs was established via three 'bait' sequences matching disparate parts of the mitochondrial genome obtained by conventional PCR and Sanger sequencing. This proved that assembly of contigs from the sequencing pool was correct. Our study produced sequences for 21 nearly complete and seven partial sets of protein coding mitochondrial genes. Combined with existing sequences for 25 taxa, an improved estimate of basal relationships in Coleoptera was obtained. The procedure could be employed routinely for mitochondrial genome sequencing at the species level, to provide improved species 'barcodes' that currently use the cox1 gene only.

  17. Understanding regulation of microRNAs on intestine regeneration in the sea cucumber Apostichopus japonicus using high-throughput sequencing.

    Science.gov (United States)

    Sun, Lina; Sun, Jingchun; Li, Xiaoni; Zhang, Libin; Yang, Hongsheng; Wang, Qing

    2017-06-01

    The sea cucumber, as a member of the Echinodermata, has the capacity to restore damaged organs and body parts, which has always been a key scientific issue. MicroRNAs (miRNAs), a class of short noncoding RNAs, play important roles in regulating gene expression. In the present study, we applied high-throughput sequencing to investigate alterations of miRNA expression in regenerative intestine compared to normal intestine. A total of 73 differentially expressed miRNAs were obtained, including 59 up-regulated miRNAs and 14 down-regulated miRNAs. Among these molecules, Aja-miR-1715-5p, Aja-miR-153, Aja-miR-252a, Aja-miR-153-5p, Aja-miR-252b, Aja-miR-2001, Aja-miR-64d-3p, and Aja-miR-252-5p were differentially expressed over 10-fold at 3days post-evisceration (dpe). Notably, real-time PCR revealed that Aja-miR-1715-5p was up-regulated 1390-fold at 3dpe. Moreover, putative target gene co-expression analyses, gene ontology, and pathway analyses suggest that these miRNAs play important roles in specific cellular events (cell proliferation, migration, and apoptosis), metabolic regulation, and energy redistribution. These results will provide a basis for future studies of miRNA regulation in sea cucumber regeneration. Copyright © 2017 Elsevier Inc. All rights reserved.

  18. Rapid development of microsatellite markers for Callosobruchus chinensis using Illumina paired-end sequencing.

    Directory of Open Access Journals (Sweden)

    Can-Xing Duan

    Full Text Available BACKGROUND: The adzuki bean weevil, Callosobruchus chinensis L., is one of the most destructive pests of stored legume seeds such as mungbean, cowpea, and adzuki bean, which usually cause considerable loss in the quantity and quality of stored seeds during transportation and storage. However, a lack of genetic information of this pest results in a series of genetic questions remain largely unknown, including population genetic structure, kinship, biotype abundance, and so on. Co-dominant microsatellite markers offer a great resolving power to determine these events. Here, we report rapid microsatellite isolation from C. chinensis via high-throughput sequencing. PRINCIPAL FINDINGS: In this study, 94,560,852 quality-filtered and trimmed reads were obtained for the assembly of genome using Illumina paired-end sequencing technology. In total, the genome with total length of 497,124,785 bp, comprising 403,113 high quality contigs was generated with de novo assembly. More than 6800 SSR loci were detected and a suit of 6303 primer pair sequences were designed and 500 of them were randomly selected for validation. Of these, 196 pair of primers, i.e. 39.2%, produced reproducible amplicons that were polymorphic among 8 C. chinensis genotypes collected from different geographical regions. Twenty out of 196 polymorphic SSR markers were used to analyze the genetic diversity of 18 C. chinensis populations. The results showed the twenty SSR loci were highly polymorphic among these populations. CONCLUSIONS: This study presents a first report of genome sequencing and de novo assembly for C. chinensis and demonstrates the feasibility of generating a large scale of sequence information and SSR loci isolation by Illumina paired-end sequencing. Our results provide a valuable resource for C. chinensis research. These novel markers are valuable for future genetic mapping, trait association, genetic structure and kinship among C. chinensis.

  19. High Performance Computing Modernization Program Kerberos Throughput Test Report

    Science.gov (United States)

    2017-10-26

    Naval Research Laboratory Washington, DC 20375-5320 NRL/MR/5524--17-9751 High Performance Computing Modernization Program Kerberos Throughput Test ...NUMBER 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 2. REPORT TYPE1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND SUBTITLE 6. AUTHOR(S) 8. PERFORMING...PAGE 18. NUMBER OF PAGES 17. LIMITATION OF ABSTRACT High Performance Computing Modernization Program Kerberos Throughput Test Report Daniel G. Gdula* and

  20. The use of high-throughput sequencing to investigate an outbreak of glycopeptide-resistant Enterococcus faecium with a novel quinupristin-dalfopristin resistance mechanism.

    Science.gov (United States)

    Shaw, Timothy D; Fairley, D J; Schneiders, T; Pathiraja, M; Hill, R L R; Werner, G; Elborn, J S; McMullan, R

    2018-02-24

    High-throughput sequencing (HTS) has successfully identified novel resistance genes in enterococci and determined clonal relatedness in outbreak analysis. We report the use of HTS to investigate two concurrent outbreaks of glycopeptide-resistant Enterococcus faecium (GRE) with an uncharacterised resistance mechanism to quinupristin-dalfopristin (QD). Seven QD-resistant and five QD-susceptible GRE isolates from a two-centre outbreak were studied. HTS was performed to identify genes or predicted proteins that were associated with the QD-resistant phenotype. MLST and SNP typing on HTS data was used to determine clonal relatedness. Comparative genomic analysis confirmed this GRE outbreak involved two distinct clones (ST80 and ST192). HTS confirmed the absence of known QD resistance genes, suggesting a novel mechanism was conferring resistance. Genomic analysis identified two significant genetic determinants with explanatory power for the high level of QD resistance in the ST80 QD-resistant clone: an additional 56aa leader sequence at the N-terminus of the lsaE gene and a transposon containing seven genes encoding proteins with possible drug or drug-target modification activities. However, HTS was unable to conclusively determine the QD resistance mechanism and did not reveal any genetic basis for QD resistance in the ST192 clone. This study highlights the usefulness of HTS in deciphering the degree of relatedness in two concurrent GRE outbreaks. Although HTS was able to reveal some genetic candidates for uncharacterised QD resistance, this study demonstrates the limitations of HTS as a tool for identifying putative determinants of resistance to QD.

  1. An improved high throughput sequencing method for studying oomycete communities

    DEFF Research Database (Denmark)

    Sapkota, Rumakanta; Nicolaisen, Mogens

    2015-01-01

    the usefulness of the method not only in soil DNA but also in a plant DNA background. In conclusion, we demonstrate a successful approach for pyrosequencing of oomycete communities using ITS1 as the barcode sequence with well-known primers for oomycete DNA amplification....... communities. Thewell-known primer sets ITS4, ITS6 and ITS7were used in the study in a semi-nested PCR approach to target the internal transcribed spacer (ITS) 1 of ribosomal DNA in a next generation sequencing protocol. These primers have been used in similar studies before, butwith limited success.......Wewere able to increase the proportion of retrieved oomycete sequences dramaticallymainly by increasing the annealing temperature during PCR. The optimized protocol was validated using three mock communities and the method was further evaluated using total DNA from 26 soil samples collected from different...

  2. Testing genotyping strategies for ultra-deep sequencing of a co-amplifying gene family: MHC class I in a passerine bird.

    Science.gov (United States)

    Biedrzycka, Aleksandra; Sebastian, Alvaro; Migalska, Magdalena; Westerdahl, Helena; Radwan, Jacek

    2017-07-01

    Characterization of highly duplicated genes, such as genes of the major histocompatibility complex (MHC), where multiple loci often co-amplify, has until recently been hindered by insufficient read depths per amplicon. Here, we used ultra-deep Illumina sequencing to resolve genotypes at exon 3 of MHC class I genes in the sedge warbler (Acrocephalus schoenobaenus). We sequenced 24 individuals in two replicates and used this data, as well as a simulated data set, to test the effect of amplicon coverage (range: 500-20 000 reads per amplicon) on the repeatability of genotyping using four different genotyping approaches. A third replicate employed unique barcoding to assess the extent of tag jumping, that is swapping of individual tag identifiers, which may confound genotyping. The reliability of MHC genotyping increased with coverage and approached or exceeded 90% within-method repeatability of allele calling at coverages of >5000 reads per amplicon. We found generally high agreement between genotyping methods, especially at high coverages. High reliability of the tested genotyping approaches was further supported by our analysis of the simulated data set, although the genotyping approach relying primarily on replication of variants in independent amplicons proved sensitive to repeatable errors. According to the most repeatable genotyping method, the number of co-amplifying variants per individual ranged from 19 to 42. Tag jumping was detectable, but at such low frequencies that it did not affect the reliability of genotyping. We thus demonstrate that gene families with many co-amplifying genes can be reliably genotyped using HTS, provided that there is sufficient per amplicon coverage. © 2016 John Wiley & Sons Ltd.

  3. High-throughput sequencing of black pepper root transcriptome

    Science.gov (United States)

    2012-01-01

    Background Black pepper (Piper nigrum L.) is one of the most popular spices in the world. It is used in cooking and the preservation of food and even has medicinal properties. Losses in production from disease are a major limitation in the culture of this crop. The major diseases are root rot and foot rot, which are results of root infection by Fusarium solani and Phytophtora capsici, respectively. Understanding the molecular interaction between the pathogens and the host’s root region is important for obtaining resistant cultivars by biotechnological breeding. Genetic and molecular data for this species, though, are limited. In this paper, RNA-Seq technology has been employed, for the first time, to describe the root transcriptome of black pepper. Results The root transcriptome of black pepper was sequenced by the NGS SOLiD platform and assembled using the multiple-k method. Blast2Go and orthoMCL methods were used to annotate 10338 unigenes. The 4472 predicted proteins showed about 52% homology with the Arabidopsis proteome. Two root proteomes identified 615 proteins, which seem to define the plant’s root pattern. Simple-sequence repeats were identified that may be useful in studies of genetic diversity and may have applications in biotechnology and ecology. Conclusions This dataset of 10338 unigenes is crucially important for the biotechnological breeding of black pepper and the ecogenomics of the Magnoliids, a major group of basal angiosperms. PMID:22984782

  4. High-throughput sequencing of black pepper root transcriptome

    Directory of Open Access Journals (Sweden)

    Gordo Sheila MC

    2012-09-01

    Full Text Available Abstract Background Black pepper (Piper nigrum L. is one of the most popular spices in the world. It is used in cooking and the preservation of food and even has medicinal properties. Losses in production from disease are a major limitation in the culture of this crop. The major diseases are root rot and foot rot, which are results of root infection by Fusarium solani and Phytophtora capsici, respectively. Understanding the molecular interaction between the pathogens and the host’s root region is important for obtaining resistant cultivars by biotechnological breeding. Genetic and molecular data for this species, though, are limited. In this paper, RNA-Seq technology has been employed, for the first time, to describe the root transcriptome of black pepper. Results The root transcriptome of black pepper was sequenced by the NGS SOLiD platform and assembled using the multiple-k method. Blast2Go and orthoMCL methods were used to annotate 10338 unigenes. The 4472 predicted proteins showed about 52% homology with the Arabidopsis proteome. Two root proteomes identified 615 proteins, which seem to define the plant’s root pattern. Simple-sequence repeats were identified that may be useful in studies of genetic diversity and may have applications in biotechnology and ecology. Conclusions This dataset of 10338 unigenes is crucially important for the biotechnological breeding of black pepper and the ecogenomics of the Magnoliids, a major group of basal angiosperms.

  5. High-throughput theoretical design of lithium battery materials

    International Nuclear Information System (INIS)

    Ling Shi-Gang; Gao Jian; Xiao Rui-Juan; Chen Li-Quan

    2016-01-01

    The rapid evolution of high-throughput theoretical design schemes to discover new lithium battery materials is reviewed, including high-capacity cathodes, low-strain cathodes, anodes, solid state electrolytes, and electrolyte additives. With the development of efficient theoretical methods and inexpensive computers, high-throughput theoretical calculations have played an increasingly important role in the discovery of new materials. With the help of automatic simulation flow, many types of materials can be screened, optimized and designed from a structural database according to specific search criteria. In advanced cell technology, new materials for next generation lithium batteries are of great significance to achieve performance, and some representative criteria are: higher energy density, better safety, and faster charge/discharge speed. (topical review)

  6. Nitrogenase gene amplicons from global marine surface waters are dominated by genes of non-cyanobacteria.

    Directory of Open Access Journals (Sweden)

    Hanna Farnelid

    Full Text Available Cyanobacteria are thought to be the main N(2-fixing organisms (diazotrophs in marine pelagic waters, but recent molecular analyses indicate that non-cyanobacterial diazotrophs are also present and active. Existing data are, however, restricted geographically and by limited sequencing depths. Our analysis of 79,090 nitrogenase (nifH PCR amplicons encoding 7,468 unique proteins from surface samples (ten DNA samples and two RNA samples collected at ten marine locations world-wide provides the first in-depth survey of a functional bacterial gene and yield insights into the composition and diversity of the nifH gene pool in marine waters. Great divergence in nifH composition was observed between sites. Cyanobacteria-like genes were most frequent among amplicons from the warmest waters, but overall the data set was dominated by nifH sequences most closely related to non-cyanobacteria. Clusters related to Alpha-, Beta-, Gamma-, and Delta-Proteobacteria were most common and showed distinct geographic distributions. Sequences related to anaerobic bacteria (nifH Cluster III were generally rare, but preponderant in cold waters, especially in the Arctic. Although the two transcript samples were dominated by unicellular cyanobacteria, 42% of the identified non-cyanobacterial nifH clusters from the corresponding DNA samples were also detected in cDNA. The study indicates that non-cyanobacteria account for a substantial part of the nifH gene pool in marine surface waters and that these genes are at least occasionally expressed. The contribution of non-cyanobacterial diazotrophs to the global N(2 fixation budget cannot be inferred from sequence data alone, but the prevalence of non-cyanobacterial nifH genes and transcripts suggest that these bacteria are ecologically significant.

  7. High-Throughput Scoring of Seed Germination.

    Science.gov (United States)

    Ligterink, Wilco; Hilhorst, Henk W M

    2017-01-01

    High-throughput analysis of seed germination for phenotyping large genetic populations or mutant collections is very labor intensive and would highly benefit from an automated setup. Although very often used, the total germination percentage after a nominated period of time is not very informative as it lacks information about start, rate, and uniformity of germination, which are highly indicative of such traits as dormancy, stress tolerance, and seed longevity. The calculation of cumulative germination curves requires information about germination percentage at various time points. We developed the GERMINATOR package: a simple, highly cost-efficient, and flexible procedure for high-throughput automatic scoring and evaluation of germination that can be implemented without the use of complex robotics. The GERMINATOR package contains three modules: (I) design of experimental setup with various options to replicate and randomize samples; (II) automatic scoring of germination based on the color contrast between the protruding radicle and seed coat on a single image; and (III) curve fitting of cumulative germination data and the extraction, recap, and visualization of the various germination parameters. GERMINATOR is a freely available package that allows the monitoring and analysis of several thousands of germination tests, several times a day by a single person.

  8. High throughput salt separation from uranium deposits

    Energy Technology Data Exchange (ETDEWEB)

    Kwon, S.W.; Park, K.M.; Kim, J.G.; Kim, I.T.; Park, S.B., E-mail: swkwon@kaeri.re.kr [Korea Atomic Energy Research Inst. (Korea, Republic of)

    2014-07-01

    It is very important to increase the throughput of the salt separation system owing to the high uranium content of spent nuclear fuel and high salt fraction of uranium dendrites in pyroprocessing. Multilayer porous crucible system was proposed to increase a throughput of the salt distiller in this study. An integrated sieve-crucible assembly was also investigated for the practical use of the porous crucible system. The salt evaporation behaviors were compared between the conventional nonporous crucible and the porous crucible. Two step weight reductions took place in the porous crucible, whereas the salt weight reduced only at high temperature by distillation in a nonporous crucible. The first weight reduction in the porous crucible was caused by the liquid salt penetrated out through the perforated crucible during the temperature elevation until the distillation temperature. Multilayer porous crucibles have a benefit to expand the evaporation surface area. (author)

  9. Unravelling the complexity of microRNA-mediated gene regulation in black pepper (Piper nigrum L.) using high-throughput small RNA profiling.

    Science.gov (United States)

    Asha, Srinivasan; Sreekumar, Sweda; Soniya, E V

    2016-01-01

    Analysis of high-throughput small RNA deep sequencing data, in combination with black pepper transcriptome sequences revealed microRNA-mediated gene regulation in black pepper ( Piper nigrum L.). Black pepper is an important spice crop and its berries are used worldwide as a natural food additive that contributes unique flavour to foods. In the present study to characterize microRNAs from black pepper, we generated a small RNA library from black pepper leaf and sequenced it by Illumina high-throughput sequencing technology. MicroRNAs belonging to a total of 303 conserved miRNA families were identified from the sRNAome data. Subsequent analysis from recently sequenced black pepper transcriptome confirmed precursor sequences of 50 conserved miRNAs and four potential novel miRNA candidates. Stem-loop qRT-PCR experiments demonstrated differential expression of eight conserved miRNAs in black pepper. Computational analysis of targets of the miRNAs showed 223 potential black pepper unigene targets that encode diverse transcription factors and enzymes involved in plant development, disease resistance, metabolic and signalling pathways. RLM-RACE experiments further mapped miRNA-mediated cleavage at five of the mRNA targets. In addition, miRNA isoforms corresponding to 18 miRNA families were also identified from black pepper. This study presents the first large-scale identification of microRNAs from black pepper and provides the foundation for the future studies of miRNA-mediated gene regulation of stress responses and diverse metabolic processes in black pepper.

  10. High throughput, low set-up time reconfigurable linear feedback shift registers

    NARCIS (Netherlands)

    Nas, R.J.M.; Berkel, van C.H.

    2010-01-01

    This paper presents a hardware design for a scalable, high throughput, configurable LFSR. High throughput is achieved by producing L consecutive outputs per clock cycle with a clock cycle period that, for practical cases, increases only logarithmically with the block size L and the length of the

  11. High-throughput epitope identification for snakebite antivenom

    DEFF Research Database (Denmark)

    Engmark, Mikael; De Masi, Federico; Laustsen, Andreas Hougaard

    Insight into the epitopic recognition pattern for polyclonal antivenoms is a strong tool for accurate prediction of antivenom cross-reactivity and provides a basis for design of novel antivenoms. In this work, a high-throughput approach was applied to characterize linear epitopes in 966 individua...... toxins from pit vipers (Crotalidae) using the ICP Crotalidae antivenom. Due to an abundance of snake venom metalloproteinases and phospholipase A2s in the venoms used for production of the investigated antivenom, this study focuses on these toxin families.......Insight into the epitopic recognition pattern for polyclonal antivenoms is a strong tool for accurate prediction of antivenom cross-reactivity and provides a basis for design of novel antivenoms. In this work, a high-throughput approach was applied to characterize linear epitopes in 966 individual...

  12. High-throughput physical map anchoring via BAC-pool sequencing

    Czech Academy of Sciences Publication Activity Database

    Cviková, Kateřina; Cattonaro, F.; Alaux, M.; Stein, N.; Mayer, K.F.X.; Doležel, Jaroslav; Bartoš, Jan

    2015-01-01

    Roč. 15, APR 11 (2015) ISSN 1471-2229 R&D Projects: GA ČR GA13-08786S; GA MŠk(CZ) LO1204 Institutional support: RVO:61389030 Keywords : Physical map * Contig anchoring * Next generation sequencing Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 3.631, year: 2015

  13. Towards a high throughput droplet-based agglutination assay

    KAUST Repository

    Kodzius, Rimantas; Castro, David; Foulds, Ian G.

    2013-01-01

    This work demonstrates the detection method for a high throughput droplet based agglutination assay system. Using simple hydrodynamic forces to mix and aggregate functionalized microbeads we avoid the need to use magnetic assistance or mixing structures. The concentration of our target molecules was estimated by agglutination strength, obtained through optical image analysis. Agglutination in droplets was performed with flow rates of 150 µl/min and occurred in under a minute, with potential to perform high-throughput measurements. The lowest target concentration detected in droplet microfluidics was 0.17 nM, which is three orders of magnitude more sensitive than a conventional card based agglutination assay.

  14. Towards a high throughput droplet-based agglutination assay

    KAUST Repository

    Kodzius, Rimantas

    2013-10-22

    This work demonstrates the detection method for a high throughput droplet based agglutination assay system. Using simple hydrodynamic forces to mix and aggregate functionalized microbeads we avoid the need to use magnetic assistance or mixing structures. The concentration of our target molecules was estimated by agglutination strength, obtained through optical image analysis. Agglutination in droplets was performed with flow rates of 150 µl/min and occurred in under a minute, with potential to perform high-throughput measurements. The lowest target concentration detected in droplet microfluidics was 0.17 nM, which is three orders of magnitude more sensitive than a conventional card based agglutination assay.

  15. High-resolution phylogenetic microbial community profiling

    Energy Technology Data Exchange (ETDEWEB)

    Singer, Esther; Coleman-Derr, Devin; Bowman, Brett; Schwientek, Patrick; Clum, Alicia; Copeland, Alex; Ciobanu, Doina; Cheng, Jan-Fang; Gies, Esther; Hallam, Steve; Tringe, Susannah; Woyke, Tanja

    2014-03-17

    The representation of bacterial and archaeal genome sequences is strongly biased towards cultivated organisms, which belong to merely four phylogenetic groups. Functional information and inter-phylum level relationships are still largely underexplored for candidate phyla, which are often referred to as microbial dark matter. Furthermore, a large portion of the 16S rRNA gene records in the GenBank database are labeled as environmental samples and unclassified, which is in part due to low read accuracy, potential chimeric sequences produced during PCR amplifications and the low resolution of short amplicons. In order to improve the phylogenetic classification of novel species and advance our knowledge of the ecosystem function of uncultivated microorganisms, high-throughput full length 16S rRNA gene sequencing methodologies with reduced biases are needed. We evaluated the performance of PacBio single-molecule real-time (SMRT) sequencing in high-resolution phylogenetic microbial community profiling. For this purpose, we compared PacBio and Illumina metagenomic shotgun and 16S rRNA gene sequencing of a mock community as well as of an environmental sample from Sakinaw Lake, British Columbia. Sakinaw Lake is known to contain a large age of microbial species from candidate phyla. Sequencing results show that community structure based on PacBio shotgun and 16S rRNA gene sequences is highly similar in both the mock and the environmental communities. Resolution power and community representation accuracy from SMRT sequencing data appeared to be independent of GC content of microbial genomes and was higher when compared to Illumina-based metagenome shotgun and 16S rRNA gene (iTag) sequences, e.g. full-length sequencing resolved all 23 OTUs in the mock community, while iTags did not resolve closely related species. SMRT sequencing hence offers various potential benefits when characterizing uncharted microbial communities.

  16. Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area.

    Science.gov (United States)

    Nakano, Kazuma; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Ashimine, Noriko; Ohki, Shun; Shinzato, Misuzu; Minami, Maiko; Nakanishi, Tetsuhiro; Teruya, Kuniko; Satou, Kazuhito; Hirano, Takashi

    2017-07-01

    PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II's sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.

  17. High throughput deep degradome sequencing reveals microRNAs and their targets in response to drought stress in mulberry (Morus alba).

    Science.gov (United States)

    Li, Ruixue; Chen, Dandan; Wang, Taichu; Wan, Yizhen; Li, Rongfang; Fang, Rongjun; Wang, Yuting; Hu, Fei; Zhou, Hong; Li, Long; Zhao, Weiguo

    2017-01-01

    MicroRNAs (miRNAs) play important regulatory roles by targeting mRNAs for cleavage or translational repression. Identification of miRNA targets is essential to better understanding the roles of miRNAs. miRNA targets have not been well characterized in mulberry (Morus alba). To anatomize miRNA guided gene regulation under drought stress, transcriptome-wide high throughput degradome sequencing was used in this study to directly detect drought stress responsive miRNA targets in mulberry. A drought library (DL) and a contrast library (CL) were constructed to capture the cleaved mRNAs for sequencing. In CL, 409 target genes of 30 conserved miRNA families and 990 target genes of 199 novel miRNAs were identified. In DL, 373 target genes of 30 conserved miRNA families and 950 target genes of 195 novel miRNAs were identified. Of the conserved miRNA families in DL, mno-miR156, mno-miR172, and mno-miR396 had the highest number of targets with 54, 52 and 41 transcripts, respectively, indicating that these three miRNA families and their target genes might play important functions in response to drought stress in mulberry. Additionally, we found that many of the target genes were transcription factors. By analyzing the miRNA-target molecular network, we found that the DL independent networks consisted of 838 miRNA-mRNA pairs (63.34%). The expression patterns of 11 target genes and 12 correspondent miRNAs were detected using qRT-PCR. Six miRNA targets were further verified by RNA ligase-mediated 5' rapid amplification of cDNA ends (RLM-5' RACE). Gene Ontology (GO) annotations and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis revealed that these target transcripts were implicated in a broad range of biological processes and various metabolic pathways. This is the first study to comprehensively characterize target genes and their associated miRNAs in response to drought stress by degradome sequencing in mulberry. This study provides a framework for understanding

  18. High-throughput screening of small molecule libraries using SAMDI mass spectrometry.

    Science.gov (United States)

    Gurard-Levin, Zachary A; Scholle, Michael D; Eisenberg, Adam H; Mrksich, Milan

    2011-07-11

    High-throughput screening is a common strategy used to identify compounds that modulate biochemical activities, but many approaches depend on cumbersome fluorescent reporters or antibodies and often produce false-positive hits. The development of "label-free" assays addresses many of these limitations, but current approaches still lack the throughput needed for applications in drug discovery. This paper describes a high-throughput, label-free assay that combines self-assembled monolayers with mass spectrometry, in a technique called SAMDI, as a tool for screening libraries of 100,000 compounds in one day. This method is fast, has high discrimination, and is amenable to a broad range of chemical and biological applications.

  19. A comparison of 454 sequencing and clonal sequencing for the characterization of hepatitis C virus NS3 variants

    NARCIS (Netherlands)

    Ho, Cynthia K. Y.; Welkers, Matthijs R. A.; Thomas, Xiomara V.; Sullivan, James C.; Kieffer, Tara L.; Reesink, Henk W.; Rebers, Sjoerd P. H.; de Jong, Menno D.; Schinkel, Janke; Molenkamp, Richard

    2015-01-01

    We compared 454 amplicon sequencing with clonal sequencing for the characterization of intra-host hepatitis C virus (HCV) NS3 variants. Clonal and 454 sequences were obtained from 12 patients enrolled in a clinical phase I study for telaprevir, an NS3-4a protease inhibitor. Thirty-nine datasets were

  20. Impact of Roadway Stormwater Runoff on Microbial Contamination in the Receiving Stream.

    Science.gov (United States)

    Wyckoff, Kristen N; Chen, Si; Steinman, Andrew J; He, Qiang

    2017-09-01

    Stormwater runoff from roadways has increasingly become a regulatory concern for water pollution control. Recent work has suggested roadway stormwater runoff as a potential source of microbial pollutants. The objective of this study was to determine the impact of roadway runoff on the microbiological quality of receiving streams. Microbiological quality of roadway stormwater runoff and the receiving stream was monitored during storm events with both cultivation-dependent fecal bacteria enumeration and cultivation-independent high-throughput sequencing techniques. Enumeration of total coliforms as a measure of fecal microbial pollution found consistently lower total coliform counts in roadway runoff than those in the stream water, suggesting that roadway runoff was not a major contributor of microbial pollutants to the receiving stream. Further characterization of the microbial community in the stormwater samples by 16S ribosomal RNA gene-based high-throughput amplicon sequencing revealed significant differences in the microbial composition of stormwater runoff from the roadways and the receiving stream. The differences in microbial composition between the roadway runoff and stream water demonstrate that roadway runoff did not appear to have a major influence on the stream in terms of microbiological quality. Thus, results from both fecal bacteria enumeration and high-throughput amplicon sequencing techniques were consistent that roadway stormwater runoff was not the primary contributor of microbial loading to the stream. Further studies of additional watersheds with distinct characteristics are needed to validate these findings. Understanding gained in this study could support the development of more effective strategies for stormwater management in sensitive watersheds. Copyright © by the American Society of Agronomy, Crop Science Society of America, and Soil Science Society of America, Inc.

  1. Accurate clinical genetic testing for autoinflammatory diseases using the next-generation sequencing platform MiSeq.

    Science.gov (United States)

    Nakayama, Manabu; Oda, Hirotsugu; Nakagawa, Kenji; Yasumi, Takahiro; Kawai, Tomoki; Izawa, Kazushi; Nishikomori, Ryuta; Heike, Toshio; Ohara, Osamu

    2017-03-01

    Autoinflammatory diseases occupy one of a group of primary immunodeficiency diseases that are generally thought to be caused by mutation of genes responsible for innate immunity, rather than by acquired immunity. Mutations related to autoinflammatory diseases occur in 12 genes. For example, low-level somatic mosaic NLRP3 mutations underlie chronic infantile neurologic, cutaneous, articular syndrome (CINCA), also known as neonatal-onset multisystem inflammatory disease (NOMID). In current clinical practice, clinical genetic testing plays an important role in providing patients with quick, definite diagnoses. To increase the availability of such testing, low-cost high-throughput gene-analysis systems are required, ones that not only have the sensitivity to detect even low-level somatic mosaic mutations, but also can operate simply in a clinical setting. To this end, we developed a simple method that employs two-step tailed PCR and an NGS system, MiSeq platform, to detect mutations in all coding exons of the 12 genes responsible for autoinflammatory diseases. Using this amplicon sequencing system, we amplified a total of 234 amplicons derived from the 12 genes with multiplex PCR. This was done simultaneously and in one test tube. Each sample was distinguished by an index sequence of second PCR primers following PCR amplification. With our procedure and tips for reducing PCR amplification bias, we were able to analyze 12 genes from 25 clinical samples in one MiSeq run. Moreover, with the certified primers designed by our short program-which detects and avoids common SNPs in gene-specific PCR primers-we used this system for routine genetic testing. Our optimized procedure uses a simple protocol, which can easily be followed by virtually any office medical staff. Because of the small PCR amplification bias, we can analyze simultaneously several clinical DNA samples with low cost and can obtain sufficient read numbers to detect a low level of somatic mosaic mutations.

  2. Robust Sub-nanomolar Library Preparation for High Throughput Next Generation Sequencing.

    Science.gov (United States)

    Wu, Wells W; Phue, Je-Nie; Lee, Chun-Ting; Lin, Changyi; Xu, Lai; Wang, Rong; Zhang, Yaqin; Shen, Rong-Fong

    2018-05-04

    Current library preparation protocols for Illumina HiSeq and MiSeq DNA sequencers require ≥2 nM initial library for subsequent loading of denatured cDNA onto flow cells. Such amounts are not always attainable from samples having a relatively low DNA or RNA input; or those for which a limited number of PCR amplification cycles is preferred (less PCR bias and/or more even coverage). A well-tested sub-nanomolar library preparation protocol for Illumina sequencers has however not been reported. The aim of this study is to provide a much needed working protocol for sub-nanomolar libraries to achieve outcomes as informative as those obtained with the higher library input (≥ 2 nM) recommended by Illumina's protocols. Extensive studies were conducted to validate a robust sub-nanomolar (initial library of 100 pM) protocol using PhiX DNA (as a control), genomic DNA (Bordetella bronchiseptica and microbial mock community B for 16S rRNA gene sequencing), messenger RNA, microRNA, and other small noncoding RNA samples. The utility of our protocol was further explored for PhiX library concentrations as low as 25 pM, which generated only slightly fewer than 50% of the reads achieved under the standard Illumina protocol starting with > 2 nM. A sub-nanomolar library preparation protocol (100 pM) could generate next generation sequencing (NGS) results as robust as the standard Illumina protocol. Following the sub-nanomolar protocol, libraries with initial concentrations as low as 25 pM could also be sequenced to yield satisfactory and reproducible sequencing results.

  3. High-throughput transformation of Saccharomyces cerevisiae using liquid handling robots.

    Directory of Open Access Journals (Sweden)

    Guangbo Liu

    Full Text Available Saccharomyces cerevisiae (budding yeast is a powerful eukaryotic model organism ideally suited to high-throughput genetic analyses, which time and again has yielded insights that further our understanding of cell biology processes conserved in humans. Lithium Acetate (LiAc transformation of yeast with DNA for the purposes of exogenous protein expression (e.g., plasmids or genome mutation (e.g., gene mutation, deletion, epitope tagging is a useful and long established method. However, a reliable and optimized high throughput transformation protocol that runs almost no risk of human error has not been described in the literature. Here, we describe such a method that is broadly transferable to most liquid handling high-throughput robotic platforms, which are now commonplace in academic and industry settings. Using our optimized method, we are able to comfortably transform approximately 1200 individual strains per day, allowing complete transformation of typical genomic yeast libraries within 6 days. In addition, use of our protocol for gene knockout purposes also provides a potentially quicker, easier and more cost-effective approach to generating collections of double mutants than the popular and elegant synthetic genetic array methodology. In summary, our methodology will be of significant use to anyone interested in high throughput molecular and/or genetic analysis of yeast.

  4. SNP high-throughput screening in grapevine using the SNPlex™ genotyping system

    Directory of Open Access Journals (Sweden)

    Velasco Riccardo

    2008-01-01

    Full Text Available Abstract Background Until recently, only a small number of low- and mid-throughput methods have been used for single nucleotide polymorphism (SNP discovery and genotyping in grapevine (Vitis vinifera L.. However, following completion of the sequence of the highly heterozygous genome of Pinot Noir, it has been possible to identify millions of electronic SNPs (eSNPs thus providing a valuable source for high-throughput genotyping methods. Results Herein we report the first application of the SNPlex™ genotyping system in grapevine aiming at the anchoring of an eukaryotic genome. This approach combines robust SNP detection with automated assay readout and data analysis. 813 candidate eSNPs were developed from non-repetitive contigs of the assembled genome of Pinot Noir and tested in 90 progeny of Syrah × Pinot Noir cross. 563 new SNP-based markers were obtained and mapped. The efficiency rate of 69% was enhanced to 80% when multiple displacement amplification (MDA methods were used for preparation of genomic DNA for the SNPlex assay. Conclusion Unlike other SNP genotyping methods used to investigate thousands of SNPs in a few genotypes, or a few SNPs in around a thousand genotypes, the SNPlex genotyping system represents a good compromise to investigate several hundred SNPs in a hundred or more samples simultaneously. Therefore, the use of the SNPlex assay, coupled with whole genome amplification (WGA, is a good solution for future applications in well-equipped laboratories.

  5. Comprehensive molecular diagnosis of Epstein-Barr virus-associated lymphoproliferative diseases using next-generation sequencing.

    Science.gov (United States)

    Ono, Shintaro; Nakayama, Manabu; Kanegane, Hirokazu; Hoshino, Akihiro; Shimodera, Saeko; Shibata, Hirofumi; Fujino, Hisanori; Fujino, Takahiro; Yunomae, Yuta; Okano, Tsubasa; Yamashita, Motoi; Yasumi, Takahiro; Izawa, Kazushi; Takagi, Masatoshi; Imai, Kohsuke; Zhang, Kejian; Marsh, Rebecca; Picard, Capucine; Latour, Sylvain; Ohara, Osamu; Morio, Tomohiro

    2018-05-18

    Epstein-Barr virus (EBV) is associated with several life-threatening diseases, such as lymphoproliferative disease (LPD), particularly in immunocompromised hosts. Some categories of primary immunodeficiency diseases (PIDs) including X-linked lymphoproliferative syndrome (XLP), are characterized by susceptibility and vulnerability to EBV infection. The number of genetically defined PIDs is rapidly increasing, and clinical genetic testing plays an important role in establishing a definitive diagnosis. Whole-exome sequencing is performed for diagnosing rare genetic diseases, but is both expensive and time-consuming. Low-cost, high-throughput gene analysis systems are thus necessary. We developed a comprehensive molecular diagnostic method using a two-step tailed polymerase chain reaction (PCR) and a next-generation sequencing (NGS) platform to detect mutations in 23 candidate genes responsible for XLP or XLP-like diseases. Samples from 19 patients suspected of having EBV-associated LPD were used in this comprehensive molecular diagnosis. Causative gene mutations (involving PRF1 and SH2D1A) were detected in two of the 19 patients studied. This comprehensive diagnosis method effectively detected mutations in all coding exons of 23 genes with sufficient read numbers for each amplicon. This comprehensive molecular diagnostic method using PCR and NGS provides a rapid, accurate, low-cost diagnosis for patients with XLP or XLP-like diseases.

  6. Engineering a vitamin B12 high-throughput screening system by riboswitch sensor in Sinorhizobium meliloti.

    Science.gov (United States)

    Cai, Yingying; Xia, Miaomiao; Dong, Huina; Qian, Yuan; Zhang, Tongcun; Zhu, Beiwei; Wu, Jinchuan; Zhang, Dawei

    2018-05-11

    As a very important coenzyme in the cell metabolism, Vitamin B 12 (cobalamin, VB 12 ) has been widely used in food and medicine fields. The complete biosynthesis of VB 12 requires approximately 30 genes, but overexpression of these genes did not result in expected increase of VB 12 production. High-yield VB 12 -producing strains are usually obtained by mutagenesis treatments, thus developing an efficient screening approach is urgently needed. By the help of engineered strains with varied capacities of VB 12 production, a riboswitch library was constructed and screened, and the btuB element from Salmonella typhimurium was identified as the best regulatory device. A flow cytometry high-throughput screening system was developed based on the btuB riboswitch with high efficiency to identify positive mutants. Mutation of Sinorhizobium meliloti (S. meliloti) was optimized using the novel mutation technique of atmospheric and room temperature plasma (ARTP). Finally, the mutant S. meliloti MC5-2 was obtained and considered as a candidate for industrial applications. After 7 d's cultivation on a rotary shaker at 30 °C, the VB 12 titer of S. meliloti MC5-2 reached 156 ± 4.2 mg/L, which was 21.9% higher than that of the wild type strain S. meliloti 320 (128 ± 3.2 mg/L). The genome of S. meliloti MC5-2 was sequenced, and gene mutations were identified and analyzed. To our knowledge, it is the first time that a riboswitch element was used in S. meliloti. The flow cytometry high-throughput screening system was successfully developed and a high-yield VB 12 producing strain was obtained. The identified and analyzed gene mutations gave useful information for developing high-yield strains by metabolic engineering. Overall, this work provides a useful high-throughput screening method for developing high VB 12 -yield strains.

  7. Novel strategy for protein exploration: high-throughput screening assisted with fuzzy neural network.

    Science.gov (United States)

    Kato, Ryuji; Nakano, Hideo; Konishi, Hiroyuki; Kato, Katsuya; Koga, Yuchi; Yamane, Tsuneo; Kobayashi, Takeshi; Honda, Hiroyuki

    2005-08-19

    To engineer proteins with desirable characteristics from a naturally occurring protein, high-throughput screening (HTS) combined with directed evolutional approach is the essential technology. However, most HTS techniques are simple positive screenings. The information obtained from the positive candidates is used only as results but rarely as clues for understanding the structural rules, which may explain the protein activity. In here, we have attempted to establish a novel strategy for exploring functional proteins associated with computational analysis. As a model case, we explored lipases with inverted enantioselectivity for a substrate p-nitrophenyl 3-phenylbutyrate from the wild-type lipase of Burkhorderia cepacia KWI-56, which is originally selective for (S)-configuration of the substrate. Data from our previous work on (R)-enantioselective lipase screening were applied to fuzzy neural network (FNN), bioinformatic algorithm, to extract guidelines for screening and engineering processes to be followed. FNN has an advantageous feature of extracting hidden rules that lie between sequences of variants and their enzyme activity to gain high prediction accuracy. Without any prior knowledge, FNN predicted a rule indicating that "size at position L167," among four positions (L17, F119, L167, and L266) in the substrate binding core region, is the most influential factor for obtaining lipase with inverted (R)-enantioselectivity. Based on the guidelines obtained, newly engineered novel variants, which were not found in the actual screening, were experimentally proven to gain high (R)-enantioselectivity by engineering the size at position L167. We also designed and assayed two novel variants, namely FIGV (L17F, F119I, L167G, and L266V) and FFGI (L17F, L167G, and L266I), which were compatible with the guideline obtained from FNN analysis, and confirmed that these designed lipases could acquire high inverted enantioselectivity. The results have shown that with the aid of

  8. XplorSeq: a software environment for integrated management and phylogenetic analysis of metagenomic sequence data.

    Science.gov (United States)

    Frank, Daniel N

    2008-10-07

    Advances in automated DNA sequencing technology have accelerated the generation of metagenomic DNA sequences, especially environmental ribosomal RNA gene (rDNA) sequences. As the scale of rDNA-based studies of microbial ecology has expanded, need has arisen for software that is capable of managing, annotating, and analyzing the plethora of diverse data accumulated in these projects. XplorSeq is a software package that facilitates the compilation, management and phylogenetic analysis of DNA sequences. XplorSeq was developed for, but is not limited to, high-throughput analysis of environmental rRNA gene sequences. XplorSeq integrates and extends several commonly used UNIX-based analysis tools by use of a Macintosh OS-X-based graphical user interface (GUI). Through this GUI, users may perform basic sequence import and assembly steps (base-calling, vector/primer trimming, contig assembly), perform BLAST (Basic Local Alignment and Search Tool; 123) searches of NCBI and local databases, create multiple sequence alignments, build phylogenetic trees, assemble Operational Taxonomic Units, estimate biodiversity indices, and summarize data in a variety of formats. Furthermore, sequences may be annotated with user-specified meta-data, which then can be used to sort data and organize analyses and reports. A document-based architecture permits parallel analysis of sequence data from multiple clones or amplicons, with sequences and other data stored in a single file. XplorSeq should benefit researchers who are engaged in analyses of environmental sequence data, especially those with little experience using bioinformatics software. Although XplorSeq was developed for management of rDNA sequence data, it can be applied to most any sequencing project. The application is available free of charge for non-commercial use at http://vent.colorado.edu/phyloware.

  9. Analysis of JC virus DNA replication using a quantitative and high-throughput assay

    International Nuclear Information System (INIS)

    Shin, Jong; Phelan, Paul J.; Chhum, Panharith; Bashkenova, Nazym; Yim, Sung; Parker, Robert; Gagnon, David; Gjoerup, Ole; Archambault, Jacques; Bullock, Peter A.

    2014-01-01

    Progressive Multifocal Leukoencephalopathy (PML) is caused by lytic replication of JC virus (JCV) in specific cells of the central nervous system. Like other polyomaviruses, JCV encodes a large T-antigen helicase needed for replication of the viral DNA. Here, we report the development of a luciferase-based, quantitative and high-throughput assay of JCV DNA replication in C33A cells, which, unlike the glial cell lines Hs 683 and U87, accumulate high levels of nuclear T-ag needed for robust replication. Using this assay, we investigated the requirement for different domains of T-ag, and for specific sequences within and flanking the viral origin, in JCV DNA replication. Beyond providing validation of the assay, these studies revealed an important stimulatory role of the transcription factor NF1 in JCV DNA replication. Finally, we show that the assay can be used for inhibitor testing, highlighting its value for the identification of antiviral drugs targeting JCV DNA replication. - Highlights: • Development of a high-throughput screening assay for JCV DNA replication using C33A cells. • Evidence that T-ag fails to accumulate in the nuclei of established glioma cell lines. • Evidence that NF-1 directly promotes JCV DNA replication in C33A cells. • Proof-of-concept that the HTS assay can be used to identify pharmacological inhibitor of JCV DNA replication

  10. Analysis of JC virus DNA replication using a quantitative and high-throughput assay

    Energy Technology Data Exchange (ETDEWEB)

    Shin, Jong; Phelan, Paul J.; Chhum, Panharith; Bashkenova, Nazym; Yim, Sung; Parker, Robert [Department of Developmental, Molecular and Chemical Biology, Tufts University School of Medicine, Boston, MA 02111 (United States); Gagnon, David [Institut de Recherches Cliniques de Montreal (IRCM), 110 Pine Avenue West, Montreal, Quebec, Canada H2W 1R7 (Canada); Department of Biochemistry and Molecular Medicine, Université de Montréal, Montréal, Quebec (Canada); Gjoerup, Ole [Molecular Oncology Research Institute, Tufts Medical Center, Boston, MA 02111 (United States); Archambault, Jacques [Institut de Recherches Cliniques de Montreal (IRCM), 110 Pine Avenue West, Montreal, Quebec, Canada H2W 1R7 (Canada); Department of Biochemistry and Molecular Medicine, Université de Montréal, Montréal, Quebec (Canada); Bullock, Peter A., E-mail: Peter.Bullock@tufts.edu [Department of Developmental, Molecular and Chemical Biology, Tufts University School of Medicine, Boston, MA 02111 (United States)

    2014-11-15

    Progressive Multifocal Leukoencephalopathy (PML) is caused by lytic replication of JC virus (JCV) in specific cells of the central nervous system. Like other polyomaviruses, JCV encodes a large T-antigen helicase needed for replication of the viral DNA. Here, we report the development of a luciferase-based, quantitative and high-throughput assay of JCV DNA replication in C33A cells, which, unlike the glial cell lines Hs 683 and U87, accumulate high levels of nuclear T-ag needed for robust replication. Using this assay, we investigated the requirement for different domains of T-ag, and for specific sequences within and flanking the viral origin, in JCV DNA replication. Beyond providing validation of the assay, these studies revealed an important stimulatory role of the transcription factor NF1 in JCV DNA replication. Finally, we show that the assay can be used for inhibitor testing, highlighting its value for the identification of antiviral drugs targeting JCV DNA replication. - Highlights: • Development of a high-throughput screening assay for JCV DNA replication using C33A cells. • Evidence that T-ag fails to accumulate in the nuclei of established glioma cell lines. • Evidence that NF-1 directly promotes JCV DNA replication in C33A cells. • Proof-of-concept that the HTS assay can be used to identify pharmacological inhibitor of JCV DNA replication.

  11. EZH2 and CD79B mutational status over time in B-cell non-Hodgkin lymphomas detected by high-throughput sequencing using minimal samples

    Science.gov (United States)

    Saieg, Mauro Ajaj; Geddie, William R; Boerner, Scott L; Bailey, Denis; Crump, Michael; da Cunha Santos, Gilda

    2013-01-01

    BACKGROUND: Numerous genomic abnormalities in B-cell non-Hodgkin lymphomas (NHLs) have been revealed by novel high-throughput technologies, including recurrent mutations in EZH2 (enhancer of zeste homolog 2) and CD79B (B cell antigen receptor complex-associated protein beta chain) genes. This study sought to determine the evolution of the mutational status of EZH2 and CD79B over time in different samples from the same patient in a cohort of B-cell NHLs, through use of a customized multiplex mutation assay. METHODS: DNA that was extracted from cytological material stored on FTA cards as well as from additional specimens, including archived frozen and formalin-fixed histological specimens, archived stained smears, and cytospin preparations, were submitted to a multiplex mutation assay specifically designed for the detection of point mutations involving EZH2 and CD79B, using MassARRAY spectrometry followed by Sanger sequencing. RESULTS: All 121 samples from 80 B-cell NHL cases were successfully analyzed. Mutations in EZH2 (Y646) and CD79B (Y196) were detected in 13.2% and 8% of the samples, respectively, almost exclusively in follicular lymphomas and diffuse large B-cell lymphomas. In one-third of the positive cases, a wild type was detected in a different sample from the same patient during follow-up. CONCLUSIONS: Testing multiple minimal tissue samples using a high-throughput multiplex platform exponentially increases tissue availability for molecular analysis and might facilitate future studies of tumor progression and the related molecular events. Mutational status of EZH2 and CD79B may vary in B-cell NHL samples over time and support the concept that individualized therapy should be based on molecular findings at the time of treatment, rather than on results obtained from previous specimens. Cancer (Cancer Cytopathol) 2013;121:377–386. © 2013 American Cancer Society. PMID:23361872

  12. Improved sensitivity of circulating tumor DNA measurement using short PCR amplicons

    DEFF Research Database (Denmark)

    Andersen, Rikke Fredslund; Spindler, Karen-Lise Garm; Brandslund, Ivan

    2015-01-01

    , however, presents a number of challenges that require attention. The amount of DNA is low and highly fragmented and analyses need to be optimized accordingly. KRAS ARMS-qPCR assays with amplicon lengths of 120 and 85 base pairs, respectively, were compared using positive control material (PCR fragments...

  13. Developing de novo human artificial chromosomes in embryonic stem cells using HSV-1 amplicon technology.

    Science.gov (United States)

    Moralli, Daniela; Monaco, Zoia L

    2015-02-01

    De novo artificial chromosomes expressing genes have been generated in human embryonic stem cells (hESc) and are maintained following differentiation into other cell types. Human artificial chromosomes (HAC) are small, functional, extrachromosomal elements, which behave as normal chromosomes in human cells. De novo HAC are generated following delivery of alpha satellite DNA into target cells. HAC are characterized by high levels of mitotic stability and are used as models to study centromere formation and chromosome organisation. They are successful and effective as gene expression vectors since they remain autonomous and can accommodate larger genes and regulatory regions for long-term expression studies in cells unlike other viral gene delivery vectors currently used. Transferring the essential DNA sequences for HAC formation intact across the cell membrane has been challenging for a number of years. A highly efficient delivery system based on HSV-1 amplicons has been used to target DNA directly to the ES cell nucleus and HAC stably generated in human embryonic stem cells (hESc) at high frequency. HAC were detected using an improved protocol for hESc chromosome harvesting, which consistently produced high-quality metaphase spreads that could routinely detect HAC in hESc. In tumour cells, the input DNA often integrated in the host chromosomes, but in the host ES genome, it remained intact. The hESc containing the HAC formed embryoid bodies, generated teratoma in mice, and differentiated into neuronal cells where the HAC were maintained. The HAC structure and chromatin composition was similar to the endogenous hESc chromosomes. This review will discuss the technological advances in HAC vector delivery using HSV-1 amplicons and the improvements in the identification of de novo HAC in hESc.

  14. High Throughput Analysis of Photocatalytic Water Purification

    NARCIS (Netherlands)

    Sobral Romao, J.I.; Baiao Barata, David; Habibovic, Pamela; Mul, Guido; Baltrusaitis, Jonas

    2014-01-01

    We present a novel high throughput photocatalyst efficiency assessment method based on 96-well microplates and UV-Vis spectroscopy. We demonstrate the reproducibility of the method using methyl orange (MO) decomposition, and compare kinetic data obtained with those provided in the literature for

  15. Accurate RNA consensus sequencing for high-fidelity detection of transcriptional mutagenesis-induced epimutations.

    Science.gov (United States)

    Reid-Bayliss, Kate S; Loeb, Lawrence A

    2017-08-29

    Transcriptional mutagenesis (TM) due to misincorporation during RNA transcription can result in mutant RNAs, or epimutations, that generate proteins with altered properties. TM has long been hypothesized to play a role in aging, cancer, and viral and bacterial evolution. However, inadequate methodologies have limited progress in elucidating a causal association. We present a high-throughput, highly accurate RNA sequencing method to measure epimutations with single-molecule sensitivity. Accurate RNA consensus sequencing (ARC-seq) uniquely combines RNA barcoding and generation of multiple cDNA copies per RNA molecule to eliminate errors introduced during cDNA synthesis, PCR, and sequencing. The stringency of ARC-seq can be scaled to accommodate the quality of input RNAs. We apply ARC-seq to directly assess transcriptome-wide epimutations resulting from RNA polymerase mutants and oxidative stress.

  16. High Throughput Synthesis and Screening for Agents Inhibiting Androgen Receptor Mediated Gene Transcription

    National Research Council Canada - National Science Library

    Boger, Dale L

    2005-01-01

    .... This entails the high throughput synthesis of DNA binding agents related to distamycin, their screening for binding to androgen response elements using a new high throughput DNA binding screen...

  17. High Throughput Synthesis and Screening for Agents Inhibiting Androgen Receptor Mediated Gene Transcription

    National Research Council Canada - National Science Library

    Boger, Dale

    2004-01-01

    .... This entails the high throughput synthesis of DNA binding agents related to distamycin, their screening for binding to androgen response elements using a new high throughput DNA binding screen...

  18. Genome Sequence of a Novel Archaeal Rudivirus Recovered from a Mexican Hot Spring

    DEFF Research Database (Denmark)

    Servín-Garcidueñas, L; Peng, X; Garrett, R

    2013-01-01

    We report the consensus genome sequence of a novel GC-rich rudivirus, designated SMR1 (Sulfolobales Mexican rudivirus 1), assembled from a high-throughput sequenced environmental sample from a hot spring in Los Azufres National Park in western Mexico.......We report the consensus genome sequence of a novel GC-rich rudivirus, designated SMR1 (Sulfolobales Mexican rudivirus 1), assembled from a high-throughput sequenced environmental sample from a hot spring in Los Azufres National Park in western Mexico....

  19. A Dual-Mode Large-Arrayed CMOS ISFET Sensor for Accurate and High-Throughput pH Sensing in Biomedical Diagnosis.

    Science.gov (United States)

    Huang, Xiwei; Yu, Hao; Liu, Xu; Jiang, Yu; Yan, Mei; Wu, Dongping

    2015-09-01

    The existing ISFET-based DNA sequencing detects hydrogen ions released during the polymerization of DNA strands on microbeads, which are scattered into microwell array above the ISFET sensor with unknown distribution. However, false pH detection happens at empty microwells due to crosstalk from neighboring microbeads. In this paper, a dual-mode CMOS ISFET sensor is proposed to have accurate pH detection toward DNA sequencing. Dual-mode sensing, optical and chemical modes, is realized by integrating a CMOS image sensor (CIS) with ISFET pH sensor, and is fabricated in a standard 0.18-μm CIS process. With accurate determination of microbead physical locations with CIS pixel by contact imaging, the dual-mode sensor can correlate local pH for one DNA slice at one location-determined microbead, which can result in improved pH detection accuracy. Moreover, toward a high-throughput DNA sequencing, a correlated-double-sampling readout that supports large array for both modes is deployed to reduce pixel-to-pixel nonuniformity such as threshold voltage mismatch. The proposed CMOS dual-mode sensor is experimentally examined to show a well correlated pH map and optical image for microbeads with a pH sensitivity of 26.2 mV/pH, a fixed pattern noise (FPN) reduction from 4% to 0.3%, and a readout speed of 1200 frames/s. A dual-mode CMOS ISFET sensor with suppressed FPN for accurate large-arrayed pH sensing is proposed and demonstrated with state-of-the-art measured results toward accurate and high-throughput DNA sequencing. The developed dual-mode CMOS ISFET sensor has great potential for future personal genome diagnostics with high accuracy and low cost.

  20. High throughput route selection in multi-rate wireless mesh networks

    Institute of Scientific and Technical Information of China (English)

    WEI Yi-fei; GUO Xiang-li; SONG Mei; SONG Jun-de

    2008-01-01

    Most existing Ad-hoc routing protocols use the shortest path algorithm with a hop count metric to select paths. It is appropriate in single-rate wireless networks, but has a tendency to select paths containing long-distance links that have low data rates and reduced reliability in multi-rate networks. This article introduces a high throughput routing algorithm utilizing the multi-rate capability and some mesh characteristics in wireless fidelity (WiFi) mesh networks. It uses the medium access control (MAC) transmission time as the routing metric, which is estimated by the information passed up from the physical layer. When the proposed algorithm is adopted, the Ad-hoc on-demand distance vector (AODV) routing can be improved as high throughput AODV (HT-AODV). Simulation results show that HT-AODV is capable of establishing a route that has high data-rate, short end-to-end delay and great network throughput.

  1. Alginate Immobilization of Metabolic Enzymes (AIME) for High-Throughput Screening Assays (SOT)

    Science.gov (United States)

    Alginate Immobilization of Metabolic Enzymes (AIME) for High-Throughput Screening Assays DE DeGroot, RS Thomas, and SO SimmonsNational Center for Computational Toxicology, US EPA, Research Triangle Park, NC USAThe EPA’s ToxCast program utilizes a wide variety of high-throughput s...

  2. High-throughput phenotyping and genomic selection: the frontiers of crop breeding converge.

    Science.gov (United States)

    Cabrera-Bosquet, Llorenç; Crossa, José; von Zitzewitz, Jarislav; Serret, María Dolors; Araus, José Luis

    2012-05-01

    Genomic selection (GS) and high-throughput phenotyping have recently been captivating the interest of the crop breeding community from both the public and private sectors world-wide. Both approaches promise to revolutionize the prediction of complex traits, including growth, yield and adaptation to stress. Whereas high-throughput phenotyping may help to improve understanding of crop physiology, most powerful techniques for high-throughput field phenotyping are empirical rather than analytical and comparable to genomic selection. Despite the fact that the two methodological approaches represent the extremes of what is understood as the breeding process (phenotype versus genome), they both consider the targeted traits (e.g. grain yield, growth, phenology, plant adaptation to stress) as a black box instead of dissecting them as a set of secondary traits (i.e. physiological) putatively related to the target trait. Both GS and high-throughput phenotyping have in common their empirical approach enabling breeders to use genome profile or phenotype without understanding the underlying biology. This short review discusses the main aspects of both approaches and focuses on the case of genomic selection of maize flowering traits and near-infrared spectroscopy (NIRS) and plant spectral reflectance as high-throughput field phenotyping methods for complex traits such as crop growth and yield. © 2012 Institute of Botany, Chinese Academy of Sciences.

  3. High-throughput single nucleotide polymorphism genotyping using nanofluidic Dynamic Arrays

    Directory of Open Access Journals (Sweden)

    Crenshaw Andrew

    2009-01-01

    Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs have emerged as the genetic marker of choice for mapping disease loci and candidate gene association studies, because of their high density and relatively even distribution in the human genomes. There is a need for systems allowing medium multiplexing (ten to hundreds of SNPs with high throughput, which can efficiently and cost-effectively generate genotypes for a very large sample set (thousands of individuals. Methods that are flexible, fast, accurate and cost-effective are urgently needed. This is also important for those who work on high throughput genotyping in non-model systems where off-the-shelf assays are not available and a flexible platform is needed. Results We demonstrate the use of a nanofluidic Integrated Fluidic Circuit (IFC - based genotyping system for medium-throughput multiplexing known as the Dynamic Array, by genotyping 994 individual human DNA samples on 47 different SNP assays, using nanoliter volumes of reagents. Call rates of greater than 99.5% and call accuracies of greater than 99.8% were achieved from our study, which demonstrates that this is a formidable genotyping platform. The experimental set up is very simple, with a time-to-result for each sample of about 3 hours. Conclusion Our results demonstrate that the Dynamic Array is an excellent genotyping system for medium-throughput multiplexing (30-300 SNPs, which is simple to use and combines rapid throughput with excellent call rates, high concordance and low cost. The exceptional call rates and call accuracy obtained may be of particular interest to those working on validation and replication of genome- wide- association (GWA studies.

  4. High throughput, multiplexed pathogen detection authenticates plague waves in medieval Venice, Italy.

    Science.gov (United States)

    Tran, Thi-Nguyen-Ny; Signoli, Michel; Fozzati, Luigi; Aboudharam, Gérard; Raoult, Didier; Drancourt, Michel

    2011-03-10

    Historical records suggest that multiple burial sites from the 14th-16th centuries in Venice, Italy, were used during the Black Death and subsequent plague epidemics. High throughput, multiplexed real-time PCR detected DNA of seven highly transmissible pathogens in 173 dental pulp specimens collected from 46 graves. Bartonella quintana DNA was identified in five (2.9%) samples, including three from the 16th century and two from the 15th century, and Yersinia pestis DNA was detected in three (1.7%) samples, including two from the 14th century and one from the 16th century. Partial glpD gene sequencing indicated that the detected Y. pestis was the Orientalis biotype. These data document for the first time successive plague epidemics in the medieval European city where quarantine was first instituted in the 14th century.

  5. HTTK: R Package for High-Throughput Toxicokinetics

    Science.gov (United States)

    Thousands of chemicals have been profiled by high-throughput screening programs such as ToxCast and Tox21; these chemicals are tested in part because most of them have limited or no data on hazard, exposure, or toxicokinetics. Toxicokinetic models aid in predicting tissue concent...

  6. A high throughput system for the preparation of single stranded templates grown in microculture.

    Science.gov (United States)

    Kolner, D E; Guilfoyle, R A; Smith, L M

    1994-01-01

    A high throughput system for the preparation of single stranded M13 sequencing templates is described. Supernatants from clones grown in 48-well plates are treated with a chaotropic agent to dissociate the phage coat protein. Using a semi-automated cell harvester, the free nucleic acid is bound to a glass fiber filter in the presence of chaotrope and then washed with ethanol by aspiration. Individual glass fiber discs are punched out on the cell harvester and dried briefly. The DNA samples are then eluted in water by centrifugation. The processing time from 96 microcultures to sequence quality templates is approximately 1 hr. Assuming the ability to sequence 400 bases per clone, a 0.5 megabase per day genome sequencing facility will require 6250 purified templates a week. Toward accomplishing this goal we have developed a procedure which is a modification of a method that uses a chaotropic agent and glass fiber filter (Kristensen et al., 1987). By exploiting the ability of a cell harvester to uniformly aspirate and wash 96 samples, a rapid system for high quality template preparation has been developed. Other semi-automated systems for template preparation have been developed using commercially available robotic workstations like the Biomek (Mardis and Roe, 1989). Although minimal human intervention is required, processing time is at least twice as long. Custom systems based on paramagnetic beads (Hawkins et al., 1992) produce DNA in insufficient quantity for direct sequencing and therefore require cycle sequencing. These systems require custom programing, have a fairly high initial cost and have not proven to be as fast as the method reported here.

  7. The pig gut microbial diversity: Understanding the pig gut microbial ecology through the next generation high throughput sequencing.

    Science.gov (United States)

    Kim, Hyeun Bum; Isaacson, Richard E

    2015-06-12

    The importance of the gut microbiota of animals is widely acknowledged because of its pivotal roles in the health and well being of animals. The genetic diversity of the gut microbiota contributes to the overall development and metabolic needs of the animal, and provides the host with many beneficial functions including production of volatile fatty acids, re-cycling of bile salts, production of vitamin K, cellulose digestion, and development of immune system. Thus the intestinal microbiota of animals has been the subject of study for many decades. Although most of the older studies have used culture dependent methods, the recent advent of high throughput sequencing of 16S rRNA genes has facilitated in depth studies exploring microbial populations and their dynamics in the animal gut. These culture independent DNA based studies generate large amounts of data and as a result contribute to a more detailed understanding of the microbiota dynamics in the gut and the ecology of the microbial populations. Of equal importance, is being able to identify and quantify microbes that are difficult to grow or that have not been grown in the laboratory. Interpreting the data obtained from this type of study requires using basic principles of microbial diversity to understand importance of the composition of microbial populations. In this review, we summarize the literature on culture independent studies of the pig gut microbiota with an emphasis on its succession and alterations caused by diverse factors. Copyright © 2015 Elsevier B.V. All rights reserved.

  8. A high-throughput, multi-channel photon-counting detector with picosecond timing

    CERN Document Server

    Lapington, J S; Miller, G M; Ashton, T J R; Jarron, P; Despeisse, M; Powolny, F; Howorth, J; Milnes, J

    2009-01-01

    High-throughput photon counting with high time resolution is a niche application area where vacuum tubes can still outperform solid-state devices. Applications in the life sciences utilizing time-resolved spectroscopies, particularly in the growing field of proteomics, will benefit greatly from performance enhancements in event timing and detector throughput. The HiContent project is a collaboration between the University of Leicester Space Research Centre, the Microelectronics Group at CERN, Photek Ltd., and end-users at the Gray Cancer Institute and the University of Manchester. The goal is to develop a detector system specifically designed for optical proteomics, capable of high content (multi-parametric) analysis at high throughput. The HiContent detector system is being developed to exploit this niche market. It combines multi-channel, high time resolution photon counting in a single miniaturized detector system with integrated electronics. The combination of enabling technologies; small pore microchanne...

  9. Alignment of time-resolved data from high throughput experiments.

    Science.gov (United States)

    Abidi, Nada; Franke, Raimo; Findeisen, Peter; Klawonn, Frank

    2016-12-01

    To better understand the dynamics of the underlying processes in cells, it is necessary to take measurements over a time course. Modern high-throughput technologies are often used for this purpose to measure the behavior of cell products like metabolites, peptides, proteins, [Formula: see text]RNA or mRNA at different points in time. Compared to classical time series, the number of time points is usually very limited and the measurements are taken at irregular time intervals. The main reasons for this are the costs of the experiments and the fact that the dynamic behavior usually shows a strong reaction and fast changes shortly after a stimulus and then slowly converges to a certain stable state. Another reason might simply be missing values. It is common to repeat the experiments and to have replicates in order to carry out a more reliable analysis. The ideal assumptions that the initial stimulus really started exactly at the same time for all replicates and that the replicates are perfectly synchronized are seldom satisfied. Therefore, there is a need to first adjust or align the time-resolved data before further analysis is carried out. Dynamic time warping (DTW) is considered as one of the common alignment techniques for time series data with equidistant time points. In this paper, we modified the DTW algorithm so that it can align sequences with measurements at different, non-equidistant time points with large gaps in between. This type of data is usually known as time-resolved data characterized by irregular time intervals between measurements as well as non-identical time points for different replicates. This new algorithm can be easily used to align time-resolved data from high-throughput experiments and to come across existing problems such as time scarcity and existing noise in the measurements. We propose a modified method of DTW to adapt requirements imposed by time-resolved data by use of monotone cubic interpolation splines. Our presented approach

  10. Space Link Extension Protocol Emulation for High-Throughput, High-Latency Network Connections

    Science.gov (United States)

    Tchorowski, Nicole; Murawski, Robert

    2014-01-01

    New space missions require higher data rates and new protocols to meet these requirements. These high data rate space communication links push the limitations of not only the space communication links, but of the ground communication networks and protocols which forward user data to remote ground stations (GS) for transmission. The Consultative Committee for Space Data Systems, (CCSDS) Space Link Extension (SLE) standard protocol is one protocol that has been proposed for use by the NASA Space Network (SN) Ground Segment Sustainment (SGSS) program. New protocol implementations must be carefully tested to ensure that they provide the required functionality, especially because of the remote nature of spacecraft. The SLE protocol standard has been tested in the NASA Glenn Research Center's SCENIC Emulation Lab in order to observe its operation under realistic network delay conditions. More specifically, the delay between then NASA Integrated Services Network (NISN) and spacecraft has been emulated. The round trip time (RTT) delay for the continental NISN network has been shown to be up to 120ms; as such the SLE protocol was tested with network delays ranging from 0ms to 200ms. Both a base network condition and an SLE connection were tested with these RTT delays, and the reaction of both network tests to the delay conditions were recorded. Throughput for both of these links was set at 1.2Gbps. The results will show that, in the presence of realistic network delay, the SLE link throughput is significantly reduced while the base network throughput however remained at the 1.2Gbps specification. The decrease in SLE throughput has been attributed to the implementation's use of blocking calls. The decrease in throughput is not acceptable for high data rate links, as the link requires constant data a flow in order for spacecraft and ground radios to stay synchronized, unless significant data is queued a the ground station. In cases where queuing the data is not an option

  11. Capturing the Alternative Cleavage and Polyadenylation Sites of 14 NAC Genes in Populus Using a Combination of 3′-RACE and High-Throughput Sequencing

    Directory of Open Access Journals (Sweden)

    Haoran Wang

    2018-03-01

    Full Text Available Detection of complex splice sites (SSs and polyadenylation sites (PASs of eukaryotic genes is essential for the elucidation of gene regulatory mechanisms. Transcriptome-wide studies using high-throughput sequencing (HTS have revealed prevalent alternative splicing (AS and alternative polyadenylation (APA in plants. However, small-scale and high-depth HTS aimed at detecting genes or gene families are very few and limited. We explored a convenient and flexible method for profiling SSs and PASs, which combines rapid amplification of 3′-cDNA ends (3′-RACE and HTS. Fourteen NAC (NAM, ATAF1/2, CUC2 transcription factor genes of Populus trichocarpa were analyzed by 3′-RACE-seq. Based on experimental reproducibility, boundary sequence analysis and reverse transcription PCR (RT-PCR verification, only canonical SSs were considered to be authentic. Based on stringent criteria, candidate PASs without any internal priming features were chosen as authentic PASs and assumed to be PAS-rich markers. Thirty-four novel canonical SSs, six intronic/internal exons and thirty 3′-UTR PAS-rich markers were revealed by 3′-RACE-seq. Using 3′-RACE and real-time PCR, we confirmed that three APA transcripts ending in/around PAS-rich markers were differentially regulated in response to plant hormones. Our results indicate that 3′-RACE-seq is a robust and cost-effective method to discover SSs and label active regions subjected to APA for genes or gene families. The method is suitable for small-scale AS and APA research in the initial stage.

  12. High-Throughput Identification of Antimicrobial Peptides from Amphibious Mudskippers

    Directory of Open Access Journals (Sweden)

    Yunhai Yi

    2017-11-01

    Full Text Available Widespread existence of antimicrobial peptides (AMPs has been reported in various animals with comprehensive biological activities, which is consistent with the important roles of AMPs as the first line of host defense system. However, no big-data-based analysis on AMPs from any fish species is available. In this study, we identified 507 AMP transcripts on the basis of our previously reported genomes and transcriptomes of two representative amphibious mudskippers, Boleophthalmus pectinirostris (BP and Periophthalmus magnuspinnatus (PM. The former is predominantly aquatic with less time out of water, while the latter is primarily terrestrial with extended periods of time on land. Within these identified AMPs, 449 sequences are novel; 15 were reported in BP previously; 48 are identically overlapped between BP and PM; 94 were validated by mass spectrometry. Moreover, most AMPs presented differential tissue transcription patterns in the two mudskippers. Interestingly, we discovered two AMPs, hemoglobin β1 and amylin, with high inhibitions on Micrococcus luteus. In conclusion, our high-throughput screening strategy based on genomic and transcriptomic data opens an efficient pathway to discover new antimicrobial peptides for ongoing development of marine drugs.

  13. High-Throughput Identification of Antimicrobial Peptides from Amphibious Mudskippers.

    Science.gov (United States)

    Yi, Yunhai; You, Xinxin; Bian, Chao; Chen, Shixi; Lv, Zhao; Qiu, Limei; Shi, Qiong

    2017-11-22

    Widespread existence of antimicrobial peptides (AMPs) has been reported in various animals with comprehensive biological activities, which is consistent with the important roles of AMPs as the first line of host defense system. However, no big-data-based analysis on AMPs from any fish species is available. In this study, we identified 507 AMP transcripts on the basis of our previously reported genomes and transcriptomes of two representative amphibious mudskippers, Boleophthalmus pectinirostris (BP) and Periophthalmus magnuspinnatus (PM). The former is predominantly aquatic with less time out of water, while the latter is primarily terrestrial with extended periods of time on land. Within these identified AMPs, 449 sequences are novel; 15 were reported in BP previously; 48 are identically overlapped between BP and PM; 94 were validated by mass spectrometry. Moreover, most AMPs presented differential tissue transcription patterns in the two mudskippers. Interestingly, we discovered two AMPs, hemoglobin β1 and amylin, with high inhibitions on Micrococcus luteus . In conclusion, our high-throughput screening strategy based on genomic and transcriptomic data opens an efficient pathway to discover new antimicrobial peptides for ongoing development of marine drugs.

  14. High-throughput sequencing reveals microbial communities in drinking water treatment sludge from six geographically distributed plants, including potentially toxic cyanobacteria and pathogens.

    Science.gov (United States)

    Xu, Hangzhou; Pei, Haiyan; Jin, Yan; Ma, Chunxia; Wang, Yuting; Sun, Jiongming; Li, Hongmin

    2018-04-10

    The microbial community structures of drinking water treatment sludge (DWTS) generated for raw water (RW) from different locations and with different source types - including river water, lake water and reservoir water -were investigated using high-throughput sequencing. Because the unit operations in the six DWTPs were similar, community composition in fresh sludge may be determined by microbial community in the corresponding RW. Although Proteobacteria, Cyanobacteria, Bacteroidetes, Firmicutes, Verrucomicrobia, and Planctomycetes were the dominant phyla among the six DWTS samples, no single phylum exhibited similar abundance across all the samples, owing to differences in total phosphorus, chemical oxygen demand, Al, Fe, and chloride in RW. Three genera of potentially toxic cyanobacteria (Planktothrix, Microcystis and Cylindrospermopsis), and four potential pathogens (Escherichia coli, Bacteroides ovatus, Prevotella copri and Rickettsia) were found in sludge samples. Because proliferation of potentially toxic cyanobacteria and Rickettsia in RW was mainly affected by nutrients, while growth of Escherichia coli, Bacteroides ovatus and Prevotella copri in RW may be influenced by Fe, control of nutrients and Fe in RW is essential to decrease toxic cyanobacteria and pathogens in DWTS. Copyright © 2018 Elsevier B.V. All rights reserved.

  15. TIMPs of parasitic helminths - a large-scale analysis of high-throughput sequence datasets.

    Science.gov (United States)

    Cantacessi, Cinzia; Hofmann, Andreas; Pickering, Darren; Navarro, Severine; Mitreva, Makedonka; Loukas, Alex

    2013-05-30

    Tissue inhibitors of metalloproteases (TIMPs) are a multifunctional family of proteins that orchestrate extracellular matrix turnover, tissue remodelling and other cellular processes. In parasitic helminths, such as hookworms, TIMPs have been proposed to play key roles in the host-parasite interplay, including invasion of and establishment in the vertebrate animal hosts. Currently, knowledge of helminth TIMPs is limited to a small number of studies on canine hookworms, whereas no information is available on the occurrence of TIMPs in other parasitic helminths causing neglected diseases. In the present study, we conducted a large-scale investigation of TIMP proteins of a range of neglected human parasites including the hookworm Necator americanus, the roundworm Ascaris suum, the liver flukes Clonorchis sinensis and Opisthorchis viverrini, as well as the schistosome blood flukes. This entailed mining available transcriptomic and/or genomic sequence datasets for the presence of homologues of known TIMPs, predicting secondary structures of defined protein sequences, systematic phylogenetic analyses and assessment of differential expression of genes encoding putative TIMPs in the developmental stages of A. suum, N. americanus and Schistosoma haematobium which infect the mammalian hosts. A total of 15 protein sequences with high homology to known eukaryotic TIMPs were predicted from the complement of sequence data available for parasitic helminths and subjected to in-depth bioinformatic analyses. Supported by the availability of gene manipulation technologies such as RNA interference and/or transgenesis, this work provides a basis for future functional explorations of helminth TIMPs and, in particular, of their role/s in fundamental biological pathways linked to long-term establishment in the vertebrate hosts, with a view towards the development of novel approaches for the control of neglected helminthiases.

  16. High-throughput screening to identify inhibitors of lysine demethylases.

    Science.gov (United States)

    Gale, Molly; Yan, Qin

    2015-01-01

    Lysine demethylases (KDMs) are epigenetic regulators whose dysfunction is implicated in the pathology of many human diseases including various types of cancer, inflammation and X-linked intellectual disability. Particular demethylases have been identified as promising therapeutic targets, and tremendous efforts are being devoted toward developing suitable small-molecule inhibitors for clinical and research use. Several High-throughput screening strategies have been developed to screen for small-molecule inhibitors of KDMs, each with advantages and disadvantages in terms of time, cost, effort, reliability and sensitivity. In this Special Report, we review and evaluate the High-throughput screening methods utilized for discovery of novel small-molecule KDM inhibitors.

  17. Annotating Protein Functional Residues by Coupling High-Throughput Fitness Profile and Homologous-Structure Analysis

    Directory of Open Access Journals (Sweden)

    Yushen Du

    2016-11-01

    Full Text Available Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp, we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available.

  18. High throughput production of mouse monoclonal antibodies using antigen microarrays

    DEFF Research Database (Denmark)

    De Masi, Federico; Chiarella, P.; Wilhelm, H.

    2005-01-01

    Recent advances in proteomics research underscore the increasing need for high-affinity monoclonal antibodies, which are still generated with lengthy, low-throughput antibody production techniques. Here we present a semi-automated, high-throughput method of hybridoma generation and identification....... Monoclonal antibodies were raised to different targets in single batch runs of 6-10 wk using multiplexed immunisations, automated fusion and cell-culture, and a novel antigen-coated microarray-screening assay. In a large-scale experiment, where eight mice were immunized with ten antigens each, we generated...

  19. High throughput electrospinning of high-quality nanofibers via an aluminum disk spinneret

    Science.gov (United States)

    Zheng, Guokuo

    In this work, a simple and efficient needleless high throughput electrospinning process using an aluminum disk spinneret with 24 holes is described. Electrospun mats produced by this setup consisted of fine fibers (nano-sized) of the highest quality while the productivity (yield) was many times that obtained from conventional single-needle electrospinning. The goal was to produce scaled-up amounts of the same or better quality nanofibers under variable concentration, voltage, and the working distance than those produced with the single needle lab setting. The fiber mats produced were either polymer or ceramic (such as molybdenum trioxide nanofibers). Through experimentation the optimum process conditions were defined to be: 24 kilovolt, a distance to collector of 15cm. More diluted solutions resulted in smaller diameter fibers. Comparing the morphologies of the nanofibers of MoO3 produced by both the traditional and the high throughput set up it was found that they were very similar. Moreover, the nanofibers production rate is nearly 10 times than that of traditional needle electrospinning. Thus, the high throughput process has the potential to become an industrial nanomanufacturing process and the materials processed by it may be used as filtration devices, in tissue engineering, and as sensors.

  20. Laterally orienting C. elegans using geometry at microscale for high-throughput visual screens in neurodegeneration and neuronal development studies.

    Directory of Open Access Journals (Sweden)

    Ivan de Carlos Cáceres

    Full Text Available C. elegans is an excellent model system for studying neuroscience using genetics because of its relatively simple nervous system, sequenced genome, and the availability of a large number of transgenic and mutant strains. Recently, microfluidic devices have been used for high-throughput genetic screens, replacing traditional methods of manually handling C. elegans. However, the orientation of nematodes within microfluidic devices is random and often not conducive to inspection, hindering visual analysis and overall throughput. In addition, while previous studies have utilized methods to bias head and tail orientation, none of the existing techniques allow for orientation along the dorso-ventral body axis. Here, we present the design of a simple and robust method for passively orienting worms into lateral body positions in microfluidic devices to facilitate inspection of morphological features with specific dorso-ventral alignments. Using this technique, we can position animals into lateral orientations with up to 84% efficiency, compared to 21% using existing methods. We isolated six mutants with neuronal development or neurodegenerative defects, showing that our technology can be used for on-chip analysis and high-throughput visual screens.

  1. Solion ion source for high-efficiency, high-throughput solar cell manufacturing

    Energy Technology Data Exchange (ETDEWEB)

    Koo, John, E-mail: john-koo@amat.com; Binns, Brant; Miller, Timothy; Krause, Stephen; Skinner, Wesley; Mullin, James [Applied Materials, Inc., Varian Semiconductor Equipment Business Unit, 35 Dory Road, Gloucester, Massachusetts 01930 (United States)

    2014-02-15

    In this paper, we introduce the Solion ion source for high-throughput solar cell doping. As the source power is increased to enable higher throughput, negative effects degrade the lifetime of the plasma chamber and the extraction electrodes. In order to improve efficiency, we have explored a wide range of electron energies and determined the conditions which best suit production. To extend the lifetime of the source we have developed an in situ cleaning method using only existing hardware. With these combinations, source life-times of >200 h for phosphorous and >100 h for boron ion beams have been achieved while maintaining 1100 cell-per-hour production.

  2. High throughput integrated thermal characterization with non-contact optical calorimetry

    Science.gov (United States)

    Hou, Sichao; Huo, Ruiqing; Su, Ming

    2017-10-01

    Commonly used thermal analysis tools such as calorimeter and thermal conductivity meter are separated instruments and limited by low throughput, where only one sample is examined each time. This work reports an infrared based optical calorimetry with its theoretical foundation, which is able to provide an integrated solution to characterize thermal properties of materials with high throughput. By taking time domain temperature information of spatially distributed samples, this method allows a single device (infrared camera) to determine the thermal properties of both phase change systems (melting temperature and latent heat of fusion) and non-phase change systems (thermal conductivity and heat capacity). This method further allows these thermal properties of multiple samples to be determined rapidly, remotely, and simultaneously. In this proof-of-concept experiment, the thermal properties of a panel of 16 samples including melting temperatures, latent heats of fusion, heat capacities, and thermal conductivities have been determined in 2 min with high accuracy. Given the high thermal, spatial, and temporal resolutions of the advanced infrared camera, this method has the potential to revolutionize the thermal characterization of materials by providing an integrated solution with high throughput, high sensitivity, and short analysis time.

  3. Opera: reconstructing optimal genomic scaffolds with high-throughput paired-end sequences.

    Science.gov (United States)

    Gao, Song; Sung, Wing-Kin; Nagarajan, Niranjan

    2011-11-01

    Scaffolding, the problem of ordering and orienting contigs, typically using paired-end reads, is a crucial step in the assembly of high-quality draft genomes. Even as sequencing technologies and mate-pair protocols have improved significantly, scaffolding programs still rely on heuristics, with no guarantees on the quality of the solution. In this work, we explored the feasibility of an exact solution for scaffolding and present a first tractable solution for this problem (Opera). We also describe a graph contraction procedure that allows the solution to scale to large scaffolding problems and demonstrate this by scaffolding several large real and synthetic datasets. In comparisons with existing scaffolders, Opera simultaneously produced longer and more accurate scaffolds demonstrating the utility of an exact approach. Opera also incorporates an exact quadratic programming formulation to precisely compute gap sizes (Availability: http://sourceforge.net/projects/operasf/ ).

  4. High-Throughput Sequencing Based Methods of RNA Structure Investigation

    DEFF Research Database (Denmark)

    Kielpinski, Lukasz Jan

    In this thesis we describe the development of four related methods for RNA structure probing that utilize massive parallel sequencing. Using them, we were able to gather structural data for multiple, long molecules simultaneously. First, we have established an easy to follow experimental...... and computational protocol for detecting the reverse transcription termination sites (RTTS-Seq). This protocol was subsequently applied to hydroxyl radical footprinting of three dimensional RNA structures to give a probing signal that correlates well with the RNA backbone solvent accessibility. Moreover, we applied...

  5. Comprehensive processing of high-throughput small RNA sequencing data including quality checking, normalization, and differential expression analysis using the UEA sRNA Workbench.

    Science.gov (United States)

    Beckers, Matthew; Mohorianu, Irina; Stocks, Matthew; Applegate, Christopher; Dalmay, Tamas; Moulton, Vincent

    2017-06-01

    Recently, high-throughput sequencing (HTS) has revealed compelling details about the small RNA (sRNA) population in eukaryotes. These 20 to 25 nt noncoding RNAs can influence gene expression by acting as guides for the sequence-specific regulatory mechanism known as RNA silencing. The increase in sequencing depth and number of samples per project enables a better understanding of the role sRNAs play by facilitating the study of expression patterns. However, the intricacy of the biological hypotheses coupled with a lack of appropriate tools often leads to inadequate mining of the available data and thus, an incomplete description of the biological mechanisms involved. To enable a comprehensive study of differential expression in sRNA data sets, we present a new interactive pipeline that guides researchers through the various stages of data preprocessing and analysis. This includes various tools, some of which we specifically developed for sRNA analysis, for quality checking and normalization of sRNA samples as well as tools for the detection of differentially expressed sRNAs and identification of the resulting expression patterns. The pipeline is available within the UEA sRNA Workbench, a user-friendly software package for the processing of sRNA data sets. We demonstrate the use of the pipeline on a H. sapiens data set; additional examples on a B. terrestris data set and on an A. thaliana data set are described in the Supplemental Information A comparison with existing approaches is also included, which exemplifies some of the issues that need to be addressed for sRNA analysis and how the new pipeline may be used to do this. © 2017 Beckers et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  6. High Throughput, Multiplexed Pathogen Detection Authenticates Plague Waves in Medieval Venice, Italy

    Science.gov (United States)

    Tran, Thi-Nguyen-Ny; Signoli, Michel; Fozzati, Luigi; Aboudharam, Gérard; Raoult, Didier; Drancourt, Michel

    2011-01-01

    Background Historical records suggest that multiple burial sites from the 14th–16th centuries in Venice, Italy, were used during the Black Death and subsequent plague epidemics. Methodology/Principal Findings High throughput, multiplexed real-time PCR detected DNA of seven highly transmissible pathogens in 173 dental pulp specimens collected from 46 graves. Bartonella quintana DNA was identified in five (2.9%) samples, including three from the 16th century and two from the 15th century, and Yersinia pestis DNA was detected in three (1.7%) samples, including two from the 14th century and one from the 16th century. Partial glpD gene sequencing indicated that the detected Y. pestis was the Orientalis biotype. Conclusions These data document for the first time successive plague epidemics in the medieval European city where quarantine was first instituted in the 14th century. PMID:21423736

  7. CoLIde: A bioinformatics tool for CO-expression based small RNA Loci Identification using high-throughput sequencing data

    OpenAIRE

    Mohorianu, Irina; Stocks, Matthew Benedict; Wood, John; Dalmay, Tamas; Moulton, Vincent

    2013-01-01

    Small RNAs (sRNAs) are 20–25 nt non-coding RNAs that act as guides for the highly sequence-specific regulatory mechanism known as RNA silencing. Due to the recent increase in sequencing depth, a highly complex and diverse population of sRNAs in both plants and animals has been revealed. However, the exponential increase in sequencing data has also made the identification of individual sRNA transcripts corresponding to biological units (sRNA loci) more challenging when based exclusively on the...

  8. Fully Automated Electro Membrane Extraction Autosampler for LC-MS Systems Allowing Soft Extractions for High-Throughput Applications

    DEFF Research Database (Denmark)

    Fuchs, David; Pedersen-Bjergaard, Stig; Jensen, Henrik

    2016-01-01

    was optimized for soft extraction of analytes and high sample throughput. Further, it was demonstrated that by flushing the EME-syringe with acidic wash buffer and reverting the applied electric potential, carry-over between samples can be reduced to below 1%. Performance of the system was characterized (RSD......, a complete analytical workflow of purification, separation, and analysis of sample could be achieved within only 5.5 min. With the developed system large sequences of samples could be analyzed in a completely automated manner. This high degree of automation makes the developed EME-autosampler a powerful tool...

  9. Improving Hierarchical Models Using Historical Data with Applications in High-Throughput Genomics Data Analysis.

    Science.gov (United States)

    Li, Ben; Li, Yunxiao; Qin, Zhaohui S

    2017-06-01

    Modern high-throughput biotechnologies such as microarray and next generation sequencing produce a massive amount of information for each sample assayed. However, in a typical high-throughput experiment, only limited amount of data are observed for each individual feature, thus the classical 'large p , small n ' problem. Bayesian hierarchical model, capable of borrowing strength across features within the same dataset, has been recognized as an effective tool in analyzing such data. However, the shrinkage effect, the most prominent feature of hierarchical features, can lead to undesirable over-correction for some features. In this work, we discuss possible causes of the over-correction problem and propose several alternative solutions. Our strategy is rooted in the fact that in the Big Data era, large amount of historical data are available which should be taken advantage of. Our strategy presents a new framework to enhance the Bayesian hierarchical model. Through simulation and real data analysis, we demonstrated superior performance of the proposed strategy. Our new strategy also enables borrowing information across different platforms which could be extremely useful with emergence of new technologies and accumulation of data from different platforms in the Big Data era. Our method has been implemented in R package "adaptiveHM", which is freely available from https://github.com/benliemory/adaptiveHM.

  10. MIPHENO: Data normalization for high throughput metabolic analysis.

    Science.gov (United States)

    High throughput methodologies such as microarrays, mass spectrometry and plate-based small molecule screens are increasingly used to facilitate discoveries from gene function to drug candidate identification. These large-scale experiments are typically carried out over the course...

  11. Cancer panomics: computational methods and infrastructure for integrative analysis of cancer high-throughput "omics" data

    DEFF Research Database (Denmark)

    Brunak, Søren; De La Vega, Francisco M.; Rätsch, Gunnar

    2014-01-01

    Targeted cancer treatment is becoming the goal of newly developed oncology medicines and has already shown promise in some spectacular cases such as the case of BRAF kinase inhibitors in BRAF-mutant (e.g. V600E) melanoma. These developments are driven by the advent of high-throughput sequencing......, which continues to drop in cost, and that has enabled the sequencing of the genome, transcriptome, and epigenome of the tumors of a large number of cancer patients in order to discover the molecular aberrations that drive the oncogenesis of several types of cancer. Applying these technologies...... in the clinic promises to transform cancer treatment by identifying therapeutic vulnerabilities of each patient's tumor. These approaches will need to address the panomics of cancer--the integration of the complex combination of patient-specific characteristics that drive the development of each person's tumor...

  12. New developments of RNAi in Paracoccidioides brasiliensis: prospects for high-throughput, genome-wide, functional genomics.

    Directory of Open Access Journals (Sweden)

    Tercio Goes

    2014-10-01

    Full Text Available The Fungal Genome Initiative of the Broad Institute, in partnership with the Paracoccidioides research community, has recently sequenced the genome of representative isolates of this human-pathogen dimorphic fungus: Pb18 (S1, Pb03 (PS2 and Pb01. The accomplishment of future high-throughput, genome-wide, functional genomics will rely upon appropriate molecular tools and straightforward techniques to streamline the generation of stable loss-of-function phenotypes. In the past decades, RNAi has emerged as the most robust genetic technique to modulate or to suppress gene expression in diverse eukaryotes, including fungi. These molecular tools and techniques, adapted for RNAi, were up until now unavailable for P. brasiliensis.In this paper, we report Agrobacterium tumefaciens mediated transformation of yeast cells for high-throughput applications with which higher transformation frequencies of 150±24 yeast cell transformants per 1×106 viable yeast cells were obtained. Our approach is based on a bifunctional selective marker fusion protein consisted of the Streptoalloteichus hindustanus bleomycin-resistance gene (Shble and the intrinsically fluorescent monomeric protein mCherry which was codon-optimized for heterologous expression in P. brasiliensis. We also report successful GP43 gene knock-down through the expression of intron-containing hairpin RNA (ihpRNA from a Gateway-adapted cassette (cALf which was purpose-built for gene silencing in a high-throughput manner. Gp43 transcript levels were reduced by 73.1±22.9% with this approach.We have a firm conviction that the genetic transformation technique and the molecular tools herein described will have a relevant contribution in future Paracoccidioides spp. functional genomics research.

  13. High-Throughput Thermodynamic Modeling and Uncertainty Quantification for ICME

    Science.gov (United States)

    Otis, Richard A.; Liu, Zi-Kui

    2017-05-01

    One foundational component of the integrated computational materials engineering (ICME) and Materials Genome Initiative is the computational thermodynamics based on the calculation of phase diagrams (CALPHAD) method. The CALPHAD method pioneered by Kaufman has enabled the development of thermodynamic, atomic mobility, and molar volume databases of individual phases in the full space of temperature, composition, and sometimes pressure for technologically important multicomponent engineering materials, along with sophisticated computational tools for using the databases. In this article, our recent efforts will be presented in terms of developing new computational tools for high-throughput modeling and uncertainty quantification based on high-throughput, first-principles calculations and the CALPHAD method along with their potential propagations to downstream ICME modeling and simulations.

  14. Controlling high-throughput manufacturing at the nano-scale

    Science.gov (United States)

    Cooper, Khershed P.

    2013-09-01

    Interest in nano-scale manufacturing research and development is growing. The reason is to accelerate the translation of discoveries and inventions of nanoscience and nanotechnology into products that would benefit industry, economy and society. Ongoing research in nanomanufacturing is focused primarily on developing novel nanofabrication techniques for a variety of applications—materials, energy, electronics, photonics, biomedical, etc. Our goal is to foster the development of high-throughput methods of fabricating nano-enabled products. Large-area parallel processing and highspeed continuous processing are high-throughput means for mass production. An example of large-area processing is step-and-repeat nanoimprinting, by which nanostructures are reproduced again and again over a large area, such as a 12 in wafer. Roll-to-roll processing is an example of continuous processing, by which it is possible to print and imprint multi-level nanostructures and nanodevices on a moving flexible substrate. The big pay-off is high-volume production and low unit cost. However, the anticipated cost benefits can only be realized if the increased production rate is accompanied by high yields of high quality products. To ensure product quality, we need to design and construct manufacturing systems such that the processes can be closely monitored and controlled. One approach is to bring cyber-physical systems (CPS) concepts to nanomanufacturing. CPS involves the control of a physical system such as manufacturing through modeling, computation, communication and control. Such a closely coupled system will involve in-situ metrology and closed-loop control of the physical processes guided by physics-based models and driven by appropriate instrumentation, sensing and actuation. This paper will discuss these ideas in the context of controlling high-throughput manufacturing at the nano-scale.

  15. A family of E. coli expression vectors for laboratory scale and high throughput soluble protein production

    Directory of Open Access Journals (Sweden)

    Bottomley Stephen P

    2006-03-01

    Full Text Available Abstract Background In the past few years, both automated and manual high-throughput protein expression and purification has become an accessible means to rapidly screen and produce soluble proteins for structural and functional studies. However, many of the commercial vectors encoding different solubility tags require different cloning and purification steps for each vector, considerably slowing down expression screening. We have developed a set of E. coli expression vectors with different solubility tags that allow for parallel cloning from a single PCR product and can be purified using the same protocol. Results The set of E. coli expression vectors, encode for either a hexa-histidine tag or the three most commonly used solubility tags (GST, MBP, NusA and all with an N-terminal hexa-histidine sequence. The result is two-fold: the His-tag facilitates purification by immobilised metal affinity chromatography, whilst the fusion domains act primarily as solubility aids during expression, in addition to providing an optional purification step. We have also incorporated a TEV recognition sequence following the solubility tag domain, which allows for highly specific cleavage (using TEV protease of the fusion protein to yield native protein. These vectors are also designed for ligation-independent cloning and they possess a high-level expressing T7 promoter, which is suitable for auto-induction. To validate our vector system, we have cloned four different genes and also one gene into all four vectors and used small-scale expression and purification techniques. We demonstrate that the vectors are capable of high levels of expression and that efficient screening of new proteins can be readily achieved at the laboratory level. Conclusion The result is a set of four rationally designed vectors, which can be used for streamlined cloning, expression and purification of target proteins in the laboratory and have the potential for being adaptable to a high-throughput

  16. High-Throughput Analysis and Automation for Glycomics Studies

    NARCIS (Netherlands)

    Shubhakar, A.; Reiding, K.R.; Gardner, R.A.; Spencer, D.I.R.; Fernandes, D.L.; Wuhrer, M.

    2015-01-01

    This review covers advances in analytical technologies for high-throughput (HTP) glycomics. Our focus is on structural studies of glycoprotein glycosylation to support biopharmaceutical realization and the discovery of glycan biomarkers for human disease. For biopharmaceuticals, there is increasing

  17. High-Throughput Cloning and Expression Library Creation for Functional Proteomics

    Science.gov (United States)

    Festa, Fernanda; Steel, Jason; Bian, Xiaofang; Labaer, Joshua

    2013-01-01

    The study of protein function usually requires the use of a cloned version of the gene for protein expression and functional assays. This strategy is particular important when the information available regarding function is limited. The functional characterization of the thousands of newly identified proteins revealed by genomics requires faster methods than traditional single gene experiments, creating the need for fast, flexible and reliable cloning systems. These collections of open reading frame (ORF) clones can be coupled with high-throughput proteomics platforms, such as protein microarrays and cell-based assays, to answer biological questions. In this tutorial we provide the background for DNA cloning, discuss the major high-throughput cloning systems (Gateway® Technology, Flexi® Vector Systems, and Creator™ DNA Cloning System) and compare them side-by-side. We also report an example of high-throughput cloning study and its application in functional proteomics. This Tutorial is part of the International Proteomics Tutorial Programme (IPTP12). Details can be found at http://www.proteomicstutorials.org. PMID:23457047

  18. High-throughput bioinformatics with the Cyrille2 pipeline system

    Directory of Open Access Journals (Sweden)

    de Groot Joost CW

    2008-02-01

    Full Text Available Abstract Background Modern omics research involves the application of high-throughput technologies that generate vast volumes of data. These data need to be pre-processed, analyzed and integrated with existing knowledge through the use of diverse sets of software tools, models and databases. The analyses are often interdependent and chained together to form complex workflows or pipelines. Given the volume of the data used and the multitude of computational resources available, specialized pipeline software is required to make high-throughput analysis of large-scale omics datasets feasible. Results We have developed a generic pipeline system called Cyrille2. The system is modular in design and consists of three functionally distinct parts: 1 a web based, graphical user interface (GUI that enables a pipeline operator to manage the system; 2 the Scheduler, which forms the functional core of the system and which tracks what data enters the system and determines what jobs must be scheduled for execution, and; 3 the Executor, which searches for scheduled jobs and executes these on a compute cluster. Conclusion The Cyrille2 system is an extensible, modular system, implementing the stated requirements. Cyrille2 enables easy creation and execution of high throughput, flexible bioinformatics pipelines.

  19. RNA-Based Amplicon Sequencing Reveals Microbiota Development during Ripening of Artisanal versus Industrial Lard d'Arnad.

    Science.gov (United States)

    Ferrocino, Ilario; Bellio, Alberto; Romano, Angelo; Macori, Guerrino; Rantsiou, Kalliopi; Decastelli, Lucia; Cocolin, Luca

    2017-08-15

    Valle d'Aosta Lard d'Arnad is a protected designation of origin (PDO) product produced from fat of the shoulder and back of heavy pigs. Its manufacturing process can be very diverse, especially regarding the maturation temperature and the NaCl concentration used for the brine; thereby, the main goal of this study was to investigate the impact of those parameters on the microbiota developed during curing and ripening. Three farms producing Lard d'Arnad were selected. Two plants, reflecting the industrial process characterized either by low maturation temperature (plant A [10% NaCl, 2°C]) or by using a low NaCl concentration (plant B [2.5% NaCl, 4°C]), were selected, while the third was characterized by an artisanal process (plant C [30% NaCl, 8°C]). Lard samples were obtained at time 0 and after 7, 15, 30, 60, and 90 days of maturation. From each plant, 3 independent lots were analyzed. The diversity of live microbiota was evaluated by using classical plate counts and amplicon target sequencing of small subunit (SSU) rRNA. The main taxa identified by sequencing were Acinetobacter johnsonii , Psychrobacter , Staphylococcus equorum , Staphylococcus sciuri , Pseudomonas fragi , Brochothrix , Halomonas , and Vibrio , and differences in their relative abundances distinguished samples from the individual plants. The composition of the microbiota was more similar among plants A and B, and it was characterized by the higher presence of taxa recognized as undesired bacteria in food-processing environments. Oligotype analysis of Halomonas and Acinetobacter revealed the presence of several characteristic oligotypes associated with A and B samples. IMPORTANCE Changes in the food production process can drastically affect the microbial community structure, with a possible impact on the final characteristics of the products. The industrial processes of Lard d'Arnad production are characterized by a reduction in the salt concentration in the brines to address a consumer demand

  20. Transcriptome-Wide Analysis of Botrytis elliptica Responsive microRNAs and Their Targets in Lilium Regale Wilson by High-Throughput Sequencing and Degradome Analysis

    Directory of Open Access Journals (Sweden)

    Xue Gao

    2017-05-01

    Full Text Available MicroRNAs, as master regulators of gene expression, have been widely identified and play crucial roles in plant-pathogen interactions. A fatal pathogen, Botrytis elliptica, causes the serious folia disease of lily, which reduces production because of the high susceptibility of most cultivated species. However, the miRNAs related to Botrytis infection of lily, and the miRNA-mediated gene regulatory networks providing resistance to B. elliptica in lily remain largely unexplored. To systematically dissect B. elliptica-responsive miRNAs and their target genes, three small RNA libraries were constructed from the leaves of Lilium regale, a promising Chinese wild Lilium species, which had been subjected to mock B. elliptica treatment or B. elliptica infection for 6 and 24 h. By high-throughput sequencing, 71 known miRNAs belonging to 47 conserved families and 24 novel miRNA were identified, of which 18 miRNAs were downreguleted and 13 were upregulated in response to B. elliptica. Moreover, based on the lily mRNA transcriptome, 22 targets for 9 known and 1 novel miRNAs were identified by the degradome sequencing approach. Most target genes for elliptica-responsive miRNAs were involved in metabolic processes, few encoding different transcription factors, including ELONGATION FACTOR 1 ALPHA (EF1a and TEOSINTE BRANCHED1/CYCLOIDEA/PROLIFERATING CELL FACTOR 2 (TCP2. Furthermore, the expression patterns of a set of elliptica-responsive miRNAs and their targets were validated by quantitative real-time PCR. This study represents the first transcriptome-based analysis of miRNAs responsive to B. elliptica and their targets in lily. The results reveal the possible regulatory roles of miRNAs and their targets in B. elliptica interaction, which will extend our understanding of the mechanisms of this disease in lily.