WorldWideScience

Sample records for high-throughput sequence analysis

  1. Targeted DNA Methylation Analysis by High Throughput Sequencing in Porcine Peri-attachment Embryos

    OpenAIRE

    MORRILL, Benson H.; COX, Lindsay; WARD, Anika; HEYWOOD, Sierra; PRATHER, Randall S.; ISOM, S. Clay

    2013-01-01

    Abstract The purpose of this experiment was to implement and evaluate the effectiveness of a next-generation sequencing-based method for DNA methylation analysis in porcine embryonic samples. Fourteen discrete genomic regions were amplified by PCR using bisulfite-converted genomic DNA derived from day 14 in vivo-derived (IVV) and parthenogenetic (PA) porcine embryos as template DNA. Resulting PCR products were subjected to high-throughput sequencing using the Illumina Genome Analyzer IIx plat...

  2. A priori Considerations When Conducting High-Throughput Amplicon-Based Sequence Analysis

    Directory of Open Access Journals (Sweden)

    Aditi Sengupta

    2016-03-01

    Full Text Available Amplicon-based sequencing strategies that include 16S rRNA and functional genes, alongside “meta-omics” analyses of communities of microorganisms, have allowed researchers to pose questions and find answers to “who” is present in the environment and “what” they are doing. Next-generation sequencing approaches that aid microbial ecology studies of agricultural systems are fast gaining popularity among agronomy, crop, soil, and environmental science researchers. Given the rapid development of these high-throughput sequencing techniques, researchers with no prior experience will desire information about the best practices that can be used before actually starting high-throughput amplicon-based sequence analyses. We have outlined items that need to be carefully considered in experimental design, sampling, basic bioinformatics, sequencing of mock communities and negative controls, acquisition of metadata, and in standardization of reaction conditions as per experimental requirements. Not all considerations mentioned here may pertain to a particular study. The overall goal is to inform researchers about considerations that must be taken into account when conducting high-throughput microbial DNA sequencing and sequences analysis.

  3. Improvements and impacts of GRCh38 human reference on high throughput sequencing data analysis.

    Science.gov (United States)

    Guo, Yan; Dai, Yulin; Yu, Hui; Zhao, Shilin; Samuels, David C; Shyr, Yu

    2017-03-01

    Analyses of high throughput sequencing data starts with alignment against a reference genome, which is the foundation for all re-sequencing data analyses. Each new release of the human reference genome has been augmented with improved accuracy and completeness. It is presumed that the latest release of human reference genome, GRCh38 will contribute more to high throughput sequencing data analysis by providing more accuracy. But the amount of improvement has not yet been quantified. We conducted a study to compare the genomic analysis results between the GRCh38 reference and its predecessor GRCh37. Through analyses of alignment, single nucleotide polymorphisms, small insertion/deletions, copy number and structural variants, we show that GRCh38 offers overall more accurate analysis of human sequencing data. More importantly, GRCh38 produced fewer false positive structural variants. In conclusion, GRCh38 is an improvement over GRCh37 not only from the genome assembly aspect, but also yields more reliable genomic analysis results. Copyright © 2017. Published by Elsevier Inc.

  4. HTSstation: a web application and open-access libraries for high-throughput sequencing data analysis.

    Science.gov (United States)

    David, Fabrice P A; Delafontaine, Julien; Carat, Solenne; Ross, Frederick J; Lefebvre, Gregory; Jarosz, Yohan; Sinclair, Lucas; Noordermeer, Daan; Rougemont, Jacques; Leleu, Marion

    2014-01-01

    The HTSstation analysis portal is a suite of simple web forms coupled to modular analysis pipelines for various applications of High-Throughput Sequencing including ChIP-seq, RNA-seq, 4C-seq and re-sequencing. HTSstation offers biologists the possibility to rapidly investigate their HTS data using an intuitive web application with heuristically pre-defined parameters. A number of open-source software components have been implemented and can be used to build, configure and run HTS analysis pipelines reactively. Besides, our programming framework empowers developers with the possibility to design their own workflows and integrate additional third-party software. The HTSstation web application is accessible at http://htsstation.epfl.ch.

  5. High-Throughput Analysis of T-DNA Location and Structure Using Sequence Capture.

    Directory of Open Access Journals (Sweden)

    Soichi Inagaki

    Full Text Available Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA-genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously, using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. Our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.

  6. High-Throughput Analysis of T-DNA Location and Structure Using Sequence Capture.

    Science.gov (United States)

    Inagaki, Soichi; Henry, Isabelle M; Lieberman, Meric C; Comai, Luca

    2015-01-01

    Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA-genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously, using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. Our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.

  7. Galaxy Workflows for Web-based Bioinformatics Analysis of Aptamer High-throughput Sequencing Data

    Directory of Open Access Journals (Sweden)

    William H Thiel

    2016-01-01

    Full Text Available Development of RNA and DNA aptamers for diagnostic and therapeutic applications is a rapidly growing field. Aptamers are identified through iterative rounds of selection in a process termed SELEX (Systematic Evolution of Ligands by EXponential enrichment. High-throughput sequencing (HTS revolutionized the modern SELEX process by identifying millions of aptamer sequences across multiple rounds of aptamer selection. However, these vast aptamer HTS datasets necessitated bioinformatics techniques. Herein, we describe a semiautomated approach to analyze aptamer HTS datasets using the Galaxy Project, a web-based open source collection of bioinformatics tools that were originally developed to analyze genome, exome, and transcriptome HTS data. Using a series of Workflows created in the Galaxy webserver, we demonstrate efficient processing of aptamer HTS data and compilation of a database of unique aptamer sequences. Additional Workflows were created to characterize the abundance and persistence of aptamer sequences within a selection and to filter sequences based on these parameters. A key advantage of this approach is that the online nature of the Galaxy webserver and its graphical interface allow for the analysis of HTS data without the need to compile code or install multiple programs.

  8. Pyicos: a versatile toolkit for the analysis of high-throughput sequencing data.

    Science.gov (United States)

    Althammer, Sonja; González-Vallinas, Juan; Ballaré, Cecilia; Beato, Miguel; Eyras, Eduardo

    2011-12-15

    High-throughput sequencing (HTS) has revolutionized gene regulation studies and is now fundamental for the detection of protein-DNA and protein-RNA binding, as well as for measuring RNA expression. With increasing variety and sequencing depth of HTS datasets, the need for more flexible and memory-efficient tools to analyse them is growing. We describe Pyicos, a powerful toolkit for the analysis of mapped reads from diverse HTS experiments: ChIP-Seq, either punctuated or broad signals, CLIP-Seq and RNA-Seq. We prove the effectiveness of Pyicos to select for significant signals and show that its accuracy is comparable and sometimes superior to that of methods specifically designed for each particular type of experiment. Pyicos facilitates the analysis of a variety of HTS datatypes through its flexibility and memory efficiency, providing a useful framework for data integration into models of regulatory genomics. Open-source software, with tutorials and protocol files, is available at http://regulatorygenomics.upf.edu/pyicos or as a Galaxy server at http://regulatorygenomics.upf.edu/galaxy eduardo.eyras@upf.edu Supplementary data are available at Bioinformatics online.

  9. Identification of microRNAs from Eugenia uniflora by high-throughput sequencing and bioinformatics analysis.

    Science.gov (United States)

    Guzman, Frank; Almerão, Mauricio P; Körbes, Ana P; Loss-Morais, Guilherme; Margis, Rogerio

    2012-01-01

    microRNAs or miRNAs are small non-coding regulatory RNAs that play important functions in the regulation of gene expression at the post-transcriptional level by targeting mRNAs for degradation or inhibiting protein translation. Eugenia uniflora is a plant native to tropical America with pharmacological and ecological importance, and there have been no previous studies concerning its gene expression and regulation. To date, no miRNAs have been reported in Myrtaceae species. Small RNA and RNA-seq libraries were constructed to identify miRNAs and pre-miRNAs in Eugenia uniflora. Solexa technology was used to perform high throughput sequencing of the library, and the data obtained were analyzed using bioinformatics tools. From 14,489,131 small RNA clean reads, we obtained 1,852,722 mature miRNA sequences representing 45 conserved families that have been identified in other plant species. Further analysis using contigs assembled from RNA-seq allowed the prediction of secondary structures of 25 known and 17 novel pre-miRNAs. The expression of twenty-seven identified miRNAs was also validated using RT-PCR assays. Potential targets were predicted for the most abundant mature miRNAs in the identified pre-miRNAs based on sequence homology. This study is the first large scale identification of miRNAs and their potential targets from a species of the Myrtaceae family without genomic sequence resources. Our study provides more information about the evolutionary conservation of the regulatory network of miRNAs in plants and highlights species-specific miRNAs.

  10. Fine grained compositional analysis of Port Everglades Inlet microbiome using high throughput DNA sequencing.

    Science.gov (United States)

    O'Connell, Lauren; Gao, Song; McCorquodale, Donald; Fleisher, Jay; Lopez, Jose V

    2018-01-01

    Similar to natural rivers, manmade inlets connect inland runoff to the ocean. Port Everglades Inlet (PEI) is a busy cargo and cruise ship port in South Florida, which can act as a source of pollution to surrounding beaches and offshore coral reefs. Understanding the composition and fluctuations of bacterioplankton communities ("microbiomes") in major port inlets is important due to potential impacts on surrounding environments. We hypothesize seasonal microbial fluctuations, which were profiled by high throughput 16S rRNA amplicon sequencing and analysis. Surface water samples were collected every week for one year. A total of four samples per month, two from each sampling location, were used for statistical analysis creating a high sampling frequency and finer sampling scale than previous inlet microbiome studies. We observed significant differences in community alpha diversity between months and seasons. Analysis of composition of microbiomes (ANCOM) tests were run in QIIME 2 at genus level taxonomic classification to determine which genera were differentially abundant between seasons and months. Beta diversity results yielded significant differences in PEI community composition in regard to month, season, water temperature, and salinity. Analysis of potentially pathogenic genera showed presence of Staphylococcus and Streptococcus . However, statistical analysis indicated that these organisms were not present in significantly high abundances throughout the year or between seasons. Significant differences in alpha diversity were observed when comparing microbial communities with respect to time. This observation stems from the high community evenness and low community richness in August. This indicates that only a few organisms dominated the community during this month. August had lower than average rainfall levels for a wet season, which may have contributed to less runoff, and fewer bacterial groups introduced into the port surface waters. Bacterioplankton beta

  11. Fine grained compositional analysis of Port Everglades Inlet microbiome using high throughput DNA sequencing

    Directory of Open Access Journals (Sweden)

    Lauren O’Connell

    2018-05-01

    Full Text Available Background Similar to natural rivers, manmade inlets connect inland runoff to the ocean. Port Everglades Inlet (PEI is a busy cargo and cruise ship port in South Florida, which can act as a source of pollution to surrounding beaches and offshore coral reefs. Understanding the composition and fluctuations of bacterioplankton communities (“microbiomes” in major port inlets is important due to potential impacts on surrounding environments. We hypothesize seasonal microbial fluctuations, which were profiled by high throughput 16S rRNA amplicon sequencing and analysis. Methods & Results Surface water samples were collected every week for one year. A total of four samples per month, two from each sampling location, were used for statistical analysis creating a high sampling frequency and finer sampling scale than previous inlet microbiome studies. We observed significant differences in community alpha diversity between months and seasons. Analysis of composition of microbiomes (ANCOM tests were run in QIIME 2 at genus level taxonomic classification to determine which genera were differentially abundant between seasons and months. Beta diversity results yielded significant differences in PEI community composition in regard to month, season, water temperature, and salinity. Analysis of potentially pathogenic genera showed presence of Staphylococcus and Streptococcus. However, statistical analysis indicated that these organisms were not present in significantly high abundances throughout the year or between seasons. Discussion Significant differences in alpha diversity were observed when comparing microbial communities with respect to time. This observation stems from the high community evenness and low community richness in August. This indicates that only a few organisms dominated the community during this month. August had lower than average rainfall levels for a wet season, which may have contributed to less runoff, and fewer bacterial groups

  12. CSReport: A New Computational Tool Designed for Automatic Analysis of Class Switch Recombination Junctions Sequenced by High-Throughput Sequencing.

    Science.gov (United States)

    Boyer, François; Boutouil, Hend; Dalloul, Iman; Dalloul, Zeinab; Cook-Moreau, Jeanne; Aldigier, Jean-Claude; Carrion, Claire; Herve, Bastien; Scaon, Erwan; Cogné, Michel; Péron, Sophie

    2017-05-15

    B cells ensure humoral immune responses due to the production of Ag-specific memory B cells and Ab-secreting plasma cells. In secondary lymphoid organs, Ag-driven B cell activation induces terminal maturation and Ig isotype class switch (class switch recombination [CSR]). CSR creates a virtually unique IgH locus in every B cell clone by intrachromosomal recombination between two switch (S) regions upstream of each C region gene. Amount and structural features of CSR junctions reveal valuable information about the CSR mechanism, and analysis of CSR junctions is useful in basic and clinical research studies of B cell functions. To provide an automated tool able to analyze large data sets of CSR junction sequences produced by high-throughput sequencing (HTS), we designed CSReport, a software program dedicated to support analysis of CSR recombination junctions sequenced with a HTS-based protocol (Ion Torrent technology). CSReport was assessed using simulated data sets of CSR junctions and then used for analysis of Sμ-Sα and Sμ-Sγ1 junctions from CH12F3 cells and primary murine B cells, respectively. CSReport identifies junction segment breakpoints on reference sequences and junction structure (blunt-ended junctions or junctions with insertions or microhomology). Besides the ability to analyze unprecedentedly large libraries of junction sequences, CSReport will provide a unified framework for CSR junction studies. Our results show that CSReport is an accurate tool for analysis of sequences from our HTS-based protocol for CSR junctions, thereby facilitating and accelerating their study. Copyright © 2017 by The American Association of Immunologists, Inc.

  13. Analysis of high-throughput sequencing and annotation strategies for phage genomes.

    Directory of Open Access Journals (Sweden)

    Matthew R Henn

    Full Text Available BACKGROUND: Bacterial viruses (phages play a critical role in shaping microbial populations as they influence both host mortality and horizontal gene transfer. As such, they have a significant impact on local and global ecosystem function and human health. Despite their importance, little is known about the genomic diversity harbored in phages, as methods to capture complete phage genomes have been hampered by the lack of knowledge about the target genomes, and difficulties in generating sufficient quantities of genomic DNA for sequencing. Of the approximately 550 phage genomes currently available in the public domain, fewer than 5% are marine phage. METHODOLOGY/PRINCIPAL FINDINGS: To advance the study of phage biology through comparative genomic approaches we used marine cyanophage as a model system. We compared DNA preparation methodologies (DNA extraction directly from either phage lysates or CsCl purified phage particles, and sequencing strategies that utilize either Sanger sequencing of a linker amplification shotgun library (LASL or of a whole genome shotgun library (WGSL, or 454 pyrosequencing methods. We demonstrate that genomic DNA sample preparation directly from a phage lysate, combined with 454 pyrosequencing, is best suited for phage genome sequencing at scale, as this method is capable of capturing complete continuous genomes with high accuracy. In addition, we describe an automated annotation informatics pipeline that delivers high-quality annotation and yields few false positives and negatives in ORF calling. CONCLUSIONS/SIGNIFICANCE: These DNA preparation, sequencing and annotation strategies enable a high-throughput approach to the burgeoning field of phage genomics.

  14. Perchlorate reduction by hydrogen autotrophic bacteria and microbial community analysis using high-throughput sequencing.

    Science.gov (United States)

    Wan, Dongjin; Liu, Yongde; Niu, Zhenhua; Xiao, Shuhu; Li, Daorong

    2016-02-01

    Hydrogen autotrophic reduction of perchlorate have advantages of high removal efficiency and harmless to drinking water. But so far the reported information about the microbial community structure was comparatively limited, changes in the biodiversity and the dominant bacteria during acclimation process required detailed study. In this study, perchlorate-reducing hydrogen autotrophic bacteria were acclimated by hydrogen aeration from activated sludge. For the first time, high-throughput sequencing was applied to analyze changes in biodiversity and the dominant bacteria during acclimation process. The Michaelis-Menten model described the perchlorate reduction kinetics well. Model parameters q(max) and K(s) were 2.521-3.245 (mg ClO4(-)/gVSS h) and 5.44-8.23 (mg/l), respectively. Microbial perchlorate reduction occurred across at pH range 5.0-11.0; removal was highest at pH 9.0. The enriched mixed bacteria could use perchlorate, nitrate and sulfate as electron accepter, and the sequence of preference was: NO3(-) > ClO4(-) > SO4(2-). Compared to the feed culture, biodiversity decreased greatly during acclimation process, the microbial community structure gradually stabilized after 9 acclimation cycles. The Thauera genus related to Rhodocyclales was the dominated perchlorate reducing bacteria (PRB) in the mixed culture.

  15. Transcriptomic analysis of Petunia hybrida in response to salt stress using high throughput RNA sequencing.

    Directory of Open Access Journals (Sweden)

    Gonzalo H Villarino

    Full Text Available Salinity and drought stress are the primary cause of crop losses worldwide. In sodic saline soils sodium chloride (NaCl disrupts normal plant growth and development. The complex interactions of plant systems with abiotic stress have made RNA sequencing a more holistic and appealing approach to study transcriptome level responses in a single cell and/or tissue. In this work, we determined the Petunia transcriptome response to NaCl stress by sequencing leaf samples and assembling 196 million Illumina reads with Trinity software. Using our reference transcriptome we identified more than 7,000 genes that were differentially expressed within 24 h of acute NaCl stress. The proposed transcriptome can also be used as an excellent tool for biological and bioinformatics in the absence of an available Petunia genome and it is available at the SOL Genomics Network (SGN http://solgenomics.net. Genes related to regulation of reactive oxygen species, transport, and signal transductions as well as novel and undescribed transcripts were among those differentially expressed in response to salt stress. The candidate genes identified in this study can be applied as markers for breeding or to genetically engineer plants to enhance salt tolerance. Gene Ontology analyses indicated that most of the NaCl damage happened at 24 h inducing genotoxicity, affecting transport and organelles due to the high concentration of Na+ ions. Finally, we report a modification to the library preparation protocol whereby cDNA samples were bar-coded with non-HPLC purified primers, without affecting the quality and quantity of the RNA-seq data. The methodological improvement presented here could substantially reduce the cost of sample preparation for future high-throughput RNA sequencing experiments.

  16. Transcriptomic analysis of Petunia hybrida in response to salt stress using high throughput RNA sequencing.

    Science.gov (United States)

    Villarino, Gonzalo H; Bombarely, Aureliano; Giovannoni, James J; Scanlon, Michael J; Mattson, Neil S

    2014-01-01

    Salinity and drought stress are the primary cause of crop losses worldwide. In sodic saline soils sodium chloride (NaCl) disrupts normal plant growth and development. The complex interactions of plant systems with abiotic stress have made RNA sequencing a more holistic and appealing approach to study transcriptome level responses in a single cell and/or tissue. In this work, we determined the Petunia transcriptome response to NaCl stress by sequencing leaf samples and assembling 196 million Illumina reads with Trinity software. Using our reference transcriptome we identified more than 7,000 genes that were differentially expressed within 24 h of acute NaCl stress. The proposed transcriptome can also be used as an excellent tool for biological and bioinformatics in the absence of an available Petunia genome and it is available at the SOL Genomics Network (SGN) http://solgenomics.net. Genes related to regulation of reactive oxygen species, transport, and signal transductions as well as novel and undescribed transcripts were among those differentially expressed in response to salt stress. The candidate genes identified in this study can be applied as markers for breeding or to genetically engineer plants to enhance salt tolerance. Gene Ontology analyses indicated that most of the NaCl damage happened at 24 h inducing genotoxicity, affecting transport and organelles due to the high concentration of Na+ ions. Finally, we report a modification to the library preparation protocol whereby cDNA samples were bar-coded with non-HPLC purified primers, without affecting the quality and quantity of the RNA-seq data. The methodological improvement presented here could substantially reduce the cost of sample preparation for future high-throughput RNA sequencing experiments.

  17. Bacterial Pathogens and Community Composition in Advanced Sewage Treatment Systems Revealed by Metagenomics Analysis Based on High-Throughput Sequencing

    Science.gov (United States)

    Lu, Xin; Zhang, Xu-Xiang; Wang, Zhu; Huang, Kailong; Wang, Yuan; Liang, Weigang; Tan, Yunfei; Liu, Bo; Tang, Junying

    2015-01-01

    This study used 454 pyrosequencing, Illumina high-throughput sequencing and metagenomic analysis to investigate bacterial pathogens and their potential virulence in a sewage treatment plant (STP) applying both conventional and advanced treatment processes. Pyrosequencing and Illumina sequencing consistently demonstrated that Arcobacter genus occupied over 43.42% of total abundance of potential pathogens in the STP. At species level, potential pathogens Arcobacter butzleri, Aeromonas hydrophila and Klebsiella pneumonia dominated in raw sewage, which was also confirmed by quantitative real time PCR. Illumina sequencing also revealed prevalence of various types of pathogenicity islands and virulence proteins in the STP. Most of the potential pathogens and virulence factors were eliminated in the STP, and the removal efficiency mainly depended on oxidation ditch. Compared with sand filtration, magnetic resin seemed to have higher removals in most of the potential pathogens and virulence factors. However, presence of the residual A. butzleri in the final effluent still deserves more concerns. The findings indicate that sewage acts as an important source of environmental pathogens, but STPs can effectively control their spread in the environment. Joint use of the high-throughput sequencing technologies is considered a reliable method for deep and comprehensive overview of environmental bacterial virulence. PMID:25938416

  18. A Reference Viral Database (RVDB) To Enhance Bioinformatics Analysis of High-Throughput Sequencing for Novel Virus Detection.

    Science.gov (United States)

    Goodacre, Norman; Aljanahi, Aisha; Nandakumar, Subhiksha; Mikailov, Mike; Khan, Arifa S

    2018-01-01

    Detection of distantly related viruses by high-throughput sequencing (HTS) is bioinformatically challenging because of the lack of a public database containing all viral sequences, without abundant nonviral sequences, which can extend runtime and obscure viral hits. Our reference viral database (RVDB) includes all viral, virus-related, and virus-like nucleotide sequences (excluding bacterial viruses), regardless of length, and with overall reduced cellular sequences. Semantic selection criteria (SEM-I) were used to select viral sequences from GenBank, resulting in a first-generation viral database (VDB). This database was manually and computationally reviewed, resulting in refined, semantic selection criteria (SEM-R), which were applied to a new download of updated GenBank sequences to create a second-generation VDB. Viral entries in the latter were clustered at 98% by CD-HIT-EST to reduce redundancy while retaining high viral sequence diversity. The viral identity of the clustered representative sequences (creps) was confirmed by BLAST searches in NCBI databases and HMMER searches in PFAM and DFAM databases. The resulting RVDB contained a broad representation of viral families, sequence diversity, and a reduced cellular content; it includes full-length and partial sequences and endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Testing of RVDBv10.2, with an in-house HTS transcriptomic data set indicated a significantly faster run for virus detection than interrogating the entirety of the NCBI nonredundant nucleotide database, which contains all viral sequences but also nonviral sequences. RVDB is publically available for facilitating HTS analysis, particularly for novel virus detection. It is meant to be updated on a regular basis to include new viral sequences added to GenBank. IMPORTANCE To facilitate bioinformatics analysis of high-throughput sequencing (HTS) data for the detection of both known and novel viruses, we have

  19. Peptide Pattern Recognition for high-throughput protein sequence analysis and clustering

    DEFF Research Database (Denmark)

    Busk, Peter Kamp

    2017-01-01

    Large collections of protein sequences with divergent sequences are tedious to analyze for understanding their phylogenetic or structure-function relation. Peptide Pattern Recognition is an algorithm that was developed to facilitate this task but the previous version does only allow a limited...... number of sequences as input. I implemented Peptide Pattern Recognition as a multithread software designed to handle large numbers of sequences and perform analysis in a reasonable time frame. Benchmarking showed that the new implementation of Peptide Pattern Recognition is twenty times faster than...... the previous implementation on a small protein collection with 673 MAP kinase sequences. In addition, the new implementation could analyze a large protein collection with 48,570 Glycosyl Transferase family 20 sequences without reaching its upper limit on a desktop computer. Peptide Pattern Recognition...

  20. Micropathogen Community Analysis in Hyalomma rufipes via High-Throughput Sequencing of Small RNAs

    Science.gov (United States)

    Luo, Jin; Liu, Min-Xuan; Ren, Qiao-Yun; Chen, Ze; Tian, Zhan-Cheng; Hao, Jia-Wei; Wu, Feng; Liu, Xiao-Cui; Luo, Jian-Xun; Yin, Hong; Wang, Hui; Liu, Guang-Yuan

    2017-01-01

    Ticks are important vectors in the transmission of a broad range of micropathogens to vertebrates, including humans. Because of the role of ticks in disease transmission, identifying and characterizing the micropathogen profiles of tick populations have become increasingly important. The objective of this study was to survey the micropathogens of Hyalomma rufipes ticks. Illumina HiSeq2000 technology was utilized to perform deep sequencing of small RNAs (sRNAs) extracted from field-collected H. rufipes ticks in Gansu Province, China. The resultant sRNA library data revealed that the surveyed tick populations produced reads that were homologous to St. Croix River Virus (SCRV) sequences. We also observed many reads that were homologous to microbial and/or pathogenic isolates, including bacteria, protozoa, and fungi. As part of this analysis, a phylogenetic tree was constructed to display the relationships among the homologous sequences that were identified. The study offered a unique opportunity to gain insight into the micropathogens of H. rufipes ticks. The effective control of arthropod vectors in the future will require knowledge of the micropathogen composition of vectors harboring infectious agents. Understanding the ecological factors that regulate vector propagation in association with the prevalence and persistence of micropathogen lineages is also imperative. These interactions may affect the evolution of micropathogen lineages, especially if the micropathogens rely on the vector or host for dispersal. The sRNA deep-sequencing approach used in this analysis provides an intuitive method to survey micropathogen prevalence in ticks and other vector species. PMID:28861401

  1. The simple fool's guide to population genomics via RNA-Seq: An introduction to high-throughput sequencing data analysis

    DEFF Research Database (Denmark)

    De Wit, P.; Pespeni, M.H.; Ladner, J.T.

    2012-01-01

    to Population Genomics via RNA-seq' (SFG), a document intended to serve as an easy-to-follow protocol, walking a user through one example of high-throughput sequencing data analysis of nonmodel organisms. It is by no means an exhaustive protocol, but rather serves as an introduction to the bioinformatic methods...... used in population genomics, enabling a user to gain familiarity with basic analysis steps. The SFG consists of two parts. This document summarizes the steps needed and lays out the basic themes for each and a simple approach to follow. The second document is the full SFG, publicly available at http://sfg.......stanford.edu, that includes detailed protocols for data processing and analysis, along with a repository of custom-made scripts and sample files. Steps included in the SFG range from tissue collection to de novo assembly, blast annotation, alignment, gene expression, functional enrichment, SNP detection, principal components...

  2. Designing small universal k-mer hitting sets for improved analysis of high-throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Yaron Orenstein

    2017-10-01

    Full Text Available With the rapidly increasing volume of deep sequencing data, more efficient algorithms and data structures are needed. Minimizers are a central recent paradigm that has improved various sequence analysis tasks, including hashing for faster read overlap detection, sparse suffix arrays for creating smaller indexes, and Bloom filters for speeding up sequence search. Here, we propose an alternative paradigm that can lead to substantial further improvement in these and other tasks. For integers k and L > k, we say that a set of k-mers is a universal hitting set (UHS if every possible L-long sequence must contain a k-mer from the set. We develop a heuristic called DOCKS to find a compact UHS, which works in two phases: The first phase is solved optimally, and for the second we propose several efficient heuristics, trading set size for speed and memory. The use of heuristics is motivated by showing the NP-hardness of a closely related problem. We show that DOCKS works well in practice and produces UHSs that are very close to a theoretical lower bound. We present results for various values of k and L and by applying them to real genomes show that UHSs indeed improve over minimizers. In particular, DOCKS uses less than 30% of the 10-mers needed to span the human genome compared to minimizers. The software and computed UHSs are freely available at github.com/Shamir-Lab/DOCKS/ and acgt.cs.tau.ac.il/docks/, respectively.

  3. Secure and robust cloud computing for high-throughput forensic microsatellite sequence analysis and databasing.

    Science.gov (United States)

    Bailey, Sarah F; Scheible, Melissa K; Williams, Christopher; Silva, Deborah S B S; Hoggan, Marina; Eichman, Christopher; Faith, Seth A

    2017-11-01

    Next-generation Sequencing (NGS) is a rapidly evolving technology with demonstrated benefits for forensic genetic applications, and the strategies to analyze and manage the massive NGS datasets are currently in development. Here, the computing, data storage, connectivity, and security resources of the Cloud were evaluated as a model for forensic laboratory systems that produce NGS data. A complete front-to-end Cloud system was developed to upload, process, and interpret raw NGS data using a web browser dashboard. The system was extensible, demonstrating analysis capabilities of autosomal and Y-STRs from a variety of NGS instrumentation (Illumina MiniSeq and MiSeq, and Oxford Nanopore MinION). NGS data for STRs were concordant with standard reference materials previously characterized with capillary electrophoresis and Sanger sequencing. The computing power of the Cloud was implemented with on-demand auto-scaling to allow multiple file analysis in tandem. The system was designed to store resulting data in a relational database, amenable to downstream sample interpretations and databasing applications following the most recent guidelines in nomenclature for sequenced alleles. Lastly, a multi-layered Cloud security architecture was tested and showed that industry standards for securing data and computing resources were readily applied to the NGS system without disadvantageous effects for bioinformatic analysis, connectivity or data storage/retrieval. The results of this study demonstrate the feasibility of using Cloud-based systems for secured NGS data analysis, storage, databasing, and multi-user distributed connectivity. Copyright © 2017 Elsevier B.V. All rights reserved.

  4. Metagenomic analysis and functional characterization of the biogas microbiome using high throughput shotgun sequencing and a novel binning strategy.

    Science.gov (United States)

    Campanaro, Stefano; Treu, Laura; Kougias, Panagiotis G; De Francisci, Davide; Valle, Giorgio; Angelidaki, Irini

    2016-01-01

    Biogas production is an economically attractive technology that has gained momentum worldwide over the past years. Biogas is produced by a biologically mediated process, widely known as "anaerobic digestion." This process is performed by a specialized and complex microbial community, in which different members have distinct roles in the establishment of a collective organization. Deciphering the complex microbial community engaged in this process is interesting both for unraveling the network of bacterial interactions and for applicability potential to the derived knowledge. In this study, we dissect the bioma involved in anaerobic digestion by means of high throughput Illumina sequencing (~51 gigabases of sequence data), disclosing nearly one million genes and extracting 106 microbial genomes by a novel strategy combining two binning processes. Microbial phylogeny and putative taxonomy performed using >400 proteins revealed that the biogas community is a trove of new species. A new approach based on functional properties as per network representation was developed to assign roles to the microbial species. The organization of the anaerobic digestion microbiome is resembled by a funnel concept, in which the microbial consortium presents a progressive functional specialization while reaching the final step of the process (i.e., methanogenesis). Key microbial genomes encoding enzymes involved in specific metabolic pathways, such as carbohydrates utilization, fatty acids degradation, amino acids fermentation, and syntrophic acetate oxidation, were identified. Additionally, the analysis identified a new uncultured archaeon that was putatively related to Methanomassiliicoccales but surprisingly having a methylotrophic methanogenic pathway. This study is a pioneer research on the phylogenetic and functional characterization of the microbial community populating biogas reactors. By applying for the first time high-throughput sequencing and a novel binning strategy, the

  5. PipeCraft: Flexible open-source toolkit for bioinformatics analysis of custom high-throughput amplicon sequencing data.

    Science.gov (United States)

    Anslan, Sten; Bahram, Mohammad; Hiiesalu, Indrek; Tedersoo, Leho

    2017-11-01

    High-throughput sequencing methods have become a routine analysis tool in environmental sciences as well as in public and private sector. These methods provide vast amount of data, which need to be analysed in several steps. Although the bioinformatics may be applied using several public tools, many analytical pipelines allow too few options for the optimal analysis for more complicated or customized designs. Here, we introduce PipeCraft, a flexible and handy bioinformatics pipeline with a user-friendly graphical interface that links several public tools for analysing amplicon sequencing data. Users are able to customize the pipeline by selecting the most suitable tools and options to process raw sequences from Illumina, Pacific Biosciences, Ion Torrent and Roche 454 sequencing platforms. We described the design and options of PipeCraft and evaluated its performance by analysing the data sets from three different sequencing platforms. We demonstrated that PipeCraft is able to process large data sets within 24 hr. The graphical user interface and the automated links between various bioinformatics tools enable easy customization of the workflow. All analytical steps and options are recorded in log files and are easily traceable. © 2017 John Wiley & Sons Ltd.

  6. Viral metagenomics: Analysis of begomoviruses by illumina high-throughput sequencing

    KAUST Repository

    Idris, Ali; Al-Saleh, Mohammed; Piatek, Marek J.; Al-Shahwan, Ibrahim; Ali, Shahjahan; Brown, Judith K.

    2014-01-01

    Traditional DNA sequencing methods are inefficient, lack the ability to discern the least abundant viral sequences, and ineffective for determining the extent of variability in viral populations. Here, populations of single-stranded DNA plant

  7. Bulk segregant analysis by high-throughput sequencing reveals a novel xylose utilization gene from Saccharomyces cerevisiae.

    Directory of Open Access Journals (Sweden)

    Jared W Wenger

    2010-05-01

    Full Text Available Fermentation of xylose is a fundamental requirement for the efficient production of ethanol from lignocellulosic biomass sources. Although they aggressively ferment hexoses, it has long been thought that native Saccharomyces cerevisiae strains cannot grow fermentatively or non-fermentatively on xylose. Population surveys have uncovered a few naturally occurring strains that are weakly xylose-positive, and some S. cerevisiae have been genetically engineered to ferment xylose, but no strain, either natural or engineered, has yet been reported to ferment xylose as efficiently as glucose. Here, we used a medium-throughput screen to identify Saccharomyces strains that can increase in optical density when xylose is presented as the sole carbon source. We identified 38 strains that have this xylose utilization phenotype, including strains of S. cerevisiae, other sensu stricto members, and hybrids between them. All the S. cerevisiae xylose-utilizing strains we identified are wine yeasts, and for those that could produce meiotic progeny, the xylose phenotype segregates as a single gene trait. We mapped this gene by Bulk Segregant Analysis (BSA using tiling microarrays and high-throughput sequencing. The gene is a putative xylitol dehydrogenase, which we name XDH1, and is located in the subtelomeric region of the right end of chromosome XV in a region not present in the S288c reference genome. We further characterized the xylose phenotype by performing gene expression microarrays and by genetically dissecting the endogenous Saccharomyces xylose pathway. We have demonstrated that natural S. cerevisiae yeasts are capable of utilizing xylose as the sole carbon source, characterized the genetic basis for this trait as well as the endogenous xylose utilization pathway, and demonstrated the feasibility of BSA using high-throughput sequencing.

  8. Viral metagenomics: Analysis of begomoviruses by illumina high-throughput sequencing

    KAUST Repository

    Idris, Ali

    2014-03-12

    Traditional DNA sequencing methods are inefficient, lack the ability to discern the least abundant viral sequences, and ineffective for determining the extent of variability in viral populations. Here, populations of single-stranded DNA plant begomoviral genomes and their associated beta- and alpha-satellite molecules (virus-satellite complexes) (genus, Begomovirus; family, Geminiviridae) were enriched from total nucleic acids isolated from symptomatic, field-infected plants, using rolling circle amplification (RCA). Enriched virus-satellite complexes were subjected to Illumina-Next Generation Sequencing (NGS). CASAVA and SeqMan NGen programs were implemented, respectively, for quality control and for de novo and reference-guided contig assembly of viral-satellite sequences. The authenticity of the begomoviral sequences, and the reproducibility of the Illumina-NGS approach for begomoviral deep sequencing projects, were validated by comparing NGS results with those obtained using traditional molecular cloning and Sanger sequencing of viral components and satellite DNAs, also enriched by RCA or amplified by polymerase chain reaction. As the use of NGS approaches, together with advances in software development, make possible deep sequence coverage at a lower cost; the approach described herein will streamline the exploration of begomovirus diversity and population structure from naturally infected plants, irrespective of viral abundance. This is the first report of the implementation of Illumina-NGS to explore the diversity and identify begomoviral-satellite SNPs directly from plants naturally-infected with begomoviruses under field conditions. 2014 by the authors; licensee MDPI, Basel, Switzerland.

  9. Viral Metagenomics: Analysis of Begomoviruses by Illumina High-Throughput Sequencing

    Directory of Open Access Journals (Sweden)

    Ali Idris

    2014-03-01

    Full Text Available Traditional DNA sequencing methods are inefficient, lack the ability to discern the least abundant viral sequences, and ineffective for determining the extent of variability in viral populations. Here, populations of single-stranded DNA plant begomoviral genomes and their associated beta- and alpha-satellite molecules (virus-satellite complexes (genus, Begomovirus; family, Geminiviridae were enriched from total nucleic acids isolated from symptomatic, field-infected plants, using rolling circle amplification (RCA. Enriched virus-satellite complexes were subjected to Illumina-Next Generation Sequencing (NGS. CASAVA and SeqMan NGen programs were implemented, respectively, for quality control and for de novo and reference-guided contig assembly of viral-satellite sequences. The authenticity of the begomoviral sequences, and the reproducibility of the Illumina-NGS approach for begomoviral deep sequencing projects, were validated by comparing NGS results with those obtained using traditional molecular cloning and Sanger sequencing of viral components and satellite DNAs, also enriched by RCA or amplified by polymerase chain reaction. As the use of NGS approaches, together with advances in software development, make possible deep sequence coverage at a lower cost; the approach described herein will streamline the exploration of begomovirus diversity and population structure from naturally infected plants, irrespective of viral abundance. This is the first report of the implementation of Illumina-NGS to explore the diversity and identify begomoviral-satellite SNPs directly from plants naturally-infected with begomoviruses under field conditions.

  10. Molecular diet analysis of two African free-tailed bats (Molossidae) using high throughput sequencing

    DEFF Research Database (Denmark)

    Bohmann, Kristine; Monadjem, Ara; Noer, Christina Lehmkuhl

    2011-01-01

    Given the diversity of prey consumed by insectivorous bats, it is difficult to discern the composition of their diet using morphological or conventional PCR-based analyses of their faeces. We demonstrate the use of a powerful alternate tool, the use of the Roche FLX sequencing platform to deep......-sequence uniquely 5′ tagged insect-generic barcode cytochrome c oxidase I (COI) fragments, that were PCR amplified from faecal pellets of two free-tailed bat species Chaerephon pumilus and Mops condylurus (family: Molossidae). Although the analyses were challenged by the paucity of southern African insect COI...

  11. High Throughput Sample Preparation and Analysis for DNA Sequencing, PCR and Combinatorial Screening of Catalysis Based on Capillary Array Technique

    Energy Technology Data Exchange (ETDEWEB)

    Zhang, Yonghua [Iowa State Univ., Ames, IA (United States)

    2000-01-01

    Sample preparation has been one of the major bottlenecks for many high throughput analyses. The purpose of this research was to develop new sample preparation and integration approach for DNA sequencing, PCR based DNA analysis and combinatorial screening of homogeneous catalysis based on multiplexed capillary electrophoresis with laser induced fluorescence or imaging UV absorption detection. The author first introduced a method to integrate the front-end tasks to DNA capillary-array sequencers. protocols for directly sequencing the plasmids from a single bacterial colony in fused-silica capillaries were developed. After the colony was picked, lysis was accomplished in situ in the plastic sample tube using either a thermocycler or heating block. Upon heating, the plasmids were released while chromsomal DNA and membrane proteins were denatured and precipitated to the bottom of the tube. After adding enzyme and Sanger reagents, the resulting solution was aspirated into the reaction capillaries by a syringe pump, and cycle sequencing was initiated. No deleterious effect upon the reaction efficiency, the on-line purification system, or the capillary electrophoresis separation was observed, even though the crude lysate was used as the template. Multiplexed on-line DNA sequencing data from 8 parallel channels allowed base calling up to 620 bp with an accuracy of 98%. The entire system can be automatically regenerated for repeated operation. For PCR based DNA analysis, they demonstrated that capillary electrophoresis with UV detection can be used for DNA analysis starting from clinical sample without purification. After PCR reaction using cheek cell, blood or HIV-1 gag DNA, the reaction mixtures was injected into the capillary either on-line or off-line by base stacking. The protocol was also applied to capillary array electrophoresis. The use of cheaper detection, and the elimination of purification of DNA sample before or after PCR reaction, will make this approach an

  12. Analysis of petunia hybrida in response to salt stress using high throughput RNA sequencing

    Science.gov (United States)

    Salt and drought are among the greatest challenges to crop and native plants in meeting their yield and reproductive potentials. DNA sequencing-enabled transcriptome profiling provides a means of assessing what genes are responding to salt or drought stress so as to better understand the molecular ...

  13. TIMPs of parasitic helminths - a large-scale analysis of high-throughput sequence datasets.

    Science.gov (United States)

    Cantacessi, Cinzia; Hofmann, Andreas; Pickering, Darren; Navarro, Severine; Mitreva, Makedonka; Loukas, Alex

    2013-05-30

    Tissue inhibitors of metalloproteases (TIMPs) are a multifunctional family of proteins that orchestrate extracellular matrix turnover, tissue remodelling and other cellular processes. In parasitic helminths, such as hookworms, TIMPs have been proposed to play key roles in the host-parasite interplay, including invasion of and establishment in the vertebrate animal hosts. Currently, knowledge of helminth TIMPs is limited to a small number of studies on canine hookworms, whereas no information is available on the occurrence of TIMPs in other parasitic helminths causing neglected diseases. In the present study, we conducted a large-scale investigation of TIMP proteins of a range of neglected human parasites including the hookworm Necator americanus, the roundworm Ascaris suum, the liver flukes Clonorchis sinensis and Opisthorchis viverrini, as well as the schistosome blood flukes. This entailed mining available transcriptomic and/or genomic sequence datasets for the presence of homologues of known TIMPs, predicting secondary structures of defined protein sequences, systematic phylogenetic analyses and assessment of differential expression of genes encoding putative TIMPs in the developmental stages of A. suum, N. americanus and Schistosoma haematobium which infect the mammalian hosts. A total of 15 protein sequences with high homology to known eukaryotic TIMPs were predicted from the complement of sequence data available for parasitic helminths and subjected to in-depth bioinformatic analyses. Supported by the availability of gene manipulation technologies such as RNA interference and/or transgenesis, this work provides a basis for future functional explorations of helminth TIMPs and, in particular, of their role/s in fundamental biological pathways linked to long-term establishment in the vertebrate hosts, with a view towards the development of novel approaches for the control of neglected helminthiases.

  14. ImmuneDB: a system for the analysis and exploration of high-throughput adaptive immune receptor sequencing data.

    Science.gov (United States)

    Rosenfeld, Aaron M; Meng, Wenzhao; Luning Prak, Eline T; Hershberg, Uri

    2017-01-15

    As high-throughput sequencing of B cells becomes more common, the need for tools to analyze the large quantity of data also increases. This article introduces ImmuneDB, a system for analyzing vast amounts of heavy chain variable region sequences and exploring the resulting data. It can take as input raw FASTA/FASTQ data, identify genes, determine clones, construct lineages, as well as provide information such as selection pressure and mutation analysis. It uses an industry leading database, MySQL, to provide fast analysis and avoid the complexities of using error prone flat-files. ImmuneDB is freely available at http://immunedb.comA demo of the ImmuneDB web interface is available at: http://immunedb.com/demo CONTACT: Uh25@drexel.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  15. BioVLAB-MMIA-NGS: microRNA-mRNA integrated analysis using high-throughput sequencing data.

    Science.gov (United States)

    Chae, Heejoon; Rhee, Sungmin; Nephew, Kenneth P; Kim, Sun

    2015-01-15

    It is now well established that microRNAs (miRNAs) play a critical role in regulating gene expression in a sequence-specific manner, and genome-wide efforts are underway to predict known and novel miRNA targets. However, the integrated miRNA-mRNA analysis remains a major computational challenge, requiring powerful informatics systems and bioinformatics expertise. The objective of this study was to modify our widely recognized Web server for the integrated mRNA-miRNA analysis (MMIA) and its subsequent deployment on the Amazon cloud (BioVLAB-MMIA) to be compatible with high-throughput platforms, including next-generation sequencing (NGS) data (e.g. RNA-seq). We developed a new version called the BioVLAB-MMIA-NGS, deployed on both Amazon cloud and on a high-performance publicly available server called MAHA. By using NGS data and integrating various bioinformatics tools and databases, BioVLAB-MMIA-NGS offers several advantages. First, sequencing data is more accurate than array-based methods for determining miRNA expression levels. Second, potential novel miRNAs can be detected by using various computational methods for characterizing miRNAs. Third, because miRNA-mediated gene regulation is due to hybridization of an miRNA to its target mRNA, sequencing data can be used to identify many-to-many relationship between miRNAs and target genes with high accuracy. http://epigenomics.snu.ac.kr/biovlab_mmia_ngs/. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  16. High-throughput sequence alignment using Graphics Processing Units

    Directory of Open Access Journals (Sweden)

    Trapnell Cole

    2007-12-01

    Full Text Available Abstract Background The recent availability of new, less expensive high-throughput DNA sequencing technologies has yielded a dramatic increase in the volume of sequence data that must be analyzed. These data are being generated for several purposes, including genotyping, genome resequencing, metagenomics, and de novo genome assembly projects. Sequence alignment programs such as MUMmer have proven essential for analysis of these data, but researchers will need ever faster, high-throughput alignment tools running on inexpensive hardware to keep up with new sequence technologies. Results This paper describes MUMmerGPU, an open-source high-throughput parallel pairwise local sequence alignment program that runs on commodity Graphics Processing Units (GPUs in common workstations. MUMmerGPU uses the new Compute Unified Device Architecture (CUDA from nVidia to align multiple query sequences against a single reference sequence stored as a suffix tree. By processing the queries in parallel on the highly parallel graphics card, MUMmerGPU achieves more than a 10-fold speedup over a serial CPU version of the sequence alignment kernel, and outperforms the exact alignment component of MUMmer on a high end CPU by 3.5-fold in total application time when aligning reads from recent sequencing projects using Solexa/Illumina, 454, and Sanger sequencing technologies. Conclusion MUMmerGPU is a low cost, ultra-fast sequence alignment program designed to handle the increasing volume of data produced by new, high-throughput sequencing technologies. MUMmerGPU demonstrates that even memory-intensive applications can run significantly faster on the relatively low-cost GPU than on the CPU.

  17. GxGrare: gene-gene interaction analysis method for rare variants from high-throughput sequencing data.

    Science.gov (United States)

    Kwon, Minseok; Leem, Sangseob; Yoon, Joon; Park, Taesung

    2018-03-19

    With the rapid advancement of array-based genotyping techniques, genome-wide association studies (GWAS) have successfully identified common genetic variants associated with common complex diseases. However, it has been shown that only a small proportion of the genetic etiology of complex diseases could be explained by the genetic factors identified from GWAS. This missing heritability could possibly be explained by gene-gene interaction (epistasis) and rare variants. There has been an exponential growth of gene-gene interaction analysis for common variants in terms of methodological developments and practical applications. Also, the recent advancement of high-throughput sequencing technologies makes it possible to conduct rare variant analysis. However, little progress has been made in gene-gene interaction analysis for rare variants. Here, we propose GxGrare which is a new gene-gene interaction method for the rare variants in the framework of the multifactor dimensionality reduction (MDR) analysis. The proposed method consists of three steps; 1) collapsing the rare variants, 2) MDR analysis for the collapsed rare variants, and 3) detect top candidate interaction pairs. GxGrare can be used for the detection of not only gene-gene interactions, but also interactions within a single gene. The proposed method is illustrated with 1080 whole exome sequencing data of the Korean population in order to identify causal gene-gene interaction for rare variants for type 2 diabetes. The proposed GxGrare performs well for gene-gene interaction detection with collapsing of rare variants. GxGrare is available at http://bibs.snu.ac.kr/software/gxgrare which contains simulation data and documentation. Supported operating systems include Linux and OS X.

  18. HTSSIP: An R package for analysis of high throughput sequencing data from nucleic acid stable isotope probing (SIP experiments.

    Directory of Open Access Journals (Sweden)

    Nicholas D Youngblut

    Full Text Available Combining high throughput sequencing with stable isotope probing (HTS-SIP is a powerful method for mapping in situ metabolic processes to thousands of microbial taxa. However, accurately mapping metabolic processes to taxa is complex and challenging. Multiple HTS-SIP data analysis methods have been developed, including high-resolution stable isotope probing (HR-SIP, multi-window high-resolution stable isotope probing (MW-HR-SIP, quantitative stable isotope probing (qSIP, and ΔBD. Currently, there is no publicly available software designed specifically for analyzing HTS-SIP data. To address this shortfall, we have developed the HTSSIP R package, an open-source, cross-platform toolset for conducting HTS-SIP analyses in a straightforward and easily reproducible manner. The HTSSIP package, along with full documentation and examples, is available from CRAN at https://cran.r-project.org/web/packages/HTSSIP/index.html and Github at https://github.com/buckleylab/HTSSIP.

  19. Metabolomic and high-throughput sequencing analysis – modern approach for the assessment of biodeterioration of materials from historic buildings

    Directory of Open Access Journals (Sweden)

    Beata eGutarowska

    2015-09-01

    Full Text Available Preservation of cultural heritage is of paramount importance worldwide. Microbial colonization of construction materials, such as wood, brick, mortar and stone in historic buildings can lead to severe deterioration. The aim of the present study was to give modern insight into the phylogenetic diversity and activated metabolic pathways of microbial communities colonized historic objects located in the former Auschwitz II-Birkenau concentration and extermination camp in Oświęcim, Poland. For this purpose we combined molecular, microscopic and chemical methods. Selected specimens were examined using Field Emission Scanning Electron Microscopy (FESEM, metabolomic analysis and high-throughput Illumina sequencing. FESEM imaging revealed the presence of complex microbial communities comprising diatoms, fungi and bacteria, mainly cyanobacteria and actinobacteria, on sample surfaces. Microbial diversity of brick specimens appeared higher than that of the wood and was dominated by algae and cyanobacteria, while wood was mainly colonized by fungi. DNA sequences documented the presence of 15 bacterial phyla representing 99 genera including Halomonas, Halorhodospira, Salinisphaera, Salinibacterium, Rubrobacter, Streptomyces, Arthrobacter and 9 fungal classes represented by 113 genera including Cladosporium, Acremonium, Alternaria, Engyodontium, Penicillium, Rhizopus and Aureobasidium. Most of the identified sequences were characteristic of organisms implicated in deterioration of wood and brick. Metabolomic data indicated the activation of numerous metabolic pathways, including those regulating the production of primary and secondary metabolites, for example, metabolites associated with the production of antibiotics, organic acids and deterioration of organic compounds. The study demonstrated that a combination of electron microscopy imaging with metabolomic and genomic techniques allows to link the phylogenetic information and metabolic profiles of

  20. Metabolomic and high-throughput sequencing analysis-modern approach for the assessment of biodeterioration of materials from historic buildings.

    Science.gov (United States)

    Gutarowska, Beata; Celikkol-Aydin, Sukriye; Bonifay, Vincent; Otlewska, Anna; Aydin, Egemen; Oldham, Athenia L; Brauer, Jonathan I; Duncan, Kathleen E; Adamiak, Justyna; Sunner, Jan A; Beech, Iwona B

    2015-01-01

    Preservation of cultural heritage is of paramount importance worldwide. Microbial colonization of construction materials, such as wood, brick, mortar, and stone in historic buildings can lead to severe deterioration. The aim of the present study was to give modern insight into the phylogenetic diversity and activated metabolic pathways of microbial communities colonized historic objects located in the former Auschwitz II-Birkenau concentration and extermination camp in Oświecim, Poland. For this purpose we combined molecular, microscopic and chemical methods. Selected specimens were examined using Field Emission Scanning Electron Microscopy (FESEM), metabolomic analysis and high-throughput Illumina sequencing. FESEM imaging revealed the presence of complex microbial communities comprising diatoms, fungi and bacteria, mainly cyanobacteria and actinobacteria, on sample surfaces. Microbial diversity of brick specimens appeared higher than that of the wood and was dominated by algae and cyanobacteria, while wood was mainly colonized by fungi. DNA sequences documented the presence of 15 bacterial phyla representing 99 genera including Halomonas, Halorhodospira, Salinisphaera, Salinibacterium, Rubrobacter, Streptomyces, Arthrobacter and nine fungal classes represented by 113 genera including Cladosporium, Acremonium, Alternaria, Engyodontium, Penicillium, Rhizopus, and Aureobasidium. Most of the identified sequences were characteristic of organisms implicated in deterioration of wood and brick. Metabolomic data indicated the activation of numerous metabolic pathways, including those regulating the production of primary and secondary metabolites, for example, metabolites associated with the production of antibiotics, organic acids and deterioration of organic compounds. The study demonstrated that a combination of electron microscopy imaging with metabolomic and genomic techniques allows to link the phylogenetic information and metabolic profiles of microbial communities

  1. High throughput 16S rRNA gene amplicon sequencing

    DEFF Research Database (Denmark)

    Nierychlo, Marta; Larsen, Poul; Jørgensen, Mads Koustrup

    S rRNA gene amplicon sequencing has been developed over the past few years and is now ready to use for more comprehensive studies related to plant operation and optimization thanks to short analysis time, low cost, high throughput, and high taxonomic resolution. In this study we show how 16S r......RNA gene amplicon sequencing can be used to reveal factors of importance for the operation of full-scale nutrient removal plants related to settling problems and floc properties. Using optimized DNA extraction protocols, indexed primers and our in-house Illumina platform, we prepared multiple samples...... be correlated to the presence of the species that are regarded as “strong” and “weak” floc formers. In conclusion, 16S rRNA gene amplicon sequencing provides a high throughput approach for a rapid and cheap community profiling of activated sludge that in combination with multivariate statistics can be used...

  2. Genetic analysis and gene mapping of a low stigma exposed mutant gene by high-throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Xiao Ma

    Full Text Available Rice is one of the main food crops and several studies have examined the molecular mechanism of the exposure of the rice plant stigma. The improvement in the exposure of the stigma in female parent hybrid combinations can enhance the efficiency of hybrid breeding. In the present study, a mutant plant with low exposed stigma (lesr was discovered among the descendants of the indica thermo-sensitive sterile line 115S. The ES% rate of the mutant decreased by 70.64% compared with the wild type variety. The F2 population was established by genetic analysis considering the mutant as the female parent and the restorer line 93S as the male parent. The results indicated a normal F1 population, while a clear division was noted for the high and low exposed stigma groups, respectively. This process was possible only by a ES of 25% in the F2 population. This was in agreement with the ratio of 3:1, which indicated that the mutant was controlled by a recessive main-effect QTL locus, temporarily named as LESR. Genome-wide comparison of the SNP profiles between the early, high and low production bulks were constructed from F2 plants using bulked segregant analysis in combination with high-throughput sequencing technology. The results demonstrated that the candidate loci was located on the chromosome 10 of the rice. Following screening of the recombinant rice plants with newly developed molecular markers, the genetic region was narrowed down to 0.25 Mb. This region was flanked by InDel-2 and InDel-2 at the physical location from 13.69 to 13.94 Mb. Within this region, 7 genes indicated base differences between parents. A total of 2 genes exhibited differences at the coding region and upstream of the coding region, respectively. The present study aimed to further clone the LESR gene, verify its function and identify the stigma variation.

  3. eRNA: a graphic user interface-based tool optimized for large data analysis from high-throughput RNA sequencing.

    Science.gov (United States)

    Yuan, Tiezheng; Huang, Xiaoyi; Dittmar, Rachel L; Du, Meijun; Kohli, Manish; Boardman, Lisa; Thibodeau, Stephen N; Wang, Liang

    2014-03-05

    RNA sequencing (RNA-seq) is emerging as a critical approach in biological research. However, its high-throughput advantage is significantly limited by the capacity of bioinformatics tools. The research community urgently needs user-friendly tools to efficiently analyze the complicated data generated by high throughput sequencers. We developed a standalone tool with graphic user interface (GUI)-based analytic modules, known as eRNA. The capacity of performing parallel processing and sample management facilitates large data analyses by maximizing hardware usage and freeing users from tediously handling sequencing data. The module miRNA identification" includes GUIs for raw data reading, adapter removal, sequence alignment, and read counting. The module "mRNA identification" includes GUIs for reference sequences, genome mapping, transcript assembling, and differential expression. The module "Target screening" provides expression profiling analyses and graphic visualization. The module "Self-testing" offers the directory setups, sample management, and a check for third-party package dependency. Integration of other GUIs including Bowtie, miRDeep2, and miRspring extend the program's functionality. eRNA focuses on the common tools required for the mapping and quantification analysis of miRNA-seq and mRNA-seq data. The software package provides an additional choice for scientists who require a user-friendly computing environment and high-throughput capacity for large data analysis. eRNA is available for free download at https://sourceforge.net/projects/erna/?source=directory.

  4. Metagenomic analysis and functional characterization of the biogas microbiome using high throughput shotgun sequencing and a novel binning strategy

    DEFF Research Database (Denmark)

    Campanaro, Stefano; Treu, Laura; Kougias, Panagiotis

    2016-01-01

    Biogas production is an economically attractive technology that has gained momentum worldwide over the past years. Biogas is produced by a biologically mediated process, widely known as "anaerobic digestion." This process is performed by a specialized and complex microbial community, in which...... performed using >400 proteins revealed that the biogas community is a trove of new species. A new approach based on functional properties as per network representation was developed to assign roles to the microbial species. The organization of the anaerobic digestion microbiome is resembled by a funnel...... on the phylogenetic and functional characterization of the microbial community populating biogas reactors. By applying for the first time high-throughput sequencing and a novel binning strategy, the identified genes were anchored to single genomes providing a clear understanding of their metabolic pathways...

  5. Applications of High Throughput Nucleotide Sequencing

    DEFF Research Database (Denmark)

    Waage, Johannes Eichler

    equally large demands in data handling, analysis and interpretation, perhaps defining the modern challenge of the computational biologist of the post-genomic era. The first part of this thesis consists of a general introduction to the history, common terms and challenges of next generation sequencing......-sequencing, a study of the effects on alternative RNA splicing of KO of the nonsense mediated RNA decay system in Mus, using digital gene expression and a custom-built exon-exon junction mapping pipeline is presented (article I). Evolved from this work, a Bioconductor package, spliceR, for classifying alternative...

  6. miRanalyzer: an update on the detection and analysis of microRNAs in high-throughput sequencing experiments

    Science.gov (United States)

    Hackenberg, Michael; Rodríguez-Ezpeleta, Naiara; Aransay, Ana M.

    2011-01-01

    We present a new version of miRanalyzer, a web server and stand-alone tool for the detection of known and prediction of new microRNAs in high-throughput sequencing experiments. The new version has been notably improved regarding speed, scope and available features. Alignments are now based on the ultrafast short-read aligner Bowtie (granting also colour space support, allowing mismatches and improving speed) and 31 genomes, including 6 plant genomes, can now be analysed (previous version contained only 7). Differences between plant and animal microRNAs have been taken into account for the prediction models and differential expression of both, known and predicted microRNAs, between two conditions can be calculated. Additionally, consensus sequences of predicted mature and precursor microRNAs can be obtained from multiple samples, which increases the reliability of the predicted microRNAs. Finally, a stand-alone version of the miRanalyzer that is based on a local and easily customized database is also available; this allows the user to have more control on certain parameters as well as to use specific data such as unpublished assemblies or other libraries that are not available in the web server. miRanalyzer is available at http://bioinfo2.ugr.es/miRanalyzer/miRanalyzer.php. PMID:21515631

  7. Temporal dynamics of soil microbial communities under different moisture regimes: high-throughput sequencing and bioinformatics analysis

    Science.gov (United States)

    Semenov, Mikhail; Zhuravleva, Anna; Semenov, Vyacheslav; Yevdokimov, Ilya; Larionova, Alla

    2017-04-01

    Recent climate scenarios predict not only continued global warming but also an increased frequency and intensity of extreme climatic events such as strong changes in temperature and precipitation regimes. Microorganisms are well known to be more sensitive to changes in environmental conditions than to other soil chemical and physical parameters. In this study, we determined the shifts in soil microbial community structure as well as indicative taxa in soils under three moisture regimes using high-throughput Illumina sequencing and range of bioinformatics approaches for the assessment of sequence data. Incubation experiments were performed in soil-filled (Greyic Phaeozems Albic) rhizoboxes with maize and without plants. Three contrasting moisture regimes were being simulated: 1) optimal wetting (OW), a watering 2-3 times per week to maintain soil moisture of 20-25% by weight; 2) periodic wetting (PW), with alternating periods of wetting and drought; and 3) constant insufficient wetting (IW), while soil moisture of 12% by weight was permanently maintained. Sampled fresh soils were homogenized, and the total DNA of three replicates was extracted using the FastDNA® SPIN kit for Soil. DNA replicates were combined in a pooled sample and the DNA was used for PCR with specific primers for the 16S V3 and V4 regions. In order to compare variability between different samples and replicates within a single sample, some DNA replicates treated separately. The products were purified and submitted to Illumina MiSeq sequencing. Sequence data were evaluated by alpha-diversity (Chao1 and Shannon H' diversity indexes), beta-diversity (UniFrac and Bray-Curtis dissimilarity), heatmap, tagcloud, and plot-bar analyses using the MiSeq Reporter Metagenomics Workflow and R packages (phyloseq, vegan, tagcloud). Shannon index varied in a rather narrow range (4.4-4.9) with the lowest values for microbial communities under PW treatment. Chao1 index varied from 385 to 480, being a more flexible

  8. Management of High-Throughput DNA Sequencing Projects: Alpheus.

    Science.gov (United States)

    Miller, Neil A; Kingsmore, Stephen F; Farmer, Andrew; Langley, Raymond J; Mudge, Joann; Crow, John A; Gonzalez, Alvaro J; Schilkey, Faye D; Kim, Ryan J; van Velkinburgh, Jennifer; May, Gregory D; Black, C Forrest; Myers, M Kathy; Utsey, John P; Frost, Nicholas S; Sugarbaker, David J; Bueno, Raphael; Gullans, Stephen R; Baxter, Susan M; Day, Steve W; Retzel, Ernest F

    2008-12-26

    High-throughput DNA sequencing has enabled systems biology to begin to address areas in health, agricultural and basic biological research. Concomitant with the opportunities is an absolute necessity to manage significant volumes of high-dimensional and inter-related data and analysis. Alpheus is an analysis pipeline, database and visualization software for use with massively parallel DNA sequencing technologies that feature multi-gigabase throughput characterized by relatively short reads, such as Illumina-Solexa (sequencing-by-synthesis), Roche-454 (pyrosequencing) and Applied Biosystem's SOLiD (sequencing-by-ligation). Alpheus enables alignment to reference sequence(s), detection of variants and enumeration of sequence abundance, including expression levels in transcriptome sequence. Alpheus is able to detect several types of variants, including non-synonymous and synonymous single nucleotide polymorphisms (SNPs), insertions/deletions (indels), premature stop codons, and splice isoforms. Variant detection is aided by the ability to filter variant calls based on consistency, expected allele frequency, sequence quality, coverage, and variant type in order to minimize false positives while maximizing the identification of true positives. Alpheus also enables comparisons of genes with variants between cases and controls or bulk segregant pools. Sequence-based differential expression comparisons can be developed, with data export to SAS JMP Genomics for statistical analysis.

  9. Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq)-A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes.

    Science.gov (United States)

    Chwialkowska, Karolina; Korotko, Urszula; Kosinska, Joanna; Szarejko, Iwona; Kwasniewski, Miroslaw

    2017-01-01

    Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq). We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS) and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare . However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation analysis in crop

  10. Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq—A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes

    Directory of Open Access Journals (Sweden)

    Karolina Chwialkowska

    2017-11-01

    Full Text Available Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq. We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare. However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation

  11. Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq)—A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes

    Science.gov (United States)

    Chwialkowska, Karolina; Korotko, Urszula; Kosinska, Joanna; Szarejko, Iwona; Kwasniewski, Miroslaw

    2017-01-01

    Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq). We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS) and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare. However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation analysis in crop

  12. Leveraging the Power of High Performance Computing for Next Generation Sequencing Data Analysis: Tricks and Twists from a High Throughput Exome Workflow

    Science.gov (United States)

    Wonczak, Stephan; Thiele, Holger; Nieroda, Lech; Jabbari, Kamel; Borowski, Stefan; Sinha, Vishal; Gunia, Wilfried; Lang, Ulrich; Achter, Viktor; Nürnberg, Peter

    2015-01-01

    Next generation sequencing (NGS) has been a great success and is now a standard method of research in the life sciences. With this technology, dozens of whole genomes or hundreds of exomes can be sequenced in rather short time, producing huge amounts of data. Complex bioinformatics analyses are required to turn these data into scientific findings. In order to run these analyses fast, automated workflows implemented on high performance computers are state of the art. While providing sufficient compute power and storage to meet the NGS data challenge, high performance computing (HPC) systems require special care when utilized for high throughput processing. This is especially true if the HPC system is shared by different users. Here, stability, robustness and maintainability are as important for automated workflows as speed and throughput. To achieve all of these aims, dedicated solutions have to be developed. In this paper, we present the tricks and twists that we utilized in the implementation of our exome data processing workflow. It may serve as a guideline for other high throughput data analysis projects using a similar infrastructure. The code implementing our solutions is provided in the supporting information files. PMID:25942438

  13. Application of high-throughput sequencing in understanding human oral microbiome related with health and disease

    OpenAIRE

    Chen, Hui; Jiang, Wen

    2014-01-01

    The oral microbiome is one of most diversity habitat in the human body and they are closely related with oral health and disease. As the technique developing,, high throughput sequencing has become a popular approach applied for oral microbial analysis. Oral bacterial profiles have been studied to explore the relationship between microbial diversity and oral diseases such as caries and periodontal disease. This review describes the application of high-throughput sequencing for characterizati...

  14. High-Throughput Block Optical DNA Sequence Identification.

    Science.gov (United States)

    Sagar, Dodderi Manjunatha; Korshoj, Lee Erik; Hanson, Katrina Bethany; Chowdhury, Partha Pratim; Otoupal, Peter Britton; Chatterjee, Anushree; Nagpal, Prashant

    2018-01-01

    Optical techniques for molecular diagnostics or DNA sequencing generally rely on small molecule fluorescent labels, which utilize light with a wavelength of several hundred nanometers for detection. Developing a label-free optical DNA sequencing technique will require nanoscale focusing of light, a high-throughput and multiplexed identification method, and a data compression technique to rapidly identify sequences and analyze genomic heterogeneity for big datasets. Such a method should identify characteristic molecular vibrations using optical spectroscopy, especially in the "fingerprinting region" from ≈400-1400 cm -1 . Here, surface-enhanced Raman spectroscopy is used to demonstrate label-free identification of DNA nucleobases with multiplexed 3D plasmonic nanofocusing. While nanometer-scale mode volumes prevent identification of single nucleobases within a DNA sequence, the block optical technique can identify A, T, G, and C content in DNA k-mers. The content of each nucleotide in a DNA block can be a unique and high-throughput method for identifying sequences, genes, and other biomarkers as an alternative to single-letter sequencing. Additionally, coupling two complementary vibrational spectroscopy techniques (infrared and Raman) can improve block characterization. These results pave the way for developing a novel, high-throughput block optical sequencing method with lossy genomic data compression using k-mer identification from multiplexed optical data acquisition. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  15. Uncovering leaf rust responsive miRNAs in wheat (Triticum aestivum L.) using high-throughput sequencing and prediction of their targets through degradome analysis.

    Science.gov (United States)

    Kumar, Dhananjay; Dutta, Summi; Singh, Dharmendra; Prabhu, Kumble Vinod; Kumar, Manish; Mukhopadhyay, Kunal

    2017-01-01

    Deep sequencing identified 497 conserved and 559 novel miRNAs in wheat, while degradome analysis revealed 701 targets genes. QRT-PCR demonstrated differential expression of miRNAs during stages of leaf rust progression. Bread wheat (Triticum aestivum L.) is an important cereal food crop feeding 30 % of the world population. Major threat to wheat production is the rust epidemics. This study was targeted towards identification and functional characterizations of micro(mi)RNAs and their target genes in wheat in response to leaf rust ingression. High-throughput sequencing was used for transcriptome-wide identification of miRNAs and their expression profiling in retort to leaf rust using mock and pathogen-inoculated resistant and susceptible near-isogenic wheat plants. A total of 1056 mature miRNAs were identified, of which 497 miRNAs were conserved and 559 miRNAs were novel. The pathogen-inoculated resistant plants manifested more miRNAs compared with the pathogen infected susceptible plants. The miRNA counts increased in susceptible isoline due to leaf rust, conversely, the counts decreased in the resistant isoline in response to pathogenesis illustrating precise spatial tuning of miRNAs during compatible and incompatible interaction. Stem-loop quantitative real-time PCR was used to profile 10 highly differentially expressed miRNAs obtained from high-throughput sequencing data. The spatio-temporal profiling validated the differential expression of miRNAs between the isolines as well as in retort to pathogen infection. Degradome analysis provided 701 predicted target genes associated with defense response, signal transduction, development, metabolism, and transcriptional regulation. The obtained results indicate that wheat isolines employ diverse arrays of miRNAs that modulate their target genes during compatible and incompatible interaction. Our findings contribute to increase knowledge on roles of microRNA in wheat-leaf rust interactions and could help in rust

  16. Network analysis of the microorganism in 25 Danish wastewater treatment plants over 7 years using high-throughput amplicon sequencing

    DEFF Research Database (Denmark)

    Albertsen, Mads; Larsen, Poul; Saunders, Aaron Marc

    to link sludge and floc properties to the microbial communities. All data was subjected to extensive network analysis and multivariate statistics through R. The 16S amplicon results confirmed the findings of relatively few core groups of organism shared by all the wastewater treatment plants......Wastewater treatment is the world’s largest biotechnological processes and a perfect model system for microbial ecology as the habitat is well defined and replicated all over the world. Extensive investigations on Danish wastewater treatment plants using fluorescent in situ hybridization have...... a year, totaling over 400 samples. All samples were subjected to 16S rDNA amplicon sequencing using V13 primers on the Illumina MiSeq platform (2x300bp) to a depth of at least 20.000 quality filtered reads per sample. The OTUs were assigned taxonomy based on a manually curated version of the greengenes...

  17. Comparative analysis of transcriptomes in aerial stems and roots of Ephedra sinica based on high-throughput mRNA sequencing

    Directory of Open Access Journals (Sweden)

    Taketo Okada

    2016-12-01

    Full Text Available Ephedra plants are taxonomically classified as gymnosperms, and are medicinally important as the botanical origin of crude drugs and as bioresources that contain pharmacologically active chemicals. Here we show a comparative analysis of the transcriptomes of aerial stems and roots of Ephedra sinica based on high-throughput mRNA sequencing by RNA-Seq. De novo assembly of short cDNA sequence reads generated 23,358, 13,373, and 28,579 contigs longer than 200 bases from aerial stems, roots, or both aerial stems and roots, respectively. The presumed functions encoded by these contig sequences were annotated by BLAST (blastx. Subsequently, these contigs were classified based on gene ontology slims, Enzyme Commission numbers, and the InterPro database. Furthermore, comparative gene expression analysis was performed between aerial stems and roots. These transcriptome analyses revealed differences and similarities between the transcriptomes of aerial stems and roots in E. sinica. Deep transcriptome sequencing of Ephedra should open the door to molecular biological studies based on the entire transcriptome, tissue- or organ-specific transcriptomes, or targeted genes of interest.

  18. Application of high-throughput DNA sequencing in phytopathology.

    Science.gov (United States)

    Studholme, David J; Glover, Rachel H; Boonham, Neil

    2011-01-01

    The new sequencing technologies are already making a big impact in academic research on medically important microbes and may soon revolutionize diagnostics, epidemiology, and infection control. Plant pathology also stands to gain from exploiting these opportunities. This manuscript reviews some applications of these high-throughput sequencing methods that are relevant to phytopathology, with emphasis on the associated computational and bioinformatics challenges and their solutions. Second-generation sequencing technologies have recently been exploited in genomics of both prokaryotic and eukaryotic plant pathogens. They are also proving to be useful in diagnostics, especially with respect to viruses. Copyright © 2011 by Annual Reviews. All rights reserved.

  19. Quack: A quality assurance tool for high throughput sequence data.

    Science.gov (United States)

    Thrash, Adam; Arick, Mark; Peterson, Daniel G

    2018-05-01

    The quality of data generated by high-throughput DNA sequencing tools must be rapidly assessed in order to determine how useful the data may be in making biological discoveries; higher quality data leads to more confident results and conclusions. Due to the ever-increasing size of data sets and the importance of rapid quality assessment, tools that analyze sequencing data should quickly produce easily interpretable graphics. Quack addresses these issues by generating information-dense visualizations from FASTQ files at a speed far surpassing other publicly available quality assurance tools in a manner independent of sequencing technology. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  20. High-Throughput Analysis of Enzyme Activities

    Energy Technology Data Exchange (ETDEWEB)

    Lu, Guoxin [Iowa State Univ., Ames, IA (United States)

    2007-01-01

    High-throughput screening (HTS) techniques have been applied to many research fields nowadays. Robot microarray printing technique and automation microtiter handling technique allows HTS performing in both heterogeneous and homogeneous formats, with minimal sample required for each assay element. In this dissertation, new HTS techniques for enzyme activity analysis were developed. First, patterns of immobilized enzyme on nylon screen were detected by multiplexed capillary system. The imaging resolution is limited by the outer diameter of the capillaries. In order to get finer images, capillaries with smaller outer diameters can be used to form the imaging probe. Application of capillary electrophoresis allows separation of the product from the substrate in the reaction mixture, so that the product doesn't have to have different optical properties with the substrate. UV absorption detection allows almost universal detection for organic molecules. Thus, no modifications of either the substrate or the product molecules are necessary. This technique has the potential to be used in screening of local distribution variations of specific bio-molecules in a tissue or in screening of multiple immobilized catalysts. Another high-throughput screening technique is developed by directly monitoring the light intensity of the immobilized-catalyst surface using a scientific charge-coupled device (CCD). Briefly, the surface of enzyme microarray is focused onto a scientific CCD using an objective lens. By carefully choosing the detection wavelength, generation of product on an enzyme spot can be seen by the CCD. Analyzing the light intensity change over time on an enzyme spot can give information of reaction rate. The same microarray can be used for many times. Thus, high-throughput kinetic studies of hundreds of catalytic reactions are made possible. At last, we studied the fluorescence emission spectra of ADP and obtained the detection limits for ADP under three different

  1. Integrated analysis of RNA-binding protein complexes using in vitro selection and high-throughput sequencing and sequence specificity landscapes (SEQRS).

    Science.gov (United States)

    Lou, Tzu-Fang; Weidmann, Chase A; Killingsworth, Jordan; Tanaka Hall, Traci M; Goldstrohm, Aaron C; Campbell, Zachary T

    2017-04-15

    RNA-binding proteins (RBPs) collaborate to control virtually every aspect of RNA function. Tremendous progress has been made in the area of global assessment of RBP specificity using next-generation sequencing approaches both in vivo and in vitro. Understanding how protein-protein interactions enable precise combinatorial regulation of RNA remains a significant problem. Addressing this challenge requires tools that can quantitatively determine the specificities of both individual proteins and multimeric complexes in an unbiased and comprehensive way. One approach utilizes in vitro selection, high-throughput sequencing, and sequence-specificity landscapes (SEQRS). We outline a SEQRS experiment focused on obtaining the specificity of a multi-protein complex between Drosophila RBPs Pumilio (Pum) and Nanos (Nos). We discuss the necessary controls in this type of experiment and examine how the resulting data can be complemented with structural and cell-based reporter assays. Additionally, SEQRS data can be integrated with functional genomics data to uncover biological function. Finally, we propose extensions of the technique that will enhance our understanding of multi-protein regulatory complexes assembled onto RNA. Copyright © 2016 Elsevier Inc. All rights reserved.

  2. High-Throughput Analysis With 96-Capillary Array Electrophoresis and Integrated Sample Preparation for DNA Sequencing Based on Laser Induced Fluorescence Detection

    Energy Technology Data Exchange (ETDEWEB)

    Xue, Gang [Iowa State Univ., Ames, IA (United States)

    2001-01-01

    The purpose of this research was to improve the fluorescence detection for the multiplexed capillary array electrophoresis, extend its use beyond the genomic analysis, and to develop an integrated micro-sample preparation system for high-throughput DNA sequencing. The authors first demonstrated multiplexed capillary zone electrophoresis (CZE) and micellar electrokinetic chromatography (MEKC) separations in a 96-capillary array system with laser-induced fluorescence detection. Migration times of four kinds of fluoresceins and six polyaromatic hydrocarbons (PAHs) are normalized to one of the capillaries using two internal standards. The relative standard deviations (RSD) after normalization are 0.6-1.4% for the fluoresceins and 0.1-1.5% for the PAHs. Quantitative calibration of the separations based on peak areas is also performed, again with substantial improvement over the raw data. This opens up the possibility of performing massively parallel separations for high-throughput chemical analysis for process monitoring, combinatorial synthesis, and clinical diagnosis. The authors further improved the fluorescence detection by step laser scanning. A computer-controlled galvanometer scanner is adapted for scanning a focused laser beam across a 96-capillary array for laser-induced fluorescence detection. The signal at a single photomultiplier tube is temporally sorted to distinguish among the capillaries. The limit of detection for fluorescein is 3 x 10-11 M (S/N = 3) for 5-mW of total laser power scanned at 4 Hz. The observed cross-talk among capillaries is 0.2%. Advantages include the efficient utilization of light due to the high duty-cycle of step scan, good detection performance due to the reduction of stray light, ruggedness due to the small mass of the galvanometer mirror, low cost due to the simplicity of components, and flexibility due to the independent paths for excitation and emission.

  3. Transcriptome-Wide Analysis of Botrytis elliptica Responsive microRNAs and Their Targets in Lilium Regale Wilson by High-Throughput Sequencing and Degradome Analysis

    Directory of Open Access Journals (Sweden)

    Xue Gao

    2017-05-01

    Full Text Available MicroRNAs, as master regulators of gene expression, have been widely identified and play crucial roles in plant-pathogen interactions. A fatal pathogen, Botrytis elliptica, causes the serious folia disease of lily, which reduces production because of the high susceptibility of most cultivated species. However, the miRNAs related to Botrytis infection of lily, and the miRNA-mediated gene regulatory networks providing resistance to B. elliptica in lily remain largely unexplored. To systematically dissect B. elliptica-responsive miRNAs and their target genes, three small RNA libraries were constructed from the leaves of Lilium regale, a promising Chinese wild Lilium species, which had been subjected to mock B. elliptica treatment or B. elliptica infection for 6 and 24 h. By high-throughput sequencing, 71 known miRNAs belonging to 47 conserved families and 24 novel miRNA were identified, of which 18 miRNAs were downreguleted and 13 were upregulated in response to B. elliptica. Moreover, based on the lily mRNA transcriptome, 22 targets for 9 known and 1 novel miRNAs were identified by the degradome sequencing approach. Most target genes for elliptica-responsive miRNAs were involved in metabolic processes, few encoding different transcription factors, including ELONGATION FACTOR 1 ALPHA (EF1a and TEOSINTE BRANCHED1/CYCLOIDEA/PROLIFERATING CELL FACTOR 2 (TCP2. Furthermore, the expression patterns of a set of elliptica-responsive miRNAs and their targets were validated by quantitative real-time PCR. This study represents the first transcriptome-based analysis of miRNAs responsive to B. elliptica and their targets in lily. The results reveal the possible regulatory roles of miRNAs and their targets in B. elliptica interaction, which will extend our understanding of the mechanisms of this disease in lily.

  4. BOOGIE: Predicting Blood Groups from High Throughput Sequencing Data.

    Science.gov (United States)

    Giollo, Manuel; Minervini, Giovanni; Scalzotto, Marta; Leonardi, Emanuela; Ferrari, Carlo; Tosatto, Silvio C E

    2015-01-01

    Over the last decade, we have witnessed an incredible growth in the amount of available genotype data due to high throughput sequencing (HTS) techniques. This information may be used to predict phenotypes of medical relevance, and pave the way towards personalized medicine. Blood phenotypes (e.g. ABO and Rh) are a purely genetic trait that has been extensively studied for decades, with currently over thirty known blood groups. Given the public availability of blood group data, it is of interest to predict these phenotypes from HTS data which may translate into more accurate blood typing in clinical practice. Here we propose BOOGIE, a fast predictor for the inference of blood groups from single nucleotide variant (SNV) databases. We focus on the prediction of thirty blood groups ranging from the well known ABO and Rh, to the less studied Junior or Diego. BOOGIE correctly predicted the blood group with 94% accuracy for the Personal Genome Project whole genome profiles where good quality SNV annotation was available. Additionally, our tool produces a high quality haplotype phase, which is of interest in the context of ethnicity-specific polymorphisms or traits. The versatility and simplicity of the analysis make it easily interpretable and allow easy extension of the protocol towards other phenotypes. BOOGIE can be downloaded from URL http://protein.bio.unipd.it/download/.

  5. High-throughput sequencing and pathway analysis reveal alteration of the pituitary transcriptome by 17α-ethynylestradiol (EE2) in female coho salmon, Oncorhynchus kisutch

    Energy Technology Data Exchange (ETDEWEB)

    Harding, Louisa B. [School of Aquatic and Fishery Sciences, University of Washington, Seattle, WA 98195 (United States); Schultz, Irvin R. [Battelle, Marine Sciences Laboratory – Pacific Northwest National Laboratory, 1529 West Sequim Bay Road, Sequim, WA 98382 (United States); Goetz, Giles W. [School of Aquatic and Fishery Sciences, University of Washington, Seattle, WA 98195 (United States); Luckenbach, J. Adam [Northwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, 2725 Montlake Blvd E, Seattle, WA 98112 (United States); Center for Reproductive Biology, Washington State University, Pullman, WA 98164 (United States); Young, Graham [School of Aquatic and Fishery Sciences, University of Washington, Seattle, WA 98195 (United States); Center for Reproductive Biology, Washington State University, Pullman, WA 98164 (United States); Goetz, Frederick W. [Northwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, Manchester Research Station, P.O. Box 130, Manchester, WA 98353 (United States); Swanson, Penny, E-mail: penny.swanson@noaa.gov [Northwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, 2725 Montlake Blvd E, Seattle, WA 98112 (United States); Center for Reproductive Biology, Washington State University, Pullman, WA 98164 (United States)

    2013-10-15

    Highlights: •Studied impacts of ethynylestradiol (EE2) exposure on salmon pituitary transcriptome. •High-throughput sequencing, RNAseq, and pathway analysis were performed. •EE2 altered mRNAs for genes in circadian rhythm, GnRH, and TGFβ signaling pathways. •LH and FSH beta subunit mRNAs were most highly up- and down-regulated by EE2, respectively. •Estrogens may alter processes associated with reproductive timing in salmon. -- Abstract: Considerable research has been done on the effects of endocrine disrupting chemicals (EDCs) on reproduction and gene expression in the brain, liver and gonads of teleost fish, but information on impacts to the pituitary gland are still limited despite its central role in regulating reproduction. The aim of this study was to further our understanding of the potential effects of natural and synthetic estrogens on the brain–pituitary–gonad axis in fish by determining the effects of 17α-ethynylestradiol (EE2) on the pituitary transcriptome. We exposed sub-adult coho salmon (Oncorhynchus kisutch) to 0 or 12 ng EE2/L for up to 6 weeks and effects on the pituitary transcriptome of females were assessed using high-throughput Illumina{sup ®} sequencing, RNA-Seq and pathway analysis. After 1 or 6 weeks, 218 and 670 contiguous sequences (contigs) respectively, were differentially expressed in pituitaries of EE2-exposed fish relative to control. Two of the most highly up- and down-regulated contigs were luteinizing hormone β subunit (241-fold and 395-fold at 1 and 6 weeks, respectively) and follicle-stimulating hormone β subunit (−3.4-fold at 6 weeks). Additional contigs related to gonadotropin synthesis and release were differentially expressed in EE2-exposed fish relative to controls. These included contigs involved in gonadotropin releasing hormone (GNRH) and transforming growth factor-β signaling. There was an over-representation of significantly affected contigs in 33 and 18 canonical pathways at 1 and 6 weeks

  6. High-throughput sequencing and pathway analysis reveal alteration of the pituitary transcriptome by 17α-ethynylestradiol (EE2) in female coho salmon, Oncorhynchus kisutch

    International Nuclear Information System (INIS)

    Harding, Louisa B.; Schultz, Irvin R.; Goetz, Giles W.; Luckenbach, J. Adam; Young, Graham; Goetz, Frederick W.; Swanson, Penny

    2013-01-01

    Highlights: •Studied impacts of ethynylestradiol (EE2) exposure on salmon pituitary transcriptome. •High-throughput sequencing, RNAseq, and pathway analysis were performed. •EE2 altered mRNAs for genes in circadian rhythm, GnRH, and TGFβ signaling pathways. •LH and FSH beta subunit mRNAs were most highly up- and down-regulated by EE2, respectively. •Estrogens may alter processes associated with reproductive timing in salmon. -- Abstract: Considerable research has been done on the effects of endocrine disrupting chemicals (EDCs) on reproduction and gene expression in the brain, liver and gonads of teleost fish, but information on impacts to the pituitary gland are still limited despite its central role in regulating reproduction. The aim of this study was to further our understanding of the potential effects of natural and synthetic estrogens on the brain–pituitary–gonad axis in fish by determining the effects of 17α-ethynylestradiol (EE2) on the pituitary transcriptome. We exposed sub-adult coho salmon (Oncorhynchus kisutch) to 0 or 12 ng EE2/L for up to 6 weeks and effects on the pituitary transcriptome of females were assessed using high-throughput Illumina ® sequencing, RNA-Seq and pathway analysis. After 1 or 6 weeks, 218 and 670 contiguous sequences (contigs) respectively, were differentially expressed in pituitaries of EE2-exposed fish relative to control. Two of the most highly up- and down-regulated contigs were luteinizing hormone β subunit (241-fold and 395-fold at 1 and 6 weeks, respectively) and follicle-stimulating hormone β subunit (−3.4-fold at 6 weeks). Additional contigs related to gonadotropin synthesis and release were differentially expressed in EE2-exposed fish relative to controls. These included contigs involved in gonadotropin releasing hormone (GNRH) and transforming growth factor-β signaling. There was an over-representation of significantly affected contigs in 33 and 18 canonical pathways at 1 and 6 weeks

  7. Identification and characterization of microRNAs related to salt stress in broccoli, using high-throughput sequencing and bioinformatics analysis.

    Science.gov (United States)

    Tian, Yunhong; Tian, Yunming; Luo, Xiaojun; Zhou, Tao; Huang, Zuoping; Liu, Ying; Qiu, Yihan; Hou, Bing; Sun, Dan; Deng, Hongyu; Qian, Shen; Yao, Kaitai

    2014-09-03

    MicroRNAs (miRNAs) are a new class of endogenous regulators of a broad range of physiological processes, which act by regulating gene expression post-transcriptionally. The brassica vegetable, broccoli (Brassica oleracea var. italica), is very popular with a wide range of consumers, but environmental stresses such as salinity are a problem worldwide in restricting its growth and yield. Little is known about the role of miRNAs in the response of broccoli to salt stress. In this study, broccoli subjected to salt stress and broccoli grown under control conditions were analyzed by high-throughput sequencing. Differential miRNA expression was confirmed by real-time reverse transcription polymerase chain reaction (RT-PCR). The prediction of miRNA targets was undertaken using the Kyoto Encyclopedia of Genes and Genomes (KEGG) Orthology (KO) database and Gene Ontology (GO)-enrichment analyses. Two libraries of small (or short) RNAs (sRNAs) were constructed and sequenced by high-throughput Solexa sequencing. A total of 24,511,963 and 21,034,728 clean reads, representing 9,861,236 (40.23%) and 8,574,665 (40.76%) unique reads, were obtained for control and salt-stressed broccoli, respectively. Furthermore, 42 putative known and 39 putative candidate miRNAs that were differentially expressed between control and salt-stressed broccoli were revealed by their read counts and confirmed by the use of stem-loop real-time RT-PCR. Amongst these, the putative conserved miRNAs, miR393 and miR855, and two putative candidate miRNAs, miR3 and miR34, were the most strongly down-regulated when broccoli was salt-stressed, whereas the putative conserved miRNA, miR396a, and the putative candidate miRNA, miR37, were the most up-regulated. Finally, analysis of the predicted gene targets of miRNAs using the GO and KO databases indicated that a range of metabolic and other cellular functions known to be associated with salt stress were up-regulated in broccoli treated with salt. A comprehensive

  8. Systematic Analysis of the Association between Gut Flora and Obesity through High-Throughput Sequencing and Bioinformatics Approaches

    Directory of Open Access Journals (Sweden)

    Chih-Min Chiu

    2014-01-01

    Full Text Available Eighty-one stool samples from Taiwanese were collected for analysis of the association between the gut flora and obesity. The supervised analysis showed that the most, abundant genera of bacteria in normal samples (from people with a body mass index (BMI ≤ 24 were Bacteroides (27.7%, Prevotella (19.4%, Escherichia (12%, Phascolarctobacterium (3.9%, and Eubacterium (3.5%. The most abundant genera of bacteria in case samples (with a BMI ≥ 27 were Bacteroides (29%, Prevotella (21%, Escherichia (7.4%, Megamonas (5.1%, and Phascolarctobacterium (3.8%. A principal coordinate analysis (PCoA demonstrated that normal samples were clustered more compactly than case samples. An unsupervised analysis demonstrated that bacterial communities in the gut were clustered into two main groups: N-like and OB-like groups. Remarkably, most normal samples (78% were clustered in the N-like group, and most case samples (81% were clustered in the OB-like group (Fisher’s P  value=1.61E-07. The results showed that bacterial communities in the gut were highly associated with obesity. This is the first study in Taiwan to investigate the association between human gut flora and obesity, and the results provide new insights into the correlation of bacteria with the rising trend in obesity.

  9. Comprehensive processing of high-throughput small RNA sequencing data including quality checking, normalization, and differential expression analysis using the UEA sRNA Workbench.

    Science.gov (United States)

    Beckers, Matthew; Mohorianu, Irina; Stocks, Matthew; Applegate, Christopher; Dalmay, Tamas; Moulton, Vincent

    2017-06-01

    Recently, high-throughput sequencing (HTS) has revealed compelling details about the small RNA (sRNA) population in eukaryotes. These 20 to 25 nt noncoding RNAs can influence gene expression by acting as guides for the sequence-specific regulatory mechanism known as RNA silencing. The increase in sequencing depth and number of samples per project enables a better understanding of the role sRNAs play by facilitating the study of expression patterns. However, the intricacy of the biological hypotheses coupled with a lack of appropriate tools often leads to inadequate mining of the available data and thus, an incomplete description of the biological mechanisms involved. To enable a comprehensive study of differential expression in sRNA data sets, we present a new interactive pipeline that guides researchers through the various stages of data preprocessing and analysis. This includes various tools, some of which we specifically developed for sRNA analysis, for quality checking and normalization of sRNA samples as well as tools for the detection of differentially expressed sRNAs and identification of the resulting expression patterns. The pipeline is available within the UEA sRNA Workbench, a user-friendly software package for the processing of sRNA data sets. We demonstrate the use of the pipeline on a H. sapiens data set; additional examples on a B. terrestris data set and on an A. thaliana data set are described in the Supplemental Information A comparison with existing approaches is also included, which exemplifies some of the issues that need to be addressed for sRNA analysis and how the new pipeline may be used to do this. © 2017 Beckers et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  10. MetaGenSense: A web-application for analysis and exploration of high throughput sequencing metagenomic data [version 3; referees: 1 approved, 2 approved with reservations

    Directory of Open Access Journals (Sweden)

    Damien Correia

    2016-12-01

    Full Text Available The detection and characterization of emerging infectious agents has been a continuing public health concern. High Throughput Sequencing (HTS or Next-Generation Sequencing (NGS technologies have proven to be promising approaches for efficient and unbiased detection of pathogens in complex biological samples, providing access to comprehensive analyses. As NGS approaches typically yield millions of putatively representative reads per sample, efficient data management and visualization resources have become mandatory. Most usually, those resources are implemented through a dedicated Laboratory Information Management System (LIMS, solely to provide perspective regarding the available information. We developed an easily deployable web-interface, facilitating management and bioinformatics analysis of metagenomics data-samples. It was engineered to run associated and dedicated Galaxy workflows for the detection and eventually classification of pathogens. The web application allows easy interaction with existing Galaxy metagenomic workflows, facilitates the organization, exploration and aggregation of the most relevant sample-specific sequences among millions of genomic sequences, allowing them to determine their relative abundance, and associate them to the most closely related organism or pathogen. The user-friendly Django-Based interface, associates the users’ input data and its metadata through a bio-IT provided set of resources (a Galaxy instance, and both sufficient storage and grid computing power. Galaxy is used to handle and analyze the user’s input data from loading, indexing, mapping, assembly and DB-searches. Interaction between our application and Galaxy is ensured by the BioBlend library, which gives API-based access to Galaxy’s main features. Metadata about samples, runs, as well as the workflow results are stored in the LIMS. For metagenomic classification and exploration purposes, we show, as a proof of concept, that integration

  11. Isolation of Exosome-Like Nanoparticles and Analysis of MicroRNAs Derived from Coconut Water Based on Small RNA High-Throughput Sequencing.

    Science.gov (United States)

    Zhao, Zhehao; Yu, Siran; Li, Min; Gui, Xin; Li, Ping

    2018-03-21

    In this study, the presence of microRNAs in coconut water was identified by real-time polymerase chain reaction (PCR) based on the results of high-throughput small RNA sequencing. In addition, the differences in microRNA content between immature and mature coconut water were compared. A total of 47 known microRNAs belonging to 25 families and 14 new microRNAs were identified in coconut endosperm. Through analysis using a target gene prediction software, potential microRNA target genes were identified in the human genome. Real-time PCR showed that the level of most microRNAs was higher in mature coconut water than in immature coconut water. Then, exosome-like nanoparticles were isolated from coconut water. After ultracentrifugation, some particle structures were seen in coconut water samples using 1,1'-dioctadecyl-3,3,3',3'-tetramethylindocarbocyanine perchlorate fluorescence staining. Subsequent scanning electron microscopy observation and dynamic light scattering analysis also revealed some exosome-like nanoparticles in coconut water, and the mean diameters of the particles detected by the two methods were 13.16 and 59.72 nm, respectively. In conclusion, there are extracellular microRNAs in coconut water, and their levels are higher in mature coconut water than in immature coconut water. Some exosome-like nanoparticles were isolated from coconut water, and the diameter of these particles was smaller than that of animal-derived exosomes.

  12. Genome-wide identification and comparative analysis of grafting-responsive mRNA in watermelon grafted onto bottle gourd and squash rootstocks by high-throughput sequencing.

    Science.gov (United States)

    Liu, Na; Yang, Jinghua; Fu, Xinxing; Zhang, Li; Tang, Kai; Guy, Kateta Malangisha; Hu, Zhongyuan; Guo, Shaogui; Xu, Yong; Zhang, Mingfang

    2016-04-01

    Grafting is an important agricultural technique widely used to improve plant growth, yield, and adaptation to either biotic or abiotic stresses. However, the molecular mechanisms underlying grafting-induced physiological processes remain unclear. Watermelon (Citrullus lanatus L.) is an important horticultural crop worldwide. Grafting technique is commonly used in watermelon production for improving its tolerance to stresses, especially to the soil-borne fusarium wilt disease. In the present study, we used high-throughput sequencing to perform a genome-wide transcript analysis of scions from watermelon grafted onto bottle gourd and squash rootstocks. Our transcriptome and digital gene expression (DGE) profiling data provided insights into the molecular aspects of gene regulation in grafted watermelon. Compared with self-grafted watermelon, there were 787 and 3485 genes differentially expressed in watermelon grafted onto bottle gourd and squash rootstocks, respectively. These genes were associated with primary and secondary metabolism, hormone signaling, transcription factors, transporters, and response to stimuli. Grafting led to changes in expression of these genes, suggesting that they may play important roles in mediating the physiological processes of grafted seedlings. The potential roles of the grafting-responsive mRNAs in diverse biological and metabolic processes were discussed. Obviously, the data obtained in this study provide an excellent resource for unraveling the mechanisms of candidate genes function in diverse biological processes and in environmental adaptation in a graft system.

  13. High Throughput Analysis of Photocatalytic Water Purification

    NARCIS (Netherlands)

    Sobral Romao, J.I.; Baiao Barata, David; Habibovic, Pamela; Mul, Guido; Baltrusaitis, Jonas

    2014-01-01

    We present a novel high throughput photocatalyst efficiency assessment method based on 96-well microplates and UV-Vis spectroscopy. We demonstrate the reproducibility of the method using methyl orange (MO) decomposition, and compare kinetic data obtained with those provided in the literature for

  14. Probabilistic Methods for Processing High-Throughput Sequencing Signals

    DEFF Research Database (Denmark)

    Sørensen, Lasse Maretty

    High-throughput sequencing has the potential to answer many of the big questions in biology and medicine. It can be used to determine the ancestry of species, to chart complex ecosystems and to understand and diagnose disease. However, going from raw sequencing data to biological or medical insig....... By estimating the genotypes on a set of candidate variants obtained from both a standard mapping-based approach as well as de novo assemblies, we are able to find considerably more structural variation than previous studies...... for reconstructing transcript sequences from RNA sequencing data. The method is based on a novel sparse prior distribution over transcript abundances and is markedly more accurate than existing approaches. The second chapter describes a new method for calling genotypes from a fixed set of candidate variants....... The method queries the reads using a graph representation of the variants and hereby mitigates the reference-bias that characterise standard genotyping methods. In the last chapter, we apply this method to call the genotypes of 50 deeply sequencing parent-offspring trios from the GenomeDenmark project...

  15. High-Throughput Next-Generation Sequencing of Polioviruses

    Science.gov (United States)

    Montmayeur, Anna M.; Schmidt, Alexander; Zhao, Kun; Magaña, Laura; Iber, Jane; Castro, Christina J.; Chen, Qi; Henderson, Elizabeth; Ramos, Edward; Shaw, Jing; Tatusov, Roman L.; Dybdahl-Sissoko, Naomi; Endegue-Zanga, Marie Claire; Adeniji, Johnson A.; Oberste, M. Steven; Burns, Cara C.

    2016-01-01

    ABSTRACT The poliovirus (PV) is currently targeted for worldwide eradication and containment. Sanger-based sequencing of the viral protein 1 (VP1) capsid region is currently the standard method for PV surveillance. However, the whole-genome sequence is sometimes needed for higher resolution global surveillance. In this study, we optimized whole-genome sequencing protocols for poliovirus isolates and FTA cards using next-generation sequencing (NGS), aiming for high sequence coverage, efficiency, and throughput. We found that DNase treatment of poliovirus RNA followed by random reverse transcription (RT), amplification, and the use of the Nextera XT DNA library preparation kit produced significantly better results than other preparations. The average viral reads per total reads, a measurement of efficiency, was as high as 84.2% ± 15.6%. PV genomes covering >99 to 100% of the reference length were obtained and validated with Sanger sequencing. A total of 52 PV genomes were generated, multiplexing as many as 64 samples in a single Illumina MiSeq run. This high-throughput, sequence-independent NGS approach facilitated the detection of a diverse range of PVs, especially for those in vaccine-derived polioviruses (VDPV), circulating VDPV, or immunodeficiency-related VDPV. In contrast to results from previous studies on other viruses, our results showed that filtration and nuclease treatment did not discernibly increase the sequencing efficiency of PV isolates. However, DNase treatment after nucleic acid extraction to remove host DNA significantly improved the sequencing results. This NGS method has been successfully implemented to generate PV genomes for molecular epidemiology of the most recent PV isolates. Additionally, the ability to obtain full PV genomes from FTA cards will aid in facilitating global poliovirus surveillance. PMID:27927929

  16. Identification of miRNAs and their targets through high-throughput sequencing and degradome analysis in male and female Asparagus officinalis.

    Science.gov (United States)

    Chen, Jingli; Zheng, Yi; Qin, Li; Wang, Yan; Chen, Lifei; He, Yanjun; Fei, Zhangjun; Lu, Gang

    2016-04-12

    MicroRNAs (miRNAs), a class of non-coding small RNAs (sRNAs), regulate various biological processes. Although miRNAs have been identified and characterized in several plant species, miRNAs in Asparagus officinalis have not been reported. As a dioecious plant with homomorphic sex chromosomes, asparagus is regarded as an important model system for studying mechanisms of plant sex determination. Two independent sRNA libraries from male and female asparagus plants were sequenced with Illumina sequencing, thereby generating 4.13 and 5.88 million final clean reads, respectively. Both libraries predominantly contained 24-nt sRNAs, followed by 21-nt sRNAs. Further analysis identified 154 conserved miRNAs, which belong to 26 families, and 39 novel miRNA candidates seemed to be specific to asparagus. Comparative profiling revealed that 63 miRNAs exhibited significant differential expression between male and female plants, which was confirmed by real-time quantitative PCR analysis. Among them, 37 miRNAs were significantly up-regulated in the female library, whereas the others were preferentially expressed in the male library. Furthermore, 40 target mRNAs representing 44 conserved and seven novel miRNAs were identified in asparagus through high-throughput degradome sequencing. Functional annotation showed that these target mRNAs were involved in a wide range of developmental and metabolic processes. We identified a large set of conserved and specific miRNAs and compared their expression levels between male and female asparagus plants. Several asparagus miRNAs, which belong to the miR159, miR167, and miR172 families involved in reproductive organ development, were differentially expressed between male and female plants, as well as during flower development. Consistently, several predicted targets of asparagus miRNAs were associated with floral organ development. These findings suggest the potential roles of miRNAs in sex determination and reproductive developmental processes in

  17. A comprehensive analysis of in vitro and in vivo genetic fitness of Pseudomonas aeruginosa using high-throughput sequencing of transposon libraries.

    Directory of Open Access Journals (Sweden)

    David Skurnik

    Full Text Available High-throughput sequencing of transposon (Tn libraries created within entire genomes identifies and quantifies the contribution of individual genes and operons to the fitness of organisms in different environments. We used insertion-sequencing (INSeq to analyze the contribution to fitness of all non-essential genes in the chromosome of Pseudomonas aeruginosa strain PA14 based on a library of ∼300,000 individual Tn insertions. In vitro growth in LB provided a baseline for comparison with the survival of the Tn insertion strains following 6 days of colonization of the murine gastrointestinal tract as well as a comparison with Tn-inserts subsequently able to systemically disseminate to the spleen following induction of neutropenia. Sequencing was performed following DNA extraction from the recovered bacteria, digestion with the MmeI restriction enzyme that hydrolyzes DNA 16 bp away from the end of the Tn insert, and fractionation into oligonucleotides of 1,200-1,500 bp that were prepared for high-throughput sequencing. Changes in frequency of Tn inserts into the P. aeruginosa genome were used to quantify in vivo fitness resulting from loss of a gene. 636 genes had <10 sequencing reads in LB, thus defined as unable to grow in this medium. During in vivo infection there were major losses of strains with Tn inserts in almost all known virulence factors, as well as respiration, energy utilization, ion pumps, nutritional genes and prophages. Many new candidates for virulence factors were also identified. There were consistent changes in the recovery of Tn inserts in genes within most operons and Tn insertions into some genes enhanced in vivo fitness. Strikingly, 90% of the non-essential genes were required for in vivo survival following systemic dissemination during neutropenia. These experiments resulted in the identification of the P. aeruginosa strain PA14 genes necessary for optimal survival in the mucosal and systemic environments of a mammalian

  18. Identification of microRNAs in Caragana intermedia by high-throughput sequencing and expression analysis of 12 microRNAs and their targets under salt stress.

    Science.gov (United States)

    Zhu, Jianfeng; Li, Wanfeng; Yang, Wenhua; Qi, Liwang; Han, Suying

    2013-09-01

    142 miRNAs were identified and 38 miRNA targets were predicted, 4 of which were validated, in C. intermedia . The expression of 12 miRNAs in salt-stressed leaves was assessed by qRT-PCR. MicroRNAs (miRNAs) are endogenous small RNAs that play important roles in various biological and metabolic processes in plants. Caragana intermedia is an important ecological and economic tree species prominent in the desert environment of west and northwest China. To date, no investigation into C. intermedia miRNAs has been reported. In this study, high-throughput sequencing of small RNAs and analysis of transcriptome data were performed to identify both conserved and novel miRNAs, and also their target mRNA genes in C. intermedia. Based on sequence similarity and hairpin structure prediction, 132 putative conserved miRNAs (12 of which were confirmed to form hairpin precursors) belonging to 31 known miRNA families were identified. Ten novel miRNAs (including the miRNA* sequences of three novel miRNAs) were also discovered. Furthermore, 36 potential target genes of 17 known miRNA families and 2 potential target genes of 1 novel miRNA were predicted; 4 of these were validated by 5' RACE. The expression of 12 miRNAs was validated in different tissues, and these and five target mRNAs were assessed by qRT-PCR after salt treatment. The expression levels of seven miRNAs (cin-miR157a, cin-miR159a, cin-miR165a, cin-miR167b, cin-miR172b, cin-miR390a and cin-miR396a) were upregulated, while cin-miR398a expression was downregulated after salt treatment. The targets of cin-miR157a, cin-miR165a, cin-miR172b and cin-miR396a were downregulated and showed an approximately negative correlation with their corresponding miRNAs under salt treatment. These results would help further understanding of miRNA regulation in response to abiotic stress in C. intermedia.

  19. High Throughput Sequencing for Detection of Foodborne Pathogens

    Directory of Open Access Journals (Sweden)

    Camilla Sekse

    2017-10-01

    Full Text Available High-throughput sequencing (HTS is becoming the state-of-the-art technology for typing of microbial isolates, especially in clinical samples. Yet, its application is still in its infancy for monitoring and outbreak investigations of foods. Here we review the published literature, covering not only bacterial but also viral and Eukaryote food pathogens, to assess the status and potential of HTS implementation to inform stakeholders, improve food safety and reduce outbreak impacts. The developments in sequencing technology and bioinformatics have outpaced the capacity to analyze and interpret the sequence data. The influence of sample processing, nucleic acid extraction and purification, harmonized protocols for generation and interpretation of data, and properly annotated and curated reference databases including non-pathogenic “natural” strains are other major obstacles to the realization of the full potential of HTS in analytical food surveillance, epidemiological and outbreak investigations, and in complementing preventive approaches for the control and management of foodborne pathogens. Despite significant obstacles, the achieved progress in capacity and broadening of the application range over the last decade is impressive and unprecedented, as illustrated with the chosen examples from the literature. Large consortia, often with broad international participation, are making coordinated efforts to cope with many of the mentioned obstacles. Further rapid progress can therefore be prospected for the next decade.

  20. Using high-throughput barcode sequencing to efficiently map connectomes.

    Science.gov (United States)

    Peikon, Ian D; Kebschull, Justus M; Vagin, Vasily V; Ravens, Diana I; Sun, Yu-Chi; Brouzes, Eric; Corrêa, Ivan R; Bressan, Dario; Zador, Anthony M

    2017-07-07

    The function of a neural circuit is determined by the details of its synaptic connections. At present, the only available method for determining a neural wiring diagram with single synapse precision-a 'connectome'-is based on imaging methods that are slow, labor-intensive and expensive. Here, we present SYNseq, a method for converting the connectome into a form that can exploit the speed and low cost of modern high-throughput DNA sequencing. In SYNseq, each neuron is labeled with a unique random nucleotide sequence-an RNA 'barcode'-which is targeted to the synapse using engineered proteins. Barcodes in pre- and postsynaptic neurons are then associated through protein-protein crosslinking across the synapse, extracted from the tissue, and joined into a form suitable for sequencing. Although our failure to develop an efficient barcode joining scheme precludes the widespread application of this approach, we expect that with further development SYNseq will enable tracing of complex circuits at high speed and low cost. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  1. High-throughput sequencing and analysis of the gill tissue transcriptome from the deep-sea hydrothermal vent mussel Bathymodiolus azoricus

    Directory of Open Access Journals (Sweden)

    Gomes Paula

    2010-10-01

    Full Text Available Abstract Background Bathymodiolus azoricus is a deep-sea hydrothermal vent mussel found in association with large faunal communities living in chemosynthetic environments at the bottom of the sea floor near the Azores Islands. Investigation of the exceptional physiological reactions that vent mussels have adopted in their habitat, including responses to environmental microbes, remains a difficult challenge for deep-sea biologists. In an attempt to reveal genes potentially involved in the deep-sea mussel innate immunity we carried out a high-throughput sequence analysis of freshly collected B. azoricus transcriptome using gills tissues as the primary source of immune transcripts given its strategic role in filtering the surrounding waterborne potentially infectious microorganisms. Additionally, a substantial EST data set was produced and from which a comprehensive collection of genes coding for putative proteins was organized in a dedicated database, "DeepSeaVent" the first deep-sea vent animal transcriptome database based on the 454 pyrosequencing technology. Results A normalized cDNA library from gills tissue was sequenced in a full 454 GS-FLX run, producing 778,996 sequencing reads. Assembly of the high quality reads resulted in 75,407 contigs of which 3,071 were singletons. A total of 39,425 transcripts were conceptually translated into amino-sequences of which 22,023 matched known proteins in the NCBI non-redundant protein database, 15,839 revealed conserved protein domains through InterPro functional classification and 9,584 were assigned with Gene Ontology terms. Queries conducted within the database enabled the identification of genes putatively involved in immune and inflammatory reactions which had not been previously evidenced in the vent mussel. Their physical counterpart was confirmed by semi-quantitative quantitative Reverse-Transcription-Polymerase Chain Reactions (RT-PCR and their RNA transcription level by quantitative PCR (q

  2. Comparative analysis of miRNAs of two rapeseed genotypes in response to acetohydroxyacid synthase-inhibiting herbicides by high-throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Maolong Hu

    Full Text Available Acetohydroxyacid synthase (AHAS, also called acetolactate synthase, is a key enzyme involved in the first step of the biosynthesis of the branched-chain amino acids valine, isoleucine and leucine. Acetohydroxyacid synthase-inhibiting herbicides (AHAS herbicides are five chemical families of herbicides that inhibit AHAS enzymes, including imidazolinones (IMI, sulfonylureas (SU, pyrimidinylthiobenzoates, triazolinones and triazolopyrimidines. Five AHAS genes have been identified in rapeseed, but little information is available regarding the role of miRNAs in response to AHAS herbicides. In this study, an AHAS herbicides tolerant genotype and a sensitive genotype were used for miRNA comparative analysis. A total of 20 small RNA libraries were obtained of these two genotypes at three time points (0h, 24 h and 48 h after spraying SU and IMI herbicides with two replicates. We identified 940 conserved miRNAs and 1515 novel candidate miRNAs in Brassica napus using high-throughput sequencing methods combined with computing analysis. A total of 3284 genes were predicted to be targets of these miRNAs, and their functions were shown using GO, KOG and KEGG annotations. The differentiation expression results of miRNAs showed almost twice as many differentiated miRNAs were found in tolerant genotype M342 (309 miRNAs after SU herbicide application than in sensitive genotype N131 (164 miRNAs. In additiond 177 and 296 miRNAs defined as differentiated in sensitive genotype and tolerant genotype in response to SU herbicides. The miR398 family was observed to be associated with AHAS herbicide tolerance because their expression increased in the tolerant genotype but decreased in the sensitive genotype. Moreover, 50 novel miRNAs from 39 precursors were predicted. There were 8 conserved miRNAs, 4 novel miRNAs and 3 target genes were validated by quantitative real-time PCR experiment. This study not only provides novel insights into the miRNA content of AHAS herbicides

  3. High-Throughput DNA sequencing of ancient wood.

    Science.gov (United States)

    Wagner, Stefanie; Lagane, Frédéric; Seguin-Orlando, Andaine; Schubert, Mikkel; Leroy, Thibault; Guichoux, Erwan; Chancerel, Emilie; Bech-Hebelstrup, Inger; Bernard, Vincent; Billard, Cyrille; Billaud, Yves; Bolliger, Matthias; Croutsch, Christophe; Čufar, Katarina; Eynaud, Frédérique; Heussner, Karl Uwe; Köninger, Joachim; Langenegger, Fabien; Leroy, Frédéric; Lima, Christine; Martinelli, Nicoletta; Momber, Garry; Billamboz, André; Nelle, Oliver; Palomo, Antoni; Piqué, Raquel; Ramstein, Marianne; Schweichel, Roswitha; Stäuble, Harald; Tegel, Willy; Terradas, Xavier; Verdin, Florence; Plomion, Christophe; Kremer, Antoine; Orlando, Ludovic

    2018-03-01

    Reconstructing the colonization and demographic dynamics that gave rise to extant forests is essential to forecasts of forest responses to environmental changes. Classical approaches to map how population of trees changed through space and time largely rely on pollen distribution patterns, with only a limited number of studies exploiting DNA molecules preserved in wooden tree archaeological and subfossil remains. Here, we advance such analyses by applying high-throughput (HTS) DNA sequencing to wood archaeological and subfossil material for the first time, using a comprehensive sample of 167 European white oak waterlogged remains spanning a large temporal (from 550 to 9,800 years) and geographical range across Europe. The successful characterization of the endogenous DNA and exogenous microbial DNA of 140 (~83%) samples helped the identification of environmental conditions favouring long-term DNA preservation in wood remains, and started to unveil the first trends in the DNA decay process in wood material. Additionally, the maternally inherited chloroplast haplotypes of 21 samples from three periods of forest human-induced use (Neolithic, Bronze Age and Middle Ages) were found to be consistent with those of modern populations growing in the same geographic areas. Our work paves the way for further studies aiming at using ancient DNA preserved in wood to reconstruct the micro-evolutionary response of trees to climate change and human forest management. © 2018 John Wiley & Sons Ltd.

  4. A simple, high throughput method to locate single copy sequences from Bacterial Artificial Chromosome (BAC libraries using High Resolution Melt analysis

    Directory of Open Access Journals (Sweden)

    Caligari Peter DS

    2010-05-01

    Full Text Available Abstract Background The high-throughput anchoring of genetic markers into contigs is required for many ongoing physical mapping projects. Multidimentional BAC pooling strategies for PCR-based screening of large insert libraries is a widely used alternative to high density filter hybridisation of bacterial colonies. To date, concerns over reliability have led most if not all groups engaged in high throughput physical mapping projects to favour BAC DNA isolation prior to amplification by conventional PCR. Results Here, we report the first combined use of Multiplex Tandem PCR (MT-PCR and High Resolution Melt (HRM analysis on bacterial stocks of BAC library superpools as a means of rapidly anchoring markers to BAC colonies and thereby to integrate genetic and physical maps. We exemplify the approach using a BAC library of the model plant Arabidopsis thaliana. Super pools of twenty five 384-well plates and two-dimension matrix pools of the BAC library were prepared for marker screening. The entire procedure only requires around 3 h to anchor one marker. Conclusions A pre-amplification step during MT-PCR allows high multiplexing and increases the sensitivity and reliability of subsequent HRM discrimination. This simple gel-free protocol is more reliable, faster and far less costly than conventional PCR screening. The option to screen in parallel 3 genetic markers in one MT-PCR-HRM reaction using templates from directly pooled bacterial stocks of BAC-containing bacteria further reduces time for anchoring markers in physical maps of species with large genomes.

  5. Association Study of Gut Flora in Coronary Heart Disease through High-Throughput Sequencing

    OpenAIRE

    Cui, Li; Zhao, Tingting; Hu, Haibing; Zhang, Wen; Hua, Xiuguo

    2017-01-01

    Objectives. We aimed to explore the impact of gut microbiota in coronary heart disease (CHD) patients through high-throughput sequencing. Methods. A total of 29 CHD in-hospital patients and 35 healthy volunteers as controls were included. Nucleic acids were extracted from fecal samples, followed by ? diversity and principal coordinate analysis (PCoA). Based on unweighted UniFrac distance matrices, unweighted-pair group method with arithmetic mean (UPGMA) trees were created. Results. After dat...

  6. Identification and Analysis of Red Sea Mangrove (Avicennia marina) microRNAs by High-Throughput Sequencing and Their Association with Stress Responses

    KAUST Repository

    Khraiwesh, Basel; Pugalenthi, Ganesan; Fedoroff, Nina V.

    2013-01-01

    Although RNA silencing has been studied primarily in model plants, advances in high-throughput sequencing technologies have enabled profiling of the small RNA components of many more plant species, providing insights into the ubiquity and conservatism of some miRNA-based regulatory mechanisms. Small RNAs of 20 to 24 nucleotides (nt) are important regulators of gene transcript levels by either transcriptional or by posttranscriptional gene silencing, contributing to genome maintenance and controlling a variety of developmental and physiological processes. Here, we used deep sequencing and molecular methods to create an inventory of the small RNAs in the mangrove species, Avicennia marina. We identified 26 novel mangrove miRNAs and 193 conserved miRNAs belonging to 36 families. We determined that 2 of the novel miRNAs were produced from known miRNA precursors and 4 were likely to be species-specific by the criterion that we found no homologs in other plant species. We used qRT-PCR to analyze the expression of miRNAs and their target genes in different tissue sets and some demonstrated tissue-specific expression. Furthermore, we predicted potential targets of these putative miRNAs based on a sequence homology and experimentally validated through endonucleolytic cleavage assays. Our results suggested that expression profiles of miRNAs and their predicted targets could be useful in exploring the significance of the conservation patterns of plants, particularly in response to abiotic stress. Because of their well-developed abilities in this regard, mangroves and other extremophiles are excellent models for such exploration. © 2013 Khraiwesh et al.

  7. Biphasic Study to Characterize Agricultural Biogas Plants by High-Throughput 16S rRNA Gene Amplicon Sequencing and Microscopic Analysis.

    Science.gov (United States)

    Maus, Irena; Kim, Yong Sung; Wibberg, Daniel; Stolze, Yvonne; Off, Sandra; Antonczyk, Sebastian; Pühler, Alfred; Scherer, Paul; Schlüter, Andreas

    2017-02-28

    Process surveillance within agricultural biogas plants (BGPs) was concurrently studied by high-throughput 16S rRNA gene amplicon sequencing and an optimized quantitative microscopic fingerprinting (QMF) technique. In contrast to 16S rRNA gene amplicons, digitalized microscopy is a rapid and cost-effective method that facilitates enumeration and morphological differentiation of the most significant groups of methanogens regarding their shape and characteristic autofluorescent factor 420. Moreover, the fluorescence signal mirrors cell vitality. In this study, four different BGPs were investigated. The results indicated stable process performance in the mesophilic BGPs and in the thermophilic reactor. Bacterial subcommunity characterization revealed significant differences between the four BGPs. Most remarkably, the genera Defluviitoga and Halocella dominated the thermophilic bacterial subcommunity, whereas members of another taxon, Syntrophaceticus , were found to be abundant in the mesophilic BGP. The domain Archaea was dominated by the genus Methanoculleus in all four BGPs, followed by Methanosaeta in BGP1 and BGP3. In contrast, Methanothermobacter members were highly abundant in the thermophilic BGP4. Furthermore, a high consistency between the sequencing approach and the QMF method was shown, especially for the thermophilic BGP. The differences elucidated that using this biphasic approach for mesophilic BGPs provided novel insights regarding disaggregated single cells of Methanosarcina and Methanosaeta species. Both dominated the archaeal subcommunity and replaced coccoid Methanoculleus members belonging to the same group of Methanomicrobiales that have been frequently observed in similar BGPs. This work demonstrates that combining QMF and 16S rRNA gene amplicon sequencing is a complementary strategy to describe archaeal community structures within biogas processes.

  8. Identification and analysis of red sea mangrove (Avicennia marina microRNAs by high-throughput sequencing and their association with stress responses.

    Directory of Open Access Journals (Sweden)

    Basel Khraiwesh

    Full Text Available Although RNA silencing has been studied primarily in model plants, advances in high-throughput sequencing technologies have enabled profiling of the small RNA components of many more plant species, providing insights into the ubiquity and conservatism of some miRNA-based regulatory mechanisms. Small RNAs of 20 to 24 nucleotides (nt are important regulators of gene transcript levels by either transcriptional or by posttranscriptional gene silencing, contributing to genome maintenance and controlling a variety of developmental and physiological processes. Here, we used deep sequencing and molecular methods to create an inventory of the small RNAs in the mangrove species, Avicennia marina. We identified 26 novel mangrove miRNAs and 193 conserved miRNAs belonging to 36 families. We determined that 2 of the novel miRNAs were produced from known miRNA precursors and 4 were likely to be species-specific by the criterion that we found no homologs in other plant species. We used qRT-PCR to analyze the expression of miRNAs and their target genes in different tissue sets and some demonstrated tissue-specific expression. Furthermore, we predicted potential targets of these putative miRNAs based on a sequence homology and experimentally validated through endonucleolytic cleavage assays. Our results suggested that expression profiles of miRNAs and their predicted targets could be useful in exploring the significance of the conservation patterns of plants, particularly in response to abiotic stress. Because of their well-developed abilities in this regard, mangroves and other extremophiles are excellent models for such exploration.

  9. Identification and Analysis of Red Sea Mangrove (Avicennia marina) microRNAs by High-Throughput Sequencing and Their Association with Stress Responses

    KAUST Repository

    Khraiwesh, Basel

    2013-04-08

    Although RNA silencing has been studied primarily in model plants, advances in high-throughput sequencing technologies have enabled profiling of the small RNA components of many more plant species, providing insights into the ubiquity and conservatism of some miRNA-based regulatory mechanisms. Small RNAs of 20 to 24 nucleotides (nt) are important regulators of gene transcript levels by either transcriptional or by posttranscriptional gene silencing, contributing to genome maintenance and controlling a variety of developmental and physiological processes. Here, we used deep sequencing and molecular methods to create an inventory of the small RNAs in the mangrove species, Avicennia marina. We identified 26 novel mangrove miRNAs and 193 conserved miRNAs belonging to 36 families. We determined that 2 of the novel miRNAs were produced from known miRNA precursors and 4 were likely to be species-specific by the criterion that we found no homologs in other plant species. We used qRT-PCR to analyze the expression of miRNAs and their target genes in different tissue sets and some demonstrated tissue-specific expression. Furthermore, we predicted potential targets of these putative miRNAs based on a sequence homology and experimentally validated through endonucleolytic cleavage assays. Our results suggested that expression profiles of miRNAs and their predicted targets could be useful in exploring the significance of the conservation patterns of plants, particularly in response to abiotic stress. Because of their well-developed abilities in this regard, mangroves and other extremophiles are excellent models for such exploration. © 2013 Khraiwesh et al.

  10. SINA: accurate high-throughput multiple sequence alignment of ribosomal RNA genes.

    Science.gov (United States)

    Pruesse, Elmar; Peplies, Jörg; Glöckner, Frank Oliver

    2012-07-15

    In the analysis of homologous sequences, computation of multiple sequence alignments (MSAs) has become a bottleneck. This is especially troublesome for marker genes like the ribosomal RNA (rRNA) where already millions of sequences are publicly available and individual studies can easily produce hundreds of thousands of new sequences. Methods have been developed to cope with such numbers, but further improvements are needed to meet accuracy requirements. In this study, we present the SILVA Incremental Aligner (SINA) used to align the rRNA gene databases provided by the SILVA ribosomal RNA project. SINA uses a combination of k-mer searching and partial order alignment (POA) to maintain very high alignment accuracy while satisfying high throughput performance demands. SINA was evaluated in comparison with the commonly used high throughput MSA programs PyNAST and mothur. The three BRAliBase III benchmark MSAs could be reproduced with 99.3, 97.6 and 96.1 accuracy. A larger benchmark MSA comprising 38 772 sequences could be reproduced with 98.9 and 99.3% accuracy using reference MSAs comprising 1000 and 5000 sequences. SINA was able to achieve higher accuracy than PyNAST and mothur in all performed benchmarks. Alignment of up to 500 sequences using the latest SILVA SSU/LSU Ref datasets as reference MSA is offered at http://www.arb-silva.de/aligner. This page also links to Linux binaries, user manual and tutorial. SINA is made available under a personal use license.

  11. Scrutinizing virus genome termini by high-throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Shasha Li

    Full Text Available Analysis of genomic terminal sequences has been a major step in studies on viral DNA replication and packaging mechanisms. However, traditional methods to study genome termini are challenging due to the time-consuming protocols and their inefficiency where critical details are lost easily. Recent advances in next generation sequencing (NGS have enabled it to be a powerful tool to study genome termini. In this study, using NGS we sequenced one iridovirus genome and twenty phage genomes and confirmed for the first time that the high frequency sequences (HFSs found in the NGS reads are indeed the terminal sequences of viral genomes. Further, we established a criterion to distinguish the type of termini and the viral packaging mode. We also obtained additional terminal details such as terminal repeats, multi-termini, asymmetric termini. With this approach, we were able to simultaneously detect details of the genome termini as well as obtain the complete sequence of bacteriophage genomes. Theoretically, this application can be further extended to analyze larger and more complicated genomes of plant and animal viruses. This study proposed a novel and efficient method for research on viral replication, packaging, terminase activity, transcription regulation, and metabolism of the host cell.

  12. Library Design-Facilitated High-Throughput Sequencing of Synthetic Peptide Libraries.

    Science.gov (United States)

    Vinogradov, Alexander A; Gates, Zachary P; Zhang, Chi; Quartararo, Anthony J; Halloran, Kathryn H; Pentelute, Bradley L

    2017-11-13

    A methodology to achieve high-throughput de novo sequencing of synthetic peptide mixtures is reported. The approach leverages shotgun nanoliquid chromatography coupled with tandem mass spectrometry-based de novo sequencing of library mixtures (up to 2000 peptides) as well as automated data analysis protocols to filter away incorrect assignments, noise, and synthetic side-products. For increasing the confidence in the sequencing results, mass spectrometry-friendly library designs were developed that enabled unambiguous decoding of up to 600 peptide sequences per hour while maintaining greater than 85% sequence identification rates in most cases. The reliability of the reported decoding strategy was additionally confirmed by matching fragmentation spectra for select authentic peptides identified from library sequencing samples. The methods reported here are directly applicable to screening techniques that yield mixtures of active compounds, including particle sorting of one-bead one-compound libraries and affinity enrichment of synthetic library mixtures performed in solution.

  13. Applications of High-Throughput Nucleotide Sequencing (PhD)

    DEFF Research Database (Denmark)

    Waage, Johannes

    equally large demands in data handling, analysis and interpretation, perhaps defining the modern challenge of the computational biologist of the post-genomic era. The first part of this thesis consists of a general introduction to the history, common terms and challenges of next generation sequencing......-sequencing, a study of the effects on alternative RNA splicing of KO of the nonsense mediated RNA decay system in Mus, using digital gene expression and a custom-built exon-exon junction mapping pipeline is presented (article I). Evolved from this work, a Bioconductor package, spliceR, for classifying alternative...

  14. ISRNA: an integrative online toolkit for short reads from high-throughput sequencing data.

    Science.gov (United States)

    Luo, Guan-Zheng; Yang, Wei; Ma, Ying-Ke; Wang, Xiu-Jie

    2014-02-01

    Integrative Short Reads NAvigator (ISRNA) is an online toolkit for analyzing high-throughput small RNA sequencing data. Besides the high-speed genome mapping function, ISRNA provides statistics for genomic location, length distribution and nucleotide composition bias analysis of sequence reads. Number of reads mapped to known microRNAs and other classes of short non-coding RNAs, coverage of short reads on genes, expression abundance of sequence reads as well as some other analysis functions are also supported. The versatile search functions enable users to select sequence reads according to their sub-sequences, expression abundance, genomic location, relationship to genes, etc. A specialized genome browser is integrated to visualize the genomic distribution of short reads. ISRNA also supports management and comparison among multiple datasets. ISRNA is implemented in Java/C++/Perl/MySQL and can be freely accessed at http://omicslab.genetics.ac.cn/ISRNA/.

  15. Applications of high-throughput sequencing to chromatin structure and function in mammals

    OpenAIRE

    Dunham, Ian

    2009-01-01

    High-throughput DNA sequencing approaches have enabled direct interrogation of chromatin samples from mammalian cells. We are beginning to develop a genome-wide description of nuclear function during development, but further data collection, refinement, and integration are needed.

  16. High-Throughput Mapping of Single-Neuron Projections by Sequencing of Barcoded RNA.

    Science.gov (United States)

    Kebschull, Justus M; Garcia da Silva, Pedro; Reid, Ashlan P; Peikon, Ian D; Albeanu, Dinu F; Zador, Anthony M

    2016-09-07

    Neurons transmit information to distant brain regions via long-range axonal projections. In the mouse, area-to-area connections have only been systematically mapped using bulk labeling techniques, which obscure the diverse projections of intermingled single neurons. Here we describe MAPseq (Multiplexed Analysis of Projections by Sequencing), a technique that can map the projections of thousands or even millions of single neurons by labeling large sets of neurons with random RNA sequences ("barcodes"). Axons are filled with barcode mRNA, each putative projection area is dissected, and the barcode mRNA is extracted and sequenced. Applying MAPseq to the locus coeruleus (LC), we find that individual LC neurons have preferred cortical targets. By recasting neuroanatomy, which is traditionally viewed as a problem of microscopy, as a problem of sequencing, MAPseq harnesses advances in sequencing technology to permit high-throughput interrogation of brain circuits. Copyright © 2016 Elsevier Inc. All rights reserved.

  17. Automated cleaning and pre-processing of immunoglobulin gene sequences from high-throughput sequencing

    Directory of Open Access Journals (Sweden)

    Miri eMichaeli

    2012-12-01

    Full Text Available High throughput sequencing (HTS yields tens of thousands to millions of sequences that require a large amount of pre-processing work to clean various artifacts. Such cleaning cannot be performed manually. Existing programs are not suitable for immunoglobulin (Ig genes, which are variable and often highly mutated. This paper describes Ig-HTS-Cleaner (Ig High Throughput Sequencing Cleaner, a program containing a simple cleaning procedure that successfully deals with pre-processing of Ig sequences derived from HTS, and Ig-Indel-Identifier (Ig Insertion – Deletion Identifier, a program for identifying legitimate and artifact insertions and/or deletions (indels. Our programs were designed for analyzing Ig gene sequences obtained by 454 sequencing, but they are applicable to all types of sequences and sequencing platforms. Ig-HTS-Cleaner and Ig-Indel-Identifier have been implemented in Java and saved as executable JAR files, supported on Linux and MS Windows. No special requirements are needed in order to run the programs, except for correctly constructing the input files as explained in the text. The programs' performance has been tested and validated on real and simulated data sets.

  18. Cytosolic Glutamine Synthetase is Important for Photosynthetic Efficiency and Water Use Efficiency in Potato as Revealed by High Throughput Sequencing QTL analysis

    DEFF Research Database (Denmark)

    Kaminski, Kacper Piotr; Sørensen, Kirsten Kørup; Andersen, Mathias Neumann

    2015-01-01

    was observed. Two extreme WUE bulks of clones were identified and pools of genomic DNA from them as well as the parents were sequenced and mapped to reference potato genome. Following a novel data analysis approach, two highly resolved QTLs were found on chromosome 1 and 9. Interestingly, three genes encoding...

  19. Characterizing ncRNAs in human pathogenic protists using high-throughput sequencing technology

    Directory of Open Access Journals (Sweden)

    Lesley Joan Collins

    2011-12-01

    Full Text Available ncRNAs are key genes in many human diseases including cancer and viral infection, as well as providing critical functions in pathogenic organisms such as fungi, bacteria, viruses and protists. Until now the identification and characterization of ncRNAs associated with disease has been slow or inaccurate requiring many years of testing to understand complicated RNA and protein gene relationships. High-throughput sequencing now offers the opportunity to characterize miRNAs, siRNAs, snoRNAs and long ncRNAs on a genomic scale making it faster and easier to clarify how these ncRNAs contribute to the disease state. However, this technology is still relatively new, and ncRNA discovery is not an application of high priority for streamlined bioinformatics. Here we summarize background concepts and practical approaches for ncRNA analysis using high-throughput sequencing, and how it relates to understanding human disease. As a case study, we focus on the parasitic protists Giardia lamblia and Trichomonas vaginalis, where large evolutionary distance has meant difficulties in comparing ncRNAs with those from model eukaryotes. A combination of biological, computational and sequencing approaches has enabled easier classification of ncRNA classes such as snoRNAs, but has also aided the identification of novel classes. It is hoped that a higher level of understanding of ncRNA expression and interaction may aid in the development of less harsh treatment for protist-based diseases.

  20. Characterizing ncRNAs in Human Pathogenic Protists Using High-Throughput Sequencing Technology

    Science.gov (United States)

    Collins, Lesley Joan

    2011-01-01

    ncRNAs are key genes in many human diseases including cancer and viral infection, as well as providing critical functions in pathogenic organisms such as fungi, bacteria, viruses, and protists. Until now the identification and characterization of ncRNAs associated with disease has been slow or inaccurate requiring many years of testing to understand complicated RNA and protein gene relationships. High-throughput sequencing now offers the opportunity to characterize miRNAs, siRNAs, small nucleolar RNAs (snoRNAs), and long ncRNAs on a genomic scale, making it faster and easier to clarify how these ncRNAs contribute to the disease state. However, this technology is still relatively new, and ncRNA discovery is not an application of high priority for streamlined bioinformatics. Here we summarize background concepts and practical approaches for ncRNA analysis using high-throughput sequencing, and how it relates to understanding human disease. As a case study, we focus on the parasitic protists Giardia lamblia and Trichomonas vaginalis, where large evolutionary distance has meant difficulties in comparing ncRNAs with those from model eukaryotes. A combination of biological, computational, and sequencing approaches has enabled easier classification of ncRNA classes such as snoRNAs, but has also aided the identification of novel classes. It is hoped that a higher level of understanding of ncRNA expression and interaction may aid in the development of less harsh treatment for protist-based diseases. PMID:22303390

  1. SUGAR: graphical user interface-based data refiner for high-throughput DNA sequencing.

    Science.gov (United States)

    Sato, Yukuto; Kojima, Kaname; Nariai, Naoki; Yamaguchi-Kabata, Yumi; Kawai, Yosuke; Takahashi, Mamoru; Mimori, Takahiro; Nagasaki, Masao

    2014-08-08

    Next-generation sequencers (NGSs) have become one of the main tools for current biology. To obtain useful insights from the NGS data, it is essential to control low-quality portions of the data affected by technical errors such as air bubbles in sequencing fluidics. We develop a software SUGAR (subtile-based GUI-assisted refiner) which can handle ultra-high-throughput data with user-friendly graphical user interface (GUI) and interactive analysis capability. The SUGAR generates high-resolution quality heatmaps of the flowcell, enabling users to find possible signals of technical errors during the sequencing. The sequencing data generated from the error-affected regions of a flowcell can be selectively removed by automated analysis or GUI-assisted operations implemented in the SUGAR. The automated data-cleaning function based on sequence read quality (Phred) scores was applied to a public whole human genome sequencing data and we proved the overall mapping quality was improved. The detailed data evaluation and cleaning enabled by SUGAR would reduce technical problems in sequence read mapping, improving subsequent variant analysis that require high-quality sequence data and mapping results. Therefore, the software will be especially useful to control the quality of variant calls to the low population cells, e.g., cancers, in a sample with technical errors of sequencing procedures.

  2. Roche genome sequencer FLX based high-throughput sequencing of ancient DNA

    DEFF Research Database (Denmark)

    Alquezar-Planas, David E; Fordyce, Sarah Louise

    2012-01-01

    Since the development of so-called "next generation" high-throughput sequencing in 2005, this technology has been applied to a variety of fields. Such applications include disease studies, evolutionary investigations, and ancient DNA. Each application requires a specialized protocol to ensure...... that the data produced is optimal. Although much of the procedure can be followed directly from the manufacturer's protocols, the key differences lie in the library preparation steps. This chapter presents an optimized protocol for the sequencing of fossil remains and museum specimens, commonly referred...

  3. The use of coded PCR primers enables high-throughput sequencing of multiple homolog amplification products by 454 parallel sequencing.

    Directory of Open Access Journals (Sweden)

    Jonas Binladen

    2007-02-01

    Full Text Available The invention of the Genome Sequence 20 DNA Sequencing System (454 parallel sequencing platform has enabled the rapid and high-volume production of sequence data. Until now, however, individual emulsion PCR (emPCR reactions and subsequent sequencing runs have been unable to combine template DNA from multiple individuals, as homologous sequences cannot be subsequently assigned to their original sources.We use conventional PCR with 5'-nucleotide tagged primers to generate homologous DNA amplification products from multiple specimens, followed by sequencing through the high-throughput Genome Sequence 20 DNA Sequencing System (GS20, Roche/454 Life Sciences. Each DNA sequence is subsequently traced back to its individual source through 5'tag-analysis.We demonstrate that this new approach enables the assignment of virtually all the generated DNA sequences to the correct source once sequencing anomalies are accounted for (miss-assignment rate<0.4%. Therefore, the method enables accurate sequencing and assignment of homologous DNA sequences from multiple sources in single high-throughput GS20 run. We observe a bias in the distribution of the differently tagged primers that is dependent on the 5' nucleotide of the tag. In particular, primers 5' labelled with a cytosine are heavily overrepresented among the final sequences, while those 5' labelled with a thymine are strongly underrepresented. A weaker bias also exists with regards to the distribution of the sequences as sorted by the second nucleotide of the dinucleotide tags. As the results are based on a single GS20 run, the general applicability of the approach requires confirmation. However, our experiments demonstrate that 5'primer tagging is a useful method in which the sequencing power of the GS20 can be applied to PCR-based assays of multiple homologous PCR products. The new approach will be of value to a broad range of research areas, such as those of comparative genomics, complete mitochondrial

  4. Automated analysis of high-throughput B-cell sequencing data reveals a high frequency of novel immunoglobulin V gene segment alleles.

    Science.gov (United States)

    Gadala-Maria, Daniel; Yaari, Gur; Uduman, Mohamed; Kleinstein, Steven H

    2015-02-24

    Individual variation in germline and expressed B-cell immunoglobulin (Ig) repertoires has been associated with aging, disease susceptibility, and differential response to infection and vaccination. Repertoire properties can now be studied at large-scale through next-generation sequencing of rearranged Ig genes. Accurate analysis of these repertoire-sequencing (Rep-Seq) data requires identifying the germline variable (V), diversity (D), and joining (J) gene segments used by each Ig sequence. Current V(D)J assignment methods work by aligning sequences to a database of known germline V(D)J segment alleles. However, existing databases are likely to be incomplete and novel polymorphisms are hard to differentiate from the frequent occurrence of somatic hypermutations in Ig sequences. Here we develop a Tool for Ig Genotype Elucidation via Rep-Seq (TIgGER). TIgGER analyzes mutation patterns in Rep-Seq data to identify novel V segment alleles, and also constructs a personalized germline database containing the specific set of alleles carried by a subject. This information is then used to improve the initial V segment assignments from existing tools, like IMGT/HighV-QUEST. The application of TIgGER to Rep-Seq data from seven subjects identified 11 novel V segment alleles, including at least one in every subject examined. These novel alleles constituted 13% of the total number of unique alleles in these subjects, and impacted 3% of V(D)J segment assignments. These results reinforce the highly polymorphic nature of human Ig V genes, and suggest that many novel alleles remain to be discovered. The integration of TIgGER into Rep-Seq processing pipelines will increase the accuracy of V segment assignments, thus improving B-cell repertoire analyses.

  5. High throughput sequencing identifies chilling responsive genes in sweetpotato (Ipomoea batatas Lam.) during storage.

    Science.gov (United States)

    Xie, Zeyi; Zhou, Zhilin; Li, Hongmin; Yu, Jingjing; Jiang, Jiaojiao; Tang, Zhonghou; Ma, Daifu; Zhang, Baohong; Han, Yonghua; Li, Zongyun

    2018-05-21

    Sweetpotato (Ipomoea batatas L.) is a globally important economic food crop. It belongs to Convolvulaceae family and origins in the tropics; however, sweetpotato is sensitive to cold stress during storage. In this study, we performed transcriptome sequencing to investigate the sweetpotato response to chilling stress during storage. A total of 110,110 unigenes were generated via high-throughput sequencing. Differentially expressed genes (DEGs) analysis showed that 18,681 genes were up-regulated and 21,983 genes were down-regulated in low temperature condition. Many DEGs were related to the cell membrane system, antioxidant enzymes, carbohydrate metabolism, and hormone metabolism, which are potentially associated with sweetpotato resistance to low temperature. The existence of DEGs suggests a molecular basis for the biochemical and physiological consequences of sweetpotato in low temperature storage conditions. Our analysis will provide a new target for enhancement of sweetpotato cold stress tolerance in postharvest storage through genetic manipulation. Copyright © 2018. Published by Elsevier Inc.

  6. An improved high throughput sequencing method for studying oomycete communities

    DEFF Research Database (Denmark)

    Sapkota, Rumakanta; Nicolaisen, Mogens

    2015-01-01

    the usefulness of the method not only in soil DNA but also in a plant DNA background. In conclusion, we demonstrate a successful approach for pyrosequencing of oomycete communities using ITS1 as the barcode sequence with well-known primers for oomycete DNA amplification....... communities. Thewell-known primer sets ITS4, ITS6 and ITS7were used in the study in a semi-nested PCR approach to target the internal transcribed spacer (ITS) 1 of ribosomal DNA in a next generation sequencing protocol. These primers have been used in similar studies before, butwith limited success.......Wewere able to increase the proportion of retrieved oomycete sequences dramaticallymainly by increasing the annealing temperature during PCR. The optimized protocol was validated using three mock communities and the method was further evaluated using total DNA from 26 soil samples collected from different...

  7. MUSCLE: multiple sequence alignment with high accuracy and high throughput.

    Science.gov (United States)

    Edgar, Robert C

    2004-01-01

    We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the log-expectation score, and refinement using tree-dependent restricted partitioning. The speed and accuracy of MUSCLE are compared with T-Coffee, MAFFT and CLUSTALW on four test sets of reference alignments: BAliBASE, SABmark, SMART and a new benchmark, PREFAB. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of these sets. Without refinement, MUSCLE achieves average accuracy statistically indistinguishable from T-Coffee and MAFFT, and is the fastest of the tested methods for large numbers of sequences, aligning 5000 sequences of average length 350 in 7 min on a current desktop computer. The MUSCLE program, source code and PREFAB test data are freely available at http://www.drive5. com/muscle.

  8. Algorithms for mapping high-throughput DNA sequences

    DEFF Research Database (Denmark)

    Frellsen, Jes; Menzel, Peter; Krogh, Anders

    2014-01-01

    of data generation, new bioinformatics approaches have been developed to cope with the large amount of sequencing reads obtained in these experiments. In this chapter, we first introduce HTS technologies and their usage in molecular biology and discuss the problem of mapping sequencing reads...... to their genomic origin. We then in detail describe two approaches that offer very fast heuristics to solve the mapping problem in a feasible runtime. In particular, we describe the BLAT algorithm, and we give an introduction to the Burrows-Wheeler Transform and the mapping algorithms based on this transformation....

  9. Centroid based clustering of high throughput sequencing reads based on n-mer counts.

    Science.gov (United States)

    Solovyov, Alexander; Lipkin, W Ian

    2013-09-08

    Many problems in computational biology require alignment-free sequence comparisons. One of the common tasks involving sequence comparison is sequence clustering. Here we apply methods of alignment-free comparison (in particular, comparison using sequence composition) to the challenge of sequence clustering. We study several centroid based algorithms for clustering sequences based on word counts. Study of their performance shows that using k-means algorithm with or without the data whitening is efficient from the computational point of view. A higher clustering accuracy can be achieved using the soft expectation maximization method, whereby each sequence is attributed to each cluster with a specific probability. We implement an open source tool for alignment-free clustering. It is publicly available from github: https://github.com/luscinius/afcluster. We show the utility of alignment-free sequence clustering for high throughput sequencing analysis despite its limitations. In particular, it allows one to perform assembly with reduced resources and a minimal loss of quality. The major factor affecting performance of alignment-free read clustering is the length of the read.

  10. glbase: a framework for combining, analyzing and displaying heterogeneous genomic and high-throughput sequencing data.

    Science.gov (United States)

    Hutchins, Andrew Paul; Jauch, Ralf; Dyla, Mateusz; Miranda-Saavedra, Diego

    2014-01-01

    Genomic datasets and the tools to analyze them have proliferated at an astonishing rate. However, such tools are often poorly integrated with each other: each program typically produces its own custom output in a variety of non-standard file formats. Here we present glbase, a framework that uses a flexible set of descriptors that can quickly parse non-binary data files. glbase includes many functions to intersect two lists of data, including operations on genomic interval data and support for the efficient random access to huge genomic data files. Many glbase functions can produce graphical outputs, including scatter plots, heatmaps, boxplots and other common analytical displays of high-throughput data such as RNA-seq, ChIP-seq and microarray expression data. glbase is designed to rapidly bring biological data into a Python-based analytical environment to facilitate analysis and data processing. In summary, glbase is a flexible and multifunctional toolkit that allows the combination and analysis of high-throughput data (especially next-generation sequencing and genome-wide data), and which has been instrumental in the analysis of complex data sets. glbase is freely available at http://bitbucket.org/oaxiom/glbase/.

  11. glbase: a framework for combining, analyzing and displaying heterogeneous genomic and high-throughput sequencing data

    Directory of Open Access Journals (Sweden)

    Andrew Paul Hutchins

    2014-01-01

    Full Text Available Genomic datasets and the tools to analyze them have proliferated at an astonishing rate. However, such tools are often poorly integrated with each other: each program typically produces its own custom output in a variety of non-standard file formats. Here we present glbase, a framework that uses a flexible set of descriptors that can quickly parse non-binary data files. glbase includes many functions to intersect two lists of data, including operations on genomic interval data and support for the efficient random access to huge genomic data files. Many glbase functions can produce graphical outputs, including scatter plots, heatmaps, boxplots and other common analytical displays of high-throughput data such as RNA-seq, ChIP-seq and microarray expression data. glbase is designed to rapidly bring biological data into a Python-based analytical environment to facilitate analysis and data processing. In summary, glbase is a flexible and multifunctional toolkit that allows the combination and analysis of high-throughput data (especially next-generation sequencing and genome-wide data, and which has been instrumental in the analysis of complex data sets. glbase is freely available at http://bitbucket.org/oaxiom/glbase/.

  12. High-throughput sequencing of forensic genetic samples using punches of FTA cards with buccal swabs

    DEFF Research Database (Denmark)

    Kampmann, Marie-Louise; Buchard, Anders; Børsting, Claus

    2016-01-01

    Here, we demonstrate that punches from buccal swab samples preserved on FTA cards can be used for high-throughput DNA sequencing, also known as massively parallel sequencing (MPS). We typed 44 reference samples with the HID-Ion AmpliSeq Identity Panel using washed 1.2 mm punches from FTA cards...

  13. High-Throughput Sequencing Based Methods of RNA Structure Investigation

    DEFF Research Database (Denmark)

    Kielpinski, Lukasz Jan

    In this thesis we describe the development of four related methods for RNA structure probing that utilize massive parallel sequencing. Using them, we were able to gather structural data for multiple, long molecules simultaneously. First, we have established an easy to follow experimental...... and computational protocol for detecting the reverse transcription termination sites (RTTS-Seq). This protocol was subsequently applied to hydroxyl radical footprinting of three dimensional RNA structures to give a probing signal that correlates well with the RNA backbone solvent accessibility. Moreover, we applied...

  14. High-throughput sequencing of black pepper root transcriptome

    Science.gov (United States)

    2012-01-01

    Background Black pepper (Piper nigrum L.) is one of the most popular spices in the world. It is used in cooking and the preservation of food and even has medicinal properties. Losses in production from disease are a major limitation in the culture of this crop. The major diseases are root rot and foot rot, which are results of root infection by Fusarium solani and Phytophtora capsici, respectively. Understanding the molecular interaction between the pathogens and the host’s root region is important for obtaining resistant cultivars by biotechnological breeding. Genetic and molecular data for this species, though, are limited. In this paper, RNA-Seq technology has been employed, for the first time, to describe the root transcriptome of black pepper. Results The root transcriptome of black pepper was sequenced by the NGS SOLiD platform and assembled using the multiple-k method. Blast2Go and orthoMCL methods were used to annotate 10338 unigenes. The 4472 predicted proteins showed about 52% homology with the Arabidopsis proteome. Two root proteomes identified 615 proteins, which seem to define the plant’s root pattern. Simple-sequence repeats were identified that may be useful in studies of genetic diversity and may have applications in biotechnology and ecology. Conclusions This dataset of 10338 unigenes is crucially important for the biotechnological breeding of black pepper and the ecogenomics of the Magnoliids, a major group of basal angiosperms. PMID:22984782

  15. High-throughput sequencing of black pepper root transcriptome

    Directory of Open Access Journals (Sweden)

    Gordo Sheila MC

    2012-09-01

    Full Text Available Abstract Background Black pepper (Piper nigrum L. is one of the most popular spices in the world. It is used in cooking and the preservation of food and even has medicinal properties. Losses in production from disease are a major limitation in the culture of this crop. The major diseases are root rot and foot rot, which are results of root infection by Fusarium solani and Phytophtora capsici, respectively. Understanding the molecular interaction between the pathogens and the host’s root region is important for obtaining resistant cultivars by biotechnological breeding. Genetic and molecular data for this species, though, are limited. In this paper, RNA-Seq technology has been employed, for the first time, to describe the root transcriptome of black pepper. Results The root transcriptome of black pepper was sequenced by the NGS SOLiD platform and assembled using the multiple-k method. Blast2Go and orthoMCL methods were used to annotate 10338 unigenes. The 4472 predicted proteins showed about 52% homology with the Arabidopsis proteome. Two root proteomes identified 615 proteins, which seem to define the plant’s root pattern. Simple-sequence repeats were identified that may be useful in studies of genetic diversity and may have applications in biotechnology and ecology. Conclusions This dataset of 10338 unigenes is crucially important for the biotechnological breeding of black pepper and the ecogenomics of the Magnoliids, a major group of basal angiosperms.

  16. The Microsoft Biology Foundation Applications for High-Throughput Sequencing

    Science.gov (United States)

    Mercer, S.

    2010-01-01

    w9-2 The need for reusable libraries of bioinformatics functions has been recognized for many years and a number of language-specific toolkits have been constructed. Such toolkits have served as valuable nucleation points for the community, promoting the sharing of code and establishing standards. The majority of DNA sequencing machines and many other standard pieces of lab equipment are controlled by PCs using Windows, and a Microsoft genomics toolkit would enable initial processing and quality control to happen closer to the instrumentation and provide opportunities for added-value services within core facilities. The Microsoft Biology Foundation (MBF) is an open source software library, freely available for both commercial and academic use, available as an early-stage betafrom mbf.codeplex.com. This presentation will describe the structure and goals of MBF and demonstrate some of its uses.

  17. Automated degenerate PCR primer design for high-throughput sequencing improves efficiency of viral sequencing

    Directory of Open Access Journals (Sweden)

    Li Kelvin

    2012-11-01

    Full Text Available Abstract Background In a high-throughput environment, to PCR amplify and sequence a large set of viral isolates from populations that are potentially heterogeneous and continuously evolving, the use of degenerate PCR primers is an important strategy. Degenerate primers allow for the PCR amplification of a wider range of viral isolates with only one set of pre-mixed primers, thus increasing amplification success rates and minimizing the necessity for genome finishing activities. To successfully select a large set of degenerate PCR primers necessary to tile across an entire viral genome and maximize their success, this process is best performed computationally. Results We have developed a fully automated degenerate PCR primer design system that plays a key role in the J. Craig Venter Institute’s (JCVI high-throughput viral sequencing pipeline. A consensus viral genome, or a set of consensus segment sequences in the case of a segmented virus, is specified using IUPAC ambiguity codes in the consensus template sequence to represent the allelic diversity of the target population. PCR primer pairs are then selected computationally to produce a minimal amplicon set capable of tiling across the full length of the specified target region. As part of the tiling process, primer pairs are computationally screened to meet the criteria for successful PCR with one of two described amplification protocols. The actual sequencing success rates for designed primers for measles virus, mumps virus, human parainfluenza virus 1 and 3, human respiratory syncytial virus A and B and human metapneumovirus are described, where >90% of designed primer pairs were able to consistently successfully amplify >75% of the isolates. Conclusions Augmenting our previously developed and published JCVI Primer Design Pipeline, we achieved similarly high sequencing success rates with only minor software modifications. The recommended methodology for the construction of the consensus

  18. Identification of miRNAs and Their Targets in Cotton Inoculated with Verticillium dahliae by High-Throughput Sequencing and Degradome Analysis

    Directory of Open Access Journals (Sweden)

    Yujuan Zhang

    2015-06-01

    Full Text Available MicroRNAs (miRNAs are a group of endogenous small non-coding RNAs that play important roles in plant growth, development, and stress response processes. Verticillium wilt is a vascular disease in plants mainly caused by Verticillium dahliae Kleb., the soil-borne fungal pathogen. However, the role of miRNAs in the regulation of Verticillium defense responses is mostly unknown. This study aimed to identify new miRNAs and their potential targets that are involved in the regulation of Verticillium defense responses. Four small RNA libraries and two degradome libraries from mock-infected and infected roots of cotton (both Gossypium hirsutum L. and Gossypium barbadense L. were constructed for deep sequencing. A total of 140 known miRNAs and 58 novel miRNAs were identified. Among the identified miRNAs, many were differentially expressed between libraries. Degradome analysis showed that a total of 83 and 24 genes were the targets of 31 known and 14 novel miRNA families, respectively. Gene Ontology analysis indicated that many of the identified miRNA targets may function in controlling root development and the regulation of Verticillium defense responses in cotton. Our findings provide an overview of potential miRNAs involved in the regulation of Verticillium defense responses in cotton and the interactions between miRNAs and their corresponding targets. The profiling of these miRNAs lays the foundation for further understanding of the function of small RNAs in regulating plant response to fungal infection and Verticillium wilt in particular.

  19. Fluorescent foci quantitation for high-throughput analysis

    Directory of Open Access Journals (Sweden)

    Elena Ledesma-Fernández

    2015-06-01

    Full Text Available A number of cellular proteins localize to discrete foci within cells, for example DNA repair proteins, microtubule organizing centers, P bodies or kinetochores. It is often possible to measure the fluorescence emission from tagged proteins within these foci as a surrogate for the concentration of that specific protein. We wished to develop tools that would allow quantitation of fluorescence foci intensities in high-throughput studies. As proof of principle we have examined the kinetochore, a large multi-subunit complex that is critical for the accurate segregation of chromosomes during cell division. Kinetochore perturbations lead to aneuploidy, which is a hallmark of cancer cells. Hence, understanding kinetochore homeostasis and regulation are important for a global understanding of cell division and genome integrity. The 16 budding yeast kinetochores colocalize within the nucleus to form a single focus. Here we have created a set of freely-available tools to allow high-throughput quantitation of kinetochore foci fluorescence. We use this ‘FociQuant’ tool to compare methods of kinetochore quantitation and we show proof of principle that FociQuant can be used to identify changes in kinetochore protein levels in a mutant that affects kinetochore function. This analysis can be applied to any protein that forms discrete foci in cells.

  20. The main challenges that remain in applying high-throughput sequencing to clinical diagnostics.

    Science.gov (United States)

    Loeffelholz, Michael; Fofanov, Yuriy

    2015-01-01

    Over the last 10 years, the quality, price and availability of high-throughput sequencing instruments have improved to the point that this technology may be close to becoming a routine tool in the diagnostic microbiology laboratory. Two groups of challenges, however, have to be resolved in order to move this powerful research technology into routine use in the clinical microbiology laboratory. The computational/bioinformatics challenges include data storage cost and privacy concerns, requiring analysis to be performed without access to cloud storage or expensive computational infrastructure. The logistical challenges include interpretation of complex results and acceptance and understanding of the advantages and limitations of this technology by the medical community. This article focuses on the approaches to address these challenges, such as file formats, algorithms, data collection, reporting and good laboratory practices.

  1. Barcoding the food chain: from Sanger to high-throughput sequencing.

    Science.gov (United States)

    Littlefair, Joanne E; Clare, Elizabeth L

    2016-11-01

    Society faces the complex challenge of supporting biodiversity and ecosystem functioning, while ensuring food security by providing safe traceable food through an ever-more-complex global food chain. The increase in human mobility brings the added threat of pests, parasites, and invaders that further complicate our agro-industrial efforts. DNA barcoding technologies allow researchers to identify both individual species, and, when combined with universal primers and high-throughput sequencing techniques, the diversity within mixed samples (metabarcoding). These tools are already being employed to detect market substitutions, trace pests through the forensic evaluation of trace "environmental DNA", and to track parasitic infections in livestock. The potential of DNA barcoding to contribute to increased security of the food chain is clear, but challenges remain in regulation and the need for validation of experimental analysis. Here, we present an overview of the current uses and challenges of applied DNA barcoding in agriculture, from agro-ecosystems within farmland to the kitchen table.

  2. Discovery of viruses and virus-like pathogens in pistachio using high-throughput sequencing

    Science.gov (United States)

    Pistachio (Pistacia vera L.) trees from the National Clonal Germplasm Repository (NCGR) and orchards in California were surveyed for viruses and virus-like agents by high-throughput sequencing (HTS). Analyses of 60 trees including clonal UCB-1 hybrid rootstock (P. atlantica × P. integerrima) identif...

  3. Alignment of high-throughput sequencing data inside in-memory databases.

    Science.gov (United States)

    Firnkorn, Daniel; Knaup-Gregori, Petra; Lorenzo Bermejo, Justo; Ganzinger, Matthias

    2014-01-01

    In times of high-throughput DNA sequencing techniques, performance-capable analysis of DNA sequences is of high importance. Computer supported DNA analysis is still an intensive time-consuming task. In this paper we explore the potential of a new In-Memory database technology by using SAP's High Performance Analytic Appliance (HANA). We focus on read alignment as one of the first steps in DNA sequence analysis. In particular, we examined the widely used Burrows-Wheeler Aligner (BWA) and implemented stored procedures in both, HANA and the free database system MySQL, to compare execution time and memory management. To ensure that the results are comparable, MySQL has been running in memory as well, utilizing its integrated memory engine for database table creation. We implemented stored procedures, containing exact and inexact searching of DNA reads within the reference genome GRCh37. Due to technical restrictions in SAP HANA concerning recursion, the inexact matching problem could not be implemented on this platform. Hence, performance analysis between HANA and MySQL was made by comparing the execution time of the exact search procedures. Here, HANA was approximately 27 times faster than MySQL which means, that there is a high potential within the new In-Memory concepts, leading to further developments of DNA analysis procedures in the future.

  4. Genome-wide transcriptome analysis between small-tail Han sheep and the Surabaya fur sheep using high-throughput RNA sequencing.

    Science.gov (United States)

    Miao, Xiangyang; Luo, Qingmiao

    2013-06-01

    The small-tail Han sheep and the Surabaya fur sheep are two local breeds in north China, which are characterized by high-fecundity and low-prolificacy breed respectively. Significant genetic differences between these two breeds have provided increasing interests in the identification and utilization of major prolificacy genes in these sheep. High prolificacy is a complex trait, and it is difficult to comprehensively identify the candidate genes related to this trait using the single molecular biology technique. To understand the molecular mechanisms of fecundity and provide more information about high prolificacy candidate genes in high- and low-fecundity sheep, we explored the utility of next-generation sequencing technology in this work. A total of 1.8 Gb sequencing reads were obtained and resulted in more than 20 000 contigs that averaged ∼300 bp in length. Ten differentially expressed genes were further verified by quantitative real-time RT-PCR to confirm the reliability of RNA-seq results. Our work will provide a basis for the future research of the sheep reproduction.

  5. Bioassessment of a Drinking Water Reservoir Using Plankton: High Throughput Sequencing vs. Traditional Morphological Method

    Directory of Open Access Journals (Sweden)

    Wanli Gao

    2018-01-01

    Full Text Available Drinking water safety is increasingly perceived as one of the top global environmental issues. Plankton has been commonly used as a bioindicator for water quality in lakes and reservoirs. Recently, DNA sequencing technology has been applied to bioassessment. In this study, we compared the effectiveness of the 16S and 18S rRNA high throughput sequencing method (HTS and the traditional optical microscopy method (TOM in the bioassessment of drinking water quality. Five stations reflecting different habitats and hydrological conditions in Danjiangkou Reservoir, one of the largest drinking water reservoirs in Asia, were sampled May 2016. Non-metric multi-dimensional scaling (NMDS analysis showed that plankton assemblages varied among the stations and the spatial patterns revealed by the two methods were consistent. The correlation between TOM and HTS in a symmetric Procrustes analysis was 0.61, revealing overall good concordance between the two methods. Procrustes analysis also showed that site-specific differences between the two methods varied among the stations. Station Heijizui (H, a site heavily influenced by two tributaries, had the largest difference while station Qushou (Q, a confluence site close to the outlet dam, had the smallest difference between the two methods. Our results show that DNA sequencing has the potential to provide consistent identification of taxa, and reliable bioassessment in a long-term biomonitoring and assessment program for drinking water reservoirs.

  6. Evaluation of a pooled strategy for high-throughput sequencing of cosmid clones from metagenomic libraries.

    Science.gov (United States)

    Lam, Kathy N; Hall, Michael W; Engel, Katja; Vey, Gregory; Cheng, Jiujun; Neufeld, Josh D; Charles, Trevor C

    2014-01-01

    High-throughput sequencing methods have been instrumental in the growing field of metagenomics, with technological improvements enabling greater throughput at decreased costs. Nonetheless, the economy of high-throughput sequencing cannot be fully leveraged in the subdiscipline of functional metagenomics. In this area of research, environmental DNA is typically cloned to generate large-insert libraries from which individual clones are isolated, based on specific activities of interest. Sequence data are required for complete characterization of such clones, but the sequencing of a large set of clones requires individual barcode-based sample preparation; this can become costly, as the cost of clone barcoding scales linearly with the number of clones processed, and thus sequencing a large number of metagenomic clones often remains cost-prohibitive. We investigated a hybrid Sanger/Illumina pooled sequencing strategy that omits barcoding altogether, and we evaluated this strategy by comparing the pooled sequencing results to reference sequence data obtained from traditional barcode-based sequencing of the same set of clones. Using identity and coverage metrics in our evaluation, we show that pooled sequencing can generate high-quality sequence data, without producing problematic chimeras. Though caveats of a pooled strategy exist and further optimization of the method is required to improve recovery of complete clone sequences and to avoid circumstances that generate unrecoverable clone sequences, our results demonstrate that pooled sequencing represents an effective and low-cost alternative for sequencing large sets of metagenomic clones.

  7. Genome-wide identification and comparative analysis of conserved and novel microRNAs in grafted watermelon by high-throughput sequencing.

    Science.gov (United States)

    Liu, Na; Yang, Jinghua; Guo, Shaogui; Xu, Yong; Zhang, Mingfang

    2013-01-01

    MicroRNAs (miRNAs) are a class of endogenous small non-coding RNAs involved in the post-transcriptional gene regulation and play a critical role in plant growth, development and stresses response. However less is known about miRNAs involvement in grafting behaviors, especially with the watermelon (Citrullus lanatus L.) crop, which is one of the most important agricultural crops worldwide. Grafting method is commonly used in watermelon production in attempts to improve its adaptation to abiotic and biotic stresses, in particular to the soil-borne fusarium wilt disease. In this study, Solexa sequencing has been used to discover small RNA populations and compare miRNAs on genome-wide scale in watermelon grafting system. A total of 11,458,476, 11,614,094 and 9,339,089 raw reads representing 2,957,751, 2,880,328 and 2,964,990 unique sequences were obtained from the scions of self-grafted watermelon and watermelon grafted on-to bottle gourd and squash at two true-leaf stage, respectively. 39 known miRNAs belonging to 30 miRNA families and 80 novel miRNAs were identified in our small RNA dataset. Compared with self-grafted watermelon, 20 (5 known miRNA families and 15 novel miRNAs) and 47 (17 known miRNA families and 30 novel miRNAs) miRNAs were expressed significantly different in watermelon grafted on to bottle gourd and squash, respectively. MiRNAs expressed differentially when watermelon was grafted onto different rootstocks, suggesting that miRNAs might play an important role in diverse biological and metabolic processes in watermelon and grafting may possibly by changing miRNAs expressions to regulate plant growth and development as well as adaptation to stresses. The small RNA transcriptomes obtained in this study provided insights into molecular aspects of miRNA-mediated regulation in grafted watermelon. Obviously, this result would provide a basis for further unravelling the mechanism on how miRNAs information is exchanged between scion and rootstock in grafted

  8. HTSeq--a Python framework to work with high-throughput sequencing data.

    Science.gov (United States)

    Anders, Simon; Pyl, Paul Theodor; Huber, Wolfgang

    2015-01-15

    A large choice of tools exists for many standard tasks in the analysis of high-throughput sequencing (HTS) data. However, once a project deviates from standard workflows, custom scripts are needed. We present HTSeq, a Python library to facilitate the rapid development of such scripts. HTSeq offers parsers for many common data formats in HTS projects, as well as classes to represent data, such as genomic coordinates, sequences, sequencing reads, alignments, gene model information and variant calls, and provides data structures that allow for querying via genomic coordinates. We also present htseq-count, a tool developed with HTSeq that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes. HTSeq is released as an open-source software under the GNU General Public Licence and available from http://www-huber.embl.de/HTSeq or from the Python Package Index at https://pypi.python.org/pypi/HTSeq. © The Author 2014. Published by Oxford University Press.

  9. The application of the high throughput sequencing technology in the transposable elements.

    Science.gov (United States)

    Liu, Zhen; Xu, Jian-hong

    2015-09-01

    High throughput sequencing technology has dramatically improved the efficiency of DNA sequencing, and decreased the costs to a great extent. Meanwhile, this technology usually has advantages of better specificity, higher sensitivity and accuracy. Therefore, it has been applied to the research on genetic variations, transcriptomics and epigenomics. Recently, this technology has been widely employed in the studies of transposable elements and has achieved fruitful results. In this review, we summarize the application of high throughput sequencing technology in the fields of transposable elements, including the estimation of transposon content, preference of target sites and distribution, insertion polymorphism and population frequency, identification of rare copies, transposon horizontal transfers as well as transposon tagging. We also briefly introduce the major common sequencing strategies and algorithms, their advantages and disadvantages, and the corresponding solutions. Finally, we envision the developing trends of high throughput sequencing technology, especially the third generation sequencing technology, and its application in transposon studies in the future, hopefully providing a comprehensive understanding and reference for related scientific researchers.

  10. Targeted Capture and High-Throughput Sequencing Using Molecular Inversion Probes (MIPs).

    Science.gov (United States)

    Cantsilieris, Stuart; Stessman, Holly A; Shendure, Jay; Eichler, Evan E

    2017-01-01

    Molecular inversion probes (MIPs) in combination with massively parallel DNA sequencing represent a versatile, yet economical tool for targeted sequencing of genomic DNA. Several thousand genomic targets can be selectively captured using long oligonucleotides containing unique targeting arms and universal linkers. The ability to append sequencing adaptors and sample-specific barcodes allows large-scale pooling and subsequent high-throughput sequencing at relatively low cost per sample. Here, we describe a "wet bench" protocol detailing the capture and subsequent sequencing of >2000 genomic targets from 192 samples, representative of a single lane on the Illumina HiSeq 2000 platform.

  11. High-throughput sequencing of three Lemnoideae (duckweeds chloroplast genomes from total DNA.

    Directory of Open Access Journals (Sweden)

    Wenqin Wang

    Full Text Available BACKGROUND: Chloroplast genomes provide a wealth of information for evolutionary and population genetic studies. Chloroplasts play a particularly important role in the adaption for aquatic plants because they float on water and their major surface is exposed continuously to sunlight. The subfamily of Lemnoideae represents such a collection of aquatic species that because of photosynthesis represents one of the fastest growing plant species on earth. METHODS: We sequenced the chloroplast genomes from three different genera of Lemnoideae, Spirodela polyrhiza, Wolffiella lingulata and Wolffia australiana by high-throughput DNA sequencing of genomic DNA using the SOLiD platform. Unfractionated total DNA contains high copies of plastid DNA so that sequences from the nucleus and mitochondria can easily be filtered computationally. Remaining sequence reads were assembled into contiguous sequences (contigs using SOLiD software tools. Contigs were mapped to a reference genome of Lemna minor and gaps, selected by PCR, were sequenced on the ABI3730xl platform. CONCLUSIONS: This combinatorial approach yielded whole genomic contiguous sequences in a cost-effective manner. Over 1,000-time coverage of chloroplast from total DNA were reached by the SOLiD platform in a single spot on a quadrant slide without purification. Comparative analysis indicated that the chloroplast genome was conserved in gene number and organization with respect to the reference genome of L. minor. However, higher nucleotide substitution, abundant deletions and insertions occurred in non-coding regions of these genomes, indicating a greater genomic dynamics than expected from the comparison of other related species in the Pooideae. Noticeably, there was no transition bias over transversion in Lemnoideae. The data should have immediate applications in evolutionary biology and plant taxonomy with increased resolution and statistical power.

  12. High-throughput transcriptome sequencing analysis provides preliminary insights into the biotransformation mechanism of Rhodopseudomonas palustris treated with alpha-rhamnetin-3-rhamnoside.

    Science.gov (United States)

    Bi, Lei; Guan, Chun-jie; Yang, Guan-e; Yang, Fei; Yan, Hong-yu; Li, Qing-shan

    2016-04-01

    The purple photosynthetic bacterium Rhodopseudomonas palustris has been widely applied to enhance the therapeutic effects of traditional Chinese medicine using novel biotransformation technology. However, comprehensive studies of the R. palustris biotransformation mechanism are rare. Therefore, investigation of the expression patterns of genes involved in metabolic pathways that are active during the biotransformation process is essential to elucidate this complicated mechanism. To promote further study of the biotransformation of R. palustris, we assembled all R. palustris transcripts using Trinity software and performed differential expression analysis of the resulting unigenes. A total of 9725, 7341 and 10,963 unigenes were obtained by assembling the alpha-rhamnetin-3-rhamnoside-treated R. palustris (RPB) reads, control R. palustris (RPS) reads and combined RPB&RPS reads, respectively. A total of 9971 unigenes assembled from the RPB&RPS reads were mapped to the nr, nt, Swiss-Prot, Gene Ontology (GO), Clusters of Orthologous Groups (COGs) and Kyoto Encyclopedia of Genes and Genomes (KEGG) (E-value biotransformation in R. palustris. Furthermore, we propose two putative ARR biotransformation mechanisms in R. palustris. These analytical results represent a useful genomic resource for in-depth research into the molecular basis of biotransformation and genetic modification in R. palustris. Copyright © 2016 Elsevier GmbH. All rights reserved.

  13. High-Throughput Analysis and Automation for Glycomics Studies

    NARCIS (Netherlands)

    Shubhakar, A.; Reiding, K.R.; Gardner, R.A.; Spencer, D.I.R.; Fernandes, D.L.; Wuhrer, M.

    2015-01-01

    This review covers advances in analytical technologies for high-throughput (HTP) glycomics. Our focus is on structural studies of glycoprotein glycosylation to support biopharmaceutical realization and the discovery of glycan biomarkers for human disease. For biopharmaceuticals, there is increasing

  14. MIPHENO: Data normalization for high throughput metabolic analysis.

    Science.gov (United States)

    High throughput methodologies such as microarrays, mass spectrometry and plate-based small molecule screens are increasingly used to facilitate discoveries from gene function to drug candidate identification. These large-scale experiments are typically carried out over the course...

  15. Bacterial diversity of the Colombian fermented milk "Suero Costeño" assessed by culturing and high-throughput sequencing and DGGE analysis of 16S rRNA gene amplicons.

    Science.gov (United States)

    Motato, Karina Edith; Milani, Christian; Ventura, Marco; Valencia, Francia Elena; Ruas-Madiedo, Patricia; Delgado, Susana

    2017-12-01

    "Suero Costeño" (SC) is a traditional soured cream elaborated from raw milk in the Northern-Caribbean coast of Colombia. The natural microbiota that characterizes this popular Colombian fermented milk is unknown, although several culturing studies have previously been attempted. In this work, the microbiota associated with SC from three manufacturers in two regions, "Planeta Rica" (Córdoba) and "Caucasia" (Antioquia), was analysed by means of culturing methods in combination with high-throughput sequencing and DGGE analysis of 16S rRNA gene amplicons. The bacterial ecosystem of SC samples was revealed to be composed of lactic acid bacteria belonging to the Streptococcaceae and Lactobacillaceae families; the proportions and genera varying among manufacturers and region of elaboration. Members of the Lactobacillus acidophilus group, Lactocococcus lactis, Streptococcus infantarius and Streptococcus salivarius characterized this artisanal product. In comparison with culturing, the use of molecular in deep culture-independent techniques provides a more realistic picture of the overall bacterial communities residing in SC. Besides the descriptive purpose, these approaches will facilitate a rational strategy to follow (culture media and growing conditions) for the isolation of indigenous strains that allow standardization in the manufacture of SC. Copyright © 2017 Elsevier Ltd. All rights reserved.

  16. High-throughput Sequencing Based Immune Repertoire Study during Infectious Disease

    Directory of Open Access Journals (Sweden)

    Dongni Hou

    2016-08-01

    Full Text Available The selectivity of the adaptive immune response is based on the enormous diversity of T and B cell antigen-specific receptors. The immune repertoire, the collection of T and B cells with functional diversity in the circulatory system at any given time, is dynamic and reflects the essence of immune selectivity. In this article, we review the recent advances in immune repertoire study of infectious diseases that achieved by traditional techniques and high-throughput sequencing techniques. High-throughput sequencing techniques enable the determination of complementary regions of lymphocyte receptors with unprecedented efficiency and scale. This progress in methodology enhances the understanding of immunologic changes during pathogen challenge, and also provides a basis for further development of novel diagnostic markers, immunotherapies and vaccines.

  17. Sources of PCR-induced distortions in high-throughput sequencing data sets

    Science.gov (United States)

    Kebschull, Justus M.; Zador, Anthony M.

    2015-01-01

    PCR permits the exponential and sequence-specific amplification of DNA, even from minute starting quantities. PCR is a fundamental step in preparing DNA samples for high-throughput sequencing. However, there are errors associated with PCR-mediated amplification. Here we examine the effects of four important sources of error—bias, stochasticity, template switches and polymerase errors—on sequence representation in low-input next-generation sequencing libraries. We designed a pool of diverse PCR amplicons with a defined structure, and then used Illumina sequencing to search for signatures of each process. We further developed quantitative models for each process, and compared predictions of these models to our experimental data. We find that PCR stochasticity is the major force skewing sequence representation after amplification of a pool of unique DNA amplicons. Polymerase errors become very common in later cycles of PCR but have little impact on the overall sequence distribution as they are confined to small copy numbers. PCR template switches are rare and confined to low copy numbers. Our results provide a theoretical basis for removing distortions from high-throughput sequencing data. In addition, our findings on PCR stochasticity will have particular relevance to quantification of results from single cell sequencing, in which sequences are represented by only one or a few molecules. PMID:26187991

  18. High-throughput sequencing of forensic genetic samples using punches of FTA cards with buccal swabs.

    Science.gov (United States)

    Kampmann, Marie-Louise; Buchard, Anders; Børsting, Claus; Morling, Niels

    2016-01-01

    Here, we demonstrate that punches from buccal swab samples preserved on FTA cards can be used for high-throughput DNA sequencing, also known as massively parallel sequencing (MPS). We typed 44 reference samples with the HID-Ion AmpliSeq Identity Panel using washed 1.2 mm punches from FTA cards with buccal swabs and compared the results with those obtained with DNA extracted using the EZ1 DNA Investigator Kit. Concordant profiles were obtained for all samples. Our protocol includes simple punch, wash, and PCR steps, reducing cost and hands-on time in the laboratory. Furthermore, it facilitates automation of DNA sequencing.

  19. High-Throughput Analysis and Automation for Glycomics Studies.

    Science.gov (United States)

    Shubhakar, Archana; Reiding, Karli R; Gardner, Richard A; Spencer, Daniel I R; Fernandes, Daryl L; Wuhrer, Manfred

    This review covers advances in analytical technologies for high-throughput (HTP) glycomics. Our focus is on structural studies of glycoprotein glycosylation to support biopharmaceutical realization and the discovery of glycan biomarkers for human disease. For biopharmaceuticals, there is increasing use of glycomics in Quality by Design studies to help optimize glycan profiles of drugs with a view to improving their clinical performance. Glycomics is also used in comparability studies to ensure consistency of glycosylation both throughout product development and between biosimilars and innovator drugs. In clinical studies there is as well an expanding interest in the use of glycomics-for example in Genome Wide Association Studies-to follow changes in glycosylation patterns of biological tissues and fluids with the progress of certain diseases. These include cancers, neurodegenerative disorders and inflammatory conditions. Despite rising activity in this field, there are significant challenges in performing large scale glycomics studies. The requirement is accurate identification and quantitation of individual glycan structures. However, glycoconjugate samples are often very complex and heterogeneous and contain many diverse branched glycan structures. In this article we cover HTP sample preparation and derivatization methods, sample purification, robotization, optimized glycan profiling by UHPLC, MS and multiplexed CE, as well as hyphenated techniques and automated data analysis tools. Throughout, we summarize the advantages and challenges with each of these technologies. The issues considered include reliability of the methods for glycan identification and quantitation, sample throughput, labor intensity, and affordability for large sample numbers.

  20. High-throughput metagenomic technologies for complex microbial community analysis: open and closed formats.

    Science.gov (United States)

    Zhou, Jizhong; He, Zhili; Yang, Yunfeng; Deng, Ye; Tringe, Susannah G; Alvarez-Cohen, Lisa

    2015-01-27

    Understanding the structure, functions, activities and dynamics of microbial communities in natural environments is one of the grand challenges of 21st century science. To address this challenge, over the past decade, numerous technologies have been developed for interrogating microbial communities, of which some are amenable to exploratory work (e.g., high-throughput sequencing and phenotypic screening) and others depend on reference genes or genomes (e.g., phylogenetic and functional gene arrays). Here, we provide a critical review and synthesis of the most commonly applied "open-format" and "closed-format" detection technologies. We discuss their characteristics, advantages, and disadvantages within the context of environmental applications and focus on analysis of complex microbial systems, such as those in soils, in which diversity is high and reference genomes are few. In addition, we discuss crucial issues and considerations associated with applying complementary high-throughput molecular technologies to address important ecological questions. Copyright © 2015 Zhou et al.

  1. Association Study of Gut Flora in Coronary Heart Disease through High-Throughput Sequencing

    Directory of Open Access Journals (Sweden)

    Li Cui

    2017-01-01

    Full Text Available Objectives. We aimed to explore the impact of gut microbiota in coronary heart disease (CHD patients through high-throughput sequencing. Methods. A total of 29 CHD in-hospital patients and 35 healthy volunteers as controls were included. Nucleic acids were extracted from fecal samples, followed by α diversity and principal coordinate analysis (PCoA. Based on unweighted UniFrac distance matrices, unweighted-pair group method with arithmetic mean (UPGMA trees were created. Results. After data optimization, an average of 121312±19293 reads in CHD patients and 234372±108725 reads in controls was obtained. Reads corresponding to 38 phyla, 90 classes, and 584 genera were detected in CHD patients, whereas 40 phyla, 99 classes, and 775 genera were detected in controls. The proportion of phylum Bacteroidetes (56.12% was lower and that of phylum Firmicutes was higher (37.06% in CHD patients than those in the controls (60.92% and 32.06%, P<0.05. PCoA and UPGMA tree analysis showed that there were significant differences of gut microbial compositions between the two groups. Conclusion. The diversity and compositions of gut flora were different between CHD patients and healthy controls. The incidence of CHD might be associated with the alteration of gut microbiota.

  2. Association Study of Gut Flora in Coronary Heart Disease through High-Throughput Sequencing.

    Science.gov (United States)

    Cui, Li; Zhao, Tingting; Hu, Haibing; Zhang, Wen; Hua, Xiuguo

    2017-01-01

    Objectives. We aimed to explore the impact of gut microbiota in coronary heart disease (CHD) patients through high-throughput sequencing. Methods. A total of 29 CHD in-hospital patients and 35 healthy volunteers as controls were included. Nucleic acids were extracted from fecal samples, followed by α diversity and principal coordinate analysis (PCoA). Based on unweighted UniFrac distance matrices, unweighted-pair group method with arithmetic mean (UPGMA) trees were created. Results. After data optimization, an average of 121312 ± 19293 reads in CHD patients and 234372 ± 108725 reads in controls was obtained. Reads corresponding to 38 phyla, 90 classes, and 584 genera were detected in CHD patients, whereas 40 phyla, 99 classes, and 775 genera were detected in controls. The proportion of phylum Bacteroidetes (56.12%) was lower and that of phylum Firmicutes was higher (37.06%) in CHD patients than those in the controls (60.92% and 32.06%, P UPGMA tree analysis showed that there were significant differences of gut microbial compositions between the two groups. Conclusion. The diversity and compositions of gut flora were different between CHD patients and healthy controls. The incidence of CHD might be associated with the alteration of gut microbiota.

  3. The efficacy of high-throughput sequencing and target enrichment on charred archaeobotanical remains

    DEFF Research Database (Denmark)

    Nistelberger, H. M.; Smith, O.; Wales, Nathan

    2016-01-01

    . It has been suggested that high-throughput sequencing (HTS) technologies coupled with DNA enrichment techniques may overcome some of these limitations. Here we report the findings of HTS and target enrichment on four important archaeological crops (barley, grape, maize and rice) performed in three...... lightly-charred maize cob. Even with target enrichment, this sample failed to yield adequate data required to address fundamental questions in archaeology and biology. We further reanalysed part of an existing dataset on charred plant material, and found all purported endogenous DNA sequences were likely...

  4. Evaluation of the microbial diversity in amyotrophic lateral sclerosis using high-throughput sequencing

    Directory of Open Access Journals (Sweden)

    Xin Fang

    2016-09-01

    Full Text Available More and more evidences indicate that diseases of the central nervous system (CNS have been seriously affected by faecal microbes. However, little work is done to explore interaction between amyotrophic lateral sclerosis (ALS and faecal microbes. In the present study, high-throughput sequencing method was used to compare the intestinal microbial diversity of healthy people and ALS patients. The principal coordinate analysis (PCoA, Venn and unweighted pair-group method using arithmetic averages (UPGMA showed an obvious microbial changes between healthy people (group H and ALS patients (group A, and the average ratios of Bacteroides, Faecalibacterium, Anaerostipes, Prevotella, Escherichia and Lachnospira at genus level between ALS patients and healthy people were 0.78, 2.18, 3.41, 0.35, 0.79 and 13.07. Furthermore, the decreased Firmicutes/Bacteroidetes ratio at phylum level using LEfSE (LDA >4.0, together with the significant increased genus Dorea (harmful microorganisms and significant reduced genus Oscillibacter, Anaerostipes, Lachnospiraceae (beneficial microorganisms in ALS patients, indicated that the imbalance in intestinal microflora constitution had a strong association with the pathogenesis of ALS.

  5. Evaluation of the Microbial Diversity in Amyotrophic Lateral Sclerosis Using High-Throughput Sequencing.

    Science.gov (United States)

    Fang, Xin; Wang, Xin; Yang, Shaoguo; Meng, Fanjing; Wang, Xiaolei; Wei, Hua; Chen, Tingtao

    2016-01-01

    More and more evidences indicate that diseases of the central nervous system have been seriously affected by fecal microbes. However, little work is done to explore interaction between amyotrophic lateral sclerosis (ALS) and fecal microbes. In the present study, high-throughput sequencing method was used to compare the intestinal microbial diversity of healthy people and ALS patients. The principal coordinate analysis, Venn and unweighted pair-group method using arithmetic averages (UPGMA) showed an obvious microbial changes between healthy people (group H) and ALS patients (group A), and the average ratios of Bacteroides , Faecalibacterium , Anaerostipes , Prevotella , Escherichia , and Lachnospira at genus level between ALS patients and healthy people were 0.78, 2.18, 3.41, 0.35, 0.79, and 13.07. Furthermore, the decreased Firmicutes/Bacteroidetes ratio at phylum level using LEfSE (LDA > 4.0), together with the significant increased genus Dorea (harmful microorganisms) and significant reduced genus Oscillibacter , Anaerostipes , Lachnospiraceae (beneficial microorganisms) in ALS patients, indicated that the imbalance in intestinal microflora constitution had a strong association with the pathogenesis of ALS.

  6. High-throughput sequencing of plasma microRNA in chronic fatigue syndrome/myalgic encephalomyelitis.

    Directory of Open Access Journals (Sweden)

    Ekua W Brenu

    Full Text Available BACKGROUND: MicroRNAs (miRNAs are known to regulate many biological processes and their dysregulation has been associated with a variety of diseases including Chronic Fatigue Syndrome/Myalgic Encephalomyelitis (CFS/ME. The recent discovery of stable and reproducible miRNA in plasma has raised the possibility that circulating miRNAs may serve as novel diagnostic markers. The objective of this study was to determine the role of plasma miRNA in CFS/ME. RESULTS: Using Illumina high-throughput sequencing we identified 19 miRNAs that were differentially expressed in the plasma of CFS/ME patients in comparison to non-fatigued controls. Following RT-qPCR analysis, we were able to confirm the significant up-regulation of three miRNAs (hsa-miR-127-3p, hsa-miR-142-5p and hsa-miR-143-3p in the CFS/ME patients. CONCLUSION: Our study is the first to identify circulating miRNAs from CFS/ME patients and also to confirm three differentially expressed circulating miRNAs in CFS/ME patients, providing a basis for further study to find useful CFS/ME biomarkers.

  7. High-Throughput Sequencing of Plasma MicroRNA in Chronic Fatigue Syndrome/Myalgic Encephalomyelitis

    Science.gov (United States)

    Brenu, Ekua W.; Ashton, Kevin J.; Batovska, Jana; Staines, Donald R.; Marshall-Gradisnik, Sonya M.

    2014-01-01

    Background MicroRNAs (miRNAs) are known to regulate many biological processes and their dysregulation has been associated with a variety of diseases including Chronic Fatigue Syndrome/Myalgic Encephalomyelitis (CFS/ME). The recent discovery of stable and reproducible miRNA in plasma has raised the possibility that circulating miRNAs may serve as novel diagnostic markers. The objective of this study was to determine the role of plasma miRNA in CFS/ME. Results Using Illumina high-throughput sequencing we identified 19 miRNAs that were differentially expressed in the plasma of CFS/ME patients in comparison to non-fatigued controls. Following RT-qPCR analysis, we were able to confirm the significant up-regulation of three miRNAs (hsa-miR-127-3p, hsa-miR-142-5p and hsa-miR-143-3p) in the CFS/ME patients. Conclusion Our study is the first to identify circulating miRNAs from CFS/ME patients and also to confirm three differentially expressed circulating miRNAs in CFS/ME patients, providing a basis for further study to find useful CFS/ME biomarkers. PMID:25238588

  8. On the optimal trimming of high-throughput mRNA sequence data

    Directory of Open Access Journals (Sweden)

    Matthew D MacManes

    2014-01-01

    Full Text Available The widespread and rapid adoption of high-throughput sequencing technologies has afforded researchers the opportunity to gain a deep understanding of genome level processes that underlie evolutionary change, and perhaps more importantly, the links between genotype and phenotype. In particular, researchers interested in functional biology and adaptation have used these technologies to sequence mRNA transcriptomes of specific tissues, which in turn are often compared to other tissues, or other individuals with different phenotypes. While these techniques are extremely powerful, careful attention to data quality is required. In particular, because high-throughput sequencing is more error-prone than traditional Sanger sequencing, quality trimming of sequence reads should be an important step in all data processing pipelines. While several software packages for quality trimming exist, no general guidelines for the specifics of trimming have been developed. Here, using empirically derived sequence data, I provide general recommendations regarding the optimal strength of trimming, specifically in mRNA-Seq studies. Although very aggressive quality trimming is common, this study suggests that a more gentle trimming, specifically of those nucleotides whose Phred score < 2 or < 5, is optimal for most studies across a wide variety of metrics.

  9. Genetic profiles of cervical tumors by high-throughput sequencing for personalized medical care

    International Nuclear Information System (INIS)

    Muller, Etienne; Brault, Baptiste; Holmes, Allyson; Legros, Angelina; Jeannot, Emmanuelle; Campitelli, Maura; Rousselin, Antoine; Goardon, Nicolas; Frébourg, Thierry; Krieger, Sophie; Crouet, Hubert; Nicolas, Alain; Sastre, Xavier; Vaur, Dominique; Castéra, Laurent

    2015-01-01

    Cancer treatment is facing major evolution since the advent of targeted therapies. Building genetic profiles could predict sensitivity or resistance to these therapies and highlight disease-specific abnormalities, supporting personalized patient care. In the context of biomedical research and clinical diagnosis, our laboratory has developed an oncogenic panel comprised of 226 genes and a dedicated bioinformatic pipeline to explore somatic mutations in cervical carcinomas, using high-throughput sequencing. Twenty-nine tumors were sequenced for exons within 226 genes. The automated pipeline used includes a database and a filtration system dedicated to identifying mutations of interest and excluding false positive and germline mutations. One-hundred and seventy-six total mutational events were found among the 29 tumors. Our cervical tumor mutational landscape shows that most mutations are found in PIK3CA (E545K, E542K) and KRAS (G12D, G13D) and others in FBXW7 (R465C, R505G, R479Q). Mutations have also been found in ALK (V1149L, A1266T) and EGFR (T259M). These results showed that 48% of patients display at least one deleterious mutation in genes that have been already targeted by the Food and Drug Administration approved therapies. Considering deleterious mutations, 59% of patients could be eligible for clinical trials. Sequencing hundreds of genes in a clinical context has become feasible, in terms of time and cost. In the near future, such an analysis could be a part of a battery of examinations along the diagnosis and treatment of cancer, helping to detect sensitivity or resistance to targeted therapies and allow advancements towards personalized oncology

  10. Diversity and Structure of Diazotrophic Communities in Mangrove Rhizosphere, Revealed by High-Throughput Sequencing.

    Science.gov (United States)

    Zhang, Yanying; Yang, Qingsong; Ling, Juan; Van Nostrand, Joy D; Shi, Zhou; Zhou, Jizhong; Dong, Junde

    2017-01-01

    Diazotrophic communities make an essential contribution to the productivity through providing new nitrogen. However, knowledge of the roles that both mangrove tree species and geochemical parameters play in shaping mangove rhizosphere diazotrophic communities is still elusive. Here, a comprehensive examination of the diversity and structure of microbial communities in the rhizospheres of three mangrove species, Rhizophora apiculata , Avicennia marina , and Ceriops tagal , was undertaken using high - throughput sequencing of the 16S rRNA and nifH genes. Our results revealed a great diversity of both the total microbial composition and the diazotrophic composition specifically in the mangrove rhizosphere. Deltaproteobacteria and Gammaproteobacteria were both ubiquitous and dominant, comprising an average of 45.87 and 86.66% of total microbial and diazotrophic communities, respectively. Sulfate-reducing bacteria belonging to the Desulfobacteraceae and Desulfovibrionaceae were the dominant diazotrophs. Community statistical analyses suggested that both mangrove tree species and additional environmental variables played important roles in shaping total microbial and potential diazotroph communities in mangrove rhizospheres. In contrast to the total microbial community investigated by analysis of 16S rRNA gene sequences, most of the dominant diazotrophic groups identified by nifH gene sequences were significantly different among mangrove species. The dominant diazotrophs of the family Desulfobacteraceae were positively correlated with total phosphorus, but negatively correlated with the nitrogen to phosphorus ratio. The Pseudomonadaceae were positively correlated with the concentration of available potassium, suggesting that diazotrophs potentially play an important role in biogeochemical cycles, such as those of nitrogen, phosphorus, sulfur, and potassium, in the mangrove ecosystem.

  11. Diversity and Structure of Diazotrophic Communities in Mangrove Rhizosphere, Revealed by High-Throughput Sequencing

    Directory of Open Access Journals (Sweden)

    Yanying Zhang

    2017-10-01

    Full Text Available Diazotrophic communities make an essential contribution to the productivity through providing new nitrogen. However, knowledge of the roles that both mangrove tree species and geochemical parameters play in shaping mangove rhizosphere diazotrophic communities is still elusive. Here, a comprehensive examination of the diversity and structure of microbial communities in the rhizospheres of three mangrove species, Rhizophora apiculata, Avicennia marina, and Ceriops tagal, was undertaken using high-throughput sequencing of the 16S rRNA and nifH genes. Our results revealed a great diversity of both the total microbial composition and the diazotrophic composition specifically in the mangrove rhizosphere. Deltaproteobacteria and Gammaproteobacteria were both ubiquitous and dominant, comprising an average of 45.87 and 86.66% of total microbial and diazotrophic communities, respectively. Sulfate-reducing bacteria belonging to the Desulfobacteraceae and Desulfovibrionaceae were the dominant diazotrophs. Community statistical analyses suggested that both mangrove tree species and additional environmental variables played important roles in shaping total microbial and potential diazotroph communities in mangrove rhizospheres. In contrast to the total microbial community investigated by analysis of 16S rRNA gene sequences, most of the dominant diazotrophic groups identified by nifH gene sequences were significantly different among mangrove species. The dominant diazotrophs of the family Desulfobacteraceae were positively correlated with total phosphorus, but negatively correlated with the nitrogen to phosphorus ratio. The Pseudomonadaceae were positively correlated with the concentration of available potassium, suggesting that diazotrophs potentially play an important role in biogeochemical cycles, such as those of nitrogen, phosphorus, sulfur, and potassium, in the mangrove ecosystem.

  12. Exploring fungal diversity in deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing

    Science.gov (United States)

    Zhang, Xiao-Yong; Wang, Guang-Hua; Xu, Xin-Ya; Nong, Xu-Hua; Wang, Jie; Amin, Muhammad; Qi, Shu-Hua

    2016-10-01

    The present study investigated the fungal diversity in four different deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing of the nuclear ribosomal internal transcribed spacer-1 (ITS1). A total of 40,297 fungal ITS1 sequences clustered into 420 operational taxonomic units (OTUs) with 97% sequence similarity and 170 taxa were recovered from these sediments. Most ITS1 sequences (78%) belonged to the phylum Ascomycota, followed by Basidiomycota (17.3%), Zygomycota (1.5%) and Chytridiomycota (0.8%), and a small proportion (2.4%) belonged to unassigned fungal phyla. Compared with previous studies on fungal diversity of sediments from deep-sea environments by culture-dependent approach and clone library analysis, the present result suggested that Illumina sequencing had been dramatically accelerating the discovery of fungal community of deep-sea sediments. Furthermore, our results revealed that Sordariomycetes was the most diverse and abundant fungal class in this study, challenging the traditional view that the diversity of Sordariomycetes phylotypes was low in the deep-sea environments. In addition, more than 12 taxa accounted for 21.5% sequences were found to be rarely reported as deep-sea fungi, suggesting the deep-sea sediments from Okinawa Trough harbored a plethora of different fungal communities compared with other deep-sea environments. To our knowledge, this study is the first exploration of the fungal diversity in deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing.

  13. Differential Expression and Functional Analysis of High-Throughput -Omics Data Using Open Source Tools.

    Science.gov (United States)

    Kebschull, Moritz; Fittler, Melanie Julia; Demmer, Ryan T; Papapanou, Panos N

    2017-01-01

    Today, -omics analyses, including the systematic cataloging of messenger RNA and microRNA sequences or DNA methylation patterns in a cell population, organ, or tissue sample, allow for an unbiased, comprehensive genome-level analysis of complex diseases, offering a large advantage over earlier "candidate" gene or pathway analyses. A primary goal in the analysis of these high-throughput assays is the detection of those features among several thousand that differ between different groups of samples. In the context of oral biology, our group has successfully utilized -omics technology to identify key molecules and pathways in different diagnostic entities of periodontal disease.A major issue when inferring biological information from high-throughput -omics studies is the fact that the sheer volume of high-dimensional data generated by contemporary technology is not appropriately analyzed using common statistical methods employed in the biomedical sciences.In this chapter, we outline a robust and well-accepted bioinformatics workflow for the initial analysis of -omics data generated using microarrays or next-generation sequencing technology using open-source tools. Starting with quality control measures and necessary preprocessing steps for data originating from different -omics technologies, we next outline a differential expression analysis pipeline that can be used for data from both microarray and sequencing experiments, and offers the possibility to account for random or fixed effects. Finally, we present an overview of the possibilities for a functional analysis of the obtained data.

  14. Whole-exome sequencing and high throughput genotyping identified KCNJ11 as the thirteenth MODY gene.

    Science.gov (United States)

    Bonnefond, Amélie; Philippe, Julien; Durand, Emmanuelle; Dechaume, Aurélie; Huyvaert, Marlène; Montagne, Louise; Marre, Michel; Balkau, Beverley; Fajardy, Isabelle; Vambergue, Anne; Vatin, Vincent; Delplanque, Jérôme; Le Guilcher, David; De Graeve, Franck; Lecoeur, Cécile; Sand, Olivier; Vaxillaire, Martine; Froguel, Philippe

    2012-01-01

    Maturity-onset of the young (MODY) is a clinically heterogeneous form of diabetes characterized by an autosomal-dominant mode of inheritance, an onset before the age of 25 years, and a primary defect in the pancreatic beta-cell function. Approximately 30% of MODY families remain genetically unexplained (MODY-X). Here, we aimed to use whole-exome sequencing (WES) in a four-generation MODY-X family to identify a new susceptibility gene for MODY. WES (Agilent-SureSelect capture/Illumina-GAIIx sequencing) was performed in three affected and one non-affected relatives in the MODY-X family. We then performed a high-throughput multiplex genotyping (Illumina-GoldenGate assay) of the putative causal mutations in the whole family and in 406 controls. A linkage analysis was also carried out. By focusing on variants of interest (i.e. gains of stop codon, frameshift, non-synonymous and splice-site variants not reported in dbSNP130) present in the three affected relatives and not present in the control, we found 69 mutations. However, as WES was not uniform between samples, a total of 324 mutations had to be assessed in the whole family and in controls. Only one mutation (p.Glu227Lys in KCNJ11) co-segregated with diabetes in the family (with a LOD-score of 3.68). No KCNJ11 mutation was found in 25 other MODY-X unrelated subjects. Beyond neonatal diabetes mellitus (NDM), KCNJ11 is also a MODY gene ('MODY13'), confirming the wide spectrum of diabetes related phenotypes due to mutations in NDM genes (i.e. KCNJ11, ABCC8 and INS). Therefore, the molecular diagnosis of MODY should include KCNJ11 as affected carriers can be ideally treated with oral sulfonylureas.

  15. Whole-exome sequencing and high throughput genotyping identified KCNJ11 as the thirteenth MODY gene.

    Directory of Open Access Journals (Sweden)

    Amélie Bonnefond

    Full Text Available BACKGROUND: Maturity-onset of the young (MODY is a clinically heterogeneous form of diabetes characterized by an autosomal-dominant mode of inheritance, an onset before the age of 25 years, and a primary defect in the pancreatic beta-cell function. Approximately 30% of MODY families remain genetically unexplained (MODY-X. Here, we aimed to use whole-exome sequencing (WES in a four-generation MODY-X family to identify a new susceptibility gene for MODY. METHODOLOGY: WES (Agilent-SureSelect capture/Illumina-GAIIx sequencing was performed in three affected and one non-affected relatives in the MODY-X family. We then performed a high-throughput multiplex genotyping (Illumina-GoldenGate assay of the putative causal mutations in the whole family and in 406 controls. A linkage analysis was also carried out. PRINCIPAL FINDINGS: By focusing on variants of interest (i.e. gains of stop codon, frameshift, non-synonymous and splice-site variants not reported in dbSNP130 present in the three affected relatives and not present in the control, we found 69 mutations. However, as WES was not uniform between samples, a total of 324 mutations had to be assessed in the whole family and in controls. Only one mutation (p.Glu227Lys in KCNJ11 co-segregated with diabetes in the family (with a LOD-score of 3.68. No KCNJ11 mutation was found in 25 other MODY-X unrelated subjects. CONCLUSIONS/SIGNIFICANCE: Beyond neonatal diabetes mellitus (NDM, KCNJ11 is also a MODY gene ('MODY13', confirming the wide spectrum of diabetes related phenotypes due to mutations in NDM genes (i.e. KCNJ11, ABCC8 and INS. Therefore, the molecular diagnosis of MODY should include KCNJ11 as affected carriers can be ideally treated with oral sulfonylureas.

  16. Combining Amplification Typing of L1 Active Subfamilies (ATLAS) with High-Throughput Sequencing.

    Science.gov (United States)

    Rahbari, Raheleh; Badge, Richard M

    2016-01-01

    With the advent of new generations of high-throughput sequencing technologies, the catalog of human genome variants created by retrotransposon activity is expanding rapidly. However, despite these advances in describing L1 diversity and the fact that L1 must retrotranspose in the germline or prior to germline partitioning to be evolutionarily successful, direct assessment of de novo L1 retrotransposition in the germline or early embryogenesis has not been achieved for endogenous L1 elements. A direct study of de novo L1 retrotransposition into susceptible loci within sperm DNA (Freeman et al., Hum Mutat 32(8):978-988, 2011) suggested that the rate of L1 retrotransposition in the germline is much lower than previously estimated (ATLAS L1 display technique (Badge et al., Am J Hum Genet 72(4):823-838, 2003) to investigate de novo L1 retrotransposition in human genomes. In this chapter, we describe how we combined a high-coverage ATLAS variant with high-throughput sequencing, achieving 11-25× sequence depth per single amplicon, to study L1 retrotransposition in whole genome amplified (WGA) DNAs.

  17. Improving High-Throughput Sequencing Approaches for Reconstructing the Evolutionary Dynamics of Upper Paleolithic Human Groups

    DEFF Research Database (Denmark)

    Seguin-Orlando, Andaine

    the development and testing of innovative molecular approaches aiming at improving the amount of informative HTS data one can recover from ancient DNA extracts. We have characterized important ligation and amplification biases in the sequencing library building and enrichment steps, which can impede further...... been mainly driven by the development of High-Throughput DNA Sequencing (HTS) technologies but also by the implementation of novel molecular tools tailored to the manipulation of ultra short and damaged DNA molecules. Our ability to retrieve traces of genetic material has tremendously improved, pushing......, that impact on the overall efficacy of the method. In a second part, we implemented some of these molecular tools to the processing of five Upper Paleolithic human samples from the Kostenki and Sunghir sites in Western Eurasia, in order to reconstruct the deep genomic history of European populations...

  18. Accurate molecular diagnosis of phenylketonuria and tetrahydrobiopterin-deficient hyperphenylalaninemias using high-throughput targeted sequencing

    Science.gov (United States)

    Trujillano, Daniel; Perez, Belén; González, Justo; Tornador, Cristian; Navarrete, Rosa; Escaramis, Georgia; Ossowski, Stephan; Armengol, Lluís; Cornejo, Verónica; Desviat, Lourdes R; Ugarte, Magdalena; Estivill, Xavier

    2014-01-01

    Genetic diagnostics of phenylketonuria (PKU) and tetrahydrobiopterin (BH4) deficient hyperphenylalaninemia (BH4DH) rely on methods that scan for known mutations or on laborious molecular tools that use Sanger sequencing. We have implemented a novel and much more efficient strategy based on high-throughput multiplex-targeted resequencing of four genes (PAH, GCH1, PTS, and QDPR) that, when affected by loss-of-function mutations, cause PKU and BH4DH. We have validated this approach in a cohort of 95 samples with the previously known PAH, GCH1, PTS, and QDPR mutations and one control sample. Pooled barcoded DNA libraries were enriched using a custom NimbleGen SeqCap EZ Choice array and sequenced using a HiSeq2000 sequencer. The combination of several robust bioinformatics tools allowed us to detect all known pathogenic mutations (point mutations, short insertions/deletions, and large genomic rearrangements) in the 95 samples, without detecting spurious calls in these genes in the control sample. We then used the same capture assay in a discovery cohort of 11 uncharacterized HPA patients using a MiSeq sequencer. In addition, we report the precise characterization of the breakpoints of four genomic rearrangements in PAH, including a novel deletion of 899 bp in intron 3. Our study is a proof-of-principle that high-throughput-targeted resequencing is ready to substitute classical molecular methods to perform differential genetic diagnosis of hyperphenylalaninemias, allowing the establishment of specifically tailored treatments a few days after birth. PMID:23942198

  19. High-throughput genome sequencing of two Listeria monocytogenes clinical isolates during a large foodborne outbreak

    Directory of Open Access Journals (Sweden)

    Trout-Yakel Keri M

    2010-02-01

    Full Text Available Abstract Background A large, multi-province outbreak of listeriosis associated with ready-to-eat meat products contaminated with Listeria monocytogenes serotype 1/2a occurred in Canada in 2008. Subtyping of outbreak-associated isolates using pulsed-field gel electrophoresis (PFGE revealed two similar but distinct AscI PFGE patterns. High-throughput pyrosequencing of two L. monocytogenes isolates was used to rapidly provide the genome sequence of the primary outbreak strain and to investigate the extent of genetic diversity associated with a change of a single restriction enzyme fragment during PFGE. Results The chromosomes were collinear, but differences included 28 single nucleotide polymorphisms (SNPs and three indels, including a 33 kbp prophage that accounted for the observed difference in AscI PFGE patterns. The distribution of these traits was assessed within further clinical, environmental and food isolates associated with the outbreak, and this comparison indicated that three distinct, but highly related strains may have been involved in this nationwide outbreak. Notably, these two isolates were found to harbor a 50 kbp putative mobile genomic island encoding translocation and efflux functions that has not been observed in other Listeria genomes. Conclusions High-throughput genome sequencing provided a more detailed real-time assessment of genetic traits characteristic of the outbreak strains than could be achieved with routine subtyping methods. This study confirms that the latest generation of DNA sequencing technologies can be applied during high priority public health events, and laboratories need to prepare for this inevitability and assess how to properly analyze and interpret whole genome sequences in the context of molecular epidemiology.

  20. Detection of genomic variation by selection of a 9 mb DNA region and high throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Sergey I Nikolaev

    Full Text Available Detection of the rare polymorphisms and causative mutations of genetic diseases in a targeted genomic area has become a major goal in order to understand genomic and phenotypic variability. We have interrogated repeat-masked regions of 8.9 Mb on human chromosomes 21 (7.8 Mb and 7 (1.1 Mb from an individual from the International HapMap Project (NA12872. We have optimized a method of genomic selection for high throughput sequencing. Microarray-based selection and sequencing resulted in 260-fold enrichment, with 41% of reads mapping to the target region. 83% of SNPs in the targeted region had at least 4-fold sequence coverage and 54% at least 15-fold. When assaying HapMap SNPs in NA12872, our sequence genotypes are 91.3% concordant in regions with coverage > or = 4-fold, and 97.9% concordant in regions with coverage > or = 15-fold. About 81% of the SNPs recovered with both thresholds are listed in dbSNP. We observed that regions with low sequence coverage occur in close proximity to low-complexity DNA. Validation experiments using Sanger sequencing were performed for 46 SNPs with 15-20 fold coverage, with a confirmation rate of 96%, suggesting that DNA selection provides an accurate and cost-effective method for identifying rare genomic variants.

  1. SEED 2: a user-friendly platform for amplicon high-throughput sequencing data analyses.

    Science.gov (United States)

    Vetrovský, Tomáš; Baldrian, Petr; Morais, Daniel; Berger, Bonnie

    2018-02-14

    Modern molecular methods have increased our ability to describe microbial communities. Along with the advances brought by new sequencing technologies, we now require intensive computational resources to make sense of the large numbers of sequences continuously produced. The software developed by the scientific community to address this demand, although very useful, require experience of the command-line environment, extensive training and have steep learning curves, limiting their use. We created SEED 2, a graphical user interface for handling high-throughput amplicon-sequencing data under Windows operating systems. SEED 2 is the only sequence visualizer that empowers users with tools to handle amplicon-sequencing data of microbial community markers. It is suitable for any marker genes sequences obtained through Illumina, IonTorrent or Sanger sequencing. SEED 2 allows the user to process raw sequencing data, identify specific taxa, produce of OTU-tables, create sequence alignments and construct phylogenetic trees. Standard dual core laptops with 8 GB of RAM can handle ca. 8 million of Illumina PE 300 bp sequences, ca. 4GB of data. SEED 2 was implemented in Object Pascal and uses internal functions and external software for amplicon data processing. SEED 2 is a freeware software, available at http://www.biomed.cas.cz/mbu/lbwrf/seed/ as a self-contained file, including all the dependencies, and does not require installation. Supplementary data contain a comprehensive list of supported functions. daniel.morais@biomed.cas.cz. Supplementary data are available at Bioinformatics online. © The Author(s) 2018. Published by Oxford University Press.

  2. SNP calling using genotype model selection on high-throughput sequencing data

    KAUST Repository

    You, Na

    2012-01-16

    Motivation: A review of the available single nucleotide polymorphism (SNP) calling procedures for Illumina high-throughput sequencing (HTS) platform data reveals that most rely mainly on base-calling and mapping qualities as sources of error when calling SNPs. Thus, errors not involved in base-calling or alignment, such as those in genomic sample preparation, are not accounted for.Results: A novel method of consensus and SNP calling, Genotype Model Selection (GeMS), is given which accounts for the errors that occur during the preparation of the genomic sample. Simulations and real data analyses indicate that GeMS has the best performance balance of sensitivity and positive predictive value among the tested SNP callers. © The Author 2012. Published by Oxford University Press. All rights reserved.

  3. Tracking TCRβ sequence clonotype expansions during antiviral therapy using high-throughput sequencing of the hypervariable region

    Directory of Open Access Journals (Sweden)

    Mark W Robinson

    2016-04-01

    Full Text Available To maintain a persistent infection viruses such as hepatitis C virus (HCV employ a range of mechanisms that subvert protective T cell responses. The suppression of antigen-specific T cell responses by HCV hinders efforts to profile T cell responses during chronic infection and antiviral therapy. Conventional methods of detecting antigen-specific T cells utilise either antigen stimulation (e.g. ELISpot, proliferation assays, cytokine production or antigen-loaded tetramer staining. This limits the ability to profile T cell responses during chronic infection due to suppressed effector function and the requirement for prior knowledge of antigenic viral peptide sequences. Recently high-throughput sequencing (HTS technologies have been developed for the analysis of T cell repertoires. In the present study we have assessed the feasibility of HTS of the TCRβ complementarity determining region (CDR3 to track T cell expansions in an antigen-independent manner. Using sequential blood samples from HCV-infected individuals undergoing anti-viral therapy we were able to measure the population frequencies of >35,000 TCRβ sequence clonotypes in each individual over the course of 12 weeks. TRBV/TRBJ gene segment usage varied markedly between individuals but remained relatively constant within individuals across the course of therapy. Despite this stable TRBV/TRBJ gene segment usage, a number of TCRβ sequence clonotypes showed dramatic changes in read frequency. These changes could not be linked to therapy outcomes in the present study however the TCRβ CDR3 sequences with the largest fold changes did include sequences with identical TRBV/TRBJ gene segment usage and high joining region homology to previously published CDR3 sequences from HCV-specific T cells targeting the HLA-B*0801-restricted 1395HSKKKCDEL1403 and HLA-A*0101–restricted 1435ATDALMTGY1443 epitopes. The pipeline developed in this proof of concept study provides a platform for the design of

  4. SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data

    Science.gov (United States)

    Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).

  5. SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data.

    Science.gov (United States)

    Miller, Mark P; Knaus, Brian J; Mullins, Thomas D; Haig, Susan M

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25 bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).

  6. Reliable Detection of Herpes Simplex Virus Sequence Variation by High-Throughput Resequencing.

    Science.gov (United States)

    Morse, Alison M; Calabro, Kaitlyn R; Fear, Justin M; Bloom, David C; McIntyre, Lauren M

    2017-08-16

    High-throughput sequencing (HTS) has resulted in data for a number of herpes simplex virus (HSV) laboratory strains and clinical isolates. The knowledge of these sequences has been critical for investigating viral pathogenicity. However, the assembly of complete herpesviral genomes, including HSV, is complicated due to the existence of large repeat regions and arrays of smaller reiterated sequences that are commonly found in these genomes. In addition, the inherent genetic variation in populations of isolates for viruses and other microorganisms presents an additional challenge to many existing HTS sequence assembly pipelines. Here, we evaluate two approaches for the identification of genetic variants in HSV1 strains using Illumina short read sequencing data. The first, a reference-based approach, identifies variants from reads aligned to a reference sequence and the second, a de novo assembly approach, identifies variants from reads aligned to de novo assembled consensus sequences. Of critical importance for both approaches is the reduction in the number of low complexity regions through the construction of a non-redundant reference genome. We compared variants identified in the two methods. Our results indicate that approximately 85% of variants are identified regardless of the approach. The reference-based approach to variant discovery captures an additional 15% representing variants divergent from the HSV1 reference possibly due to viral passage. Reference-based approaches are significantly less labor-intensive and identify variants across the genome where de novo assembly-based approaches are limited to regions where contigs have been successfully assembled. In addition, regions of poor quality assembly can lead to false variant identification in de novo consensus sequences. For viruses with a well-assembled reference genome, a reference-based approach is recommended.

  7. Improving Hierarchical Models Using Historical Data with Applications in High-Throughput Genomics Data Analysis.

    Science.gov (United States)

    Li, Ben; Li, Yunxiao; Qin, Zhaohui S

    2017-06-01

    Modern high-throughput biotechnologies such as microarray and next generation sequencing produce a massive amount of information for each sample assayed. However, in a typical high-throughput experiment, only limited amount of data are observed for each individual feature, thus the classical 'large p , small n ' problem. Bayesian hierarchical model, capable of borrowing strength across features within the same dataset, has been recognized as an effective tool in analyzing such data. However, the shrinkage effect, the most prominent feature of hierarchical features, can lead to undesirable over-correction for some features. In this work, we discuss possible causes of the over-correction problem and propose several alternative solutions. Our strategy is rooted in the fact that in the Big Data era, large amount of historical data are available which should be taken advantage of. Our strategy presents a new framework to enhance the Bayesian hierarchical model. Through simulation and real data analysis, we demonstrated superior performance of the proposed strategy. Our new strategy also enables borrowing information across different platforms which could be extremely useful with emergence of new technologies and accumulation of data from different platforms in the Big Data era. Our method has been implemented in R package "adaptiveHM", which is freely available from https://github.com/benliemory/adaptiveHM.

  8. [Study on Microbial Diversity of Peri-implantitis Subgingival by High-throughput Sequencing].

    Science.gov (United States)

    Li, Zhi-jie; Wang, Shao-guo; Li, Yue-hong; Tu, Dong-xiang; Liu, Shi-yun; Nie, Hong-bing; Li, Zhi-qiang; Zhang, Ju-mei

    2015-07-01

    To study microbial diversity of peri-implantitis subgingival with high-throughput sequencing, and investigate microbiological etiology of peri-implantitis. Subgingival plaques were sampled from the patients with peri-implantitis (D group) and non-peri-implantitis subjects (N group). The microbiological diversity of the subgingival plaques was detected by sequencing V4 region of 16S rRNA with Illumina Miseq platform. The diversity of the community structure was analyzed using Mothur software. A total of 156 507 gene sequences were detected in nine samples and 4 402 operational taxonomic units (OTUs) were found. Selenomonas, Pseudomonas, and Fusobacterium were dominant bacteria in D group, while Fusobacterium, Veillonella and Streptococcus were dominant bacteria in N group. Differences between peri-implantitis and non-peri-implantitis bacterial communities were observed at all phylogenetic levels by LEfSe, which was also found in PcoA test. The occurrence of peri-implantitis is not only related to periodontitis pathogenic microbe, but also related with the changes of oral microbial community structure. Treponema, Herbaspirillum, Butyricimonas and Phaeobacte may be closely related to the occurrence and development of peri-implantitis.

  9. Direct metagenomic detection of viral pathogens in nasal and fecal specimens using an unbiased high-throughput sequencing approach.

    Directory of Open Access Journals (Sweden)

    Shota Nakamura

    Full Text Available With the severe acute respiratory syndrome epidemic of 2003 and renewed attention on avian influenza viral pandemics, new surveillance systems are needed for the earlier detection of emerging infectious diseases. We applied a "next-generation" parallel sequencing platform for viral detection in nasopharyngeal and fecal samples collected during seasonal influenza virus (Flu infections and norovirus outbreaks from 2005 to 2007 in Osaka, Japan. Random RT-PCR was performed to amplify RNA extracted from 0.1-0.25 ml of nasopharyngeal aspirates (N = 3 and fecal specimens (N = 5, and more than 10 microg of cDNA was synthesized. Unbiased high-throughput sequencing of these 8 samples yielded 15,298-32,335 (average 24,738 reads in a single 7.5 h run. In nasopharyngeal samples, although whole genome analysis was not available because the majority (>90% of reads were host genome-derived, 20-460 Flu-reads were detected, which was sufficient for subtype identification. In fecal samples, bacteria and host cells were removed by centrifugation, resulting in gain of 484-15,260 reads of norovirus sequence (78-98% of the whole genome was covered, except for one specimen that was under-detectable by RT-PCR. These results suggest that our unbiased high-throughput sequencing approach is useful for directly detecting pathogenic viruses without advance genetic information. Although its cost and technological availability make it unlikely that this system will very soon be the diagnostic standard worldwide, this system could be useful for the earlier discovery of novel emerging viruses and bioterrorism, which are difficult to detect with conventional procedures.

  10. Annotating Protein Functional Residues by Coupling High-Throughput Fitness Profile and Homologous-Structure Analysis.

    Science.gov (United States)

    Du, Yushen; Wu, Nicholas C; Jiang, Lin; Zhang, Tianhao; Gong, Danyang; Shu, Sara; Wu, Ting-Ting; Sun, Ren

    2016-11-01

    Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available. To fully comprehend the diverse functions of a protein, it is essential to understand the functionality of individual residues. Current methods are highly dependent on evolutionary sequence conservation, which is

  11. Genotyping by PCR and High-Throughput Sequencing of Commercial Probiotic Products Reveals Composition Biases.

    Directory of Open Access Journals (Sweden)

    Wesley Morovic

    2016-11-01

    Full Text Available Recent advances in microbiome research have brought renewed focus on beneficial bacteria, many of which are available in food and dietary supplements. Although probiotics have historically been defined as microorganisms that convey health benefits when ingested in sufficient viable amounts, this description now includes the stipulation well defined strains, encompassing definitive taxonomy for consumer consideration and regulatory oversight. Here, we evaluated 52 commercial dietary supplements covering a range of labeled species, and determined their content using plate counting, targeted genotyping. Additionally, strain identities were assessed using methods recently published by the United States Pharmacopeial Convention. We also determined the relative abundance of individual bacteria by high-throughput sequencing (HTS of the 16S rRNA sequence using paired-end 2x250bp Illumina MiSeq technology. Using multiple methods, we tested the hypothesis that products do contain the quantitative amount of labeled bacteria, and qualitative list of labeled microbial species. We found that 17 samples (33% were below label claim for CFU prior to their expiration dates. A multiplexed-PCR scheme showed that only 30/52 (58% of the products contained a correctly labeled classification, with issues encompassing incorrect taxonomy, missing species and un-labeled species. The HTS revealed that many blended products consisted predominantly of Lactobacillus acidophilus and Bifidobacterium animalis subsp. lactis. These results highlight the need for reliable methods to qualitatively determine the correct taxonomy and quantitatively ascertain the relative amounts of mixed microbial populations in commercial probiotic products.

  12. Comprehensive evaluation and optimization of amplicon library preparation methods for high-throughput antibody sequencing.

    Science.gov (United States)

    Menzel, Ulrike; Greiff, Victor; Khan, Tarik A; Haessler, Ulrike; Hellmann, Ina; Friedensohn, Simon; Cook, Skylar C; Pogson, Mark; Reddy, Sai T

    2014-01-01

    High-throughput sequencing (HTS) of antibody repertoire libraries has become a powerful tool in the field of systems immunology. However, numerous sources of bias in HTS workflows may affect the obtained antibody repertoire data. A crucial step in antibody library preparation is the addition of short platform-specific nucleotide adapter sequences. As of yet, the impact of the method of adapter addition on experimental library preparation and the resulting antibody repertoire HTS datasets has not been thoroughly investigated. Therefore, we compared three standard library preparation methods by performing Illumina HTS on antibody variable heavy genes from murine antibody-secreting cells. Clonal overlap and rank statistics demonstrated that the investigated methods produced equivalent HTS datasets. PCR-based methods were experimentally superior to ligation with respect to speed, efficiency, and practicality. Finally, using a two-step PCR based method we established a protocol for antibody repertoire library generation, beginning from inputs as low as 1 ng of total RNA. In summary, this study represents a major advance towards a standardized experimental framework for antibody HTS, thus opening up the potential for systems-based, cross-experiment meta-analyses of antibody repertoires.

  13. High-throughput gender identification of penguin species using melting curve analysis.

    Science.gov (United States)

    Tseng, Chao-Neng; Chang, Yung-Ting; Chiu, Hui-Tzu; Chou, Yii-Cheng; Huang, Hurng-Wern; Cheng, Chien-Chung; Liao, Ming-Hui; Chang, Hsueh-Wei

    2014-04-03

    Most species of penguins are sexual monomorphic and therefore it is difficult to visually identify their genders for monitoring population stability in terms of sex ratio analysis. In this study, we evaluated the suitability using melting curve analysis (MCA) for high-throughput gender identification of penguins. Preliminary test indicated that the Griffiths's P2/P8 primers were not suitable for MCA analysis. Based on sequence alignment of Chromo-Helicase-DNA binding protein (CHD)-W and CHD-Z genes from four species of penguins (Pygoscelis papua, Aptenodytes patagonicus, Spheniscus magellanicus, and Eudyptes chrysocome), we redesigned forward primers for the CHD-W/CHD-Z-common region (PGU-ZW2) and the CHD-W-specific region (PGU-W2) to be used in combination with the reverse Griffiths's P2 primer. When tested with P. papua samples, PCR using P2/PGU-ZW2 and P2/PGU-W2 primer sets generated two amplicons of 148- and 356-bp, respectively, which were easily resolved in 1.5% agarose gels. MCA analysis indicated the melting temperature (Tm) values for P2/PGU-ZW2 and P2/PGU-W2 amplicons of P. papua samples were 79.75°C-80.5°C and 81.0°C-81.5°C, respectively. Females displayed both ZW-common and W-specific Tm peaks, whereas male was positive only for ZW-common peak. Taken together, our redesigned primers coupled with MCA analysis allows precise high throughput gender identification for P. papua, and potentially for other penguin species such as A. patagonicus, S. magellanicus, and E. chrysocome as well.

  14. Environmental microbiology through the lens of high-throughput DNA sequencing: synopsis of current platforms and bioinformatics approaches.

    Science.gov (United States)

    Logares, Ramiro; Haverkamp, Thomas H A; Kumar, Surendra; Lanzén, Anders; Nederbragt, Alexander J; Quince, Christopher; Kauserud, Håvard

    2012-10-01

    The incursion of High-Throughput Sequencing (HTS) in environmental microbiology brings unique opportunities and challenges. HTS now allows a high-resolution exploration of the vast taxonomic and metabolic diversity present in the microbial world, which can provide an exceptional insight on global ecosystem functioning, ecological processes and evolution. This exploration has also economic potential, as we will have access to the evolutionary innovation present in microbial metabolisms, which could be used for biotechnological development. HTS is also challenging the research community, and the current bottleneck is present in the data analysis side. At the moment, researchers are in a sequence data deluge, with sequencing throughput advancing faster than the computer power needed for data analysis. However, new tools and approaches are being developed constantly and the whole process could be depicted as a fast co-evolution between sequencing technology, informatics and microbiologists. In this work, we examine the most popular and recently commercialized HTS platforms as well as bioinformatics methods for data handling and analysis used in microbial metagenomics. This non-exhaustive review is intended to serve as a broad state-of-the-art guide to researchers expanding into this rapidly evolving field. Copyright © 2012 Elsevier B.V. All rights reserved.

  15. MicroRNA from Moringa oleifera: Identification by High Throughput Sequencing and Their Potential Contribution to Plant Medicinal Value.

    Science.gov (United States)

    Pirrò, Stefano; Zanella, Letizia; Kenzo, Maurice; Montesano, Carla; Minutolo, Antonella; Potestà, Marina; Sobze, Martin Sanou; Canini, Antonella; Cirilli, Marco; Muleo, Rosario; Colizzi, Vittorio; Galgani, Andrea

    2016-01-01

    Moringa oleifera is a widespread plant with substantial nutritional and medicinal value. We postulated that microRNAs (miRNAs), which are endogenous, noncoding small RNAs regulating gene expression at the post-transcriptional level, might contribute to the medicinal properties of plants of this species after ingestion into human body, regulating human gene expression. However, the knowledge is scarce about miRNA in Moringa. Furthermore, in order to test the hypothesis on the pharmacological potential properties of miRNA, we conducted a high-throughput sequencing analysis using the Illumina platform. A total of 31,290,964 raw reads were produced from a library of small RNA isolated from M. oleifera seeds. We identified 94 conserved and two novel miRNAs that were validated by qRT-PCR assays. Results from qRT-PCR trials conducted on the expression of 20 Moringa miRNA showed that are conserved across multiple plant species as determined by their detection in tissue of other common crop plants. In silico analyses predicted target genes for the conserved miRNA that in turn allowed to relate the miRNAs to the regulation of physiological processes. Some of the predicted plant miRNAs have functional homology to their mammalian counterparts and regulated human genes when they were transfected into cell lines. To our knowledge, this is the first report of discovering M. oleifera miRNAs based on high-throughput sequencing and bioinformatics analysis and we provided new insight into a potential cross-species control of human gene expression. The widespread cultivation and consumption of M. oleifera, for nutritional and medicinal purposes, brings humans into close contact with products and extracts of this plant species. The potential for miRNA transfer should be evaluated as one possible mechanism of action to account for beneficial properties of this valuable species.

  16. Investigation of Human Cancers for Retrovirus by Low-Stringency Target Enrichment and High-Throughput Sequencing

    DEFF Research Database (Denmark)

    Vinner, Lasse; Mourier, Tobias; Friis-Nielsen, Jens

    2015-01-01

    -stringency in-solution hybridization method enables detection of discovery of hitherto unknown viral sequences by high-throughput sequencing. The sensitivity was sufficient to detect retroviral...... sequences in clinical samples. We used this method to conduct an investigation for novel retrovirus in samples from three cancer types. In accordance with recent studies our investigation revealed no retroviral infections in human B-cell lymphoma cells, cutaneous T-cell lymphoma or colorectal cancer...

  17. Identification of microRNAs and their targets in Finger millet by high throughput sequencing.

    Science.gov (United States)

    Usha, S; Jyothi, M N; Sharadamma, N; Dixit, Rekha; Devaraj, V R; Nagesh Babu, R

    2015-12-15

    MicroRNAs are short non-coding RNAs which play an important role in regulating gene expression by mRNA cleavage or by translational repression. The majority of identified miRNAs were evolutionarily conserved; however, others expressed in a species-specific manner. Finger millet is an important cereal crop; nonetheless, no practical information is available on microRNAs to date. In this study, we have identified 95 conserved microRNAs belonging to 39 families and 3 novel microRNAs by high throughput sequencing. For the identified conserved and novel miRNAs a total of 507 targets were predicted. 11 miRNAs were validated and tissue specificity was determined by stem loop RT-qPCR, Northern blot. GO analyses revealed targets of miRNA were involved in wide range of regulatory functions. This study implies large number of known and novel miRNAs found in Finger millet which may play important role in growth and development. Copyright © 2015 Elsevier B.V. All rights reserved.

  18. High-throughput sequencing, characterization and detection of new and conserved cucumber miRNAs.

    Directory of Open Access Journals (Sweden)

    Germán Martínez

    Full Text Available Micro RNAS (miRNAs are a class of endogenous small non coding RNAs involved in the post-transcriptional regulation of gene expression. In plants, a great number of conserved and specific miRNAs, mainly arising from model species, have been identified to date. However less is known about the diversity of these regulatory RNAs in vegetal species with agricultural and/or horticultural importance. Here we report a combined approach of bioinformatics prediction, high-throughput sequencing data and molecular methods to analyze miRNAs populations in cucumber (Cucumis sativus plants. A set of 19 conserved and 6 known but non-conserved miRNA families were found in our cucumber small RNA dataset. We also identified 7 (3 with their miRNA* strand not previously described miRNAs, candidates to be cucumber-specific. To validate their description these new C. sativus miRNAs were detected by northern blot hybridization. Additionally, potential targets for most conserved and new miRNAs were identified in cucumber genome.In summary, in this study we have identified, by first time, conserved, known non-conserved and new miRNAs arising from an agronomically important species such as C. sativus. The detection of this complex population of regulatory small RNAs suggests that similarly to that observe in other plant species, cucumber miRNAs may possibly play an important role in diverse biological and metabolic processes.

  19. Insight into the transcriptome of Arthrobotrys conoides using high throughput sequencing.

    Science.gov (United States)

    Ramesh, Pandit; Reena, Patel; Amitbikram, Mohapatra; Chaitanya, Joshi; Anju, Kunjadia

    2015-12-01

    Arthrobotrys conoides is a nematode-trapping fungus belonging to Orbiliales, Ascomycota group, and traps prey nematodes by means of adhesive network. Fungus has a potential to be used as a biocontrol agent against plant parasitic nematodes. In the present study, we characterized the transcriptome of A. conoides using high-throughput sequencing technology and characterized its virulence unigenes. Total 7,255 cDNA contigs with an average length of 425 bp were generated and 6184 (61.81%) transcripts were functionally annotated and characterized. Majority of unigenes were found analogous to the genes of plant pathogenic fungi. A total of 1749 transcripts were found to be orthologous with eukaryotic proteins of KOG database. Several carbohydrate active enzymes and peptidases were identified. We also analyzed classically and nonclassically secreted proteins and confirmed by BLASTP against fungal secretome database. A total of 916 contigs were analogous to 556 unique proteins of Pathogen Host Interaction (PHI) database. Further, we identified 91 unigenes homologous to the database of fungal virulence factor (DFVF). A total of 104 putative protein kinases coding transcripts were identified by BLASTP against KinBase database, which are major players in signaling pathways. This study provides a comprehensive look at the transcriptome of A. conoides and the identified unigenes might have a role in catching and killing prey nematodes by A. conoides. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  20. LncRNA Expression Profile of Human Thoracic Aortic Dissection by High-Throughput Sequencing.

    Science.gov (United States)

    Sun, Jie; Chen, Guojun; Jing, Yuanwen; He, Xiang; Dong, Jianting; Zheng, Junmeng; Zou, Meisheng; Li, Hairui; Wang, Shifei; Sun, Yili; Liao, Wangjun; Liao, Yulin; Feng, Li; Bin, Jianping

    2018-01-01

    In this study, the long non-coding RNA (lncRNA) expression profile in human thoracic aortic dissection (TAD), a highly lethal cardiovascular disease, was investigated. Human TAD (n=3) and normal aortic tissues (NA) (n=3) were examined by high-throughput sequencing. Bioinformatics analyses were performed to predict the roles of aberrantly expressed lncRNAs. Quantitative real-time polymerase chain reaction (qRT-PCR) was applied to validate the results. A total of 269 lncRNAs (159 up-regulated and 110 down-regulated) and 2, 255 mRNAs (1 294 up-regulated and 961 down-regulated) were aberrantly expressed in human TAD (fold-change> 1.5, PTAD than in NA. The predicted binding motifs of three up-regulated lncRNAs (ENSG00000248508, ENSG00000226530, and EG00000259719) were correlated with up-regulated RUNX1 (R=0.982, PTAD. These findings suggest that lncRNAs are novel potential therapeutic targets for human TAD. © 2018 The Author(s). Published by S. Karger AG, Basel.

  1. High-throughput sequencing enhanced phage display enables the identification of patient-specific epitope motifs in serum

    DEFF Research Database (Denmark)

    Christiansen, Anders; Kringelum, Jens Vindahl; Hansen, Christian Skjødt

    2015-01-01

    of the bioinformatic approach was demonstrated by identifying epitopes of a prominent peanut allergen, Ara h 1, in sera from patients with severe peanut allergy. The identified epitopes were confirmed by high-density peptide micro-arrays. The present study demonstrates that high-throughput sequencing can empower phage...

  2. Isolation and characterization of antigen-specific alpaca (Lama pacos) VHH antibodies by biopanning followed by high-throughput sequencing.

    Science.gov (United States)

    Miyazaki, Nobuo; Kiyose, Norihiko; Akazawa, Yoko; Takashima, Mizuki; Hagihara, Yosihisa; Inoue, Naokazu; Matsuda, Tomonari; Ogawa, Ryu; Inoue, Seiya; Ito, Yuji

    2015-09-01

    The antigen-binding domain of camelid dimeric heavy chain antibodies, known as VHH or Nanobody, has much potential in pharmaceutical and industrial applications. To establish the isolation process of antigen-specific VHH, a VHH phage library was constructed with a diversity of 8.4 × 10(7) from cDNA of peripheral blood mononuclear cells of an alpaca (Lama pacos) immunized with a fragment of IZUMO1 (IZUMO1PFF) as a model antigen. By conventional biopanning, 13 antigen-specific VHHs were isolated. The amino acid sequences of these VHHs, designated as N-group VHHs, were very similar to each other (>93% identity). To find more diverse antibodies, we performed high-throughput sequencing (HTS) of VHH genes. By comparing the frequencies of each sequence between before and after biopanning, we found the sequences whose frequencies were increased by biopanning. The top 100 sequences of them were supplied for phylogenic tree analysis. In total 75% of them belonged to N-group VHHs, but the other were phylogenically apart from N-group VHHs (Non N-group). Two of three VHHs selected from non N-group VHHs showed sufficient antigen binding ability. These results suggested that biopanning followed by HTS provided a useful method for finding minor and diverse antigen-specific clones that could not be identified by conventional biopanning. © The Authors 2015. Published by Oxford University Press on behalf of the Japanese Biochemical Society. All rights reserved.

  3. Seasonal diversity and dynamics of haptophytes in the Skagerrak, Norway, explored by high-throughput sequencing

    Science.gov (United States)

    Egge, Elianne Sirnæs; Johannessen, Torill Vik; Andersen, Tom; Eikrem, Wenche; Bittner, Lucie; Larsen, Aud; Sandaa, Ruth-Anne; Edvardsen, Bente

    2015-01-01

    Microalgae in the division Haptophyta play key roles in the marine ecosystem and in global biogeochemical processes. Despite their ecological importance, knowledge on seasonal dynamics, community composition and abundance at the species level is limited due to their small cell size and few morphological features visible under the light microscope. Here, we present unique data on haptophyte seasonal diversity and dynamics from two annual cycles, with the taxonomic resolution and sampling depth obtained with high-throughput sequencing. From outer Oslofjorden, S Norway, nano- and picoplanktonic samples were collected monthly for 2 years, and the haptophytes targeted by amplification of RNA/cDNA with Haptophyta-specific 18S rDNA V4 primers. We obtained 156 operational taxonomic units (OTUs), from c. 400.000 454 pyrosequencing reads, after rigorous bioinformatic filtering and clustering at 99.5%. Most OTUs represented uncultured and/or not yet 18S rDNA-sequenced species. Haptophyte OTU richness and community composition exhibited high temporal variation and significant yearly periodicity. Richness was highest in September–October (autumn) and lowest in April–May (spring). Some taxa were detected all year, such as Chrysochromulina simplex, Emiliania huxleyi and Phaeocystis cordata, whereas most calcifying coccolithophores only appeared from summer to early winter. We also revealed the seasonal dynamics of OTUs representing putative novel classes (clades HAP-3–5) or orders (clades D, E, F). Season, light and temperature accounted for 29% of the variation in OTU composition. Residual variation may be related to biotic factors, such as competition and viral infection. This study provides new, in-depth knowledge on seasonal diversity and dynamics of haptophytes in North Atlantic coastal waters. PMID:25893259

  4. Seasonal diversity and dynamics of haptophytes in the Skagerrak, Norway, explored by high-throughput sequencing.

    Science.gov (United States)

    Egge, Elianne Sirnaes; Johannessen, Torill Vik; Andersen, Tom; Eikrem, Wenche; Bittner, Lucie; Larsen, Aud; Sandaa, Ruth-Anne; Edvardsen, Bente

    2015-06-01

    Microalgae in the division Haptophyta play key roles in the marine ecosystem and in global biogeochemical processes. Despite their ecological importance, knowledge on seasonal dynamics, community composition and abundance at the species level is limited due to their small cell size and few morphological features visible under the light microscope. Here, we present unique data on haptophyte seasonal diversity and dynamics from two annual cycles, with the taxonomic resolution and sampling depth obtained with high-throughput sequencing. From outer Oslofjorden, S Norway, nano- and picoplanktonic samples were collected monthly for 2 years, and the haptophytes targeted by amplification of RNA/cDNA with Haptophyta-specific 18S rDNA V4 primers. We obtained 156 operational taxonomic units (OTUs), from c. 400.000 454 pyrosequencing reads, after rigorous bioinformatic filtering and clustering at 99.5%. Most OTUs represented uncultured and/or not yet 18S rDNA-sequenced species. Haptophyte OTU richness and community composition exhibited high temporal variation and significant yearly periodicity. Richness was highest in September-October (autumn) and lowest in April-May (spring). Some taxa were detected all year, such as Chrysochromulina simplex, Emiliania huxleyi and Phaeocystis cordata, whereas most calcifying coccolithophores only appeared from summer to early winter. We also revealed the seasonal dynamics of OTUs representing putative novel classes (clades HAP-3-5) or orders (clades D, E, F). Season, light and temperature accounted for 29% of the variation in OTU composition. Residual variation may be related to biotic factors, such as competition and viral infection. This study provides new, in-depth knowledge on seasonal diversity and dynamics of haptophytes in North Atlantic coastal waters. © 2015 The Authors. Molecular Ecology Published by John Wiley & Sons Ltd.

  5. Robust DNA Isolation and High-throughput Sequencing Library Construction for Herbarium Specimens.

    Science.gov (United States)

    Saeidi, Saman; McKain, Michael R; Kellogg, Elizabeth A

    2018-03-08

    Herbaria are an invaluable source of plant material that can be used in a variety of biological studies. The use of herbarium specimens is associated with a number of challenges including sample preservation quality, degraded DNA, and destructive sampling of rare specimens. In order to more effectively use herbarium material in large sequencing projects, a dependable and scalable method of DNA isolation and library preparation is needed. This paper demonstrates a robust, beginning-to-end protocol for DNA isolation and high-throughput library construction from herbarium specimens that does not require modification for individual samples. This protocol is tailored for low quality dried plant material and takes advantage of existing methods by optimizing tissue grinding, modifying library size selection, and introducing an optional reamplification step for low yield libraries. Reamplification of low yield DNA libraries can rescue samples derived from irreplaceable and potentially valuable herbarium specimens, negating the need for additional destructive sampling and without introducing discernible sequencing bias for common phylogenetic applications. The protocol has been tested on hundreds of grass species, but is expected to be adaptable for use in other plant lineages after verification. This protocol can be limited by extremely degraded DNA, where fragments do not exist in the desired size range, and by secondary metabolites present in some plant material that inhibit clean DNA isolation. Overall, this protocol introduces a fast and comprehensive method that allows for DNA isolation and library preparation of 24 samples in less than 13 h, with only 8 h of active hands-on time with minimal modifications.

  6. High-throughput sequencing of RNA silencing-associated small RNAs in olive (Olea europaea L..

    Directory of Open Access Journals (Sweden)

    Livia Donaire

    Full Text Available Small RNAs (sRNAs of 20 to 25 nucleotides (nt in length maintain genome integrity and control gene expression in a multitude of developmental and physiological processes. Despite RNA silencing has been primarily studied in model plants, the advent of high-throughput sequencing technologies has enabled profiling of the sRNA component of more than 40 plant species. Here, we used deep sequencing and molecular methods to report the first inventory of sRNAs in olive (Olea europaea L.. sRNA libraries prepared from juvenile and adult shoots revealed that the 24-nt class dominates the sRNA transcriptome and atypically accumulates to levels never seen in other plant species, suggesting an active role of heterochromatin silencing in the maintenance and integrity of its large genome. A total of 18 known miRNA families were identified in the libraries. Also, 5 other sRNAs derived from potential hairpin-like precursors remain as plausible miRNA candidates. RNA blots confirmed miRNA expression and suggested tissue- and/or developmental-specific expression patterns. Target mRNAs of conserved miRNAs were computationally predicted among the olive cDNA collection and experimentally validated through endonucleolytic cleavage assays. Finally, we use expression data to uncover genetic components of the miR156, miR172 and miR390/TAS3-derived trans-acting small interfering RNA (tasiRNA regulatory nodes, suggesting that these interactive networks controlling developmental transitions are fully operational in olive.

  7. Development of automatic image analysis methods for high-throughput and high-content screening

    NARCIS (Netherlands)

    Di, Zi

    2013-01-01

    This thesis focuses on the development of image analysis methods for ultra-high content analysis of high-throughput screens where cellular phenotype responses to various genetic or chemical perturbations that are under investigation. Our primary goal is to deliver efficient and robust image analysis

  8. Freud: a software suite for high-throughput simulation analysis

    Science.gov (United States)

    Harper, Eric; Spellings, Matthew; Anderson, Joshua; Glotzer, Sharon

    Computer simulation is an indispensable tool for the study of a wide variety of systems. As simulations scale to fill petascale and exascale supercomputing clusters, so too does the size of the data produced, as well as the difficulty in analyzing these data. We present Freud, an analysis software suite for efficient analysis of simulation data. Freud makes no assumptions about the system being analyzed, allowing for general analysis methods to be applied to nearly any type of simulation. Freud includes standard analysis methods such as the radial distribution function, as well as new methods including the potential of mean force and torque and local crystal environment analysis. Freud combines a Python interface with fast, parallel C + + analysis routines to run efficiently on laptops, workstations, and supercomputing clusters. Data analysis on clusters reduces data transfer requirements, a prohibitive cost for petascale computing. Used in conjunction with simulation software, Freud allows for smart simulations that adapt to the current state of the system, enabling the study of phenomena such as nucleation and growth, intelligent investigation of phases and phase transitions, and determination of effective pair potentials.

  9. High-throughput Transcriptome analysis, CAGE and beyond

    KAUST Repository

    Kodzius, Rimantas

    2008-01-01

    1. Current research - PhD work on discovery of new allergens - Postdoctoral work on Transcriptional Start Sites a) Tag based technologies allow higher throughput b) CAGE technology to define promoters c) CAGE data analysis to understand Transcription - Wo

  10. High-throughput Transcriptome analysis, CAGE and beyond

    KAUST Repository

    Kodzius, Rimantas

    2008-11-25

    1. Current research - PhD work on discovery of new allergens - Postdoctoral work on Transcriptional Start Sites a) Tag based technologies allow higher throughput b) CAGE technology to define promoters c) CAGE data analysis to understand Transcription - Wo

  11. Evolution of blue-flowered species of genus Linum based on high-throughput sequencing of ribosomal RNA genes.

    Science.gov (United States)

    Bolsheva, Nadezhda L; Melnikova, Nataliya V; Kirov, Ilya V; Speranskaya, Anna S; Krinitsina, Anastasia A; Dmitriev, Alexey A; Belenikin, Maxim S; Krasnov, George S; Lakunina, Valentina A; Snezhkina, Anastasiya V; Rozhmina, Tatiana A; Samatadze, Tatiana E; Yurkevich, Olga Yu; Zoshchuk, Svyatoslav A; Amosova, Аlexandra V; Kudryavtseva, Anna V; Muravenko, Olga V

    2017-12-28

    The species relationships within the genus Linum have already been studied several times by means of different molecular and phylogenetic approaches. Nevertheless, a number of ambiguities in phylogeny of Linum still remain unresolved. In particular, the species relationships within the sections Stellerolinum and Dasylinum need further clarification. Also, the question of independence of the species of the section Adenolinum still remains unanswered. Moreover, the relationships of L. narbonense and other species of the section Linum require further clarification. Additionally, the origin of tetraploid species of the section Linum (2n = 30) including the cultivated species L. usitatissimum has not been explored. The present study examines the phylogeny of blue-flowered species of Linum by comparisons of 5S rRNA gene sequences as well as ITS1 and ITS2 sequences of 35S rRNA genes. High-throughput sequencing has been used for analysis of multicopy rRNA gene families. In addition to the molecular phylogenetic analysis, the number and chromosomal localization of 5S and 35S rDNA sites has been determined by FISH. Our findings confirm that L. stelleroides forms a basal branch from the clade of blue-flowered flaxes which is independent of the branch formed by species of the sect. Dasylinum. The current molecular phylogenetic approaches, the cytogenetic analysis as well as different genomic DNA fingerprinting methods applied previously did not discriminate certain species within the sect. Adenolinum. The allotetraploid cultivated species L. usitatissimum and its wild ancestor L. angustifolium (2n = 30) could originate either as the result of hybridization of two diploid species (2n = 16) related to the modern L. gandiflorum and L. decumbens, or hybridization of a diploid species (2n = 16) and a diploid ancestor of modern L. narbonense (2n = 14). High-throughput sequencing of multicopy rRNA gene families allowed us to make several adjustments to the

  12. High Throughput Analysis of Breast Cancer Specimens on the Grid

    OpenAIRE

    Yang, Lin; Chen, Wenjin; Meer, Peter; Salaru, Gratian; Feldman, Michael D.; Foran, David J.

    2007-01-01

    Breast cancer accounts for about 30% of all cancers and 15% of all cancer deaths in women in the United States. Advances in computer assisted diagnosis (CAD) holds promise for early detecting and staging disease progression. In this paper we introduce a Grid-enabled CAD to perform automatic analysis of imaged histopathology breast tissue specimens. More than 100,000 digitized samples (1200 × 1200 pixels) have already been processed on the Grid. We have analyzed results for 3744 breast tissue ...

  13. Assessing the Diversity of Rodent-Borne Viruses: Exploring of High-Throughput Sequencing and Classical Amplification/Sequencing Approaches.

    Science.gov (United States)

    Drewes, Stephan; Straková, Petra; Drexler, Jan F; Jacob, Jens; Ulrich, Rainer G

    2017-01-01

    Rodents are distributed throughout the world and interact with humans in many ways. They provide vital ecosystem services, some species are useful models in biomedical research and some are held as pet animals. However, many rodent species can have adverse effects such as damage to crops and stored produce, and they are of health concern because of the transmission of pathogens to humans and livestock. The first rodent viruses were discovered by isolation approaches and resulted in break-through knowledge in immunology, molecular and cell biology, and cancer research. In addition to rodent-specific viruses, rodent-borne viruses are causing a large number of zoonotic diseases. Most prominent examples are reemerging outbreaks of human hemorrhagic fever disease cases caused by arena- and hantaviruses. In addition, rodents are reservoirs for vector-borne pathogens, such as tick-borne encephalitis virus and Borrelia spp., and may carry human pathogenic agents, but likely are not involved in their transmission to human. In our days, next-generation sequencing or high-throughput sequencing (HTS) is revolutionizing the speed of the discovery of novel viruses, but other molecular approaches, such as generic RT-PCR/PCR and rolling circle amplification techniques, contribute significantly to the rapidly ongoing process. However, the current knowledge still represents only the tip of the iceberg, when comparing the known human viruses to those known for rodents, the mammalian taxon with the largest species number. The diagnostic potential of HTS-based metagenomic approaches is illustrated by their use in the discovery and complete genome determination of novel borna- and adenoviruses as causative disease agents in squirrels. In conclusion, HTS, in combination with conventional RT-PCR/PCR-based approaches, resulted in a drastically increased knowledge of the diversity of rodent viruses. Future improvements of the used workflows, including bioinformatics analysis, will further

  14. Annotating Protein Functional Residues by Coupling High-Throughput Fitness Profile and Homologous-Structure Analysis

    Directory of Open Access Journals (Sweden)

    Yushen Du

    2016-11-01

    Full Text Available Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp, we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available.

  15. Exploring the sources of bacterial spoilers in beefsteaks by culture-independent high-throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Francesca De Filippis

    Full Text Available Microbial growth on meat to unacceptable levels contributes significantly to change meat structure, color and flavor and to cause meat spoilage. The types of microorganisms initially present in meat depend on several factors and multiple sources of contamination can be identified. The aims of this study were to evaluate the microbial diversity in beefsteaks before and after aerobic storage at 4°C and to investigate the sources of microbial contamination by examining the microbiota of carcasses wherefrom the steaks originated and of the processing environment where the beef was handled. Carcass, environmental (processing plant and meat samples were analyzed by culture-independent high-throughput sequencing of 16S rRNA gene amplicons. The microbiota of carcass swabs was very complex, including more than 600 operational taxonomic units (OTUs belonging to 15 different phyla. A significant association was found between beef microbiota and specific beef cuts (P<0.01 indicating that different cuts of the same carcass can influence the microbial contamination of beef. Despite the initially high complexity of the carcass microbiota, the steaks after aerobic storage at 4°C showed a dramatic decrease in microbial complexity. Pseudomonas sp. and Brochothrix thermosphacta were the main contaminants, and Acinetobacter, Psychrobacter and Enterobacteriaceae were also found. Comparing the relative abundance of OTUs in the different samples it was shown that abundant OTUs in beefsteaks after storage occurred in the corresponding carcass. However, the abundance of these same OTUs clearly increased in environmental samples taken in the processing plant suggesting that spoilage-associated microbial species originate from carcasses, they are carried to the processing environment where the meat is handled and there they become a resident microbiota. Such microbiota is then further spread on meat when it is handled and it represents the starting microbial association

  16. Regulatory pathway analysis by high-throughput in situ hybridization.

    Directory of Open Access Journals (Sweden)

    Axel Visel

    2007-10-01

    Full Text Available Automated in situ hybridization enables the construction of comprehensive atlases of gene expression patterns in mammals. Such atlases can become Web-searchable digital expression maps of individual genes and thus offer an entryway to elucidate genetic interactions and signaling pathways. Towards this end, an atlas housing approximately 1,000 spatial gene expression patterns of the midgestation mouse embryo was generated. Patterns were textually annotated using a controlled vocabulary comprising >90 anatomical features. Hierarchical clustering of annotations was carried out using distance scores calculated from the similarity between pairs of patterns across all anatomical structures. This process ordered hundreds of complex expression patterns into a matrix that reflects the embryonic architecture and the relatedness of patterns of expression. Clustering yielded 12 distinct groups of expression patterns. Because of the similarity of expression patterns within a group, members of each group may be components of regulatory cascades. We focused on the group containing Pax6, an evolutionary conserved transcriptional master mediator of development. Seventeen of the 82 genes in this group showed a change of expression in the developing neocortex of Pax6-deficient embryos. Electromobility shift assays were used to test for the presence of Pax6-paired domain binding sites. This led to the identification of 12 genes not previously known as potential targets of Pax6 regulation. These findings suggest that cluster analysis of annotated gene expression patterns obtained by automated in situ hybridization is a novel approach for identifying components of signaling cascades.

  17. High-throughput genetic analysis in a cohort of patients with Ocular Developmental Anomalies

    Directory of Open Access Journals (Sweden)

    Suganya Kandeeban

    2017-10-01

    Full Text Available Anophthalmia and microphthalmia (A/M are developmental ocular malformations in which the eye fails to form or is smaller than normal with both genetic and environmental etiology. Microphthalmia is often associated with additional ocular anomalies, most commonly coloboma or cataract [1, 2]. A/M has a combined incidence between 1-3.2 cases per 10,000 live births in Caucasians [3, 4]. The spectrum of genetic abnormalities (chromosomal and molecular associated with these ocular developmental defects are being investigated in the current study. A detailed pedigree analysis and ophthalmic examination have been documented for the enrolled patients followed by blood collection and DNA extraction. The strategies for genetic analysis included chromosomal analysis by conventional and array based (affymetrix cytoscan HD array methods, targeted re-sequencing of the candidate genes and whole exome sequencing (WES in Illumina HiSEQ 2500. WES was done in families excluded for mutations in candidate genes. Twenty four samples (Microphthalmia (M-5, Anophthalmia (A-7,Coloboma-2, M&A-1, microphthalmia and coloboma / other ocular features-9 were initially analyzed using conventional Geimsa Trypsin Geimsa banding of which 4 samples revealed gross chromosomal aberrations (deletions in 3q26.3-28, 11p13 (N=2 and 11q23 regions. Targeted re sequencing of candidate genes showed mutations in CHX10, PAX6, FOXE3, ABCB6 and SHH genes in 6 samples. High throughput array based chromosomal analysis revealed aberrations in 4 samples (17q21dup (n=2, 8p11del (n=2. Overall, genetic alterations in known candidate genes are seen in 50% of the study subjects. Whole exome sequencing was performed in samples that were excluded for mutations in candidate genes and the results are discussed.

  18. Experimental design-based functional mining and characterization of high-throughput sequencing data in the sequence read archive.

    Directory of Open Access Journals (Sweden)

    Takeru Nakazato

    Full Text Available High-throughput sequencing technology, also called next-generation sequencing (NGS, has the potential to revolutionize the whole process of genome sequencing, transcriptomics, and epigenetics. Sequencing data is captured in a public primary data archive, the Sequence Read Archive (SRA. As of January 2013, data from more than 14,000 projects have been submitted to SRA, which is double that of the previous year. Researchers can download raw sequence data from SRA website to perform further analyses and to compare with their own data. However, it is extremely difficult to search entries and download raw sequences of interests with SRA because the data structure is complicated, and experimental conditions along with raw sequences are partly described in natural language. Additionally, some sequences are of inconsistent quality because anyone can submit sequencing data to SRA with no quality check. Therefore, as a criterion of data quality, we focused on SRA entries that were cited in journal articles. We extracted SRA IDs and PubMed IDs (PMIDs from SRA and full-text versions of journal articles and retrieved 2748 SRA ID-PMID pairs. We constructed a publication list referring to SRA entries. Since, one of the main themes of -omics analyses is clarification of disease mechanisms, we also characterized SRA entries by disease keywords, according to the Medical Subject Headings (MeSH extracted from articles assigned to each SRA entry. We obtained 989 SRA ID-MeSH disease term pairs, and constructed a disease list referring to SRA data. We previously developed feature profiles of diseases in a system called "Gendoo". We generated hyperlinks between diseases extracted from SRA and the feature profiles of it. The developed project, publication and disease lists resulting from this study are available at our web service, called "DBCLS SRA" (http://sra.dbcls.jp/. This service will improve accessibility to high-quality data from SRA.

  19. Whole Genome Sequencing of Enterovirus species C Isolates by High-throughput Sequencing: Development of Generic Primers

    Directory of Open Access Journals (Sweden)

    Maël Bessaud

    2016-08-01

    Full Text Available Enteroviruses are among the most common viruses infecting humans and can cause diverse clinical syndromes ranging from minor febrile illness to severe and potentially fatal diseases. Enterovirus species C (EV-C consists of more than 20 types, among which the 3 serotypes of polioviruses, the etiological agents of poliomyelitis, are included. Biodiversity and evolution of EV-C genomes are shaped by frequent recombination events. Therefore, identification and characterization of circulating EV-C strains require the sequencing of different genomic regions.A simple method was developed to sequence quickly the entire genome of EV-C isolates. Four overlapping fragments were produced separately by RT-PCR performed with generic primers. The four amplicons were then pooled and purified prior to be sequenced by high-throughput technique.The method was assessed on a panel of EV-Cs belonging to a wide-range of types. It can be used to determine full-length genome sequences through de novo assembly of thousands of reads. It was also able to discriminate reads from closely related viruses in mixtures.By decreasing the workload compared to classical Sanger-based techniques, this method will serve as a precious tool for sequencing large panels of EV-Cs isolated in cell cultures during environmental surveillance or from patients, including vaccine-derived polioviruses.

  20. Identification of protoplast-isolation responsive microRNAs in Citrus reticulata Blanco by high-throughput sequencing.

    Science.gov (United States)

    Xu, Xiaoyong; Xu, Xiaoling; Zhou, Yipeng; Zeng, Shaohua; Kong, Weiwen

    2017-01-01

    Protoplast isolation is a stress-inducing process, during which a variety of physiological and molecular alterations take place. Such stress response affects the expression of totipotency of cultured protoplasts. MicroRNAs (miRNAs) play important roles in plant growth, development and stress responses. However, the underlying mechanism of miRNAs involved in the protoplast totipotency remains unclear. In this study, high-throughput sequencing technology was used to sequence two populations of small RNA from calli and callus-derived protoplasts in Citrus reticulata Blanco. A total of 67 known miRNAs from 35 families and 277 novel miRNAs were identified. Among these miRNAs, 18 known miRNAs and 64 novel miRNAs were identified by differentially expressed miRNAs (DEMs) analysis. The expression patterns of the eight DEMs were verified by qRT-PCR. Target prediction showed most targets of the miRNAs were transcription factors. The expression levels of half targets showed a negative correlation to those of the miRNAs. Furthermore, the physiological analysis showed high levels of antioxidant activities in isolated protoplasts. In short, our results indicated that miRNAs may play important roles in protoplast-isolation response.

  1. [Research on soil bacteria under the impact of sealed CO2 leakage by high-throughput sequencing technology].

    Science.gov (United States)

    Tian, Di; Ma, Xin; Li, Yu-E; Zha, Liang-Song; Wu, Yang; Zou, Xiao-Xia; Liu, Shuang

    2013-10-01

    Carbon dioxide Capture and Storage has provided a new option for mitigating global anthropogenic CO2 emission with its unique advantages. However, there is a risk of the sealed CO2 leakage, bringing a serious threat to the ecology system. It is widely known that soil microorganisms are closely related to soil health, while the study on the impact of sequestered CO2 leakage on soil microorganisms is quite deficient. In this study, the leakage scenarios of sealed CO2 were constructed and the 16S rRNA genes of soil bacteria were sequenced by Illumina high-throughput sequencing technology on Miseq platform, and related biological analysis was conducted to explore the changes of soil bacterial abundance, diversity and structure. There were 486,645 reads for 43,017 OTUs of 15 soil samples and the results of biological analysis showed that there were differences in the abundance, diversity and community structure of soil bacterial community under different CO, leakage scenarios while the abundance and diversity of the bacterial community declined with the amplification of CO2 leakage quantity and leakage time, and some bacteria species became the dominant bacteria species in the bacteria community, therefore the increase of Acidobacteria species would be a biological indicator for the impact of sealed CO2 leakage on soil ecology system.

  2. Target-dependent enrichment of virions determines the reduction of high-throughput sequencing in virus discovery.

    Directory of Open Access Journals (Sweden)

    Randi Holm Jensen

    Full Text Available Viral infections cause many different diseases stemming both from well-characterized viral pathogens but also from emerging viruses, and the search for novel viruses continues to be of great importance. High-throughput sequencing is an important technology for this purpose. However, viral nucleic acids often constitute a minute proportion of the total genetic material in a sample from infected tissue. Techniques to enrich viral targets in high-throughput sequencing have been reported, but the sensitivity of such methods is not well established. This study compares different library preparation techniques targeting both DNA and RNA with and without virion enrichment. By optimizing the selection of intact virus particles, both by physical and enzymatic approaches, we assessed the effectiveness of the specific enrichment of viral sequences as compared to non-enriched sample preparations by selectively looking for and counting read sequences obtained from shotgun sequencing. Using shotgun sequencing of total DNA or RNA, viral targets were detected at concentrations corresponding to the predicted level, providing a foundation for estimating the effectiveness of virion enrichment. Virion enrichment typically produced a 1000-fold increase in the proportion of DNA virus sequences. For RNA virions the gain was less pronounced with a maximum 13-fold increase. This enrichment varied between the different sample concentrations, with no clear trend. Despite that less sequencing was required to identify target sequences, it was not evident from our data that a lower detection level was achieved by virion enrichment compared to shotgun sequencing.

  3. High-throughput sequencing of microbial community diversity in soil, grapes, leaves, grape juice and wine of grapevine from China.

    Science.gov (United States)

    Wei, Yu-Jie; Wu, Yun; Yan, Yin-Zhuo; Zou, Wan; Xue, Jie; Ma, Wen-Rui; Wang, Wei; Tian, Ge; Wang, Li-Ye

    2018-01-01

    In this study Illumina MiSeq was performed to investigate microbial diversity in soil, leaves, grape, grape juice and wine. A total of 1,043,102 fungal Internal Transcribed Spacer (ITS) reads and 2,422,188 high quality bacterial 16S rDNA sequences were used for taxonomic classification, revealed five fungal and eight bacterial phyla. At the genus level, the dominant fungi were Ascomycota, Sordariales, Tetracladium and Geomyces in soil, Aureobasidium and Pleosporaceae in grapes leaves, Aureobasidium in grape and grape juice. The dominant bacteria were Kaistobacter, Arthrobacter, Skermanella and Sphingomonas in soil, Pseudomonas, Acinetobacter and Kaistobacter in grape and grapes leaves, and Oenococcus in grape juice and wine. Principal coordinate analysis showed structural separation between the composition of fungi and bacteria in all samples. This is the first study to understand microbiome population in soil, grape, grapes leaves, grape juice and wine in Xinjiang through High-throughput Sequencing and identify microorganisms like Saccharomyces cerevisiae and Oenococcus spp. that may contribute to the quality and flavor of wine.

  4. High-throughput sequencing of microbial community diversity in soil, grapes, leaves, grape juice and wine of grapevine from China

    Science.gov (United States)

    Yan, Yin-zhuo; Zou, Wan; Ma, Wen-rui; Wang, Wei; Tian, Ge; Wang, Li-ye

    2018-01-01

    In this study Illumina MiSeq was performed to investigate microbial diversity in soil, leaves, grape, grape juice and wine. A total of 1,043,102 fungal Internal Transcribed Spacer (ITS) reads and 2,422,188 high quality bacterial 16S rDNA sequences were used for taxonomic classification, revealed five fungal and eight bacterial phyla. At the genus level, the dominant fungi were Ascomycota, Sordariales, Tetracladium and Geomyces in soil, Aureobasidium and Pleosporaceae in grapes leaves, Aureobasidium in grape and grape juice. The dominant bacteria were Kaistobacter, Arthrobacter, Skermanella and Sphingomonas in soil, Pseudomonas, Acinetobacter and Kaistobacter in grape and grapes leaves, and Oenococcus in grape juice and wine. Principal coordinate analysis showed structural separation between the composition of fungi and bacteria in all samples. This is the first study to understand microbiome population in soil, grape, grapes leaves, grape juice and wine in Xinjiang through High-throughput Sequencing and identify microorganisms like Saccharomyces cerevisiae and Oenococcus spp. that may contribute to the quality and flavor of wine. PMID:29565999

  5. Model SNP development for complex genomes based on hexaploid oat using high-throughput 454 sequencing technology

    Directory of Open Access Journals (Sweden)

    Chao Shiaoman

    2011-01-01

    Full Text Available Abstract Background Genetic markers are pivotal to modern genomics research; however, discovery and genotyping of molecular markers in oat has been hindered by the size and complexity of the genome, and by a scarcity of sequence data. The purpose of this study was to generate oat expressed sequence tag (EST information, develop a bioinformatics pipeline for SNP discovery, and establish a method for rapid, cost-effective, and straightforward genotyping of SNP markers in complex polyploid genomes such as oat. Results Based on cDNA libraries of four cultivated oat genotypes, approximately 127,000 contigs were assembled from approximately one million Roche 454 sequence reads. Contigs were filtered through a novel bioinformatics pipeline to eliminate ambiguous polymorphism caused by subgenome homology, and 96 in silico SNPs were selected from 9,448 candidate loci for validation using high-resolution melting (HRM analysis. Of these, 52 (54% were polymorphic between parents of the Ogle1040 × TAM O-301 (OT mapping population, with 48 segregating as single Mendelian loci, and 44 being placed on the existing OT linkage map. Ogle and TAM amplicons from 12 primers were sequenced for SNP validation, revealing complex polymorphism in seven amplicons but general sequence conservation within SNP loci. Whole-amplicon interrogation with HRM revealed insertions, deletions, and heterozygotes in secondary oat germplasm pools, generating multiple alleles at some primer targets. To validate marker utility, 36 SNP assays were used to evaluate the genetic diversity of 34 diverse oat genotypes. Dendrogram clusters corresponded generally to known genome composition and genetic ancestry. Conclusions The high-throughput SNP discovery pipeline presented here is a rapid and effective method for identification of polymorphic SNP alleles in the oat genome. The current-generation HRM system is a simple and highly-informative platform for SNP genotyping. These techniques provide

  6. High-throughput sequencing of the B-cell receptor in African Burkitt lymphoma reveals clues to pathogenesis.

    Science.gov (United States)

    Lombardo, Katharine A; Coffey, David G; Morales, Alicia J; Carlson, Christopher S; Towlerton, Andrea M H; Gerdts, Sarah E; Nkrumah, Francis K; Neequaye, Janet; Biggar, Robert J; Orem, Jackson; Casper, Corey; Mbulaiteye, Sam M; Bhatia, Kishor G; Warren, Edus H

    2017-03-28

    Burkitt lymphoma (BL), the most common pediatric cancer in sub-Saharan Africa, is a malignancy of antigen-experienced B lymphocytes. High-throughput sequencing (HTS) of the immunoglobulin heavy ( IGH ) and light chain ( IGK / IGL ) loci was performed on genomic DNA from 51 primary BL tumors: 19 from Uganda and 32 from Ghana. Reverse transcription polymerase chain reaction analysis and tumor RNA sequencing (RNAseq) was performed on the Ugandan tumors to confirm and extend the findings from the HTS of tumor DNA. Clonal IGH and IGK / IGL rearrangements were identified in 41 and 46 tumors, respectively. Evidence for rearrangement of the second IGH allele was observed in only 6 of 41 tumor samples with a clonal IGH rearrangement, suggesting that the normal process of biallelic IGHD to IGHJ diversity-joining (DJ) rearrangement is often disrupted in BL progenitor cells. Most tumors, including those with a sole dominant, nonexpressed DJ rearrangement, contained many IGH and IGK / IGL sequences that differed from the dominant rearrangement by < 10 nucleotides, suggesting that the target of ongoing mutagenesis of these loci in BL tumor cells is not limited to expressed alleles. IGHV usage in both BL tumor cohorts revealed enrichment for IGHV genes that are infrequently used in memory B cells from healthy subjects. Analysis of publicly available DNA sequencing and RNAseq data revealed that these same IGHV genes were overrepresented in dominant tumor-associated IGH rearrangements in several independent BL tumor cohorts. These data suggest that BL derives from an abnormal B-cell progenitor and that aberrant mutational processes are active on the immunoglobulin loci in BL cells.

  7. Taxonomy of anaerobic digestion microbiome reveals biases associated with the applied high throughput sequencing strategies

    DEFF Research Database (Denmark)

    Campanaro, Stefano; Treu, Laura; Kougias, Panagiotis

    2018-01-01

    In the past few years, many studies investigated the anaerobic digestion microbiome by means of 16S rRNA amplicon sequencing. Results obtained from these studies were compared to each other without taking into consideration the followed procedure for amplicons preparation and data analysis...... specifically, the microbial compositions of three laboratory scale biogas reactors were analyzed before and after addition of sodium oleate by sequencing the microbiome with three different approaches: 16S rRNA amplicon sequencing, shotgun DNA and shotgun RNA. This comparative analysis revealed that......, in amplicon sequencing, abundance of some taxa (Euryarchaeota and Spirochaetes) was biased by the inefficiency of universal primers to hybridize all the templates. Reliability of the results obtained was also influenced by the number of hypervariable regions under investigation. Finally, amplicon sequencing...

  8. Sequencing quality assessment tools to enable data-driven informatics for high throughput genomics

    Directory of Open Access Journals (Sweden)

    Richard Mark Leggett

    2013-12-01

    Full Text Available The processes of quality assessment and control are an active area of research at The Genome Analysis Centre (TGAC. Unlike other sequencing centres that often concentrate on a certain species or technology, TGAC applies expertise in genomics and bioinformatics to a wide range of projects, often requiring bespoke wet lab and in silico workflows. TGAC is fortunate to have access to a diverse range of sequencing and analysis platforms, and we are at the forefront of investigations into library quality and sequence data assessment. We have developed and implemented a number of algorithms, tools, pipelines and packages to ascertain, store, and expose quality metrics across a number of next-generation sequencing platforms, allowing rapid and in-depth cross-platform QC bioinformatics. In this review, we describe these tools as a vehicle for data-driven informatics, offering the potential to provide richer context for downstream analysis and to inform experimental design.

  9. Computational and statistical methods for high-throughput analysis of post-translational modifications of proteins

    DEFF Research Database (Denmark)

    Schwämmle, Veit; Braga, Thiago Verano; Roepstorff, Peter

    2015-01-01

    The investigation of post-translational modifications (PTMs) represents one of the main research focuses for the study of protein function and cell signaling. Mass spectrometry instrumentation with increasing sensitivity improved protocols for PTM enrichment and recently established pipelines...... for high-throughput experiments allow large-scale identification and quantification of several PTM types. This review addresses the concurrently emerging challenges for the computational analysis of the resulting data and presents PTM-centered approaches for spectra identification, statistical analysis...

  10. Unraveling Core Functional Microbiota in Traditional Solid-State Fermentation by High-Throughput Amplicons and Metatranscriptomics Sequencing.

    Science.gov (United States)

    Song, Zhewei; Du, Hai; Zhang, Yan; Xu, Yan

    2017-01-01

    Fermentation microbiota is specific microorganisms that generate different types of metabolites in many productions. In traditional solid-state fermentation, the structural composition and functional capacity of the core microbiota determine the quality and quantity of products. As a typical example of food fermentation, Chinese Maotai-flavor liquor production involves a complex of various microorganisms and a wide variety of metabolites. However, the microbial succession and functional shift of the core microbiota in this traditional food fermentation remain unclear. Here, high-throughput amplicons (16S rRNA gene amplicon sequencing and internal transcribed space amplicon sequencing) and metatranscriptomics sequencing technologies were combined to reveal the structure and function of the core microbiota in Chinese soy sauce aroma type liquor production. In addition, ultra-performance liquid chromatography and headspace-solid phase microextraction-gas chromatography-mass spectrometry were employed to provide qualitative and quantitative analysis of the major flavor metabolites. A total of 10 fungal and 11 bacterial genera were identified as the core microbiota. In addition, metatranscriptomic analysis revealed pyruvate metabolism in yeasts (genera Pichia, Schizosaccharomyces, Saccharomyces , and Zygosaccharomyces ) and lactic acid bacteria (genus Lactobacillus ) classified into two stages in the production of flavor components. Stage I involved high-level alcohol (ethanol) production, with the genus Schizosaccharomyces serving as the core functional microorganism. Stage II involved high-level acid (lactic acid and acetic acid) production, with the genus Lactobacillus serving as the core functional microorganism. The functional shift from the genus Schizosaccharomyces to the genus Lactobacillus drives flavor component conversion from alcohol (ethanol) to acid (lactic acid and acetic acid) in Chinese Maotai-flavor liquor production. Our findings provide insight into

  11. Unraveling Core Functional Microbiota in Traditional Solid-State Fermentation by High-Throughput Amplicons and Metatranscriptomics Sequencing

    Directory of Open Access Journals (Sweden)

    Zhewei Song

    2017-07-01

    Full Text Available Fermentation microbiota is specific microorganisms that generate different types of metabolites in many productions. In traditional solid-state fermentation, the structural composition and functional capacity of the core microbiota determine the quality and quantity of products. As a typical example of food fermentation, Chinese Maotai-flavor liquor production involves a complex of various microorganisms and a wide variety of metabolites. However, the microbial succession and functional shift of the core microbiota in this traditional food fermentation remain unclear. Here, high-throughput amplicons (16S rRNA gene amplicon sequencing and internal transcribed space amplicon sequencing and metatranscriptomics sequencing technologies were combined to reveal the structure and function of the core microbiota in Chinese soy sauce aroma type liquor production. In addition, ultra-performance liquid chromatography and headspace-solid phase microextraction-gas chromatography-mass spectrometry were employed to provide qualitative and quantitative analysis of the major flavor metabolites. A total of 10 fungal and 11 bacterial genera were identified as the core microbiota. In addition, metatranscriptomic analysis revealed pyruvate metabolism in yeasts (genera Pichia, Schizosaccharomyces, Saccharomyces, and Zygosaccharomyces and lactic acid bacteria (genus Lactobacillus classified into two stages in the production of flavor components. Stage I involved high-level alcohol (ethanol production, with the genus Schizosaccharomyces serving as the core functional microorganism. Stage II involved high-level acid (lactic acid and acetic acid production, with the genus Lactobacillus serving as the core functional microorganism. The functional shift from the genus Schizosaccharomyces to the genus Lactobacillus drives flavor component conversion from alcohol (ethanol to acid (lactic acid and acetic acid in Chinese Maotai-flavor liquor production. Our findings provide

  12. A high-throughput splinkerette-PCR method for the isolation and sequencing of retroviral insertion sites

    DEFF Research Database (Denmark)

    Uren, Anthony G; Mikkers, Harald; Kool, Jaap

    2009-01-01

    sites has been a major limitation to performing screens on this scale. Here we present a method for the high-throughput isolation of insertion sites using a highly efficient splinkerette-PCR method coupled with capillary or 454 sequencing. This protocol includes a description of the procedure for DNA......Insertional mutagens such as viruses and transposons are a useful tool for performing forward genetic screens in mice to discover cancer genes. These screens are most effective when performed using hundreds of mice; however, until recently, the cost-effective isolation and sequencing of insertion...

  13. High throughput resistance profiling of Plasmodium falciparum infections based on custom dual indexing and Illumina next generation sequencing-technology

    DEFF Research Database (Denmark)

    Nag, Sidsel; Dalgaard, Marlene Danner; Kofoed, Poul-Erik

    2017-01-01

    Genetic polymorphisms in P. falciparum can be used to indicate the parasite's susceptibility to antimalarial drugs as well as its geographical origin. Both of these factors are key to monitoring development and spread of antimalarial drug resistance. In this study, we combine multiplex PCR, custom...... designed dual indexing and Miseq sequencing for high throughput SNP-profiling of 457 malaria infections from Guinea-Bissau, at the cost of 10 USD per sample. By amplifying and sequencing 15 genetic fragments, we cover 20 resistance-conferring SNPs occurring in pfcrt, pfmdr1, pfdhfr, pfdhps, as well...

  14. On the use of high-throughput sequencing for the study of cyanobacterial diversity in Antarctic aquatic mats.

    Science.gov (United States)

    Pessi, Igor Stelmach; Maalouf, Pedro De Carvalho; Laughinghouse, Haywood Dail; Baurain, Denis; Wilmotte, Annick

    2016-06-01

    The study of Antarctic cyanobacterial diversity has been mostly limited to morphological identification and traditional molecular techniques. High-throughput sequencing (HTS) allows a much better understanding of microbial distribution in the environment, but its application is hampered by several methodological and analytical challenges. In this work, we explored the use of HTS as a tool for the study of cyanobacterial diversity in Antarctic aquatic mats. Our results highlight the importance of using artificial communities to validate the parameters of the bioinformatics procedure used to analyze natural communities, since pipeline-dependent biases had a strong effect on the observed community structures. Analysis of microbial mats from five Antarctic lakes and an aquatic biofilm from the Sub-Antarctic showed that HTS is a valuable tool for the assessment of cyanobacterial diversity. The majority of the operational taxonomic units retrieved were related to filamentous taxa such as Leptolyngbya and Phormidium, which are common genera in Antarctic lacustrine microbial mats. However, other phylotypes related to different taxa such as Geitlerinema, Pseudanabaena, Synechococcus, Chamaesiphon, Calothrix, and Coleodesmium were also found. Results revealed a much higher diversity than what had been reported using traditional methods and also highlighted remarkable differences between the cyanobacterial communities of the studied lakes. The aquatic biofilm from the Sub-Antarctic had a distinct cyanobacterial community from the Antarctic lakes, which in turn displayed a salinity-dependent community structure at the phylotype level. © 2016 Phycological Society of America.

  15. Bacterial community compositions of coking wastewater treatment plants in steel industry revealed by Illumina high-throughput sequencing.

    Science.gov (United States)

    Ma, Qiao; Qu, Yuanyuan; Shen, Wenli; Zhang, Zhaojing; Wang, Jingwei; Liu, Ziyan; Li, Duanxing; Li, Huijie; Zhou, Jiti

    2015-03-01

    In this study, Illumina high-throughput sequencing was used to reveal the community structures of nine coking wastewater treatment plants (CWWTPs) in China for the first time. The sludge systems exhibited a similar community composition at each taxonomic level. Compared to previous studies, some of the core genera in municipal wastewater treatment plants such as Zoogloea, Prosthecobacter and Gp6 were detected as minor species. Thiobacillus (20.83%), Comamonas (6.58%), Thauera (4.02%), Azoarcus (7.78%) and Rhodoplanes (1.42%) were the dominant genera shared by at least six CWWTPs. The percentages of autotrophic ammonia-oxidizing bacteria and nitrite-oxidizing bacteria were unexpectedly low, which were verified by both real-time PCR and fluorescence in situ hybridization analyses. Hierarchical clustering and canonical correspondence analysis indicated that operation mode, flow rate and temperature might be the key factors in community formation. This study provides new insights into our understanding of microbial community compositions and structures of CWWTPs. Copyright © 2014 Elsevier Ltd. All rights reserved.

  16. SSR_pipeline--computer software for the identification of microsatellite sequences from paired-end Illumina high-throughput DNA sequence data

    Science.gov (United States)

    Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (SSRs; for example, microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains three analysis modules along with a fourth control module that can be used to automate analyses of large volumes of data. The modules are used to (1) identify the subset of paired-end sequences that pass quality standards, (2) align paired-end reads into a single composite DNA sequence, and (3) identify sequences that possess microsatellites conforming to user specified parameters. Each of the three separate analysis modules also can be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc). All modules are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, Windows). The program suite relies on a compiled Python extension module to perform paired-end alignments. Instructions for compiling the extension from source code are provided in the documentation. Users who do not have Python installed on their computers or who do not have the ability to compile software also may choose to download packaged executable files. These files include all Python scripts, a copy of the compiled extension module, and a minimal installation of Python in a single binary executable. See program documentation for more information.

  17. High-throughput sequencing of microRNAs in peripheral blood mononuclear cells: identification of potential weight loss biomarkers.

    Directory of Open Access Journals (Sweden)

    Fermín I Milagro

    Full Text Available INTRODUCTION: MicroRNAs (miRNAs are being increasingly studied in relation to energy metabolism and body composition homeostasis. Indeed, the quantitative analysis of miRNAs expression in different adiposity conditions may contribute to understand the intimate mechanisms participating in body weight control and to find new biomarkers with diagnostic or prognostic value in obesity management. OBJECTIVE: The aim of this study was the search for miRNAs in blood cells whose expression could be used as prognostic biomarkers of weight loss. METHODS: Ten Caucasian obese women were selected among the participants in a weight-loss trial that consisted in following an energy-restricted treatment. Weight loss was considered unsuccessful when 5% (responders. At baseline, total miRNA isolated from peripheral blood mononuclear cells (PBMC was sequenced with SOLiD v4. The miRNA sequencing data were validated by RT-PCR. RESULTS: Differential baseline expression of several miRNAs was found between responders and non-responders. Two miRNAs were up-regulated in the non-responder group (mir-935 and mir-4772 and three others were down-regulated (mir-223, mir-224 and mir-376b. Both mir-935 and mir-4772 showed relevant associations with the magnitude of weight loss, although the expression of other transcripts (mir-874, mir-199b, mir-766, mir-589 and mir-148b also correlated with weight loss. CONCLUSIONS: This research addresses the use of high-throughput sequencing technologies in the search for miRNA expression biomarkers in obesity, by determining the miRNA transcriptome of PBMC. Basal expression of different miRNAs, particularly mir-935 and mir-4772, could be prognostic biomarkers and may forecast the response to a hypocaloric diet.

  18. Metabolomic and high-throughput sequencing analysis—modern approach for the assessment of biodeterioration of materials from historic buildings

    Science.gov (United States)

    Gutarowska, Beata; Celikkol-Aydin, Sukriye; Bonifay, Vincent; Otlewska, Anna; Aydin, Egemen; Oldham, Athenia L.; Brauer, Jonathan I.; Duncan, Kathleen E.; Adamiak, Justyna; Sunner, Jan A.; Beech, Iwona B.

    2015-01-01

    Preservation of cultural heritage is of paramount importance worldwide. Microbial colonization of construction materials, such as wood, brick, mortar, and stone in historic buildings can lead to severe deterioration. The aim of the present study was to give modern insight into the phylogenetic diversity and activated metabolic pathways of microbial communities colonized historic objects located in the former Auschwitz II–Birkenau concentration and extermination camp in Oświecim, Poland. For this purpose we combined molecular, microscopic and chemical methods. Selected specimens were examined using Field Emission Scanning Electron Microscopy (FESEM), metabolomic analysis and high-throughput Illumina sequencing. FESEM imaging revealed the presence of complex microbial communities comprising diatoms, fungi and bacteria, mainly cyanobacteria and actinobacteria, on sample surfaces. Microbial diversity of brick specimens appeared higher than that of the wood and was dominated by algae and cyanobacteria, while wood was mainly colonized by fungi. DNA sequences documented the presence of 15 bacterial phyla representing 99 genera including Halomonas, Halorhodospira, Salinisphaera, Salinibacterium, Rubrobacter, Streptomyces, Arthrobacter and nine fungal classes represented by 113 genera including Cladosporium, Acremonium, Alternaria, Engyodontium, Penicillium, Rhizopus, and Aureobasidium. Most of the identified sequences were characteristic of organisms implicated in deterioration of wood and brick. Metabolomic data indicated the activation of numerous metabolic pathways, including those regulating the production of primary and secondary metabolites, for example, metabolites associated with the production of antibiotics, organic acids and deterioration of organic compounds. The study demonstrated that a combination of electron microscopy imaging with metabolomic and genomic techniques allows to link the phylogenetic information and metabolic profiles of microbial

  19. A standardized framework for accurate, high-throughput genotyping of recombinant and non-recombinant viral sequences.

    Science.gov (United States)

    Alcantara, Luiz Carlos Junior; Cassol, Sharon; Libin, Pieter; Deforche, Koen; Pybus, Oliver G; Van Ranst, Marc; Galvão-Castro, Bernardo; Vandamme, Anne-Mieke; de Oliveira, Tulio

    2009-07-01

    Human immunodeficiency virus type-1 (HIV-1), hepatitis B and C and other rapidly evolving viruses are characterized by extremely high levels of genetic diversity. To facilitate diagnosis and the development of prevention and treatment strategies that efficiently target the diversity of these viruses, and other pathogens such as human T-lymphotropic virus type-1 (HTLV-1), human herpes virus type-8 (HHV8) and human papillomavirus (HPV), we developed a rapid high-throughput-genotyping system. The method involves the alignment of a query sequence with a carefully selected set of pre-defined reference strains, followed by phylogenetic analysis of multiple overlapping segments of the alignment using a sliding window. Each segment of the query sequence is assigned the genotype and sub-genotype of the reference strain with the highest bootstrap (>70%) and bootscanning (>90%) scores. Results from all windows are combined and displayed graphically using color-coded genotypes. The new Virus-Genotyping Tools provide accurate classification of recombinant and non-recombinant viruses and are currently being assessed for their diagnostic utility. They have incorporated into several HIV drug resistance algorithms including the Stanford (http://hivdb.stanford.edu) and two European databases (http://www.umcutrecht.nl/subsite/spread-programme/ and http://www.hivrdb.org.uk/) and have been successfully used to genotype a large number of sequences in these and other databases. The tools are a PHP/JAVA web application and are freely accessible on a number of servers including: http://bioafrica.mrc.ac.za/rega-genotype/html/, http://lasp.cpqgm.fiocruz.br/virus-genotype/html/, http://jose.med.kuleuven.be/genotypetool/html/.

  20. Yeast diversity during the fermentation of Andean chicha: A comparison of high-throughput sequencing and culture-dependent approaches.

    Science.gov (United States)

    Mendoza, Lucía M; Neef, Alexander; Vignolo, Graciela; Belloch, Carmela

    2017-10-01

    Diversity and dynamics of yeasts associated with the fermentation of Argentinian maize-based beverage chicha was investigated. Samples taken at different stages from two chicha productions were analyzed by culture-dependent and culture-independent methods. Five hundred and ninety six yeasts were isolated by classical microbiological methods and 16 species identified by RFLPs and sequencing of D1/D2 26S rRNA gene. Genetic typing of isolates from the dominant species, Saccharomyces cerevisiae, by PCR of delta elements revealed up to 42 different patterns. High-throughput sequencing (HTS) of D1/D2 26S rRNA gene amplicons from chicha samples detected more than one hundred yeast species and almost fifty filamentous fungi taxa. Analysis of the data revealed that yeasts dominated the fermentation, although, a significant percentage of filamentous fungi appeared in the first step of the process. Statistical analysis of results showed that very few taxa were represented by more than 1% of the reads per sample at any step of the process. S. cerevisiae represented more than 90% of the reads in the fermentative samples. Other yeast species dominated the pre-fermentative steps and abounded in fermented samples when S. cerevisiae was in percentages below 90%. Most yeasts species detected by pyrosequencing were not recovered by cultivation. In contrast, the cultivation-based methodology detected very few yeast taxa, and most of them corresponded with very few reads in the pyrosequencing analysis. Copyright © 2017 Elsevier Ltd. All rights reserved.

  1. Data for automated, high-throughput microscopy analysis of intracellular bacterial colonies using spot detection.

    Science.gov (United States)

    Ernstsen, Christina L; Login, Frédéric H; Jensen, Helene H; Nørregaard, Rikke; Møller-Jensen, Jakob; Nejsum, Lene N

    2017-10-01

    Quantification of intracellular bacterial colonies is useful in strategies directed against bacterial attachment, subsequent cellular invasion and intracellular proliferation. An automated, high-throughput microscopy-method was established to quantify the number and size of intracellular bacterial colonies in infected host cells (Detection and quantification of intracellular bacterial colonies by automated, high-throughput microscopy, Ernstsen et al., 2017 [1]). The infected cells were imaged with a 10× objective and number of intracellular bacterial colonies, their size distribution and the number of cell nuclei were automatically quantified using a spot detection-tool. The spot detection-output was exported to Excel, where data analysis was performed. In this article, micrographs and spot detection data are made available to facilitate implementation of the method.

  2. Meta-Analysis of High-Throughput Datasets Reveals Cellular Responses Following Hemorrhagic Fever Virus Infection

    Directory of Open Access Journals (Sweden)

    Gavin C. Bowick

    2011-05-01

    Full Text Available The continuing use of high-throughput assays to investigate cellular responses to infection is providing a large repository of information. Due to the large number of differentially expressed transcripts, often running into the thousands, the majority of these data have not been thoroughly investigated. Advances in techniques for the downstream analysis of high-throughput datasets are providing additional methods for the generation of additional hypotheses for further investigation. The large number of experimental observations, combined with databases that correlate particular genes and proteins with canonical pathways, functions and diseases, allows for the bioinformatic exploration of functional networks that may be implicated in replication or pathogenesis. Herein, we provide an example of how analysis of published high-throughput datasets of cellular responses to hemorrhagic fever virus infection can generate additional functional data. We describe enrichment of genes involved in metabolism, post-translational modification and cardiac damage; potential roles for specific transcription factors and a conserved involvement of a pathway based around cyclooxygenase-2. We believe that these types of analyses can provide virologists with additional hypotheses for continued investigation.

  3. Image Harvest: an open-source platform for high-throughput plant image processing and analysis.

    Science.gov (United States)

    Knecht, Avi C; Campbell, Malachy T; Caprez, Adam; Swanson, David R; Walia, Harkamal

    2016-05-01

    High-throughput plant phenotyping is an effective approach to bridge the genotype-to-phenotype gap in crops. Phenomics experiments typically result in large-scale image datasets, which are not amenable for processing on desktop computers, thus creating a bottleneck in the image-analysis pipeline. Here, we present an open-source, flexible image-analysis framework, called Image Harvest (IH), for processing images originating from high-throughput plant phenotyping platforms. Image Harvest is developed to perform parallel processing on computing grids and provides an integrated feature for metadata extraction from large-scale file organization. Moreover, the integration of IH with the Open Science Grid provides academic researchers with the computational resources required for processing large image datasets at no cost. Image Harvest also offers functionalities to extract digital traits from images to interpret plant architecture-related characteristics. To demonstrate the applications of these digital traits, a rice (Oryza sativa) diversity panel was phenotyped and genome-wide association mapping was performed using digital traits that are used to describe different plant ideotypes. Three major quantitative trait loci were identified on rice chromosomes 4 and 6, which co-localize with quantitative trait loci known to regulate agronomically important traits in rice. Image Harvest is an open-source software for high-throughput image processing that requires a minimal learning curve for plant biologists to analyzephenomics datasets. © The Author 2016. Published by Oxford University Press on behalf of the Society for Experimental Biology.

  4. Image Harvest: an open-source platform for high-throughput plant image processing and analysis

    Science.gov (United States)

    Knecht, Avi C.; Campbell, Malachy T.; Caprez, Adam; Swanson, David R.; Walia, Harkamal

    2016-01-01

    High-throughput plant phenotyping is an effective approach to bridge the genotype-to-phenotype gap in crops. Phenomics experiments typically result in large-scale image datasets, which are not amenable for processing on desktop computers, thus creating a bottleneck in the image-analysis pipeline. Here, we present an open-source, flexible image-analysis framework, called Image Harvest (IH), for processing images originating from high-throughput plant phenotyping platforms. Image Harvest is developed to perform parallel processing on computing grids and provides an integrated feature for metadata extraction from large-scale file organization. Moreover, the integration of IH with the Open Science Grid provides academic researchers with the computational resources required for processing large image datasets at no cost. Image Harvest also offers functionalities to extract digital traits from images to interpret plant architecture-related characteristics. To demonstrate the applications of these digital traits, a rice (Oryza sativa) diversity panel was phenotyped and genome-wide association mapping was performed using digital traits that are used to describe different plant ideotypes. Three major quantitative trait loci were identified on rice chromosomes 4 and 6, which co-localize with quantitative trait loci known to regulate agronomically important traits in rice. Image Harvest is an open-source software for high-throughput image processing that requires a minimal learning curve for plant biologists to analyzephenomics datasets. PMID:27141917

  5. CrossCheck: an open-source web tool for high-throughput screen data analysis.

    Science.gov (United States)

    Najafov, Jamil; Najafov, Ayaz

    2017-07-19

    Modern high-throughput screening methods allow researchers to generate large datasets that potentially contain important biological information. However, oftentimes, picking relevant hits from such screens and generating testable hypotheses requires training in bioinformatics and the skills to efficiently perform database mining. There are currently no tools available to general public that allow users to cross-reference their screen datasets with published screen datasets. To this end, we developed CrossCheck, an online platform for high-throughput screen data analysis. CrossCheck is a centralized database that allows effortless comparison of the user-entered list of gene symbols with 16,231 published datasets. These datasets include published data from genome-wide RNAi and CRISPR screens, interactome proteomics and phosphoproteomics screens, cancer mutation databases, low-throughput studies of major cell signaling mediators, such as kinases, E3 ubiquitin ligases and phosphatases, and gene ontological information. Moreover, CrossCheck includes a novel database of predicted protein kinase substrates, which was developed using proteome-wide consensus motif searches. CrossCheck dramatically simplifies high-throughput screen data analysis and enables researchers to dig deep into the published literature and streamline data-driven hypothesis generation. CrossCheck is freely accessible as a web-based application at http://proteinguru.com/crosscheck.

  6. Cyber-T web server: differential analysis of high-throughput data.

    Science.gov (United States)

    Kayala, Matthew A; Baldi, Pierre

    2012-07-01

    The Bayesian regularization method for high-throughput differential analysis, described in Baldi and Long (A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes. Bioinformatics 2001: 17: 509-519) and implemented in the Cyber-T web server, is one of the most widely validated. Cyber-T implements a t-test using a Bayesian framework to compute a regularized variance of the measurements associated with each probe under each condition. This regularized estimate is derived by flexibly combining the empirical measurements with a prior, or background, derived from pooling measurements associated with probes in the same neighborhood. This approach flexibly addresses problems associated with low replication levels and technology biases, not only for DNA microarrays, but also for other technologies, such as protein arrays, quantitative mass spectrometry and next-generation sequencing (RNA-seq). Here we present an update to the Cyber-T web server, incorporating several useful new additions and improvements. Several preprocessing data normalization options including logarithmic and (Variance Stabilizing Normalization) VSN transforms are included. To augment two-sample t-tests, a one-way analysis of variance is implemented. Several methods for multiple tests correction, including standard frequentist methods and a probabilistic mixture model treatment, are available. Diagnostic plots allow visual assessment of the results. The web server provides comprehensive documentation and example data sets. The Cyber-T web server, with R source code and data sets, is publicly available at http://cybert.ics.uci.edu/.

  7. Genetic high throughput screening in Retinitis Pigmentosa based on high resolution melting (HRM) analysis.

    Science.gov (United States)

    Anasagasti, Ander; Barandika, Olatz; Irigoyen, Cristina; Benitez, Bruno A; Cooper, Breanna; Cruchaga, Carlos; López de Munain, Adolfo; Ruiz-Ederra, Javier

    2013-11-01

    Retinitis Pigmentosa (RP) involves a group of genetically determined retinal diseases caused by a large number of mutations that result in rod photoreceptor cell death followed by gradual death of cone cells. Most cases of RP are monogenic, with more than 80 associated genes identified so far. The high number of genes and variants involved in RP, among other factors, is making the molecular characterization of RP a real challenge for many patients. Although HRM has been used for the analysis of isolated variants or single RP genes, as far as we are concerned, this is the first study that uses HRM analysis for a high-throughput screening of several RP genes. Our main goal was to test the suitability of HRM analysis as a genetic screening technique in RP, and to compare its performance with two of the most widely used NGS platforms, Illumina and PGM-Ion Torrent technologies. RP patients (n = 96) were clinically diagnosed at the Ophthalmology Department of Donostia University Hospital, Spain. We analyzed a total of 16 RP genes that meet the following inclusion criteria: 1) size: genes with transcripts of less than 4 kb; 2) number of exons: genes with up to 22 exons; and 3) prevalence: genes reported to account for, at least, 0.4% of total RP cases worldwide. For comparison purposes, RHO gene was also sequenced with Illumina (GAII; Illumina), Ion semiconductor technologies (PGM; Life Technologies) and Sanger sequencing (ABI 3130xl platform; Applied Biosystems). Detected variants were confirmed in all cases by Sanger sequencing and tested for co-segregation in the family of affected probands. We identified a total of 65 genetic variants, 15 of which (23%) were novel, in 49 out of 96 patients. Among them, 14 (4 novel) are probable disease-causing genetic variants in 7 RP genes, affecting 15 patients. Our HRM analysis-based study, proved to be a cost-effective and rapid method that provides an accurate identification of genetic RP variants. This approach is effective for

  8. An Automated High Throughput Proteolysis and Desalting Platform for Quantitative Proteomic Analysis

    Directory of Open Access Journals (Sweden)

    Albert-Baskar Arul

    2013-06-01

    Full Text Available Proteomics for biomarker validation needs high throughput instrumentation to analyze huge set of clinical samples for quantitative and reproducible analysis at a minimum time without manual experimental errors. Sample preparation, a vital step in proteomics plays a major role in identification and quantification of proteins from biological samples. Tryptic digestion a major check point in sample preparation for mass spectrometry based proteomics needs to be more accurate with rapid processing time. The present study focuses on establishing a high throughput automated online system for proteolytic digestion and desalting of proteins from biological samples quantitatively and qualitatively in a reproducible manner. The present study compares online protein digestion and desalting of BSA with conventional off-line (in-solution method and validated for real time sample for reproducibility. Proteins were identified using SEQUEST data base search engine and the data were quantified using IDEALQ software. The present study shows that the online system capable of handling high throughput samples in 96 well formats carries out protein digestion and peptide desalting efficiently in a reproducible and quantitative manner. Label free quantification showed clear increase of peptide quantities with increase in concentration with much linearity compared to off line method. Hence we would like to suggest that inclusion of this online system in proteomic pipeline will be effective in quantification of proteins in comparative proteomics were the quantification is really very crucial.

  9. Ontology-based meta-analysis of global collections of high-throughput public data.

    Directory of Open Access Journals (Sweden)

    Ilya Kupershmidt

    2010-09-01

    Full Text Available The investigation of the interconnections between the molecular and genetic events that govern biological systems is essential if we are to understand the development of disease and design effective novel treatments. Microarray and next-generation sequencing technologies have the potential to provide this information. However, taking full advantage of these approaches requires that biological connections be made across large quantities of highly heterogeneous genomic datasets. Leveraging the increasingly huge quantities of genomic data in the public domain is fast becoming one of the key challenges in the research community today.We have developed a novel data mining framework that enables researchers to use this growing collection of public high-throughput data to investigate any set of genes or proteins. The connectivity between molecular states across thousands of heterogeneous datasets from microarrays and other genomic platforms is determined through a combination of rank-based enrichment statistics, meta-analyses, and biomedical ontologies. We address data quality concerns through dataset replication and meta-analysis and ensure that the majority of the findings are derived using multiple lines of evidence. As an example of our strategy and the utility of this framework, we apply our data mining approach to explore the biology of brown fat within the context of the thousands of publicly available gene expression datasets.Our work presents a practical strategy for organizing, mining, and correlating global collections of large-scale genomic data to explore normal and disease biology. Using a hypothesis-free approach, we demonstrate how a data-driven analysis across very large collections of genomic data can reveal novel discoveries and evidence to support existing hypothesis.

  10. Ontology-based meta-analysis of global collections of high-throughput public data.

    Science.gov (United States)

    Kupershmidt, Ilya; Su, Qiaojuan Jane; Grewal, Anoop; Sundaresh, Suman; Halperin, Inbal; Flynn, James; Shekar, Mamatha; Wang, Helen; Park, Jenny; Cui, Wenwu; Wall, Gregory D; Wisotzkey, Robert; Alag, Satnam; Akhtari, Saeid; Ronaghi, Mostafa

    2010-09-29

    The investigation of the interconnections between the molecular and genetic events that govern biological systems is essential if we are to understand the development of disease and design effective novel treatments. Microarray and next-generation sequencing technologies have the potential to provide this information. However, taking full advantage of these approaches requires that biological connections be made across large quantities of highly heterogeneous genomic datasets. Leveraging the increasingly huge quantities of genomic data in the public domain is fast becoming one of the key challenges in the research community today. We have developed a novel data mining framework that enables researchers to use this growing collection of public high-throughput data to investigate any set of genes or proteins. The connectivity between molecular states across thousands of heterogeneous datasets from microarrays and other genomic platforms is determined through a combination of rank-based enrichment statistics, meta-analyses, and biomedical ontologies. We address data quality concerns through dataset replication and meta-analysis and ensure that the majority of the findings are derived using multiple lines of evidence. As an example of our strategy and the utility of this framework, we apply our data mining approach to explore the biology of brown fat within the context of the thousands of publicly available gene expression datasets. Our work presents a practical strategy for organizing, mining, and correlating global collections of large-scale genomic data to explore normal and disease biology. Using a hypothesis-free approach, we demonstrate how a data-driven analysis across very large collections of genomic data can reveal novel discoveries and evidence to support existing hypothesis.

  11. Optimizing transformations for automated, high throughput analysis of flow cytometry data.

    Science.gov (United States)

    Finak, Greg; Perez, Juan-Manuel; Weng, Andrew; Gottardo, Raphael

    2010-11-04

    In a high throughput setting, effective flow cytometry data analysis depends heavily on proper data preprocessing. While usual preprocessing steps of quality assessment, outlier removal, normalization, and gating have received considerable scrutiny from the community, the influence of data transformation on the output of high throughput analysis has been largely overlooked. Flow cytometry measurements can vary over several orders of magnitude, cell populations can have variances that depend on their mean fluorescence intensities, and may exhibit heavily-skewed distributions. Consequently, the choice of data transformation can influence the output of automated gating. An appropriate data transformation aids in data visualization and gating of cell populations across the range of data. Experience shows that the choice of transformation is data specific. Our goal here is to compare the performance of different transformations applied to flow cytometry data in the context of automated gating in a high throughput, fully automated setting. We examine the most common transformations used in flow cytometry, including the generalized hyperbolic arcsine, biexponential, linlog, and generalized Box-Cox, all within the BioConductor flowCore framework that is widely used in high throughput, automated flow cytometry data analysis. All of these transformations have adjustable parameters whose effects upon the data are non-intuitive for most users. By making some modelling assumptions about the transformed data, we develop maximum likelihood criteria to optimize parameter choice for these different transformations. We compare the performance of parameter-optimized and default-parameter (in flowCore) data transformations on real and simulated data by measuring the variation in the locations of cell populations across samples, discovered via automated gating in both the scatter and fluorescence channels. We find that parameter-optimized transformations improve visualization, reduce

  12. Optimizing transformations for automated, high throughput analysis of flow cytometry data

    Directory of Open Access Journals (Sweden)

    Weng Andrew

    2010-11-01

    Full Text Available Abstract Background In a high throughput setting, effective flow cytometry data analysis depends heavily on proper data preprocessing. While usual preprocessing steps of quality assessment, outlier removal, normalization, and gating have received considerable scrutiny from the community, the influence of data transformation on the output of high throughput analysis has been largely overlooked. Flow cytometry measurements can vary over several orders of magnitude, cell populations can have variances that depend on their mean fluorescence intensities, and may exhibit heavily-skewed distributions. Consequently, the choice of data transformation can influence the output of automated gating. An appropriate data transformation aids in data visualization and gating of cell populations across the range of data. Experience shows that the choice of transformation is data specific. Our goal here is to compare the performance of different transformations applied to flow cytometry data in the context of automated gating in a high throughput, fully automated setting. We examine the most common transformations used in flow cytometry, including the generalized hyperbolic arcsine, biexponential, linlog, and generalized Box-Cox, all within the BioConductor flowCore framework that is widely used in high throughput, automated flow cytometry data analysis. All of these transformations have adjustable parameters whose effects upon the data are non-intuitive for most users. By making some modelling assumptions about the transformed data, we develop maximum likelihood criteria to optimize parameter choice for these different transformations. Results We compare the performance of parameter-optimized and default-parameter (in flowCore data transformations on real and simulated data by measuring the variation in the locations of cell populations across samples, discovered via automated gating in both the scatter and fluorescence channels. We find that parameter

  13. Deciphering the Diversities of Astroviruses and Noroviruses in Wastewater Treatment Plant Effluents by a High-Throughput Sequencing Method.

    Science.gov (United States)

    Prevost, B; Lucas, F S; Ambert-Balay, K; Pothier, P; Moulin, L; Wurtzer, S

    2015-10-01

    Although clinical epidemiology lists human enteric viruses to be among the primary causes of acute gastroenteritis in the human population, their circulation in the environment remains poorly investigated. These viruses are excreted by the human population into sewers and may be released into rivers through the effluents of wastewater treatment plants (WWTPs). In order to evaluate the viral diversity and loads in WWTP effluents of the Paris, France, urban area, which includes about 9 million inhabitants (approximately 15% of the French population), the seasonal occurrence of astroviruses and noroviruses in 100 WWTP effluent samples was investigated over 1 year. The coupling of these measurements with a high-throughput sequencing approach allowed the specific estimation of the diversity of human astroviruses (human astrovirus genotype 1 [HAstV-1], HAstV-2, HAstV-5, and HAstV-6), 7 genotypes of noroviruses (NoVs) of genogroup I (NoV GI.1 to NoV GI.6 and NoV GI.8), and 16 genotypes of NoVs of genogroup II (NoV GII.1 to NoV GII.7, NoV GII.9, NoV GII.12 to NoV GII.17, NoV GII.20, and NoV GII.21) in effluent samples. Comparison of the viral diversity in WWTP effluents to the viral diversity found by analysis of clinical data obtained throughout France underlined the consistency between the identified genotypes. However, some genotypes were locally present in effluents and were not found in the analysis of the clinical data. These findings could highlight an underestimation of the diversity of enteric viruses circulating in the human population. Consequently, analysis of WWTP effluents could allow the exploration of viral diversity not only in environmental waters but also in a human population linked to a sewerage network in order to better comprehend viral epidemiology and to forecast seasonal outbreaks. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  14. Fixing Formalin: A Method to Recover Genomic-Scale DNA Sequence Data from Formalin-Fixed Museum Specimens Using High-Throughput Sequencing.

    Directory of Open Access Journals (Sweden)

    Sarah M Hykin

    Full Text Available For 150 years or more, specimens were routinely collected and deposited in natural history collections without preserving fresh tissue samples for genetic analysis. In the case of most herpetological specimens (i.e. amphibians and reptiles, attempts to extract and sequence DNA from formalin-fixed, ethanol-preserved specimens-particularly for use in phylogenetic analyses-has been laborious and largely ineffective due to the highly fragmented nature of the DNA. As a result, tens of thousands of specimens in herpetological collections have not been available for sequence-based phylogenetic studies. Massively parallel High-Throughput Sequencing methods and the associated bioinformatics, however, are particularly suited to recovering meaningful genetic markers from severely degraded/fragmented DNA sequences such as DNA damaged by formalin-fixation. In this study, we compared previously published DNA extraction methods on three tissue types subsampled from formalin-fixed specimens of Anolis carolinensis, followed by sequencing. Sufficient quality DNA was recovered from liver tissue, making this technique minimally destructive to museum specimens. Sequencing was only successful for the more recently collected specimen (collected ~30 ybp. We suspect this could be due either to the conditions of preservation and/or the amount of tissue used for extraction purposes. For the successfully sequenced sample, we found a high rate of base misincorporation. After rigorous trimming, we successfully mapped 27.93% of the cleaned reads to the reference genome, were able to reconstruct the complete mitochondrial genome, and recovered an accurate phylogenetic placement for our specimen. We conclude that the amount of DNA available, which can vary depending on specimen age and preservation conditions, will determine if sequencing will be successful. The technique described here will greatly improve the value of museum collections by making many formalin-fixed specimens

  15. Fixing Formalin: A Method to Recover Genomic-Scale DNA Sequence Data from Formalin-Fixed Museum Specimens Using High-Throughput Sequencing.

    Science.gov (United States)

    Hykin, Sarah M; Bi, Ke; McGuire, Jimmy A

    2015-01-01

    For 150 years or more, specimens were routinely collected and deposited in natural history collections without preserving fresh tissue samples for genetic analysis. In the case of most herpetological specimens (i.e. amphibians and reptiles), attempts to extract and sequence DNA from formalin-fixed, ethanol-preserved specimens-particularly for use in phylogenetic analyses-has been laborious and largely ineffective due to the highly fragmented nature of the DNA. As a result, tens of thousands of specimens in herpetological collections have not been available for sequence-based phylogenetic studies. Massively parallel High-Throughput Sequencing methods and the associated bioinformatics, however, are particularly suited to recovering meaningful genetic markers from severely degraded/fragmented DNA sequences such as DNA damaged by formalin-fixation. In this study, we compared previously published DNA extraction methods on three tissue types subsampled from formalin-fixed specimens of Anolis carolinensis, followed by sequencing. Sufficient quality DNA was recovered from liver tissue, making this technique minimally destructive to museum specimens. Sequencing was only successful for the more recently collected specimen (collected ~30 ybp). We suspect this could be due either to the conditions of preservation and/or the amount of tissue used for extraction purposes. For the successfully sequenced sample, we found a high rate of base misincorporation. After rigorous trimming, we successfully mapped 27.93% of the cleaned reads to the reference genome, were able to reconstruct the complete mitochondrial genome, and recovered an accurate phylogenetic placement for our specimen. We conclude that the amount of DNA available, which can vary depending on specimen age and preservation conditions, will determine if sequencing will be successful. The technique described here will greatly improve the value of museum collections by making many formalin-fixed specimens available for

  16. High-throughput sequencing of human plasma RNA by using thermostable group II intron reverse transcriptases

    Science.gov (United States)

    Qin, Yidan; Yao, Jun; Wu, Douglas C.; Nottingham, Ryan M.; Mohr, Sabine; Hunicke-Smith, Scott; Lambowitz, Alan M.

    2016-01-01

    Next-generation RNA-sequencing (RNA-seq) has revolutionized transcriptome profiling, gene expression analysis, and RNA-based diagnostics. Here, we developed a new RNA-seq method that exploits thermostable group II intron reverse transcriptases (TGIRTs) and used it to profile human plasma RNAs. TGIRTs have higher thermostability, processivity, and fidelity than conventional reverse transcriptases, plus a novel template-switching activity that can efficiently attach RNA-seq adapters to target RNA sequences without RNA ligation. The new TGIRT-seq method enabled construction of RNA-seq libraries from RNA in RNA in 1-mL plasma samples from a healthy individual revealed RNA fragments mapping to a diverse population of protein-coding gene and long ncRNAs, which are enriched in intron and antisense sequences, as well as nearly all known classes of small ncRNAs, some of which have never before been seen in plasma. Surprisingly, many of the small ncRNA species were present as full-length transcripts, suggesting that they are protected from plasma RNases in ribonucleoprotein (RNP) complexes and/or exosomes. This TGIRT-seq method is readily adaptable for profiling of whole-cell, exosomal, and miRNAs, and for related procedures, such as HITS-CLIP and ribosome profiling. PMID:26554030

  17. Needs Assessment for Research Use of High-Throughput Sequencing at a Large Academic Medical Center.

    Directory of Open Access Journals (Sweden)

    Albert Geskin

    Full Text Available Next Generation Sequencing (NGS methods are driving profound changes in biomedical research, with a growing impact on patient care. Many academic medical centers are evaluating potential models to prepare for the rapid increase in NGS information needs. This study sought to investigate (1 how and where sequencing data is generated and analyzed, (2 research objectives and goals for NGS, (3 workforce capacity and unmet needs, (4 storage capacity and unmet needs, (5 available and anticipated funding resources, and (6 future challenges. As a precursor to informed decision making at our institution, we undertook a systematic needs assessment of investigators using survey methods. We recruited 331 investigators from over 60 departments and divisions at the University of Pittsburgh Schools of Health Sciences and had 140 respondents, or a 42% response rate. Results suggest that both sequencing and analysis bottlenecks currently exist. Significant educational needs were identified, including both investigator-focused needs, such as selection of NGS methods suitable for specific research objectives, and program-focused needs, such as support for training an analytic workforce. The absence of centralized infrastructure was identified as an important institutional gap. Key principles for organizations managing this change were formulated based on the survey responses. This needs assessment provides an in-depth case study which may be useful to other academic medical centers as they identify and plan for future needs.

  18. High-throughput sequencing of microRNA transcriptome and expression assay in the sturgeon, Acipenser schrenckii.

    Directory of Open Access Journals (Sweden)

    Lihong Yuan

    Full Text Available Sturgeons are considered as living fossils and have very high evolutionary, economical and conservation values. The multiploidy of sturgeon that has been caused by chromosome duplication may lead to the emergence of new microRNAs (miRNAs involved in the ploidy and physiological processes. In the present study, we performed the first sturgeon miRNAs analysis by RNA-seq high-throughput sequencing combined with expression assay of microarray and real-time PCR, and aimed to discover the sturgeon-specific miRNAs, confirm the expressed pattern of miRNAs and illustrate the potential role of miRNAs-targets on sturgeon biological processes. A total of 103 miRNAs were identified, including 58 miRNAs with strongly detected signals (signal >500 and P≤0.01, which were detected by microarray. Real-time PCR assay supported the expression pattern obtained by microarray. Moreover, co-expression of 21 miRNAs in all five tissues and tissue-specific expression of 16 miRNAs implied the crucial and particular function of them in sturgeon physiological processes. Target gene prediction, especially the enriched functional gene groups (369 GO terms and pathways (37 KEGG regulated by 58 miRNAs (P<0.05, illustrated the interaction of miRNAs and putative mRNAs, and also the potential mechanism involved in these biological processes. Our new findings of sturgeon miRNAs expand the public database of transcriptome information for this species, contribute to our understanding of sturgeon biology, and also provide invaluable data that may be applied in sturgeon breeding.

  19. High-throughput sequencing of microRNA transcriptome and expression assay in the sturgeon, Acipenser schrenckii.

    Science.gov (United States)

    Yuan, Lihong; Zhang, Xiujuan; Li, Linmiao; Jiang, Haiying; Chen, Jinping

    2014-01-01

    Sturgeons are considered as living fossils and have very high evolutionary, economical and conservation values. The multiploidy of sturgeon that has been caused by chromosome duplication may lead to the emergence of new microRNAs (miRNAs) involved in the ploidy and physiological processes. In the present study, we performed the first sturgeon miRNAs analysis by RNA-seq high-throughput sequencing combined with expression assay of microarray and real-time PCR, and aimed to discover the sturgeon-specific miRNAs, confirm the expressed pattern of miRNAs and illustrate the potential role of miRNAs-targets on sturgeon biological processes. A total of 103 miRNAs were identified, including 58 miRNAs with strongly detected signals (signal >500 and P≤0.01), which were detected by microarray. Real-time PCR assay supported the expression pattern obtained by microarray. Moreover, co-expression of 21 miRNAs in all five tissues and tissue-specific expression of 16 miRNAs implied the crucial and particular function of them in sturgeon physiological processes. Target gene prediction, especially the enriched functional gene groups (369 GO terms) and pathways (37 KEGG) regulated by 58 miRNAs (P<0.05), illustrated the interaction of miRNAs and putative mRNAs, and also the potential mechanism involved in these biological processes. Our new findings of sturgeon miRNAs expand the public database of transcriptome information for this species, contribute to our understanding of sturgeon biology, and also provide invaluable data that may be applied in sturgeon breeding.

  20. Cancer panomics: computational methods and infrastructure for integrative analysis of cancer high-throughput "omics" data

    DEFF Research Database (Denmark)

    Brunak, Søren; De La Vega, Francisco M.; Rätsch, Gunnar

    2014-01-01

    Targeted cancer treatment is becoming the goal of newly developed oncology medicines and has already shown promise in some spectacular cases such as the case of BRAF kinase inhibitors in BRAF-mutant (e.g. V600E) melanoma. These developments are driven by the advent of high-throughput sequencing......, which continues to drop in cost, and that has enabled the sequencing of the genome, transcriptome, and epigenome of the tumors of a large number of cancer patients in order to discover the molecular aberrations that drive the oncogenesis of several types of cancer. Applying these technologies...... in the clinic promises to transform cancer treatment by identifying therapeutic vulnerabilities of each patient's tumor. These approaches will need to address the panomics of cancer--the integration of the complex combination of patient-specific characteristics that drive the development of each person's tumor...

  1. Detecting DNA double-stranded breaks in mammalian genomes by linear amplification-mediated high-throughput genome-wide translocation sequencing.

    Science.gov (United States)

    Hu, Jiazhi; Meyers, Robin M; Dong, Junchao; Panchakshari, Rohit A; Alt, Frederick W; Frock, Richard L

    2016-05-01

    Unbiased, high-throughput assays for detecting and quantifying DNA double-stranded breaks (DSBs) across the genome in mammalian cells will facilitate basic studies of the mechanisms that generate and repair endogenous DSBs. They will also enable more applied studies, such as those to evaluate the on- and off-target activities of engineered nucleases. Here we describe a linear amplification-mediated high-throughput genome-wide sequencing (LAM-HTGTS) method for the detection of genome-wide 'prey' DSBs via their translocation in cultured mammalian cells to a fixed 'bait' DSB. Bait-prey junctions are cloned directly from isolated genomic DNA using LAM-PCR and unidirectionally ligated to bridge adapters; subsequent PCR steps amplify the single-stranded DNA junction library in preparation for Illumina Miseq paired-end sequencing. A custom bioinformatics pipeline identifies prey sequences that contribute to junctions and maps them across the genome. LAM-HTGTS differs from related approaches because it detects a wide range of broken end structures with nucleotide-level resolution. Familiarity with nucleic acid methods and next-generation sequencing analysis is necessary for library generation and data interpretation. LAM-HTGTS assays are sensitive, reproducible, relatively inexpensive, scalable and straightforward to implement with a turnaround time of <1 week.

  2. Global repeat discovery and estimation of genomic copy number in a large, complex genome using a high-throughput 454 sequence survey

    Directory of Open Access Journals (Sweden)

    Varala Kranthi

    2007-05-01

    Full Text Available Abstract Background Extensive computational and database tools are available to mine genomic and genetic databases for model organisms, but little genomic data is available for many species of ecological or agricultural significance, especially those with large genomes. Genome surveys using conventional sequencing techniques are powerful, particularly for detecting sequences present in many copies per genome. However these methods are time-consuming and have potential drawbacks. High throughput 454 sequencing provides an alternative method by which much information can be gained quickly and cheaply from high-coverage surveys of genomic DNA. Results We sequenced 78 million base-pairs of randomly sheared soybean DNA which passed our quality criteria. Computational analysis of the survey sequences provided global information on the abundant repetitive sequences in soybean. The sequence was used to determine the copy number across regions of large genomic clones or contigs and discover higher-order structures within satellite repeats. We have created an annotated, online database of sequences present in multiple copies in the soybean genome. The low bias of pyrosequencing against repeat sequences is demonstrated by the overall composition of the survey data, which matches well with past estimates of repetitive DNA content obtained by DNA re-association kinetics (Cot analysis. Conclusion This approach provides a potential aid to conventional or shotgun genome assembly, by allowing rapid assessment of copy number in any clone or clone-end sequence. In addition, we show that partial sequencing can provide access to partial protein-coding sequences.

  3. High-Throughput Sequencing, a VersatileWeapon to Support Genome-Based Diagnosis in Infectious Diseases: Applications to Clinical Bacteriology

    Directory of Open Access Journals (Sweden)

    Ségolène Caboche

    2014-04-01

    Full Text Available The recent progresses of high-throughput sequencing (HTS technologies enable easy and cost-reduced access to whole genome sequencing (WGS or re-sequencing. HTS associated with adapted, automatic and fast bioinformatics solutions for sequencing applications promises an accurate and timely identification and characterization of pathogenic agents. Many studies have demonstrated that data obtained from HTS analysis have allowed genome-based diagnosis, which has been consistent with phenotypic observations. These proofs of concept are probably the first steps toward the future of clinical microbiology. From concept to routine use, many parameters need to be considered to promote HTS as a powerful tool to help physicians and clinicians in microbiological investigations. This review highlights the milestones to be completed toward this purpose.

  4. Analysis of JC virus DNA replication using a quantitative and high-throughput assay

    International Nuclear Information System (INIS)

    Shin, Jong; Phelan, Paul J.; Chhum, Panharith; Bashkenova, Nazym; Yim, Sung; Parker, Robert; Gagnon, David; Gjoerup, Ole; Archambault, Jacques; Bullock, Peter A.

    2014-01-01

    Progressive Multifocal Leukoencephalopathy (PML) is caused by lytic replication of JC virus (JCV) in specific cells of the central nervous system. Like other polyomaviruses, JCV encodes a large T-antigen helicase needed for replication of the viral DNA. Here, we report the development of a luciferase-based, quantitative and high-throughput assay of JCV DNA replication in C33A cells, which, unlike the glial cell lines Hs 683 and U87, accumulate high levels of nuclear T-ag needed for robust replication. Using this assay, we investigated the requirement for different domains of T-ag, and for specific sequences within and flanking the viral origin, in JCV DNA replication. Beyond providing validation of the assay, these studies revealed an important stimulatory role of the transcription factor NF1 in JCV DNA replication. Finally, we show that the assay can be used for inhibitor testing, highlighting its value for the identification of antiviral drugs targeting JCV DNA replication. - Highlights: • Development of a high-throughput screening assay for JCV DNA replication using C33A cells. • Evidence that T-ag fails to accumulate in the nuclei of established glioma cell lines. • Evidence that NF-1 directly promotes JCV DNA replication in C33A cells. • Proof-of-concept that the HTS assay can be used to identify pharmacological inhibitor of JCV DNA replication

  5. Analysis of JC virus DNA replication using a quantitative and high-throughput assay

    Energy Technology Data Exchange (ETDEWEB)

    Shin, Jong; Phelan, Paul J.; Chhum, Panharith; Bashkenova, Nazym; Yim, Sung; Parker, Robert [Department of Developmental, Molecular and Chemical Biology, Tufts University School of Medicine, Boston, MA 02111 (United States); Gagnon, David [Institut de Recherches Cliniques de Montreal (IRCM), 110 Pine Avenue West, Montreal, Quebec, Canada H2W 1R7 (Canada); Department of Biochemistry and Molecular Medicine, Université de Montréal, Montréal, Quebec (Canada); Gjoerup, Ole [Molecular Oncology Research Institute, Tufts Medical Center, Boston, MA 02111 (United States); Archambault, Jacques [Institut de Recherches Cliniques de Montreal (IRCM), 110 Pine Avenue West, Montreal, Quebec, Canada H2W 1R7 (Canada); Department of Biochemistry and Molecular Medicine, Université de Montréal, Montréal, Quebec (Canada); Bullock, Peter A., E-mail: Peter.Bullock@tufts.edu [Department of Developmental, Molecular and Chemical Biology, Tufts University School of Medicine, Boston, MA 02111 (United States)

    2014-11-15

    Progressive Multifocal Leukoencephalopathy (PML) is caused by lytic replication of JC virus (JCV) in specific cells of the central nervous system. Like other polyomaviruses, JCV encodes a large T-antigen helicase needed for replication of the viral DNA. Here, we report the development of a luciferase-based, quantitative and high-throughput assay of JCV DNA replication in C33A cells, which, unlike the glial cell lines Hs 683 and U87, accumulate high levels of nuclear T-ag needed for robust replication. Using this assay, we investigated the requirement for different domains of T-ag, and for specific sequences within and flanking the viral origin, in JCV DNA replication. Beyond providing validation of the assay, these studies revealed an important stimulatory role of the transcription factor NF1 in JCV DNA replication. Finally, we show that the assay can be used for inhibitor testing, highlighting its value for the identification of antiviral drugs targeting JCV DNA replication. - Highlights: • Development of a high-throughput screening assay for JCV DNA replication using C33A cells. • Evidence that T-ag fails to accumulate in the nuclei of established glioma cell lines. • Evidence that NF-1 directly promotes JCV DNA replication in C33A cells. • Proof-of-concept that the HTS assay can be used to identify pharmacological inhibitor of JCV DNA replication.

  6. Recent advances in quantitative high throughput and high content data analysis.

    Science.gov (United States)

    Moutsatsos, Ioannis K; Parker, Christian N

    2016-01-01

    High throughput screening has become a basic technique with which to explore biological systems. Advances in technology, including increased screening capacity, as well as methods that generate multiparametric readouts, are driving the need for improvements in the analysis of data sets derived from such screens. This article covers the recent advances in the analysis of high throughput screening data sets from arrayed samples, as well as the recent advances in the analysis of cell-by-cell data sets derived from image or flow cytometry application. Screening multiple genomic reagents targeting any given gene creates additional challenges and so methods that prioritize individual gene targets have been developed. The article reviews many of the open source data analysis methods that are now available and which are helping to define a consensus on the best practices to use when analyzing screening data. As data sets become larger, and more complex, the need for easily accessible data analysis tools will continue to grow. The presentation of such complex data sets, to facilitate quality control monitoring and interpretation of the results will require the development of novel visualizations. In addition, advanced statistical and machine learning algorithms that can help identify patterns, correlations and the best features in massive data sets will be required. The ease of use for these tools will be important, as they will need to be used iteratively by laboratory scientists to improve the outcomes of complex analyses.

  7. Digital PCR provides sensitive and absolute calibration for high throughput sequencing

    Directory of Open Access Journals (Sweden)

    Fan H Christina

    2009-03-01

    Full Text Available Abstract Background Next-generation DNA sequencing on the 454, Solexa, and SOLiD platforms requires absolute calibration of the number of molecules to be sequenced. This requirement has two unfavorable consequences. First, large amounts of sample-typically micrograms-are needed for library preparation, thereby limiting the scope of samples which can be sequenced. For many applications, including metagenomics and the sequencing of ancient, forensic, and clinical samples, the quantity of input DNA can be critically limiting. Second, each library requires a titration sequencing run, thereby increasing the cost and lowering the throughput of sequencing. Results We demonstrate the use of digital PCR to accurately quantify 454 and Solexa sequencing libraries, enabling the preparation of sequencing libraries from nanogram quantities of input material while eliminating costly and time-consuming titration runs of the sequencer. We successfully sequenced low-nanogram scale bacterial and mammalian DNA samples on the 454 FLX and Solexa DNA sequencing platforms. This study is the first to definitively demonstrate the successful sequencing of picogram quantities of input DNA on the 454 platform, reducing the sample requirement more than 1000-fold without pre-amplification and the associated bias and reduction in library depth. Conclusion The digital PCR assay allows absolute quantification of sequencing libraries, eliminates uncertainties associated with the construction and application of standard curves to PCR-based quantification, and with a coefficient of variation close to 10%, is sufficiently precise to enable direct sequencing without titration runs.

  8. Bacterial diversity of the American sand fly Lutzomyia intermedia using high-throughput metagenomic sequencing.

    Science.gov (United States)

    Monteiro, Carolina Cunha; Villegas, Luis Eduardo Martinez; Campolina, Thais Bonifácio; Pires, Ana Clara Machado Araújo; Miranda, Jose Carlos; Pimenta, Paulo Filemon Paolucci; Secundino, Nagila Francinete Costa

    2016-08-31

    Parasites of the genus Leishmania cause a broad spectrum of diseases, collectively known as leishmaniasis, in humans worldwide. American cutaneous leishmaniasis is a neglected disease transmitted by sand fly vectors including Lutzomyia intermedia, a proven vector. The female sand fly can acquire or deliver Leishmania spp. parasites while feeding on a blood meal, which is required for nutrition, egg development and survival. The microbiota composition and abundance varies by food source, life stages and physiological conditions. The sand fly microbiota can affect parasite life-cycle in the vector. We performed a metagenomic analysis for microbiota composition and abundance in Lu. intermedia, from an endemic area in Brazil. The adult insects were collected using CDC light traps, morphologically identified, carefully sterilized, dissected under a microscope and the females separated into groups according to their physiological condition: (i) absence of blood meal (unfed = UN); (ii) presence of blood meal (blood-fed = BF); and (iii) presence of developed ovaries (gravid = GR). Then, they were processed for metagenomics with Illumina Hiseq Sequencing in order to be sequence analyzed and to obtain the taxonomic profiles of the microbiota. Bacterial metagenomic analysis revealed differences in microbiota composition based upon the distinct physiological stages of the adult insect. Sequence identification revealed two phyla (Proteobacteria and Actinobacteria), 11 families and 15 genera; 87 % of the bacteria were Gram-negative, while only one family and two genera were identified as Gram-positive. The genera Ochrobactrum, Bradyrhizobium and Pseudomonas were found across all of the groups. The metagenomic analysis revealed that the microbiota of the Lu. intermedia female sand flies are distinct under specific physiological conditions and consist of 15 bacterial genera. The Ochrobactrum, Bradyrhizobium and Pseudomonas were the common genera. Our results detailing

  9. Why barcode? High-throughput multiplex sequencing of mitochondrial genomes for molecular systematics.

    Science.gov (United States)

    Timmermans, M J T N; Dodsworth, S; Culverwell, C L; Bocak, L; Ahrens, D; Littlewood, D T J; Pons, J; Vogler, A P

    2010-11-01

    Mitochondrial genome sequences are important markers for phylogenetics but taxon sampling remains sporadic because of the great effort and cost required to acquire full-length sequences. Here, we demonstrate a simple, cost-effective way to sequence the full complement of protein coding mitochondrial genes from pooled samples using the 454/Roche platform. Multiplexing was achieved without the need for expensive indexing tags ('barcodes'). The method was trialled with a set of long-range polymerase chain reaction (PCR) fragments from 30 species of Coleoptera (beetles) sequenced in a 1/16th sector of a sequencing plate. Long contigs were produced from the pooled sequences with sequencing depths ranging from ∼10 to 100× per contig. Species identity of individual contigs was established via three 'bait' sequences matching disparate parts of the mitochondrial genome obtained by conventional PCR and Sanger sequencing. This proved that assembly of contigs from the sequencing pool was correct. Our study produced sequences for 21 nearly complete and seven partial sets of protein coding mitochondrial genes. Combined with existing sequences for 25 taxa, an improved estimate of basal relationships in Coleoptera was obtained. The procedure could be employed routinely for mitochondrial genome sequencing at the species level, to provide improved species 'barcodes' that currently use the cox1 gene only.

  10. Characterization of the fecal microbiome in different swine groups by high-throughput sequencing.

    Science.gov (United States)

    Park, Soo-Je; Kim, Jinu; Lee, Jong-Soo; Rhee, Sung-Keun; Kim, Hongik

    2014-08-01

    Swine have a complex microbial community within their gastrointestinal tract that plays a critical role in both health and disease. High-throughput 16S rRNA gene-based pyrosequencing was used to identify the possible core microorganisms in the gut of swine groups that differ in meat quality and weight grades (level 1 as higher meat quality and level 2 as lower meat quality). Samples were taken from the rectum and/or stool from ten animals, DNA was extracted, and the V1-V3 regions of the 16S rRNA gene were amplified. Two bacterial populations (Bacteroidetes and Firmicutes) dominated and were shared between the two groups. Significant differences between the groups were found at the genus level. The genera Lactobacillus and Oscillibacter were found in slightly higher proportions in the level 2 group (12.6 and 12.4% of the classified reads, respectively) than those of level 1 (9.6 and 7.7%, respectively). By contrast, the proportion of reads assigned to the genus Roseburia in the level 1 group (13.0%) was higher than that of level 2 (4.8%). The largest differences were related to the genera Clostridium, Oscillibacter, and Roseburia as core microorganisms. Moreover, two genera, Roseburia and Clostridium, related to level 1 produced linoleic acid or short chain fatty acids that might contribute to swine health and development. In conclusion, the presence of core bacteria in the swine gut is associated with meat quality with reduced body fat in swine. Copyright © 2014 Elsevier Ltd. All rights reserved.

  11. msBiodat analysis tool, big data analysis for high-throughput experiments.

    Science.gov (United States)

    Muñoz-Torres, Pau M; Rokć, Filip; Belužic, Robert; Grbeša, Ivana; Vugrek, Oliver

    2016-01-01

    Mass spectrometry (MS) are a group of a high-throughput techniques used to increase knowledge about biomolecules. They produce a large amount of data which is presented as a list of hundreds or thousands of proteins. Filtering those data efficiently is the first step for extracting biologically relevant information. The filtering may increase interest by merging previous data with the data obtained from public databases, resulting in an accurate list of proteins which meet the predetermined conditions. In this article we present msBiodat Analysis Tool, a web-based application thought to approach proteomics to the big data analysis. With this tool, researchers can easily select the most relevant information from their MS experiments using an easy-to-use web interface. An interesting feature of msBiodat analysis tool is the possibility of selecting proteins by its annotation on Gene Ontology using its Gene Id, ensembl or UniProt codes. The msBiodat analysis tool is a web-based application that allows researchers with any programming experience to deal with efficient database querying advantages. Its versatility and user-friendly interface makes easy to perform fast and accurate data screening by using complex queries. Once the analysis is finished, the result is delivered by e-mail. msBiodat analysis tool is freely available at http://msbiodata.irb.hr.

  12. Identification of antigen-specific human monoclonal antibodies using high-throughput sequencing of the antibody repertoire.

    Science.gov (United States)

    Liu, Ju; Li, Ruihua; Liu, Kun; Li, Liangliang; Zai, Xiaodong; Chi, Xiangyang; Fu, Ling; Xu, Junjie; Chen, Wei

    2016-04-22

    High-throughput sequencing of the antibody repertoire provides a large number of antibody variable region sequences that can be used to generate human monoclonal antibodies. However, current screening methods for identifying antigen-specific antibodies are inefficient. In the present study, we developed an antibody clone screening strategy based on clone dynamics and relative frequency, and used it to identify antigen-specific human monoclonal antibodies. Enzyme-linked immunosorbent assay showed that at least 52% of putative positive immunoglobulin heavy chains composed antigen-specific antibodies. Combining information on dynamics and relative frequency improved identification of positive clones and elimination of negative clones. and increase the credibility of putative positive clones. Therefore the screening strategy could simplify the subsequent experimental screening and may facilitate the generation of antigen-specific antibodies. Copyright © 2016 Elsevier Inc. All rights reserved.

  13. A Proteomic Workflow Using High-Throughput De Novo Sequencing Towards Complementation of Genome Information for Improved Comparative Crop Science.

    Science.gov (United States)

    Turetschek, Reinhard; Lyon, David; Desalegn, Getinet; Kaul, Hans-Peter; Wienkoop, Stefanie

    2016-01-01

    The proteomic study of non-model organisms, such as many crop plants, is challenging due to the lack of comprehensive genome information. Changing environmental conditions require the study and selection of adapted cultivars. Mutations, inherent to cultivars, hamper protein identification and thus considerably complicate the qualitative and quantitative comparison in large-scale systems biology approaches. With this workflow, cultivar-specific mutations are detected from high-throughput comparative MS analyses, by extracting sequence polymorphisms with de novo sequencing. Stringent criteria are suggested to filter for confidential mutations. Subsequently, these polymorphisms complement the initially used database, which is ready to use with any preferred database search algorithm. In our example, we thereby identified 26 specific mutations in two cultivars of Pisum sativum and achieved an increased number (17 %) of peptide spectrum matches.

  14. Origin, diversity and maturation of human antiviral antibodies analyzed by high-throughput sequencing

    Directory of Open Access Journals (Sweden)

    Ponraj ePrabakaran

    2012-08-01

    Full Text Available Our understanding of how antibodies are generated and function could help develop effective vaccines and antibody-based therapeutics against viruses such as HIV-1, SARS Coronavirus (CoV, and Hendra and Nipah viruses (henipaviruses. Although broadly neutralizing antibodies (bnAbs against the HIV-1 were observed in patients, elicitation of such bnAbs remains a major challenge when compared to other viral targets. We previously hypothesized that HIV-1 could have evolved a strategy to evade the immune system due to absent or very weak binding of germline antibodies to the conserved epitopes that may not be sufficient to initiate and/or maintain an effective immune response. To further explore our hypothesis, we used the 454 sequence analysis of a large naïve library of human IgM antibodies which had been used for selecting antibodies against SARS Coronavirus (CoV receptor-binding domain (RBD, and soluble G proteins (sG of Hendra and Nipah viruses (henipaviruses. We found that the human IgM repertoires from the 454 sequencing have diverse germline usages, recombination patterns, junction diversity and a lower extent of somatic mutation. In this study, we identified germline intermediates of antibodies specific to HIV-1 and other viruses as observed in normal individuals, and compared their genetic diversity and somatic mutation level along with available structural and functional data. Further computational analysis will provide framework for understanding the underlying genetic and molecular determinants related to maturation pathways of antiviral bnAbs that could be useful for applying novel approaches to the design of effective vaccine immunogens and antibody-based therapeutics.

  15. Polymorphism discovery and allele frequency estimation using high-throughput DNA sequencing of target-enriched pooled DNA samples

    Directory of Open Access Journals (Sweden)

    Mullen Michael P

    2012-01-01

    Full Text Available Abstract Background The central role of the somatotrophic axis in animal post-natal growth, development and fertility is well established. Therefore, the identification of genetic variants affecting quantitative traits within this axis is an attractive goal. However, large sample numbers are a pre-requisite for the identification of genetic variants underlying complex traits and although technologies are improving rapidly, high-throughput sequencing of large numbers of complete individual genomes remains prohibitively expensive. Therefore using a pooled DNA approach coupled with target enrichment and high-throughput sequencing, the aim of this study was to identify polymorphisms and estimate allele frequency differences across 83 candidate genes of the somatotrophic axis, in 150 Holstein-Friesian dairy bulls divided into two groups divergent for genetic merit for fertility. Results In total, 4,135 SNPs and 893 indels were identified during the resequencing of the 83 candidate genes. Nineteen percent (n = 952 of variants were located within 5' and 3' UTRs. Seventy-two percent (n = 3,612 were intronic and 9% (n = 464 were exonic, including 65 indels and 236 SNPs resulting in non-synonymous substitutions (NSS. Significant (P ® MassARRAY. No significant differences (P > 0.1 were observed between the two methods for any of the 43 SNPs across both pools (i.e., 86 tests in total. Conclusions The results of the current study support previous findings of the use of DNA sample pooling and high-throughput sequencing as a viable strategy for polymorphism discovery and allele frequency estimation. Using this approach we have characterised the genetic variation within genes of the somatotrophic axis and related pathways, central to mammalian post-natal growth and development and subsequent lactogenesis and fertility. We have identified a large number of variants segregating at significantly different frequencies between cattle groups divergent for calving

  16. Fine mapping of a Phytophthora-resistance gene RpsWY in soybean (Glycine max L.) by high-throughput genome-wide sequencing.

    Science.gov (United States)

    Cheng, Yanbo; Ma, Qibin; Ren, Hailong; Xia, Qiuju; Song, Enliang; Tan, Zhiyuan; Li, Shuxian; Zhang, Gengyun; Nian, Hai

    2017-05-01

    Using a combination of phenotypic screening, genetic and statistical analyses, and high-throughput genome-wide sequencing, we have finely mapped a dominant Phytophthora resistance gene in soybean cultivar Wayao. Phytophthora root rot (PRR) caused by Phytophthora sojae is one of the most important soil-borne diseases in many soybean-production regions in the world. Identification of resistant gene(s) and incorporating them into elite varieties are an effective way for breeding to prevent soybean from being harmed by this disease. Two soybean populations of 191 F 2 individuals and 196 F 7:8 recombinant inbred lines (RILs) were developed to map Rps gene by crossing a susceptible cultivar Huachun 2 with the resistant cultivar Wayao. Genetic analysis of the F 2 population indicated that PRR resistance in Wayao was controlled by a single dominant gene, temporarily named RpsWY, which was mapped on chromosome 3. A high-density genetic linkage bin map was constructed using 3469 recombination bins of the RILs to explore the candidate genes by the high-throughput genome-wide sequencing. The results of genotypic analysis showed that the RpsWY gene was located in bin 401 between 4466230 and 4502773 bp on chromosome 3 through line 71 and 100 of the RILs. Four predicted genes (Glyma03g04350, Glyma03g04360, Glyma03g04370, and Glyma03g04380) were found at the narrowed region of 36.5 kb in bin 401. These results suggest that the high-throughput genome-wide resequencing is an effective method to fine map PRR candidate genes.

  17. Uncommon nucleotide excision repair phenotypes revealed by targeted high-throughput sequencing.

    Science.gov (United States)

    Calmels, Nadège; Greff, Géraldine; Obringer, Cathy; Kempf, Nadine; Gasnier, Claire; Tarabeux, Julien; Miguet, Marguerite; Baujat, Geneviève; Bessis, Didier; Bretones, Patricia; Cavau, Anne; Digeon, Béatrice; Doco-Fenzy, Martine; Doray, Bérénice; Feillet, François; Gardeazabal, Jesus; Gener, Blanca; Julia, Sophie; Llano-Rivas, Isabel; Mazur, Artur; Michot, Caroline; Renaldo-Robin, Florence; Rossi, Massimiliano; Sabouraud, Pascal; Keren, Boris; Depienne, Christel; Muller, Jean; Mandel, Jean-Louis; Laugel, Vincent

    2016-03-22

    Deficient nucleotide excision repair (NER) activity causes a variety of autosomal recessive diseases including xeroderma pigmentosum (XP) a disorder which pre-disposes to skin cancer, and the severe multisystem condition known as Cockayne syndrome (CS). In view of the clinical overlap between NER-related disorders, as well as the existence of multiple phenotypes and the numerous genes involved, we developed a new diagnostic approach based on the enrichment of 16 NER-related genes by multiplex amplification coupled with next-generation sequencing (NGS). Our test cohort consisted of 11 DNA samples, all with known mutations and/or non pathogenic SNPs in two of the tested genes. We then used the same technique to analyse samples from a prospective cohort of 40 patients. Multiplex amplification and sequencing were performed using AmpliSeq protocol on the Ion Torrent PGM (Life Technologies). We identified causative mutations in 17 out of the 40 patients (43%). Four patients showed biallelic mutations in the ERCC6(CSB) gene, five in the ERCC8(CSA) gene: most of them had classical CS features but some had very mild and incomplete phenotypes. A small cohort of 4 unrelated classic XP patients from the Basque country (Northern Spain) revealed a common splicing mutation in POLH (XP-variant), demonstrating a new founder effect in this population. Interestingly, our results also found ERCC2(XPD), ERCC3(XPB) or ERCC5(XPG) mutations in two cases of UV-sensitive syndrome and in two cases with mixed XP/CS phenotypes. Our study confirms that NGS is an efficient technique for the analysis of NER-related disorders on a molecular level. It is particularly useful for phenotypes with combined features or unusually mild symptoms. Targeted NGS used in conjunction with DNA repair functional tests and precise clinical evaluation permits rapid and cost-effective diagnosis in patients with NER-defects.

  18. High-Throughput Sequencing of Microbial Community Diversity and Dynamics during Douchi Fermentation

    Science.gov (United States)

    Tu, Zong-cai; Wang, Xiao-lan

    2016-01-01

    Douchi is a type of Chinese traditional fermented food that is an important source of protein and is used in flavouring ingredients. The end product is affected by the microbial community present during fermentation, but exactly how microbes influence the fermentation process remains poorly understood. We used an Illumina MiSeq approach to investigate bacterial and fungal community diversity during both douchi-koji making and fermentation. A total of 181,443 high quality bacterial 16S rRNA sequences and 221,059 high quality fungal internal transcribed spacer reads were used for taxonomic classification, revealing eight bacterial and three fungal phyla. Firmicutes, Actinobacteria and Proteobacteria were the dominant bacterial phyla, while Ascomycota and Zygomycota were the dominant fungal phyla. At the genus level, Staphylococcus and Weissella were the dominant bacteria, while Aspergillus and Lichtheimia were the dominant fungi. Principal coordinate analysis showed structural separation between the composition of bacteria in koji making and fermentation. However, multivariate analysis of variance based on unweighted UniFrac distances did identify distinct differences (p fermentation. This is the first investigation to integrate douchi fermentation and koji making and fermentation processes through this technological approach. The results provide insight into the microbiome of the douchi fermentation process, and reveal a structural separation that may be stratified by the environment during the production of this traditional fermented food. PMID:27992473

  19. High-throughput sequencing of nematode communities from total soil DNA extractions

    DEFF Research Database (Denmark)

    Sapkota, Rumakanta; Nicolaisen, Mogens

    2015-01-01

    nematodes without the need for enrichment was developed. Using this strategy on DNA templates from a set of 22 agricultural soils, we obtained 64.4% sequences of nematode origin in total, whereas the remaining sequences were almost entirely from other metazoans. The nematode sequences were derived from...... in previous sequence-based studies are not nematode specific but also amplify other groups of organisms such as fungi and plantae, and thus require a nematode enrichment step that may introduce biases. Results: In this study an amplification strategy which selectively amplifies a fragment of the SSU from...... a broad taxonomic range and most sequences were from nematode taxa that have previously been found to be abundant in soil such as Tylenchida, Rhabditida, Dorylaimida, Triplonchida and Araeolaimida. Conclusions: Our amplification and sequencing strategy for assessing nematode diversity was able to collect...

  20. Determining the diet of larvae of western rock lobster (Panulirus cygnus using high-throughput DNA sequencing techniques.

    Directory of Open Access Journals (Sweden)

    Richard O'Rorke

    Full Text Available The Western Australian rock lobster fishery has been both a highly productive and sustainable fishery. However, a recent dramatic and unexplained decline in post-larval recruitment threatens this sustainability. Our lack of knowledge of key processes in lobster larval ecology, such as their position in the food web, limits our ability to determine what underpins this decline. The present study uses a high-throughput amplicon sequencing approach on DNA obtained from the hepatopancreas of larvae to discover significant prey items. Two short regions of the 18S rRNA gene were amplified under the presence of lobster specific PNA to prevent lobster amplification and to improve prey amplification. In the resulting sequences either little prey was recovered, indicating that the larval gut was empty, or there was a high number of reads originating from multiple zooplankton taxa. The most abundant reads included colonial Radiolaria, Thaliacea, Actinopterygii, Hydrozoa and Sagittoidea, which supports the hypothesis that the larvae feed on multiple groups of mostly transparent gelatinous zooplankton. This hypothesis has prevailed as it has been tentatively inferred from the physiology of larvae, captive feeding trials and co-occurrence in situ. However, these prey have not been observed in the larval gut as traditional microscopic techniques cannot discern between transparent and gelatinous prey items in the gut. High-throughput amplicon sequencing of gut DNA has enabled us to classify these otherwise undetectable prey. The dominance of the colonial radiolarians among the gut contents is intriguing in that this group has been historically difficult to quantify in the water column, which may explain why they have not been connected to larval diet previously. Our results indicate that a PCR based technique is a very successful approach to identify the most abundant taxa in the natural diet of lobster larvae.

  1. Characterization of microbial biofilms in a thermophilic biogas system by high-throughput metagenome sequencing.

    Science.gov (United States)

    Rademacher, Antje; Zakrzewski, Martha; Schlüter, Andreas; Schönberg, Mandy; Szczepanowski, Rafael; Goesmann, Alexander; Pühler, Alfred; Klocke, Michael

    2012-03-01

    DNAs of two biofilms of a thermophilic two-phase leach-bed biogas reactor fed with rye silage and winter barley straw were sequenced by 454-pyrosequencing technology to assess the biofilm-based microbial community and their genetic potential for anaerobic digestion. The studied biofilms matured on the surface of the substrates in the hydrolysis reactor (HR) and on the packing in the anaerobic filter reactor (AF). The classification of metagenome reads showed Clostridium as most prevalent bacteria in the HR, indicating a predominant role for plant material digestion. Notably, insights into the genetic potential of plant-degrading bacteria were determined as well as further bacterial groups, which may assist Clostridium in carbohydrate degradation. Methanosarcina and Methanothermobacter were determined as most prevalent methanogenic archaea. In consequence, the biofilm-based methanogenesis in this system might be driven by the hydrogenotrophic pathway but also by the aceticlastic methanogenesis depending on metabolite concentrations such as the acetic acid concentration. Moreover, bacteria, which are capable of acetate oxidation in syntrophic interaction with methanogens, were also predicted. Finally, the metagenome analysis unveiled a large number of reads with unidentified microbial origin, indicating that the anaerobic degradation process may also be conducted by up to now unknown species. © 2011 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.

  2. Genetic Bases of Bicuspid Aortic Valve: The Contribution of Traditional and High-Throughput Sequencing Approaches on Research and Diagnosis.

    Science.gov (United States)

    Giusti, Betti; Sticchi, Elena; De Cario, Rosina; Magi, Alberto; Nistri, Stefano; Pepe, Guglielmina

    2017-01-01

    Bicuspid aortic valve (BAV) is a common (0.5-2.0% of general population) congenital heart defect with increased prevalence of aortic dilatation and dissection. BAV has an autosomal dominant inheritance with reduced penetrance and variable expressivity. BAV has been described as an isolated trait or associated with syndromic conditions [e.g., Marfan Marfan syndrome or Loeys-Dietz syndrome (MFS, LDS)]. Identification of a syndromic condition in a BAV patient is clinically relevant to personalize aortic surgery indication. A 4-fold increase in BAV prevalence in a large cohort of unrelated MFS patients with respect to general population was reported, as well as in LDS patients (8-fold). It is also known that BAV is more frequent in patients with thoracic aortic aneurysm (TAA) related to mutations in ACTA2, FBN1 , and TGFBR2 genes. Moreover, in 8 patients with BAV and thoracic aortic dilation, not fulfilling the clinical criteria for MFS, FBN1 mutations in 2/8 patients were identified suggesting that FBN1 or other genes involved in syndromic conditions correlated to aortopathy could be involved in BAV. Beyond loci associated to syndromic disorders, studies in humans and animal models evidenced/suggested the role of further genes in non-syndromic BAV. The transcriptional regulator NOTCH1 has been associated with the development and acceleration of calcium deposition. Genome wide marker-based linkage analysis demonstrated a linkage of BAV to loci on chromosomes 18, 5, and 13q. Recently, a role for GATA4 / 5 in aortic valve morphogenesis and endocardial cell differentiation has been reported. BAV has also been associated with a reduced UFD1L gene expression or involvement of a locus containing AXIN1 / PDIA2 . Much remains to be understood about the genetics of BAV. In the last years, high-throughput sequencing technologies, allowing the analysis of large number of genes or entire exomes or genomes, progressively became available. The latter issue together with the

  3. Genetic Bases of Bicuspid Aortic Valve: The Contribution of Traditional and High-Throughput Sequencing Approaches on Research and Diagnosis

    Directory of Open Access Journals (Sweden)

    Betti Giusti

    2017-08-01

    Full Text Available Bicuspid aortic valve (BAV is a common (0.5–2.0% of general population congenital heart defect with increased prevalence of aortic dilatation and dissection. BAV has an autosomal dominant inheritance with reduced penetrance and variable expressivity. BAV has been described as an isolated trait or associated with syndromic conditions [e.g., Marfan Marfan syndrome or Loeys-Dietz syndrome (MFS, LDS]. Identification of a syndromic condition in a BAV patient is clinically relevant to personalize aortic surgery indication. A 4-fold increase in BAV prevalence in a large cohort of unrelated MFS patients with respect to general population was reported, as well as in LDS patients (8-fold. It is also known that BAV is more frequent in patients with thoracic aortic aneurysm (TAA related to mutations in ACTA2, FBN1, and TGFBR2 genes. Moreover, in 8 patients with BAV and thoracic aortic dilation, not fulfilling the clinical criteria for MFS, FBN1 mutations in 2/8 patients were identified suggesting that FBN1 or other genes involved in syndromic conditions correlated to aortopathy could be involved in BAV. Beyond loci associated to syndromic disorders, studies in humans and animal models evidenced/suggested the role of further genes in non-syndromic BAV. The transcriptional regulator NOTCH1 has been associated with the development and acceleration of calcium deposition. Genome wide marker-based linkage analysis demonstrated a linkage of BAV to loci on chromosomes 18, 5, and 13q. Recently, a role for GATA4/5 in aortic valve morphogenesis and endocardial cell differentiation has been reported. BAV has also been associated with a reduced UFD1L gene expression or involvement of a locus containing AXIN1/PDIA2. Much remains to be understood about the genetics of BAV. In the last years, high-throughput sequencing technologies, allowing the analysis of large number of genes or entire exomes or genomes, progressively became available. The latter issue together with

  4. High throughput protein production screening

    Science.gov (United States)

    Beernink, Peter T [Walnut Creek, CA; Coleman, Matthew A [Oakland, CA; Segelke, Brent W [San Ramon, CA

    2009-09-08

    Methods, compositions, and kits for the cell-free production and analysis of proteins are provided. The invention allows for the production of proteins from prokaryotic sequences or eukaryotic sequences, including human cDNAs using PCR and IVT methods and detecting the proteins through fluorescence or immunoblot techniques. This invention can be used to identify optimized PCR and WT conditions, codon usages and mutations. The methods are readily automated and can be used for high throughput analysis of protein expression levels, interactions, and functional states.

  5. Statistical methods for the analysis of high-throughput metabolomics data

    Directory of Open Access Journals (Sweden)

    Fabian J. Theis

    2013-01-01

    Full Text Available Metabolomics is a relatively new high-throughput technology that aims at measuring all endogenous metabolites within a biological sample in an unbiased fashion. The resulting metabolic profiles may be regarded as functional signatures of the physiological state, and have been shown to comprise effects of genetic regulation as well as environmental factors. This potential to connect genotypic to phenotypic information promises new insights and biomarkers for different research fields, including biomedical and pharmaceutical research. In the statistical analysis of metabolomics data, many techniques from other omics fields can be reused. However recently, a number of tools specific for metabolomics data have been developed as well. The focus of this mini review will be on recent advancements in the analysis of metabolomics data especially by utilizing Gaussian graphical models and independent component analysis.

  6. Insight into dynamic genome imaging: Canonical framework identification and high-throughput analysis.

    Science.gov (United States)

    Ronquist, Scott; Meixner, Walter; Rajapakse, Indika; Snyder, John

    2017-07-01

    The human genome is dynamic in structure, complicating researcher's attempts at fully understanding it. Time series "Fluorescent in situ Hybridization" (FISH) imaging has increased our ability to observe genome structure, but due to cell type and experimental variability this data is often noisy and difficult to analyze. Furthermore, computational analysis techniques are needed for homolog discrimination and canonical framework detection, in the case of time-series images. In this paper we introduce novel ideas for nucleus imaging analysis, present findings extracted using dynamic genome imaging, and propose an objective algorithm for high-throughput, time-series FISH imaging. While a canonical framework could not be detected beyond statistical significance in the analyzed dataset, a mathematical framework for detection has been outlined with extension to 3D image analysis. Copyright © 2017 Elsevier Inc. All rights reserved.

  7. Universal and blocking primer mismatches limit the use of high-throughput DNA sequencing for the quantitative metabarcoding of arthropods.

    Science.gov (United States)

    Piñol, J; Mir, G; Gomez-Polo, P; Agustí, N

    2015-07-01

    The quantification of the biological diversity in environmental samples using high-throughput DNA sequencing is hindered by the PCR bias caused by variable primer-template mismatches of the individual species. In some dietary studies, there is the added problem that samples are enriched with predator DNA, so often a predator-specific blocking oligonucleotide is used to alleviate the problem. However, specific blocking oligonucleotides could coblock nontarget species to some degree. Here, we accurately estimate the extent of the PCR biases induced by universal and blocking primers on a mock community prepared with DNA of twelve species of terrestrial arthropods. We also compare universal and blocking primer biases with those induced by variable annealing temperature and number of PCR cycles. The results show that reads of all species were recovered after PCR enrichment at our control conditions (no blocking oligonucleotide, 45 °C annealing temperature and 40 cycles) and high-throughput sequencing. They also show that the four factors considered biased the final proportions of the species to some degree. Among these factors, the number of primer-template mismatches of each species had a disproportionate effect (up to five orders of magnitude) on the amplification efficiency. In particular, the number of primer-template mismatches explained most of the variation (~3/4) in the amplification efficiency of the species. The effect of blocking oligonucleotide concentration on nontarget species relative abundance was also significant, but less important (below one order of magnitude). Considering the results reported here, the quantitative potential of the technique is limited, and only qualitative results (the species list) are reliable, at least when targeting the barcoding COI region. © 2014 John Wiley & Sons Ltd.

  8. High-throughput sequencing of the T cell receptor β gene identifies aggressive early-stage mycosis fungoides.

    Science.gov (United States)

    de Masson, Adele; O'Malley, John T; Elco, Christopher P; Garcia, Sarah S; Divito, Sherrie J; Lowry, Elizabeth L; Tawa, Marianne; Fisher, David C; Devlin, Phillip M; Teague, Jessica E; Leboeuf, Nicole R; Kirsch, Ilan R; Robins, Harlan; Clark, Rachael A; Kupper, Thomas S

    2018-05-09

    Mycosis fungoides (MF), the most common cutaneous T cell lymphoma (CTCL) is a malignancy of skin-tropic memory T cells. Most MF cases present as early stage (stage I A/B, limited to the skin), and these patients typically have a chronic, indolent clinical course. However, a small subset of early-stage cases develop progressive and fatal disease. Because outcomes can be so different, early identification of this high-risk population is an urgent unmet clinical need. We evaluated the use of next-generation high-throughput DNA sequencing of the T cell receptor β gene ( TCRB ) in lesional skin biopsies to predict progression and survival in a discovery cohort of 208 patients with CTCL (177 with MF) from a 15-year longitudinal observational clinical study. We compared these data to the results in an independent validation cohort of 101 CTCL patients (87 with MF). The tumor clone frequency (TCF) in lesional skin, measured by high-throughput sequencing of the TCRB gene, was an independent prognostic factor of both progression-free and overall survival in patients with CTCL and MF in particular. In early-stage patients, a TCF of >25% in the skin was a stronger predictor of progression than any other established prognostic factor (stage IB versus IA, presence of plaques, high blood lactate dehydrogenase concentration, large-cell transformation, or age). The TCF therefore may accurately predict disease progression in early-stage MF. Early identification of patients at high risk for progression could help identify candidates who may benefit from allogeneic hematopoietic stem cell transplantation before their disease becomes treatment-refractory. Copyright © 2018 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.

  9. Single-nucleotide polymorphism discovery by high-throughput sequencing in sorghum

    Directory of Open Access Journals (Sweden)

    White Frank F

    2011-07-01

    Full Text Available Abstract Background Eight diverse sorghum (Sorghum bicolor L. Moench accessions were subjected to short-read genome sequencing to characterize the distribution of single-nucleotide polymorphisms (SNPs. Two strategies were used for DNA library preparation. Missing SNP genotype data were imputed by local haplotype comparison. The effect of library type and genomic diversity on SNP discovery and imputation are evaluated. Results Alignment of eight genome equivalents (6 Gb to the public reference genome revealed 283,000 SNPs at ≥82% confirmation probability. Sequencing from libraries constructed to limit sequencing to start at defined restriction sites led to genotyping 10-fold more SNPs in all 8 accessions, and correctly imputing 11% more missing data, than from semirandom libraries. The SNP yield advantage of the reduced-representation method was less than expected, since up to one fifth of reads started at noncanonical restriction sites and up to one third of restriction sites predicted in silico to yield unique alignments were not sampled at near-saturation. For imputation accuracy, the availability of a genomically similar accession in the germplasm panel was more important than panel size or sequencing coverage. Conclusions A sequence quantity of 3 million 50-base reads per accession using a BsrFI library would conservatively provide satisfactory genotyping of 96,000 sorghum SNPs. For most reliable SNP-genotype imputation in shallowly sequenced genomes, germplasm panels should consist of pairs or groups of genomically similar entries. These results may help in designing strategies for economical genotyping-by-sequencing of large numbers of plant accessions.

  10. Robust Sub-nanomolar Library Preparation for High Throughput Next Generation Sequencing.

    Science.gov (United States)

    Wu, Wells W; Phue, Je-Nie; Lee, Chun-Ting; Lin, Changyi; Xu, Lai; Wang, Rong; Zhang, Yaqin; Shen, Rong-Fong

    2018-05-04

    Current library preparation protocols for Illumina HiSeq and MiSeq DNA sequencers require ≥2 nM initial library for subsequent loading of denatured cDNA onto flow cells. Such amounts are not always attainable from samples having a relatively low DNA or RNA input; or those for which a limited number of PCR amplification cycles is preferred (less PCR bias and/or more even coverage). A well-tested sub-nanomolar library preparation protocol for Illumina sequencers has however not been reported. The aim of this study is to provide a much needed working protocol for sub-nanomolar libraries to achieve outcomes as informative as those obtained with the higher library input (≥ 2 nM) recommended by Illumina's protocols. Extensive studies were conducted to validate a robust sub-nanomolar (initial library of 100 pM) protocol using PhiX DNA (as a control), genomic DNA (Bordetella bronchiseptica and microbial mock community B for 16S rRNA gene sequencing), messenger RNA, microRNA, and other small noncoding RNA samples. The utility of our protocol was further explored for PhiX library concentrations as low as 25 pM, which generated only slightly fewer than 50% of the reads achieved under the standard Illumina protocol starting with > 2 nM. A sub-nanomolar library preparation protocol (100 pM) could generate next generation sequencing (NGS) results as robust as the standard Illumina protocol. Following the sub-nanomolar protocol, libraries with initial concentrations as low as 25 pM could also be sequenced to yield satisfactory and reproducible sequencing results.

  11. Multispot single-molecule FRET: High-throughput analysis of freely diffusing molecules.

    Directory of Open Access Journals (Sweden)

    Antonino Ingargiola

    Full Text Available We describe an 8-spot confocal setup for high-throughput smFRET assays and illustrate its performance with two characteristic experiments. First, measurements on a series of freely diffusing doubly-labeled dsDNA samples allow us to demonstrate that data acquired in multiple spots in parallel can be properly corrected and result in measured sample characteristics consistent with those obtained with a standard single-spot setup. We then take advantage of the higher throughput provided by parallel acquisition to address an outstanding question about the kinetics of the initial steps of bacterial RNA transcription. Our real-time kinetic analysis of promoter escape by bacterial RNA polymerase confirms results obtained by a more indirect route, shedding additional light on the initial steps of transcription. Finally, we discuss the advantages of our multispot setup, while pointing potential limitations of the current single laser excitation design, as well as analysis challenges and their solutions.

  12. Hadoop and friends - first experience at CERN with a new platform for high throughput analysis steps

    Science.gov (United States)

    Duellmann, D.; Surdy, K.; Menichetti, L.; Toebbicke, R.

    2017-10-01

    The statistical analysis of infrastructure metrics comes with several specific challenges, including the fairly large volume of unstructured metrics from a large set of independent data sources. Hadoop and Spark provide an ideal environment in particular for the first steps of skimming rapidly through hundreds of TB of low relevance data to find and extract the much smaller data volume that is relevant for statistical analysis and modelling. This presentation will describe the new Hadoop service at CERN and the use of several of its components for high throughput data aggregation and ad-hoc pattern searches. We will describe the hardware setup used, the service structure with a small set of decoupled clusters and the first experience with co-hosting different applications and performing software upgrades. We will further detail the common infrastructure used for data extraction and preparation from continuous monitoring and database input sources.

  13. Data reduction for a high-throughput neutron activation analysis system

    International Nuclear Information System (INIS)

    Bowman, W.W.

    1979-01-01

    To analyze samples collected as part of a geochemical survey for the National Uranium Resource Evaluation program, Savannah River Laboratory has installed a high-throughput neutron activation analysis system. As part of that system, computer programs have been developed to reduce raw data to elemental concentrations in two steps. Program RAGS reduces gamma-ray spectra to lists of photopeak energies, peak areas, and statistical errors. Program RICHES determines the elemental concentrations from photopeak and delayed-neutron data, detector efficiencies, analysis parameters (neutron flux and activation, decay, and counting times), and spectrometric and cross-section data from libraries. Both programs have been streamlined for on-line operation with a minicomputer, each requiring approx. 64 kbytes of core. 3 tables

  14. HDAT: web-based high-throughput screening data analysis tools

    International Nuclear Information System (INIS)

    Liu, Rong; Hassan, Taimur; Rallo, Robert; Cohen, Yoram

    2013-01-01

    The increasing utilization of high-throughput screening (HTS) in toxicity studies of engineered nano-materials (ENMs) requires tools for rapid and reliable processing and analyses of large HTS datasets. In order to meet this need, a web-based platform for HTS data analyses tools (HDAT) was developed that provides statistical methods suitable for ENM toxicity data. As a publicly available computational nanoinformatics infrastructure, HDAT provides different plate normalization methods, various HTS summarization statistics, self-organizing map (SOM)-based clustering analysis, and visualization of raw and processed data using both heat map and SOM. HDAT has been successfully used in a number of HTS studies of ENM toxicity, thereby enabling analysis of toxicity mechanisms and development of structure–activity relationships for ENM toxicity. The online approach afforded by HDAT should encourage standardization of and future advances in HTS as well as facilitate convenient inter-laboratory comparisons of HTS datasets. (paper)

  15. Pathway Processor 2.0: a web resource for pathway-based analysis of high-throughput data.

    Science.gov (United States)

    Beltrame, Luca; Bianco, Luca; Fontana, Paolo; Cavalieri, Duccio

    2013-07-15

    Pathway Processor 2.0 is a web application designed to analyze high-throughput datasets, including but not limited to microarray and next-generation sequencing, using a pathway centric logic. In addition to well-established methods such as the Fisher's test and impact analysis, Pathway Processor 2.0 offers innovative methods that convert gene expression into pathway expression, leading to the identification of differentially regulated pathways in a dataset of choice. Pathway Processor 2.0 is available as a web service at http://compbiotoolbox.fmach.it/pathwayProcessor/. Sample datasets to test the functionality can be used directly from the application. duccio.cavalieri@fmach.it Supplementary data are available at Bioinformatics online.

  16. High-throughput physical map anchoring via BAC-pool sequencing

    Czech Academy of Sciences Publication Activity Database

    Cviková, Kateřina; Cattonaro, F.; Alaux, M.; Stein, N.; Mayer, K.F.X.; Doležel, Jaroslav; Bartoš, Jan

    2015-01-01

    Roč. 15, APR 11 (2015) ISSN 1471-2229 R&D Projects: GA ČR GA13-08786S; GA MŠk(CZ) LO1204 Institutional support: RVO:61389030 Keywords : Physical map * Contig anchoring * Next generation sequencing Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 3.631, year: 2015

  17. In-field High Throughput Phenotyping and Cotton Plant Growth Analysis Using LiDAR.

    Science.gov (United States)

    Sun, Shangpeng; Li, Changying; Paterson, Andrew H; Jiang, Yu; Xu, Rui; Robertson, Jon S; Snider, John L; Chee, Peng W

    2018-01-01

    Plant breeding programs and a wide range of plant science applications would greatly benefit from the development of in-field high throughput phenotyping technologies. In this study, a terrestrial LiDAR-based high throughput phenotyping system was developed. A 2D LiDAR was applied to scan plants from overhead in the field, and an RTK-GPS was used to provide spatial coordinates. Precise 3D models of scanned plants were reconstructed based on the LiDAR and RTK-GPS data. The ground plane of the 3D model was separated by RANSAC algorithm and a Euclidean clustering algorithm was applied to remove noise generated by weeds. After that, clean 3D surface models of cotton plants were obtained, from which three plot-level morphologic traits including canopy height, projected canopy area, and plant volume were derived. Canopy height ranging from 85th percentile to the maximum height were computed based on the histogram of the z coordinate for all measured points; projected canopy area was derived by projecting all points on a ground plane; and a Trapezoidal rule based algorithm was proposed to estimate plant volume. Results of validation experiments showed good agreement between LiDAR measurements and manual measurements for maximum canopy height, projected canopy area, and plant volume, with R 2 -values of 0.97, 0.97, and 0.98, respectively. The developed system was used to scan the whole field repeatedly over the period from 43 to 109 days after planting. Growth trends and growth rate curves for all three derived morphologic traits were established over the monitoring period for each cultivar. Overall, four different cultivars showed similar growth trends and growth rate patterns. Each cultivar continued to grow until ~88 days after planting, and from then on varied little. However, the actual values were cultivar specific. Correlation analysis between morphologic traits and final yield was conducted over the monitoring period. When considering each cultivar individually

  18. Exploring the environmental diversity of kinetoplastid flagellates in the high-throughput DNA sequencing era

    Directory of Open Access Journals (Sweden)

    Claudia Masini d’Avila-Levy

    2015-01-01

    Full Text Available The class Kinetoplastea encompasses both free-living and parasitic species from a wide range of hosts. Several representatives of this group are responsible for severe human diseases and for economic losses in agriculture and livestock. While this group encompasses over 30 genera, most of the available information has been derived from the vertebrate pathogenic genera Leishmaniaand Trypanosoma.Recent studies of the previously neglected groups of Kinetoplastea indicated that the actual diversity is much higher than previously thought. This article discusses the known segment of kinetoplastid diversity and how gene-directed Sanger sequencing and next-generation sequencing methods can help to deepen our knowledge of these interesting protists.

  19. WormScan: a technique for high-throughput phenotypic analysis of Caenorhabditis elegans.

    Directory of Open Access Journals (Sweden)

    Mark D Mathew

    Full Text Available BACKGROUND: There are four main phenotypes that are assessed in whole organism studies of Caenorhabditis elegans; mortality, movement, fecundity and size. Procedures have been developed that focus on the digital analysis of some, but not all of these phenotypes and may be limited by expense and limited throughput. We have developed WormScan, an automated image acquisition system that allows quantitative analysis of each of these four phenotypes on standard NGM plates seeded with E. coli. This system is very easy to implement and has the capacity to be used in high-throughput analysis. METHODOLOGY/PRINCIPAL FINDINGS: Our system employs a readily available consumer grade flatbed scanner. The method uses light stimulus from the scanner rather than physical stimulus to induce movement. With two sequential scans it is possible to quantify the induced phototactic response. To demonstrate the utility of the method, we measured the phenotypic response of C. elegans to phosphine gas exposure. We found that stimulation of movement by the light of the scanner was equivalent to physical stimulation for the determination of mortality. WormScan also provided a quantitative assessment of health for the survivors. Habituation from light stimulation of continuous scans was similar to habituation caused by physical stimulus. CONCLUSIONS/SIGNIFICANCE: There are existing systems for the automated phenotypic data collection of C. elegans. The specific advantages of our method over existing systems are high-throughput assessment of a greater range of phenotypic endpoints including determination of mortality and quantification of the mobility of survivors. Our system is also inexpensive and very easy to implement. Even though we have focused on demonstrating the usefulness of WormScan in toxicology, it can be used in a wide range of additional C. elegans studies including lifespan determination, development, pathology and behavior. Moreover, we have even adapted the

  20. An integrated multiple capillary array electrophoresis system for high-throughput DNA sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Lu, X.

    1998-03-27

    A capillary array electrophoresis system was chosen to perform DNA sequencing because of several advantages such as rapid heat dissipation, multiplexing capabilities, gel matrix filling simplicity, and the mature nature of the associated manufacturing technologies. There are two major concerns for the multiple capillary systems. One concern is inter-capillary cross-talk, and the other concern is excitation and detection efficiency. Cross-talk is eliminated through proper optical coupling, good focusing and immersing capillary array into index matching fluid. A side-entry excitation scheme with orthogonal detection was established for large capillary array. Two 100 capillary array formats were used for DNA sequencing. One format is cylindrical capillary with 150 {micro}m o.d., 75 {micro}m i.d and the other format is square capillary with 300 {micro}m out edge and 75 {micro}m inner edge. This project is focused on the development of excitation and detection of DNA as well as performing DNA sequencing. The DNA injection schemes are discussed for the cases of single and bundled capillaries. An individual sampling device was designed. The base-calling was performed for a capillary from the capillary array with the accuracy of 98%.

  1. Quartz-Seq2: a high-throughput single-cell RNA-sequencing method that effectively uses limited sequence reads.

    Science.gov (United States)

    Sasagawa, Yohei; Danno, Hiroki; Takada, Hitomi; Ebisawa, Masashi; Tanaka, Kaori; Hayashi, Tetsutaro; Kurisaki, Akira; Nikaido, Itoshi

    2018-03-09

    High-throughput single-cell RNA-seq methods assign limited unique molecular identifier (UMI) counts as gene expression values to single cells from shallow sequence reads and detect limited gene counts. We thus developed a high-throughput single-cell RNA-seq method, Quartz-Seq2, to overcome these issues. Our improvements in the reaction steps make it possible to effectively convert initial reads to UMI counts, at a rate of 30-50%, and detect more genes. To demonstrate the power of Quartz-Seq2, we analyzed approximately 10,000 transcriptomes from in vitro embryonic stem cells and an in vivo stromal vascular fraction with a limited number of reads.

  2. High-throughput analysis of amino acids in plant materials by single quadrupole mass spectrometry

    DEFF Research Database (Denmark)

    Dahl-Lassen, Rasmus; van Hecke, Jan Julien Josef; Jørgensen, Henning

    2018-01-01

    that it is very time consuming with typical chromatographic run times of 70 min or more. Results: We have here developed a high-throughput method for analysis of amino acid profiles in plant materials. The method combines classical protein hydrolysis and derivatization with fast separation by UHPLC and detection...... reducing the overall analytical costs compared to methods based on more advanced mass spectrometers....... by a single quadrupole (QDa) mass spectrometer. The chromatographic run time is reduced to 10 min and the precision, accuracy and sensitivity of the method are in line with other recent methods utilizing advanced and more expensive mass spectrometers. The sensitivity of the method is at least a factor 10...

  3. Use of genotyping by sequencing data to develop a high-throughput and multifunctional SNP panel for conservation applications in Pacific lamprey.

    Science.gov (United States)

    Hess, Jon E; Campbell, Nathan R; Docker, Margaret F; Baker, Cyndi; Jackson, Aaron; Lampman, Ralph; McIlraith, Brian; Moser, Mary L; Statler, David P; Young, William P; Wildbill, Andrew J; Narum, Shawn R

    2015-01-01

    Next-generation sequencing data can be mined for highly informative single nucleotide polymorphisms (SNPs) to develop high-throughput genomic assays for nonmodel organisms. However, choosing a set of SNPs to address a variety of objectives can be difficult because SNPs are often not equally informative. We developed an optimal combination of 96 high-throughput SNP assays from a total of 4439 SNPs identified in a previous study of Pacific lamprey (Entosphenus tridentatus) and used them to address four disparate objectives: parentage analysis, species identification and characterization of neutral and adaptive variation. Nine of these SNPs are FST outliers, and five of these outliers are localized within genes and significantly associated with geography, run-timing and dwarf life history. Two of the 96 SNPs were diagnostic for two other lamprey species that were morphologically indistinguishable at early larval stages and were sympatric in the Pacific Northwest. The majority (85) of SNPs in the panel were highly informative for parentage analysis, that is, putatively neutral with high minor allele frequency across the species' range. Results from three case studies are presented to demonstrate the broad utility of this panel of SNP markers in this species. As Pacific lamprey populations are undergoing rapid decline, these SNPs provide an important resource to address critical uncertainties associated with the conservation and recovery of this imperiled species. © 2014 John Wiley & Sons Ltd.

  4. Use of high throughput sequencing to study oomycete communities in soil and roots

    DEFF Research Database (Denmark)

    Sapkota, Rumakanta; Nicolaisen, Mogens

    2015-01-01

    taxonomic units from symptomatic lesions in carrot resulted in 94% of the reads belonging to oomycetes with a dominance of species of Pythium that are known to be involved in causing cavity spot. Moreover, soil samples showed that 95% of the sequences could be assigned to oomycetes including Pythium......, Aphanomyces, Peronospora, Saprolegnia and Phytophthora. A high proportion of oomycete reads was consistently present in all symptomatic lesions and soil samples showing the versatility of the strategy and thus demonstrating the usefulness of the method in plant and soil DNA background....

  5. Characterization of unknown genetic modifications using high throughput sequencing and computational subtraction

    Directory of Open Access Journals (Sweden)

    Butenko Melinka A

    2009-10-01

    Full Text Available Abstract Background When generating a genetically modified organism (GMO, the primary goal is to give a target organism one or several novel traits by using biotechnology techniques. A GMO will differ from its parental strain in that its pool of transcripts will be altered. Currently, there are no methods that are reliably able to determine if an organism has been genetically altered if the nature of the modification is unknown. Results We show that the concept of computational subtraction can be used to identify transgenic cDNA sequences from genetically modified plants. Our datasets include 454-type sequences from a transgenic line of Arabidopsis thaliana and published EST datasets from commercially relevant species (rice and papaya. Conclusion We believe that computational subtraction represents a powerful new strategy for determining if an organism has been genetically modified as well as to define the nature of the modification. Fewer assumptions have to be made compared to methods currently in use and this is an advantage particularly when working with unknown GMOs.

  6. Improving transcriptome assembly through error correction of high-throughput sequence reads

    Directory of Open Access Journals (Sweden)

    Matthew D. MacManes

    2013-07-01

    Full Text Available The study of functional genomics, particularly in non-model organisms, has been dramatically improved over the last few years by the use of transcriptomes and RNAseq. While these studies are potentially extremely powerful, a computationally intensive procedure, the de novo construction of a reference transcriptome must be completed as a prerequisite to further analyses. The accurate reference is critically important as all downstream steps, including estimating transcript abundance are critically dependent on the construction of an accurate reference. Though a substantial amount of research has been done on assembly, only recently have the pre-assembly procedures been studied in detail. Specifically, several stand-alone error correction modules have been reported on and, while they have shown to be effective in reducing errors at the level of sequencing reads, how error correction impacts assembly accuracy is largely unknown. Here, we show via use of a simulated and empiric dataset, that applying error correction to sequencing reads has significant positive effects on assembly accuracy, and should be applied to all datasets. A complete collection of commands which will allow for the production of Reptile corrected reads is available at https://github.com/macmanes/error_correction/tree/master/scripts and as File S1.

  7. Inferring Variation in Copy Number Using High Throughput Sequencing Data in R.

    Science.gov (United States)

    Knaus, Brian J; Grünwald, Niklaus J

    2018-01-01

    Inference of copy number variation presents a technical challenge because variant callers typically require the copy number of a genome or genomic region to be known a priori . Here we present a method to infer copy number that uses variant call format (VCF) data as input and is implemented in the R package vcfR . This method is based on the relative frequency of each allele (in both genic and non-genic regions) sequenced at heterozygous positions throughout a genome. These heterozygous positions are summarized by using arbitrarily sized windows of heterozygous positions, binning the allele frequencies, and selecting the bin with the greatest abundance of positions. This provides a non-parametric summary of the frequency that alleles were sequenced at. The method is applicable to organisms that have reference genomes that consist of full chromosomes or sub-chromosomal contigs. In contrast to other software designed to detect copy number variation, our method does not rely on an assumption of base ploidy, but instead infers it. We validated these approaches with the model system of Saccharomyces cerevisiae and applied it to the oomycete Phytophthora infestans , both known to vary in copy number. This functionality has been incorporated into the current release of the R package vcfR to provide modular and flexible methods to investigate copy number variation in genomic projects.

  8. Opera: reconstructing optimal genomic scaffolds with high-throughput paired-end sequences.

    Science.gov (United States)

    Gao, Song; Sung, Wing-Kin; Nagarajan, Niranjan

    2011-11-01

    Scaffolding, the problem of ordering and orienting contigs, typically using paired-end reads, is a crucial step in the assembly of high-quality draft genomes. Even as sequencing technologies and mate-pair protocols have improved significantly, scaffolding programs still rely on heuristics, with no guarantees on the quality of the solution. In this work, we explored the feasibility of an exact solution for scaffolding and present a first tractable solution for this problem (Opera). We also describe a graph contraction procedure that allows the solution to scale to large scaffolding problems and demonstrate this by scaffolding several large real and synthetic datasets. In comparisons with existing scaffolders, Opera simultaneously produced longer and more accurate scaffolds demonstrating the utility of an exact approach. Opera also incorporates an exact quadratic programming formulation to precisely compute gap sizes (Availability: http://sourceforge.net/projects/operasf/ ).

  9. Discovery of precursor and mature microRNAs and their putative gene targets using high-throughput sequencing in pineapple (Ananas comosus var. comosus).

    Science.gov (United States)

    Yusuf, Noor Hydayaty Md; Ong, Wen Dee; Redwan, Raimi Mohamed; Latip, Mariam Abd; Kumar, S Vijay

    2015-10-15

    MicroRNAs (miRNAs) are a class of small, endogenous non-coding RNAs that negatively regulate gene expression, resulting in the silencing of target mRNA transcripts through mRNA cleavage or translational inhibition. MiRNAs play significant roles in various biological and physiological processes in plants. However, the miRNA-mediated gene regulatory network in pineapple, the model tropical non-climacteric fruit, remains largely unexplored. Here, we report a complete list of pineapple mature miRNAs obtained from high-throughput small RNA sequencing and precursor miRNAs (pre-miRNAs) obtained from ESTs. Two small RNA libraries were constructed from pineapple fruits and leaves, respectively, using Illumina's Solexa technology. Sequence similarity analysis using miRBase revealed 579,179 reads homologous to 153 miRNAs from 41 miRNA families. In addition, a pineapple fruit transcriptome library consisting of approximately 30,000 EST contigs constructed using Solexa sequencing was used for the discovery of pre-miRNAs. In all, four pre-miRNAs were identified (MIR156, MIR399, MIR444 and MIR2673). Furthermore, the same pineapple transcriptome was used to dissect the function of the miRNAs in pineapple by predicting their putative targets in conjunction with their regulatory networks. In total, 23 metabolic pathways were found to be regulated by miRNAs in pineapple. The use of high-throughput sequencing in pineapples to unveil the presence of miRNAs and their regulatory pathways provides insight into the repertoire of miRNA regulation used exclusively in this non-climacteric model plant. Copyright © 2015 Elsevier B.V. All rights reserved.

  10. DRUMS: Disk Repository with Update Management and Select option for high throughput sequencing data.

    Science.gov (United States)

    Nettling, Martin; Thieme, Nils; Both, Andreas; Grosse, Ivo

    2014-02-04

    New technologies for analyzing biological samples, like next generation sequencing, are producing a growing amount of data together with quality scores. Moreover, software tools (e.g., for mapping sequence reads), calculating transcription factor binding probabilities, estimating epigenetic modification enriched regions or determining single nucleotide polymorphism increase this amount of position-specific DNA-related data even further. Hence, requesting data becomes challenging and expensive and is often implemented using specialised hardware. In addition, picking specific data as fast as possible becomes increasingly important in many fields of science. The general problem of handling big data sets was addressed by developing specialized databases like HBase, HyperTable or Cassandra. However, these database solutions require also specialized or distributed hardware leading to expensive investments. To the best of our knowledge, there is no database capable of (i) storing billions of position-specific DNA-related records, (ii) performing fast and resource saving requests, and (iii) running on a single standard computer hardware. Here, we present DRUMS (Disk Repository with Update Management and Select option), satisfying demands (i)-(iii). It tackles the weaknesses of traditional databases while handling position-specific DNA-related data in an efficient manner. DRUMS is capable of storing up to billions of records. Moreover, it focuses on optimizing relating single lookups as range request, which are needed permanently for computations in bioinformatics. To validate the power of DRUMS, we compare it to the widely used MySQL database. The test setting considers two biological data sets. We use standard desktop hardware as test environment. DRUMS outperforms MySQL in writing and reading records by a factor of two up to a factor of 10000. Furthermore, it can work with significantly larger data sets. Our work focuses on mid-sized data sets up to several billion

  11. High-throughput sequencing of ancient plant and mammal DNA preserved in herbivore middens

    DEFF Research Database (Denmark)

    Murray, Dáithí C.; Pearson, Stuart G.; Fullagar, Richard

    2012-01-01

    DNA analysis identified unreported plant and animal taxa, some of which are locally extinct or endemic. The survival and preservation of DNA in hot, arid environments is a complex and poorly understood process that is both sporadic and rare, but the survival of DNA through desiccation may be important......The study of arid palaeoenvironments is often frustrated by the poor or non-existent preservation of plant and animal material, yet these environments are of considerable environmental importance. The analysis of pollen and macrofossils isolated from herbivore middens has been an invaluable source...

  12. Coupled high-throughput functional screening and next generation sequencing for identification of plant polymer decomposing enzymes in metagenomic libraries

    Directory of Open Access Journals (Sweden)

    Mari eNyyssönen

    2013-09-01

    Full Text Available Recent advances in sequencing technologies generate new predictions and hypotheses about the functional roles of environmental microorganisms. Yet, until we can test these predictions at a scale that matches our ability to generate them, most of them will remain as hypotheses. Function-based mining of metagenomic libraries can provide direct linkages between genes, metabolic traits and microbial taxa and thus bridge this gap between sequence data generation and functional predictions. Here we developed high-throughput screening assays for function-based characterization of activities involved in plant polymer decomposition from environmental metagenomic libraries. The multiplexed assays use fluorogenic and chromogenic substrates, combine automated liquid handling and use a genetically modified expression host to enable simultaneous screening of 12,160 clones for 14 activities in a total of 170,240 reactions. Using this platform we identified 374 (0.26 % cellulose, hemicellulose, chitin, starch, phosphate and protein hydrolyzing clones from fosmid libraries prepared from decomposing leaf litter. Sequencing on the Illumina MiSeq platform, followed by assembly and gene prediction of a subset of 95 fosmid clones, identified a broad range of bacterial phyla, including Actinobacteria, Bacteroidetes, multiple Proteobacteria sub-phyla in addition to some Fungi. Carbohydrate-active enzyme genes from 20 different glycoside hydrolase families were detected. Using tetranucleotide frequency binning of fosmid sequences, multiple enzyme activities from distinct fosmids were linked, demonstrating how biochemically-confirmed functional traits in environmental metagenomes may be attributed to groups of specific organisms. Overall, our results demonstrate how functional screening of metagenomic libraries can be used to connect microbial functionality to community composition and, as a result, complement large-scale metagenomic sequencing efforts.

  13. GUItars: a GUI tool for analysis of high-throughput RNA interference screening data.

    Directory of Open Access Journals (Sweden)

    Asli N Goktug

    Full Text Available High-throughput RNA interference (RNAi screening has become a widely used approach to elucidating gene functions. However, analysis and annotation of large data sets generated from these screens has been a challenge for researchers without a programming background. Over the years, numerous data analysis methods were produced for plate quality control and hit selection and implemented by a few open-access software packages. Recently, strictly standardized mean difference (SSMD has become a widely used method for RNAi screening analysis mainly due to its better control of false negative and false positive rates and its ability to quantify RNAi effects with a statistical basis. We have developed GUItars to enable researchers without a programming background to use SSMD as both a plate quality and a hit selection metric to analyze large data sets.The software is accompanied by an intuitive graphical user interface for easy and rapid analysis workflow. SSMD analysis methods have been provided to the users along with traditionally-used z-score, normalized percent activity, and t-test methods for hit selection. GUItars is capable of analyzing large-scale data sets from screens with or without replicates. The software is designed to automatically generate and save numerous graphical outputs known to be among the most informative high-throughput data visualization tools capturing plate-wise and screen-wise performances. Graphical outputs are also written in HTML format for easy access, and a comprehensive summary of screening results is written into tab-delimited output files.With GUItars, we demonstrated robust SSMD-based analysis workflow on a 3840-gene small interfering RNA (siRNA library and identified 200 siRNAs that increased and 150 siRNAs that decreased the assay activities with moderate to stronger effects. GUItars enables rapid analysis and illustration of data from large- or small-scale RNAi screens using SSMD and other traditional analysis

  14. Discovery of J chain in African lungfish (Protopterus dolloi, Sarcopterygii using high throughput transcriptome sequencing: implications in mucosal immunity.

    Directory of Open Access Journals (Sweden)

    Luca Tacchi

    Full Text Available J chain is a small polypeptide responsible for immunoglobulin (Ig polymerization and transport of Igs across mucosal surfaces in higher vertebrates. We identified a J chain in dipnoid fish, the African lungfish (Protopterus dolloi by high throughput sequencing of the transcriptome. P. dolloi J chain is 161 aa long and contains six of the eight Cys residues present in mammalian J chain. Phylogenetic studies place the lungfish J chain closer to tetrapod J chain than to the coelacanth or nurse shark sequences. J chain expression occurs in all P. dolloi immune tissues examined and it increases in the gut and kidney in response to an experimental bacterial infection. Double fluorescent in-situ hybridization shows that 88.5% of IgM⁺ cells in the gut co-express J chain, a significantly higher percentage than in the pre-pyloric spleen. Importantly, J chain expression is not restricted to the B-cell compartment since gut epithelial cells also express J chain. These results improve our current view of J chain from a phylogenetic perspective.

  15. A high-throughput method to detect RNA profiling by integration of RT-MLPA with next generation sequencing technology.

    Science.gov (United States)

    Wang, Jing; Yang, Xue; Chen, Haofeng; Wang, Xuewei; Wang, Xiangyu; Fang, Yi; Jia, Zhenyu; Gao, Jidong

    2017-07-11

    RNA in formalin-fixed and paraffin-embedded (FFPE) tissues provides large amount of information indicating disease stages, histological tumor types and grades, as well as clinical outcomes. However, Detection of RNA expression levels in formalin-fixed and paraffin-embedded samples is extremely difficult due to poor RNA quality. Here we developed a high-throughput method, Reverse Transcription-Multiple Ligation-dependent Probe Sequencing (RT-MLPSeq), to determine expression levels of multiple transcripts in FFPE samples. By combining Reverse Transcription-Multiple Ligation-dependent Amplification method and next generation sequencing technology, RT-MLPSeq overcomes the limit of probe length in multiplex ligation-dependent probe amplification assay and thus could detect expression levels of transcripts without quantitative limitations. We proved that different RT-MLPSeq probes targeting on the same transcripts have highly consistent results and the starting RNA/cDNA input could be as little as 1 ng. RT-MLPSeq also presented consistent relative RNA levels of selected 13 genes with reverse transcription quantitative PCR. Finally, we demonstrated the application of the new RT-MLPSeq method by measuring the mRNA expression levels of 21 genes which can be used for accurate calculation of the breast cancer recurrence score - an index that has been widely used for managing breast cancer patients.

  16. Fungi Sailing the Arctic Ocean: Speciose Communities in North Atlantic Driftwood as Revealed by High-Throughput Amplicon Sequencing.

    Science.gov (United States)

    Rämä, Teppo; Davey, Marie L; Nordén, Jenni; Halvorsen, Rune; Blaalid, Rakel; Mathiassen, Geir H; Alsos, Inger G; Kauserud, Håvard

    2016-08-01

    High amounts of driftwood sail across the oceans and provide habitat for organisms tolerating the rough and saline environment. Fungi have adapted to the extremely cold and saline conditions which driftwood faces in the high north. For the first time, we applied high-throughput sequencing to fungi residing in driftwood to reveal their taxonomic richness, community composition, and ecology in the North Atlantic. Using pyrosequencing of ITS2 amplicons obtained from 49 marine logs, we found 807 fungal operational taxonomic units (OTUs) based on clustering at 97 % sequence similarity cut-off level. The phylum Ascomycota comprised 74 % of the OTUs and 20 % belonged to Basidiomycota. The richness of basidiomycetes decreased with prolonged submersion in the sea, supporting the general view of ascomycetes being more extremotolerant. However, more than one fourth of the fungal OTUs remained unassigned to any fungal class, emphasising the need for better DNA reference data from the marine habitat. Different fungal communities were detected in coniferous and deciduous logs. Our results highlight that driftwood hosts a considerably higher fungal diversity than currently known. The driftwood fungal community is not a terrestrial relic but a speciose assemblage of fungi adapted to the stressful marine environment and different kinds of wooden substrates found in it.

  17. High-throughput sequencing and mutagenesis to accelerate the domestication of Microlaena stipoides as a new food crop.

    Directory of Open Access Journals (Sweden)

    Frances M Shapter

    Full Text Available Global food demand, climatic variability and reduced land availability are driving the need for domestication of new crop species. The accelerated domestication of a rice-like Australian dryland polyploid grass, Microlaena stipoides (Poaceae, was targeted using chemical mutagenesis in conjunction with high throughput sequencing of genes for key domestication traits. While M. stipoides has previously been identified as having potential as a new grain crop for human consumption, only a limited understanding of its genetic diversity and breeding system was available to aid the domestication process. Next generation sequencing of deeply-pooled target amplicons estimated allelic diversity of a selected base population at 14.3 SNP/Mb and identified novel, putatively mutation-induced polymorphisms at about 2.4 mutations/Mb. A 97% lethal dose (LD₉₇ of ethyl methanesulfonate treatment was applied without inducing sterility in this polyploid species. Forward and reverse genetic screens identified beneficial alleles for the domestication trait, seed-shattering. Unique phenotypes observed in the M2 population suggest the potential for rapid accumulation of beneficial traits without recourse to a traditional cross-breeding strategy. This approach may be applicable to other wild species, unlocking their potential as new food, fibre and fuel crops.

  18. Effort versus Reward: Preparing Samples for Fungal Community Characterization in High-Throughput Sequencing Surveys of Soils.

    Directory of Open Access Journals (Sweden)

    Zewei Song

    Full Text Available Next generation fungal amplicon sequencing is being used with increasing frequency to study fungal diversity in various ecosystems; however, the influence of sample preparation on the characterization of fungal community is poorly understood. We investigated the effects of four procedural modifications to library preparation for high-throughput sequencing (HTS. The following treatments were considered: 1 the amount of soil used in DNA extraction, 2 the inclusion of additional steps (freeze/thaw cycles, sonication, or hot water bath incubation in the extraction procedure, 3 the amount of DNA template used in PCR, and 4 the effect of sample pooling, either physically or computationally. Soils from two different ecosystems in Minnesota, USA, one prairie and one forest site, were used to assess the generality of our results. The first three treatments did not significantly influence observed fungal OTU richness or community structure at either site. Physical pooling captured more OTU richness compared to individual samples, but total OTU richness at each site was highest when individual samples were computationally combined. We conclude that standard extraction kit protocols are well optimized for fungal HTS surveys, but because sample pooling can significantly influence OTU richness estimates, it is important to carefully consider the study aims when planning sampling procedures.

  19. Next-generation phage display: integrating and comparing available molecular tools to enable cost-effective high-throughput analysis.

    Directory of Open Access Journals (Sweden)

    Emmanuel Dias-Neto

    2009-12-01

    Full Text Available Combinatorial phage display has been used in the last 20 years in the identification of protein-ligands and protein-protein interactions, uncovering relevant molecular recognition events. Rate-limiting steps of combinatorial phage display library selection are (i the counting of transducing units and (ii the sequencing of the encoded displayed ligands. Here, we adapted emerging genomic technologies to minimize such challenges.We gained efficiency by applying in tandem real-time PCR for rapid quantification to enable bacteria-free phage display library screening, and added phage DNA next-generation sequencing for large-scale ligand analysis, reporting a fully integrated set of high-throughput quantitative and analytical tools. The approach is far less labor-intensive and allows rigorous quantification; for medical applications, including selections in patients, it also represents an advance for quantitative distribution analysis and ligand identification of hundreds of thousands of targeted particles from patient-derived biopsy or autopsy in a longer timeframe post library administration. Additional advantages over current methods include increased sensitivity, less variability, enhanced linearity, scalability, and accuracy at much lower cost. Sequences obtained by qPhage plus pyrosequencing were similar to a dataset produced from conventional Sanger-sequenced transducing-units (TU, with no biases due to GC content, codon usage, and amino acid or peptide frequency. These tools allow phage display selection and ligand analysis at >1,000-fold faster rate, and reduce costs approximately 250-fold for generating 10(6 ligand sequences.Our analyses demonstrates that whereas this approach correlates with the traditional colony-counting, it is also capable of a much larger sampling, allowing a faster, less expensive, more accurate and consistent analysis of phage enrichment. Overall, qPhage plus pyrosequencing is superior to TU-counting plus Sanger

  20. High-throughput sequencing offers insight into mechanisms of resource partitioning in cryptic bat species

    DEFF Research Database (Denmark)

    Razgour, Orly; Clare, Elizabeth L.; Zeale, Matt R. K.

    2011-01-01

    Sympatric cryptic species, characterized by low morphological differentiation, pose a challenge to understanding the role of interspecific competition in structuring ecological communities. We used traditional (morphological) and novel molecular methods of diet analysis to study the diet of two...... of the cryptic bats, 60% of which were assigned to a likely species or genus. The findings from the molecular study supported the results of microscopic analyses in showing that the diets of both species were dominated by lepidopterans. However, HTS provided a sufficiently high resolution of prey identification...

  1. Computational and statistical methods for high-throughput mass spectrometry-based PTM analysis

    DEFF Research Database (Denmark)

    Schwämmle, Veit; Vaudel, Marc

    2017-01-01

    Cell signaling and functions heavily rely on post-translational modifications (PTMs) of proteins. Their high-throughput characterization is thus of utmost interest for multiple biological and medical investigations. In combination with efficient enrichment methods, peptide mass spectrometry analy...

  2. Data for automated, high-throughput microscopy analysis of intracellular bacterial colonies using spot detection

    DEFF Research Database (Denmark)

    Ernstsen, Christina Lundgaard; Login, Frédéric H.; Jensen, Helene Halkjær

    2017-01-01

    Quantification of intracellular bacterial colonies is useful in strategies directed against bacterial attachment, subsequent cellular invasion and intracellular proliferation. An automated, high-throughput microscopy-method was established to quantify the number and size of intracellular bacteria...

  3. Analysis of Active Methylotrophic Communities: When DNA-SIP Meets High-Throughput Technologies.

    Science.gov (United States)

    Taubert, Martin; Grob, Carolina; Howat, Alexandra M; Burns, Oliver J; Chen, Yin; Neufeld, Josh D; Murrell, J Colin

    2016-01-01

    Methylotrophs are microorganisms ubiquitous in the environment that can metabolize one-carbon (C1) compounds as carbon and/or energy sources. The activity of these prokaryotes impacts biogeochemical cycles within their respective habitats and can determine whether these habitats act as sources or sinks of C1 compounds. Due to the high importance of C1 compounds, not only in biogeochemical cycles, but also for climatic processes, it is vital to understand the contributions of these microorganisms to carbon cycling in different environments. One of the most challenging questions when investigating methylotrophs, but also in environmental microbiology in general, is which species contribute to the environmental processes of interest, or "who does what, where and when?" Metabolic labeling with C1 compounds substituted with (13)C, a technique called stable isotope probing, is a key method to trace carbon fluxes within methylotrophic communities. The incorporation of (13)C into the biomass of active methylotrophs leads to an increase in the molecular mass of their biomolecules. For DNA-based stable isotope probing (DNA-SIP), labeled and unlabeled DNA is separated by isopycnic ultracentrifugation. The ability to specifically analyze DNA of active methylotrophs from a complex background community by high-throughput sequencing techniques, i.e. targeted metagenomics, is the hallmark strength of DNA-SIP for elucidating ecosystem functioning, and a protocol is detailed in this chapter.

  4. Analysis of JC virus DNA replication using a quantitative and high-throughput assay

    Science.gov (United States)

    Shin, Jong; Phelan, Paul J.; Chhum, Panharith; Bashkenova, Nazym; Yim, Sung; Parker, Robert; Gagnon, David; Gjoerup, Ole; Archambault, Jacques; Bullock, Peter A.

    2015-01-01

    Progressive Multifocal Leukoencephalopathy (PML) is caused by lytic replication of JC virus (JCV) in specific cells of the central nervous system. Like other polyomaviruses, JCV encodes a large T-antigen helicase needed for replication of the viral DNA. Here, we report the development of a luciferase-based, quantitative and high-throughput assay of JCV DNA replication in C33A cells, which, unlike the glial cell lines Hs 683 and U87, accumulate high levels of nuclear T-ag needed for robust replication. Using this assay, we investigated the requirement for different domains of T-ag, and for specific sequences within and flanking the viral origin, in JCV DNA replication. Beyond providing validation of the assay, these studies revealed an important stimulatory role of the transcription factor NF1 in JCV DNA replication. Finally, we show that the assay can be used for inhibitor testing, highlighting its value for the identification of antiviral drugs targeting JCV DNA replication. PMID:25155200

  5. High throughput on-chip analysis of high-energy charged particle tracks using lensfree imaging

    Energy Technology Data Exchange (ETDEWEB)

    Luo, Wei; Shabbir, Faizan; Gong, Chao; Gulec, Cagatay; Pigeon, Jeremy; Shaw, Jessica; Greenbaum, Alon; Tochitsky, Sergei; Joshi, Chandrashekhar [Electrical Engineering Department, University of California, Los Angeles, California 90095 (United States); Ozcan, Aydogan, E-mail: ozcan@ucla.edu [Electrical Engineering Department, University of California, Los Angeles, California 90095 (United States); Bioengineering Department, University of California, Los Angeles, California 90095 (United States); California NanoSystems Institute (CNSI), University of California, Los Angeles, California 90095 (United States)

    2015-04-13

    We demonstrate a high-throughput charged particle analysis platform, which is based on lensfree on-chip microscopy for rapid ion track analysis using allyl diglycol carbonate, i.e., CR-39 plastic polymer as the sensing medium. By adopting a wide-area opto-electronic image sensor together with a source-shifting based pixel super-resolution technique, a large CR-39 sample volume (i.e., 4 cm × 4 cm × 0.1 cm) can be imaged in less than 1 min using a compact lensfree on-chip microscope, which detects partially coherent in-line holograms of the ion tracks recorded within the CR-39 detector. After the image capture, using highly parallelized reconstruction and ion track analysis algorithms running on graphics processing units, we reconstruct and analyze the entire volume of a CR-39 detector within ∼1.5 min. This significant reduction in the entire imaging and ion track analysis time not only increases our throughput but also allows us to perform time-resolved analysis of the etching process to monitor and optimize the growth of ion tracks during etching. This computational lensfree imaging platform can provide a much higher throughput and more cost-effective alternative to traditional lens-based scanning optical microscopes for ion track analysis using CR-39 and other passive high energy particle detectors.

  6. Diversity and functions of bacterial community in drinking water biofilms revealed by high-throughput sequencing

    Science.gov (United States)

    Chao, Yuanqing; Mao, Yanping; Wang, Zhiping; Zhang, Tong

    2015-06-01

    The development of biofilms in drinking water (DW) systems may cause various problems to water quality. To investigate the community structure of biofilms on different pipe materials and the global/specific metabolic functions of DW biofilms, PCR-based 454 pyrosequencing data for 16S rRNA genes and Illumina metagenomic data were generated and analysed. Considerable differences in bacterial diversity and taxonomic structure were identified between biofilms formed on stainless steel and biofilms formed on plastics, indicating that the metallic materials facilitate the formation of higher diversity biofilms. Moreover, variations in several dominant genera were observed during biofilm formation. Based on PCA analysis, the global functions in the DW biofilms were similar to other DW metagenomes. Beyond the global functions, the occurrences and abundances of specific protective genes involved in the glutathione metabolism, the SoxRS system, the OxyR system, RpoS regulated genes, and the production/degradation of extracellular polymeric substances were also evaluated. A near-complete and low-contamination draft genome was constructed from the metagenome of the DW biofilm, based on the coverage and tetranucleotide frequencies, and identified as a Bradyrhizobiaceae-like bacterium according to a phylogenetic analysis. Our findings provide new insight into DW biofilms, especially in terms of their metabolic functions.

  7. High-throughput SHAPE analysis reveals structures in HIV-1 genomic RNA strongly conserved across distinct biological states.

    Directory of Open Access Journals (Sweden)

    Kevin A Wilkinson

    2008-04-01

    Full Text Available Replication and pathogenesis of the human immunodeficiency virus (HIV is tightly linked to the structure of its RNA genome, but genome structure in infectious virions is poorly understood. We invent high-throughput SHAPE (selective 2'-hydroxyl acylation analyzed by primer extension technology, which uses many of the same tools as DNA sequencing, to quantify RNA backbone flexibility at single-nucleotide resolution and from which robust structural information can be immediately derived. We analyze the structure of HIV-1 genomic RNA in four biologically instructive states, including the authentic viral genome inside native particles. Remarkably, given the large number of plausible local structures, the first 10% of the HIV-1 genome exists in a single, predominant conformation in all four states. We also discover that noncoding regions functioning in a regulatory role have significantly lower (p-value < 0.0001 SHAPE reactivities, and hence more structure, than do viral coding regions that function as the template for protein synthesis. By directly monitoring protein binding inside virions, we identify the RNA recognition motif for the viral nucleocapsid protein. Seven structurally homologous binding sites occur in a well-defined domain in the genome, consistent with a role in directing specific packaging of genomic RNA into nascent virions. In addition, we identify two distinct motifs that are targets for the duplex destabilizing activity of this same protein. The nucleocapsid protein destabilizes local HIV-1 RNA structure in ways likely to facilitate initial movement both of the retroviral reverse transcriptase from its tRNA primer and of the ribosome in coding regions. Each of the three nucleocapsid interaction motifs falls in a specific genome domain, indicating that local protein interactions can be organized by the long-range architecture of an RNA. High-throughput SHAPE reveals a comprehensive view of HIV-1 RNA genome structure, and further

  8. Commentary: Roles for Pathologists in a High-throughput Image Analysis Team.

    Science.gov (United States)

    Aeffner, Famke; Wilson, Kristin; Bolon, Brad; Kanaly, Suzanne; Mahrt, Charles R; Rudmann, Dan; Charles, Elaine; Young, G David

    2016-08-01

    Historically, pathologists perform manual evaluation of H&E- or immunohistochemically-stained slides, which can be subjective, inconsistent, and, at best, semiquantitative. As the complexity of staining and demand for increased precision of manual evaluation increase, the pathologist's assessment will include automated analyses (i.e., "digital pathology") to increase the accuracy, efficiency, and speed of diagnosis and hypothesis testing and as an important biomedical research and diagnostic tool. This commentary introduces the many roles for pathologists in designing and conducting high-throughput digital image analysis. Pathology review is central to the entire course of a digital pathology study, including experimental design, sample quality verification, specimen annotation, analytical algorithm development, and report preparation. The pathologist performs these roles by reviewing work undertaken by technicians and scientists with training and expertise in image analysis instruments and software. These roles require regular, face-to-face interactions between team members and the lead pathologist. Traditional pathology training is suitable preparation for entry-level participation on image analysis teams. The future of pathology is very exciting, with the expanding utilization of digital image analysis set to expand pathology roles in research and drug development with increasing and new career opportunities for pathologists. © 2016 by The Author(s) 2016.

  9. Integrative Analysis of High-throughput Cancer Studies with Contrasted Penalization

    Science.gov (United States)

    Shi, Xingjie; Liu, Jin; Huang, Jian; Zhou, Yong; Shia, BenChang; Ma, Shuangge

    2015-01-01

    In cancer studies with high-throughput genetic and genomic measurements, integrative analysis provides a way to effectively pool and analyze heterogeneous raw data from multiple independent studies and outperforms “classic” meta-analysis and single-dataset analysis. When marker selection is of interest, the genetic basis of multiple datasets can be described using the homogeneity model or the heterogeneity model. In this study, we consider marker selection under the heterogeneity model, which includes the homogeneity model as a special case and can be more flexible. Penalization methods have been developed in the literature for marker selection. This study advances from the published ones by introducing the contrast penalties, which can accommodate the within- and across-dataset structures of covariates/regression coefficients and, by doing so, further improve marker selection performance. Specifically, we develop a penalization method that accommodates the across-dataset structures by smoothing over regression coefficients. An effective iterative algorithm, which calls an inner coordinate descent iteration, is developed. Simulation shows that the proposed method outperforms the benchmark with more accurate marker identification. The analysis of breast cancer and lung cancer prognosis studies with gene expression measurements shows that the proposed method identifies genes different from those using the benchmark and has better prediction performance. PMID:24395534

  10. A DNA fingerprinting procedure for ultra high-throughput genetic analysis of insects.

    Science.gov (United States)

    Schlipalius, D I; Waldron, J; Carroll, B J; Collins, P J; Ebert, P R

    2001-12-01

    Existing procedures for the generation of polymorphic DNA markers are not optimal for insect studies in which the organisms are often tiny and background molecular information is often non-existent. We have used a new high throughput DNA marker generation protocol called randomly amplified DNA fingerprints (RAF) to analyse the genetic variability in three separate strains of the stored grain pest, Rhyzopertha dominica. This protocol is quick, robust and reliable even though it requires minimal sample preparation, minute amounts of DNA and no prior molecular analysis of the organism. Arbitrarily selected oligonucleotide primers routinely produced approximately 50 scoreable polymorphic DNA markers, between individuals of three independent field isolates of R. dominica. Multivariate cluster analysis using forty-nine arbitrarily selected polymorphisms generated from a single primer reliably separated individuals into three clades corresponding to their geographical origin. The resulting clades were quite distinct, with an average genetic difference of 37.5 +/- 6.0% between clades and of 21.0 +/- 7.1% between individuals within clades. As a prelude to future gene mapping efforts, we have also assessed the performance of RAF under conditions commonly used in gene mapping. In this analysis, fingerprints from pooled DNA samples accurately and reproducibly reflected RAF profiles obtained from individual DNA samples that had been combined to create the bulked samples.

  11. High-Throughput Quantitative Proteomic Analysis of Dengue Virus Type 2 Infected A549 Cells

    Science.gov (United States)

    Chiu, Han-Chen; Hannemann, Holger; Heesom, Kate J.; Matthews, David A.; Davidson, Andrew D.

    2014-01-01

    Disease caused by dengue virus is a global health concern with up to 390 million individuals infected annually worldwide. There are no vaccines or antiviral compounds available to either prevent or treat dengue disease which may be fatal. To increase our understanding of the interaction of dengue virus with the host cell, we analyzed changes in the proteome of human A549 cells in response to dengue virus type 2 infection using stable isotope labelling in cell culture (SILAC) in combination with high-throughput mass spectrometry (MS). Mock and infected A549 cells were fractionated into nuclear and cytoplasmic extracts before analysis to identify proteins that redistribute between cellular compartments during infection and reduce the complexity of the analysis. We identified and quantified 3098 and 2115 proteins in the cytoplasmic and nuclear fractions respectively. Proteins that showed a significant alteration in amount during infection were examined using gene enrichment, pathway and network analysis tools. The analyses revealed that dengue virus infection modulated the amounts of proteins involved in the interferon and unfolded protein responses, lipid metabolism and the cell cycle. The SILAC-MS results were validated for a select number of proteins over a time course of infection by Western blotting and immunofluorescence microscopy. Our study demonstrates for the first time the power of SILAC-MS for identifying and quantifying novel changes in cellular protein amounts in response to dengue virus infection. PMID:24671231

  12. High-throughput quantitative proteomic analysis of dengue virus type 2 infected A549 cells.

    Directory of Open Access Journals (Sweden)

    Han-Chen Chiu

    Full Text Available Disease caused by dengue virus is a global health concern with up to 390 million individuals infected annually worldwide. There are no vaccines or antiviral compounds available to either prevent or treat dengue disease which may be fatal. To increase our understanding of the interaction of dengue virus with the host cell, we analyzed changes in the proteome of human A549 cells in response to dengue virus type 2 infection using stable isotope labelling in cell culture (SILAC in combination with high-throughput mass spectrometry (MS. Mock and infected A549 cells were fractionated into nuclear and cytoplasmic extracts before analysis to identify proteins that redistribute between cellular compartments during infection and reduce the complexity of the analysis. We identified and quantified 3098 and 2115 proteins in the cytoplasmic and nuclear fractions respectively. Proteins that showed a significant alteration in amount during infection were examined using gene enrichment, pathway and network analysis tools. The analyses revealed that dengue virus infection modulated the amounts of proteins involved in the interferon and unfolded protein responses, lipid metabolism and the cell cycle. The SILAC-MS results were validated for a select number of proteins over a time course of infection by Western blotting and immunofluorescence microscopy. Our study demonstrates for the first time the power of SILAC-MS for identifying and quantifying novel changes in cellular protein amounts in response to dengue virus infection.

  13. The use of high-throughput DNA sequencing in the investigation of antigenic variation: application to Neisseria species.

    Directory of Open Access Journals (Sweden)

    John K Davies

    Full Text Available Antigenic variation occurs in a broad range of species. This process resembles gene conversion in that variant DNA is unidirectionally transferred from partial gene copies (or silent loci into an expression locus. Previous studies of antigenic variation have involved the amplification and sequencing of individual genes from hundreds of colonies. Using the pilE gene from Neisseria gonorrhoeae we have demonstrated that it is possible to use PCR amplification, followed by high-throughput DNA sequencing and a novel assembly process, to detect individual antigenic variation events. The ability to detect these events was much greater than has previously been possible. In N. gonorrhoeae most silent loci contain multiple partial gene copies. Here we show that there is a bias towards using the copy at the 3' end of the silent loci (copy 1 as the donor sequence. The pilE gene of N. gonorrhoeae and some strains of Neisseria meningitidis encode class I pilin, but strains of N. meningitidis from clonal complexes 8 and 11 encode a class II pilin. We have confirmed that the class II pili of meningococcal strain FAM18 (clonal complex 11 are non-variable, and this is also true for the class II pili of strain NMB from clonal complex 8. In addition when a gene encoding class I pilin was moved into the meningococcal strain NMB background there was no evidence of antigenic variation. Finally we investigated several members of the opa gene family of N. gonorrhoeae, where it has been suggested that limited variation occurs. Variation was detected in the opaK gene that is located close to pilE, but not at the opaJ gene located elsewhere on the genome. The approach described here promises to dramatically improve studies of the extent and nature of antigenic variation systems in a variety of species.

  14. Apple ring rot-responsive putative microRNAs revealed by high-throughput sequencing in Malus × domestica Borkh.

    Science.gov (United States)

    Yu, Xin-Yi; Du, Bei-Bei; Gao, Zhi-Hong; Zhang, Shi-Jie; Tu, Xu-Tong; Chen, Xiao-Yun; Zhang, Zhen; Qu, Shen-Chun

    2014-08-01

    MicroRNAs (miRNAs) are small non-coding RNAs, which silence target mRNA via cleavage or translational inhibition to function in regulating gene expression. MiRNAs act as important regulators of plant development and stress response. For understanding the role of miRNAs responsive to apple ring rot stress, we identified disease-responsive miRNAs using high-throughput sequencing in Malus × domestica Borkh.. Four small RNA libraries were constructed from two control strains in M. domestica, crabapple (CKHu) and Fuji Naga-fu No. 6 (CKFu), and two disease stress strains, crabapple (DSHu) and Fuji Naga-fu No. 6 (DSFu). A total of 59 miRNA families were identified and five miRNAs might be responsive to apple ring rot infection and validated via qRT-PCR. Furthermore, we predicted 76 target genes which were regulated by conserved miRNAs potentially. Our study demonstrated that miRNAs was responsive to apple ring rot infection and may have important implications on apple disease resistance.

  15. Understanding regulation of microRNAs on intestine regeneration in the sea cucumber Apostichopus japonicus using high-throughput sequencing.

    Science.gov (United States)

    Sun, Lina; Sun, Jingchun; Li, Xiaoni; Zhang, Libin; Yang, Hongsheng; Wang, Qing

    2017-06-01

    The sea cucumber, as a member of the Echinodermata, has the capacity to restore damaged organs and body parts, which has always been a key scientific issue. MicroRNAs (miRNAs), a class of short noncoding RNAs, play important roles in regulating gene expression. In the present study, we applied high-throughput sequencing to investigate alterations of miRNA expression in regenerative intestine compared to normal intestine. A total of 73 differentially expressed miRNAs were obtained, including 59 up-regulated miRNAs and 14 down-regulated miRNAs. Among these molecules, Aja-miR-1715-5p, Aja-miR-153, Aja-miR-252a, Aja-miR-153-5p, Aja-miR-252b, Aja-miR-2001, Aja-miR-64d-3p, and Aja-miR-252-5p were differentially expressed over 10-fold at 3days post-evisceration (dpe). Notably, real-time PCR revealed that Aja-miR-1715-5p was up-regulated 1390-fold at 3dpe. Moreover, putative target gene co-expression analyses, gene ontology, and pathway analyses suggest that these miRNAs play important roles in specific cellular events (cell proliferation, migration, and apoptosis), metabolic regulation, and energy redistribution. These results will provide a basis for future studies of miRNA regulation in sea cucumber regeneration. Copyright © 2017 Elsevier Inc. All rights reserved.

  16. The pig gut microbial diversity: Understanding the pig gut microbial ecology through the next generation high throughput sequencing.

    Science.gov (United States)

    Kim, Hyeun Bum; Isaacson, Richard E

    2015-06-12

    The importance of the gut microbiota of animals is widely acknowledged because of its pivotal roles in the health and well being of animals. The genetic diversity of the gut microbiota contributes to the overall development and metabolic needs of the animal, and provides the host with many beneficial functions including production of volatile fatty acids, re-cycling of bile salts, production of vitamin K, cellulose digestion, and development of immune system. Thus the intestinal microbiota of animals has been the subject of study for many decades. Although most of the older studies have used culture dependent methods, the recent advent of high throughput sequencing of 16S rRNA genes has facilitated in depth studies exploring microbial populations and their dynamics in the animal gut. These culture independent DNA based studies generate large amounts of data and as a result contribute to a more detailed understanding of the microbiota dynamics in the gut and the ecology of the microbial populations. Of equal importance, is being able to identify and quantify microbes that are difficult to grow or that have not been grown in the laboratory. Interpreting the data obtained from this type of study requires using basic principles of microbial diversity to understand importance of the composition of microbial populations. In this review, we summarize the literature on culture independent studies of the pig gut microbiota with an emphasis on its succession and alterations caused by diverse factors. Copyright © 2015 Elsevier B.V. All rights reserved.

  17. High-throughput sequencing approach uncovers the miRNome of peritoneal endometriotic lesions and adjacent healthy tissues.

    Directory of Open Access Journals (Sweden)

    Merli Saare

    Full Text Available Accumulating data have shown the involvement of microRNAs (miRNAs in endometriosis pathogenesis. In this study, we used a novel approach to determine the endometriotic lesion-specific miRNAs by high-throughput small RNA sequencing of paired samples of peritoneal endometriotic lesions and matched healthy surrounding tissues together with eutopic endometria of the same patients. We found five miRNAs specific to epithelial cells--miR-34c, miR-449a, miR-200a, miR-200b and miR-141 showing significantly higher expression in peritoneal endometriotic lesions compared to healthy peritoneal tissues. We also determined the expression levels of miR-200 family target genes E-cadherin, ZEB1 and ZEB2 and found that the expression level of E-cadherin was significantly higher in endometriotic lesions compared to healthy tissues. Further evaluation verified that studied miRNAs could be used as diagnostic markers for confirming the presence of endometrial cells in endometriotic lesion biopsy samples. Furthermore, we demonstrated that the miRNA profile of peritoneal endometriotic lesion biopsies is largely masked by the surrounding peritoneal tissue, challenging the discovery of an accurate lesion-specific miRNA profile. Taken together, our findings indicate that only particular miRNAs with a significantly higher expression in endometriotic cells can be detected from lesion biopsies, and can serve as diagnostic markers for endometriosis.

  18. Distribution and Diversity of Bacteria and Fungi Colonization in Stone Monuments Analyzed by High-Throughput Sequencing.

    Directory of Open Access Journals (Sweden)

    Qiang Li

    Full Text Available The historical and cultural heritage of Qingxing palace and Lingyin and Kaihua temple, located in Hangzhou of China, include a large number of exquisite Buddhist statues and ancient stone sculptures which date back to the Northern Song (960-1219 A.D. and Qing dynasties (1636-1912 A.D. and are considered to be some of the best examples of ancient stone sculpting techniques. They were added to the World Heritage List in 2011 because of their unique craftsmanship and importance to the study of ancient Chinese Buddhist culture. However, biodeterioration of the surface of the ancient Buddhist statues and white marble pillars not only severely impairs their aesthetic value but also alters their material structure and thermo-hygric properties. In this study, high-throughput sequencing was utilized to identify the microbial communities colonizing the stone monuments. The diversity and distribution of the microbial communities in six samples collected from three different environmental conditions with signs of deterioration were analyzed by means of bioinformatics software and diversity indices. In addition, the impact of environmental factors, including temperature, light intensity, air humidity, and the concentration of NO2 and SO2, on the microbial communities' diversity and distribution was evaluated. The results indicate that the presence of predominantly phototrophic microorganisms was correlated with light and humidity, while nitrifying bacteria and Thiobacillus were associated with NO2 and SO2 from air pollution.

  19. Distribution and Diversity of Bacteria and Fungi Colonization in Stone Monuments Analyzed by High-Throughput Sequencing.

    Science.gov (United States)

    Li, Qiang; Zhang, Bingjian; He, Zhang; Yang, Xiaoru

    The historical and cultural heritage of Qingxing palace and Lingyin and Kaihua temple, located in Hangzhou of China, include a large number of exquisite Buddhist statues and ancient stone sculptures which date back to the Northern Song (960-1219 A.D.) and Qing dynasties (1636-1912 A.D.) and are considered to be some of the best examples of ancient stone sculpting techniques. They were added to the World Heritage List in 2011 because of their unique craftsmanship and importance to the study of ancient Chinese Buddhist culture. However, biodeterioration of the surface of the ancient Buddhist statues and white marble pillars not only severely impairs their aesthetic value but also alters their material structure and thermo-hygric properties. In this study, high-throughput sequencing was utilized to identify the microbial communities colonizing the stone monuments. The diversity and distribution of the microbial communities in six samples collected from three different environmental conditions with signs of deterioration were analyzed by means of bioinformatics software and diversity indices. In addition, the impact of environmental factors, including temperature, light intensity, air humidity, and the concentration of NO2 and SO2, on the microbial communities' diversity and distribution was evaluated. The results indicate that the presence of predominantly phototrophic microorganisms was correlated with light and humidity, while nitrifying bacteria and Thiobacillus were associated with NO2 and SO2 from air pollution.

  20. Mining environmental high-throughput sequence data sets to identify divergent amplicon clusters for phylogenetic reconstruction and morphotype visualization.

    Science.gov (United States)

    Gimmler, Anna; Stoeck, Thorsten

    2015-08-01

    Environmental high-throughput sequencing (envHTS) is a very powerful tool, which in protistan ecology is predominantly used for the exploration of diversity and its geographic and local patterns. We here used a pyrosequenced V4-SSU rDNA data set from a solar saltern pond as test case to exploit such massive protistan amplicon data sets beyond this descriptive purpose. Therefore, we combined a Swarm-based blastn network including 11 579 ciliate V4 amplicons to identify divergent amplicon clusters with targeted polymerase chain reaction (PCR) primer design for full-length small subunit of the ribosomal DNA retrieval and probe design for fluorescence in situ hybridization (FISH). This powerful strategy allows to benefit from envHTS data sets to (i) reveal the phylogenetic position of the taxon behind divergent amplicons; (ii) improve phylogenetic resolution and evolutionary history of specific taxon groups; (iii) solidly assess an amplicons (species') degree of similarity to its closest described relative; (iv) visualize the morphotype behind a divergent amplicons cluster; (v) rapidly FISH screen many environmental samples for geographic/habitat distribution and abundances of the respective organism and (vi) to monitor the success of enrichment strategies in live samples for cultivation and isolation of the respective organisms. © 2015 Society for Applied Microbiology and John Wiley & Sons Ltd.

  1. The First Report of miRNAs from a Thysanopteran Insect, Thrips palmi Karny Using High-Throughput Sequencing.

    Directory of Open Access Journals (Sweden)

    K B Rebijith

    Full Text Available Thrips palmi Karny (Thysanoptera: Thripidae is the sole vector of Watermelon bud necrosis tospovirus, where the crop loss has been estimated to be around USD 50 million annually. Chemical insecticides are of limited use in the management of T. palmi due to the thigmokinetic behaviour and development of high levels of resistance to insecticides. There is an urgent need to find out an effective futuristic management strategy, where the small RNAs especially microRNAs hold great promise as a key player in the growth and development. miRNAs are a class of short non-coding RNAs involved in regulation of gene expression either by mRNA cleavage or by translational repression. We identified and characterized a total of 77 miRNAs from T. palmi using high-throughput deep sequencing. Functional classifications of the targets for these miRNAs revealed that majority of them are involved in the regulation of transcription and translation, nucleotide binding and signal transduction. We have also validated few of these miRNAs employing stem-loop RT-PCR, qRT-PCR and Northern blot. The present study not only provides an in-depth understanding of the biological and physiological roles of miRNAs in governing gene expression but may also lead as an invaluable tool for the management of thysanopteran insects in the future.

  2. Towards high-throughput phenotyping of complex patterned behaviors in rodents: focus on mouse self-grooming and its sequencing.

    Science.gov (United States)

    Kyzar, Evan; Gaikwad, Siddharth; Roth, Andrew; Green, Jeremy; Pham, Mimi; Stewart, Adam; Liang, Yiqing; Kobla, Vikrant; Kalueff, Allan V

    2011-12-01

    Increasingly recognized in biological psychiatry, rodent self-grooming is a complex patterned behavior with evolutionarily conserved cephalo-caudal progression. While grooming is traditionally assessed by the latency, frequency and duration, its sequencing represents another important domain sensitive to various experimental manipulations. Such behavioral complexity requires novel objective approaches to quantify rodent grooming, in addition to time-consuming and highly variable manual observation. The present study combined modern behavior-recognition video-tracking technologies (CleverSys, Inc.) with manual observation to characterize in-depth spontaneous (novelty-induced) and artificial (water-induced) self-grooming in adult male C57BL/6J mice. We specifically focused on individual episodes of grooming (paw licking, head washing, body/leg washing, and tail/genital grooming), their duration and transitions between episodes. Overall, the frequency, duration and transitions detected using the automated approach significantly correlated with manual observations (R=0.51-0.7, pgrooming, also indicating that behavior-recognition tools can be applied to characterize both the amount and sequential organization (patterning) of rodent grooming. Together with further refinement and methodological advancement, this approach will foster high-throughput neurophenotyping of grooming, with multiple applications in drug screening and testing of genetically modified animals. Copyright © 2011 Elsevier B.V. All rights reserved.

  3. Spatially conserved regulatory elements identified within human and mouse Cd247 gene using high-throughput sequencing data from the ENCODE project

    DEFF Research Database (Denmark)

    Pundhir, Sachin; Hannibal, Tine Dahlbæk; Bang-Berthelsen, Claus Heiner

    2014-01-01

    . In this study, we have utilized the wealth of high-throughput sequencing data produced during the Encyclopedia of DNA Elements (ENCODE) project to identify spatially conserved regulatory elements within the Cd247 gene from human and mouse. We show the presence of two transcription factor binding sites...

  4. Characterization of the indigenous microflora in raw and pasteurized buffalo milk during storage at refrigeration temperature by high-throughput sequencing

    Science.gov (United States)

    The effect of refrigeration on bacterial communities within raw and pasteurized buffalo milk was studied using high-throughput sequencing. High quality samples of raw buffalo milk were obtained from five dairy farms in the Guangxi province of China. A sample of each milk was pasteurized, and both r...

  5. Identification and characterization of microRNAs in Humulus lupulus using high-throughput sequencing and their response to Citrus bark cracking viroid (CBCVd) infection

    Czech Academy of Sciences Publication Activity Database

    Mishra, Ajay Kumar; Duraisamy, Ganesh Selvaraj; Matoušek, Jaroslav; Radišek, S.; Javornik, B.; Jakše, J.

    2016-01-01

    Roč. 17, č. 919 (2016) ISSN 1471-2164 R&D Projects: GA MŠk(CZ) LH14255 Institutional support: RVO:60077344 Keywords : Humulus lupulus * High-throughput sequencing * Citrus bark cracking viroid Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 3.729, year: 2016

  6. Microfluidic PCR Amplification and MiSeq Amplicon Sequencing Techniques for High-Throughput Detection and Genotyping of Human Pathogenic RNA Viruses in Human Feces, Sewage, and Oysters

    Directory of Open Access Journals (Sweden)

    Mamoru Oshiki

    2018-04-01

    Full Text Available Detection and genotyping of pathogenic RNA viruses in human and environmental samples are useful for monitoring the circulation and prevalence of these pathogens, whereas a conventional PCR assay followed by Sanger sequencing is time-consuming and laborious. The present study aimed to develop a high-throughput detection-and-genotyping tool for 11 human RNA viruses [Aichi virus; astrovirus; enterovirus; norovirus genogroup I (GI, GII, and GIV; hepatitis A virus; hepatitis E virus; rotavirus; sapovirus; and human parechovirus] using a microfluidic device and next-generation sequencer. Microfluidic nested PCR was carried out on a 48.48 Access Array chip, and the amplicons were recovered and used for MiSeq sequencing (Illumina, Tokyo, Japan; genotyping was conducted by homology searching and phylogenetic analysis of the obtained sequence reads. The detection limit of the 11 tested viruses ranged from 100 to 103 copies/μL in cDNA sample, corresponding to 101–104 copies/mL-sewage, 105–108 copies/g-human feces, and 102–105 copies/g-digestive tissues of oyster. The developed assay was successfully applied for simultaneous detection and genotyping of RNA viruses to samples of human feces, sewage, and artificially contaminated oysters. Microfluidic nested PCR followed by MiSeq sequencing enables efficient tracking of the fate of multiple RNA viruses in various environments, which is essential for a better understanding of the circulation of human pathogenic RNA viruses in the human population.

  7. A high throughput mass spectrometry screening analysis based on two-dimensional carbon microfiber fractionation system.

    Science.gov (United States)

    Ma, Biao; Zou, Yilin; Xie, Xuan; Zhao, Jinhua; Piao, Xiangfan; Piao, Jingyi; Yao, Zhongping; Quinto, Maurizio; Wang, Gang; Li, Donghao

    2017-06-09

    A novel high-throughput, solvent saving and versatile integrated two-dimensional microscale carbon fiber/active carbon fiber system (2DμCFs) that allows a simply and rapid separation of compounds in low-polar, medium-polar and high-polar fractions, has been coupled with ambient ionization-mass spectrometry (ESI-Q-TOF-MS and ESI-QqQ-MS) for screening and quantitative analyses of real samples. 2DμCFs led to a substantial interference reduction and minimization of ionization suppression effects, thus increasing the sensitivity and the screening capabilities of the subsequent MS analysis. The method has been applied to the analysis of Schisandra Chinensis extracts, obtaining with a single injection a simultaneous determination of 33 compounds presenting different polarities, such as organic acids, lignans, and flavonoids in less than 7min, at low pressures and using small solvent amounts. The method was also validated using 10 model compounds, giving limit of detections (LODs) ranging from 0.3 to 30ngmL -1 , satisfactory recoveries (from 75.8 to 93.2%) and reproducibilities (relative standard deviations, RSDs, from 1.40 to 8.06%). Copyright © 2017 Elsevier B.V. All rights reserved.

  8. High-throughput peptide mass fingerprinting and protein macroarray analysis using chemical printing strategies

    International Nuclear Information System (INIS)

    Sloane, A.J.; Duff, J.L.; Hopwood, F.G.; Wilson, N.L.; Smith, P.E.; Hill, C.J.; Packer, N.H.; Williams, K.L.; Gooley, A.A.; Cole, R.A.; Cooley, P.W.; Wallace, D.B.

    2001-01-01

    We describe a 'chemical printer' that uses piezoelectric pulsing for rapid and accurate microdispensing of picolitre volumes of fluid for proteomic analysis of 'protein macroarrays'. Unlike positive transfer and pin transfer systems, our printer dispenses fluid in a non-contact process that ensures that the fluid source cannot be contaminated by substrate during a printing event. We demonstrate automated delivery of enzyme and matrix solutions for on-membrane protein digestion and subsequent peptide mass fingerprinting (pmf) analysis directly from the membrane surface using matrix-assisted laser-desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry (MS). This approach bypasses the more commonly used multi-step procedures, thereby permitting a more rapid procedure for protein identification. We also highlight the advantage of printing different chemistries onto an individual protein spot for multiple microscale analyses. This ability is particularly useful when detailed characterisation of rare and valuable sample is required. Using a combination of PNGase F and trypsin we have mapped sites of N-glycosylation using on-membrane digestion strategies. We also demonstrate the ability to print multiple serum samples in a micro-ELISA format and rapidly screen a protein macroarray of human blood plasma for pathogen-derived antigens. We anticipate that the 'chemical printer' will be a major component of proteomic platforms for high-throughput protein identification and characterisation with widespread applications in biomedical and diagnostic discovery

  9. The use of coded PCR primers enables high-throughput sequencing of multiple homolog amplification products by 454 parallel sequencing

    DEFF Research Database (Denmark)

    Binladen, Jonas; Gilbert, M Thomas P; Bollback, Jonathan P

    2007-01-01

    BACKGROUND: The invention of the Genome Sequence 20 DNA Sequencing System (454 parallel sequencing platform) has enabled the rapid and high-volume production of sequence data. Until now, however, individual emulsion PCR (emPCR) reactions and subsequent sequencing runs have been unable to combine...... primers that is dependent on the 5' nucleotide of the tag. In particular, primers 5' labelled with a cytosine are heavily overrepresented among the final sequences, while those 5' labelled with a thymine are strongly underrepresented. A weaker bias also exists with regards to the distribution...

  10. Discriminating activated sludge flocs from biofilm microbial communities in a novel pilot-scale reciprocation MBR using high-throughput 16S rRNA gene sequencing.

    Science.gov (United States)

    De Sotto, Ryan; Ho, Jaeho; Lee, Woonyoung; Bae, Sungwoo

    2018-03-29

    Membrane bioreactors (MBRs) are a well-established filtration technology that has become a popular solution for treating wastewater. One of the drawbacks of MBRs, however, is the formation of biofilm on the surface of membrane modules. The occurrence of biofilms leads to biofouling, which eventually compromises water quality and damages the membranes. To prevent this, it is vital to understand the mechanism of biofilm formation on membrane surfaces. In this pilot-scale study, a novel reciprocation membrane bioreactor was operated for a period of 8 months and fed with domestic wastewater from an aerobic tank of a local WWTP. Water quality parameters were monitored and the microbial composition of the attached biofilm and suspended aggregates was evaluated in this reciprocating MBR configuration. The abundance of nitrifiers and composition of microbial communities from biofilm and suspended solids samples were investigated using qPCR and high throughput 16S amplicon sequencing. Removal efficiencies of 29%, 16%, and 15% of chemical oxygen demand, total phosphorus and total nitrogen from the influent were observed after the MBR process with average effluent concentrations of 16 mg/L, 4.6 mg/L, and 5.8 mg/L respectively. This suggests that the energy-efficient MBR, apart from reducing the total energy consumption, was able to maintain effluent concentrations that are within regulatory standards for discharge. Molecular analysis showed the presence of amoA Bacteria and 16S Nitrospira genes with the occurrence of nitrification. Candidatus Accumulibacter, a genus with organisms that can accumulate phosphorus, was found to be present in both groups which explains why phosphorus removal was observed in the system. High-throughput 16S rRNA amplicon sequencing revealed the genus Saprospira to be the most abundant species from the total OTUs of both the membrane tank and biofilm samples. Copyright © 2018 Elsevier Ltd. All rights reserved.

  11. High-throughput sequencing and copy number variation detection using formalin fixed embedded tissue in metastatic gastric cancer.

    Directory of Open Access Journals (Sweden)

    Seokhwi Kim

    Full Text Available In the era of targeted therapy, mutation profiling of cancer is a crucial aspect of making therapeutic decisions. To characterize cancer at a molecular level, the use of formalin-fixed paraffin-embedded tissue is important. We tested the Ion AmpliSeq Cancer Hotspot Panel v2 and nCounter Copy Number Variation Assay in 89 formalin-fixed paraffin-embedded gastric cancer samples to determine whether they are applicable in archival clinical samples for personalized targeted therapies. We validated the results with Sanger sequencing, real-time quantitative PCR, fluorescence in situ hybridization and immunohistochemistry. Frequently detected somatic mutations included TP53 (28.17%, APC (10.1%, PIK3CA (5.6%, KRAS (4.5%, SMO (3.4%, STK11 (3.4%, CDKN2A (3.4% and SMAD4 (3.4%. Amplifications of HER2, CCNE1, MYC, KRAS and EGFR genes were observed in 8 (8.9%, 4 (4.5%, 2 (2.2%, 1 (1.1% and 1 (1.1% cases, respectively. In the cases with amplification, fluorescence in situ hybridization for HER2 verified gene amplification and immunohistochemistry for HER2, EGFR and CCNE1 verified the overexpression of proteins in tumor cells. In conclusion, we successfully performed semiconductor-based sequencing and nCounter copy number variation analyses in formalin-fixed paraffin-embedded gastric cancer samples. High-throughput screening in archival clinical samples enables faster, more accurate and cost-effective detection of hotspot mutations or amplification in genes.

  12. The Open Connectome Project Data Cluster: Scalable Analysis and Vision for High-Throughput Neuroscience

    Science.gov (United States)

    Burns, Randal; Roncal, William Gray; Kleissas, Dean; Lillaney, Kunal; Manavalan, Priya; Perlman, Eric; Berger, Daniel R.; Bock, Davi D.; Chung, Kwanghun; Grosenick, Logan; Kasthuri, Narayanan; Weiler, Nicholas C.; Deisseroth, Karl; Kazhdan, Michael; Lichtman, Jeff; Reid, R. Clay; Smith, Stephen J.; Szalay, Alexander S.; Vogelstein, Joshua T.; Vogelstein, R. Jacob

    2013-01-01

    We describe a scalable database cluster for the spatial analysis and annotation of high-throughput brain imaging data, initially for 3-d electron microscopy image stacks, but for time-series and multi-channel data as well. The system was designed primarily for workloads that build connectomes— neural connectivity maps of the brain—using the parallel execution of computer vision algorithms on high-performance compute clusters. These services and open-science data sets are publicly available at openconnecto.me. The system design inherits much from NoSQL scale-out and data-intensive computing architectures. We distribute data to cluster nodes by partitioning a spatial index. We direct I/O to different systems—reads to parallel disk arrays and writes to solid-state storage—to avoid I/O interference and maximize throughput. All programming interfaces are RESTful Web services, which are simple and stateless, improving scalability and usability. We include a performance evaluation of the production system, highlighting the effec-tiveness of spatial data organization. PMID:24401992

  13. MultiSense: A Multimodal Sensor Tool Enabling the High-Throughput Analysis of Respiration.

    Science.gov (United States)

    Keil, Peter; Liebsch, Gregor; Borisjuk, Ljudmilla; Rolletschek, Hardy

    2017-01-01

    The high-throughput analysis of respiratory activity has become an important component of many biological investigations. Here, a technological platform, denoted the "MultiSense tool," is described. The tool enables the parallel monitoring of respiration in 100 samples over an extended time period, by dynamically tracking the concentrations of oxygen (O 2 ) and/or carbon dioxide (CO 2 ) and/or pH within an airtight vial. Its flexible design supports the quantification of respiration based on either oxygen consumption or carbon dioxide release, thereby allowing for the determination of the physiologically significant respiratory quotient (the ratio between the quantities of CO 2 released and the O 2 consumed). It requires an LED light source to be mounted above the sample, together with a CCD camera system, adjusted to enable the capture of analyte-specific wavelengths, and fluorescent sensor spots inserted into the sample vial. Here, a demonstration is given of the use of the MultiSense tool to quantify respiration in imbibing plant seeds, for which an appropriate step-by-step protocol is provided. The technology can be easily adapted for a wide range of applications, including the monitoring of gas exchange in any kind of liquid culture system (algae, embryo and tissue culture, cell suspensions, microbial cultures).

  14. A Data Analysis Pipeline Accounting for Artifacts in Tox21 Quantitative High-Throughput Screening Assays.

    Science.gov (United States)

    Hsieh, Jui-Hua; Sedykh, Alexander; Huang, Ruili; Xia, Menghang; Tice, Raymond R

    2015-08-01

    A main goal of the U.S. Tox21 program is to profile a 10K-compound library for activity against a panel of stress-related and nuclear receptor signaling pathway assays using a quantitative high-throughput screening (qHTS) approach. However, assay artifacts, including nonreproducible signals and assay interference (e.g., autofluorescence), complicate compound activity interpretation. To address these issues, we have developed a data analysis pipeline that includes an updated signal noise-filtering/curation protocol and an assay interference flagging system. To better characterize various types of signals, we adopted a weighted version of the area under the curve (wAUC) to quantify the amount of activity across the tested concentration range in combination with the assay-dependent point-of-departure (POD) concentration. Based on the 32 Tox21 qHTS assays analyzed, we demonstrate that signal profiling using wAUC affords the best reproducibility (Pearson's r = 0.91) in comparison with the POD (0.82) only or the AC(50) (i.e., half-maximal activity concentration, 0.81). Among the activity artifacts characterized, cytotoxicity is the major confounding factor; on average, about 8% of Tox21 compounds are affected, whereas autofluorescence affects less than 0.5%. To facilitate data evaluation, we implemented two graphical user interface applications, allowing users to rapidly evaluate the in vitro activity of Tox21 compounds. © 2015 Society for Laboratory Automation and Screening.

  15. Digital Biomass Accumulation Using High-Throughput Plant Phenotype Data Analysis.

    Science.gov (United States)

    Rahaman, Md Matiur; Ahsan, Md Asif; Gillani, Zeeshan; Chen, Ming

    2017-09-01

    Biomass is an important phenotypic trait in functional ecology and growth analysis. The typical methods for measuring biomass are destructive, and they require numerous individuals to be cultivated for repeated measurements. With the advent of image-based high-throughput plant phenotyping facilities, non-destructive biomass measuring methods have attempted to overcome this problem. Thus, the estimation of plant biomass of individual plants from their digital images is becoming more important. In this paper, we propose an approach to biomass estimation based on image derived phenotypic traits. Several image-based biomass studies state that the estimation of plant biomass is only a linear function of the projected plant area in images. However, we modeled the plant volume as a function of plant area, plant compactness, and plant age to generalize the linear biomass model. The obtained results confirm the proposed model and can explain most of the observed variance during image-derived biomass estimation. Moreover, a small difference was observed between actual and estimated digital biomass, which indicates that our proposed approach can be used to estimate digital biomass accurately.

  16. Combinatorial approach toward high-throughput analysis of direct methanol fuel cells.

    Science.gov (United States)

    Jiang, Rongzhong; Rong, Charles; Chu, Deryn

    2005-01-01

    A 40-member array of direct methanol fuel cells (with stationary fuel and convective air supplies) was generated by electrically connecting the fuel cells in series. High-throughput analysis of these fuel cells was realized by fast screening of voltages between the two terminals of a fuel cell at constant current discharge. A large number of voltage-current curves (200) were obtained by screening the voltages through multiple small-current steps. Gaussian distribution was used to statistically analyze the large number of experimental data. The standard deviation (sigma) of voltages of these fuel cells increased linearly with discharge current. The voltage-current curves at various fuel concentrations were simulated with an empirical equation of voltage versus current and a linear equation of sigma versus current. The simulated voltage-current curves fitted the experimental data well. With increasing methanol concentration from 0.5 to 4.0 M, the Tafel slope of the voltage-current curves (at sigma=0.0), changed from 28 to 91 mV.dec-1, the cell resistance from 2.91 to 0.18 Omega, and the power output from 3 to 18 mW.cm-2.

  17. The Open Connectome Project Data Cluster: Scalable Analysis and Vision for High-Throughput Neuroscience.

    Science.gov (United States)

    Burns, Randal; Roncal, William Gray; Kleissas, Dean; Lillaney, Kunal; Manavalan, Priya; Perlman, Eric; Berger, Daniel R; Bock, Davi D; Chung, Kwanghun; Grosenick, Logan; Kasthuri, Narayanan; Weiler, Nicholas C; Deisseroth, Karl; Kazhdan, Michael; Lichtman, Jeff; Reid, R Clay; Smith, Stephen J; Szalay, Alexander S; Vogelstein, Joshua T; Vogelstein, R Jacob

    2013-01-01

    We describe a scalable database cluster for the spatial analysis and annotation of high-throughput brain imaging data, initially for 3-d electron microscopy image stacks, but for time-series and multi-channel data as well. The system was designed primarily for workloads that build connectomes - neural connectivity maps of the brain-using the parallel execution of computer vision algorithms on high-performance compute clusters. These services and open-science data sets are publicly available at openconnecto.me. The system design inherits much from NoSQL scale-out and data-intensive computing architectures. We distribute data to cluster nodes by partitioning a spatial index. We direct I/O to different systems-reads to parallel disk arrays and writes to solid-state storage-to avoid I/O interference and maximize throughput. All programming interfaces are RESTful Web services, which are simple and stateless, improving scalability and usability. We include a performance evaluation of the production system, highlighting the effec-tiveness of spatial data organization.

  18. High-throughput sequencing of core STR loci for forensic genetic investigations using the Roche Genome Sequencer FLX platform

    DEFF Research Database (Denmark)

    Fordyce, Sarah Louise; Avila Arcos, Maria del Carmen; Rockenbauer, Eszter

    2011-01-01

    repeat units. These methods do not allow for the full resolution of STR base composition that sequencing approaches could provide. Here we present an STR profiling method based on the use of the Roche Genome Sequencer (GS) FLX to simultaneously sequence multiple core STR loci. Using this method...

  19. High-throughput transcriptome analysis of barley (Hordeum vulgare) exposed to excessive boron.

    Science.gov (United States)

    Tombuloglu, Guzin; Tombuloglu, Huseyin; Sakcali, M Serdal; Unver, Turgay

    2015-02-15

    Boron (B) is an essential micronutrient for optimum plant growth. However, above certain threshold B is toxic and causes yield loss in agricultural lands. While a number of studies were conducted to understand B tolerance mechanism, a transcriptome-wide approach for B tolerant barley is performed here for the first time. A high-throughput RNA-Seq (cDNA) sequencing technology (Illumina) was used with barley (Hordeum vulgare), yielding 208 million clean reads. In total, 256,874 unigenes were generated and assigned to known peptide databases: Gene Ontology (GO) (99,043), Swiss-Prot (38,266), Clusters of Orthologous Groups (COG) (26,250), and the Kyoto Encyclopedia of Genes and Genomes (KEGG) (36,860), as determined by BLASTx search. According to the digital gene expression (DGE) analyses, 16% and 17% of the transcripts were found to be differentially regulated in root and leaf tissues, respectively. Most of them were involved in cell wall, stress response, membrane, protein kinase and transporter mechanisms. Some of the genes detected as highly expressed in root tissue are phospholipases, predicted divalent heavy-metal cation transporters, formin-like proteins and calmodulin/Ca(2+)-binding proteins. In addition, chitin-binding lectin precursor, ubiquitin carboxyl-terminal hydrolase, and serine/threonine-protein kinase AFC2 genes were indicated to be highly regulated in leaf tissue upon excess B treatment. Some pathways, such as the Ca(2+)-calmodulin system, are activated in response to B toxicity. The differential regulation of 10 transcripts was confirmed by qRT-PCR, revealing the tissue-specific responses against B toxicity and their putative function in B-tolerance mechanisms. Copyright © 2014. Published by Elsevier B.V.

  20. High-throughput mutational analysis of TOR1A in primary dystonia

    Directory of Open Access Journals (Sweden)

    Truong Daniel D

    2009-03-01

    Full Text Available Abstract Background Although the c.904_906delGAG mutation in Exon 5 of TOR1A typically manifests as early-onset generalized dystonia, DYT1 dystonia is genetically and clinically heterogeneous. Recently, another Exon 5 mutation (c.863G>A has been associated with early-onset generalized dystonia and some ΔGAG mutation carriers present with late-onset focal dystonia. The aim of this study was to identify TOR1A Exon 5 mutations in a large cohort of subjects with mainly non-generalized primary dystonia. Methods High resolution melting (HRM was used to examine the entire TOR1A Exon 5 coding sequence in 1014 subjects with primary dystonia (422 spasmodic dysphonia, 285 cervical dystonia, 67 blepharospasm, 41 writer's cramp, 16 oromandibular dystonia, 38 other primary focal dystonia, 112 segmental dystonia, 16 multifocal dystonia, and 17 generalized dystonia and 250 controls (150 neurologically normal and 100 with other movement disorders. Diagnostic sensitivity and specificity were evaluated in an additional 8 subjects with known ΔGAG DYT1 dystonia and 88 subjects with ΔGAG-negative dystonia. Results HRM of TOR1A Exon 5 showed high (100% diagnostic sensitivity and specificity. HRM was rapid and economical. HRM reliably differentiated the TOR1A ΔGAG and c.863G>A mutations. Melting curves were normal in 250/250 controls and 1012/1014 subjects with primary dystonia. The two subjects with shifted melting curves were found to harbor the classic ΔGAG deletion: 1 a non-Jewish Caucasian female with childhood-onset multifocal dystonia and 2 an Ashkenazi Jewish female with adolescent-onset spasmodic dysphonia. Conclusion First, HRM is an inexpensive, diagnostically sensitive and specific, high-throughput method for mutation discovery. Second, Exon 5 mutations in TOR1A are rarely associated with non-generalized primary dystonia.

  1. Two-stage clustering (TSC: a pipeline for selecting operational taxonomic units for the high-throughput sequencing of PCR amplicons.

    Directory of Open Access Journals (Sweden)

    Xiao-Tao Jiang

    Full Text Available Clustering 16S/18S rRNA amplicon sequences into operational taxonomic units (OTUs is a critical step for the bioinformatic analysis of microbial diversity. Here, we report a pipeline for selecting OTUs with a relatively low computational demand and a high degree of accuracy. This pipeline is referred to as two-stage clustering (TSC because it divides tags into two groups according to their abundance and clusters them sequentially. The more abundant group is clustered using a hierarchical algorithm similar to that in ESPRIT, which has a high degree of accuracy but is computationally costly for large datasets. The rarer group, which includes the majority of tags, is then heuristically clustered to improve efficiency. To further improve the computational efficiency and accuracy, two preclustering steps are implemented. To maintain clustering accuracy, all tags are grouped into an OTU depending on their pairwise Needleman-Wunsch distance. This method not only improved the computational efficiency but also mitigated the spurious OTU estimation from 'noise' sequences. In addition, OTUs clustered using TSC showed comparable or improved performance in beta-diversity comparisons compared to existing OTU selection methods. This study suggests that the distribution of sequencing datasets is a useful property for improving the computational efficiency and increasing the clustering accuracy of the high-throughput sequencing of PCR amplicons. The software and user guide are freely available at http://hwzhoulab.smu.edu.cn/paperdata/.

  2. GiA Roots: software for the high throughput analysis of plant root system architecture

    Science.gov (United States)

    2012-01-01

    Background Characterizing root system architecture (RSA) is essential to understanding the development and function of vascular plants. Identifying RSA-associated genes also represents an underexplored opportunity for crop improvement. Software tools are needed to accelerate the pace at which quantitative traits of RSA are estimated from images of root networks. Results We have developed GiA Roots (General Image Analysis of Roots), a semi-automated software tool designed specifically for the high-throughput analysis of root system images. GiA Roots includes user-assisted algorithms to distinguish root from background and a fully automated pipeline that extracts dozens of root system phenotypes. Quantitative information on each phenotype, along with intermediate steps for full reproducibility, is returned to the end-user for downstream analysis. GiA Roots has a GUI front end and a command-line interface for interweaving the software into large-scale workflows. GiA Roots can also be extended to estimate novel phenotypes specified by the end-user. Conclusions We demonstrate the use of GiA Roots on a set of 2393 images of rice roots representing 12 genotypes from the species Oryza sativa. We validate trait measurements against prior analyses of this image set that demonstrated that RSA traits are likely heritable and associated with genotypic differences. Moreover, we demonstrate that GiA Roots is extensible and an end-user can add functionality so that GiA Roots can estimate novel RSA traits. In summary, we show that the software can function as an efficient tool as part of a workflow to move from large numbers of root images to downstream analysis. PMID:22834569

  3. Error correction and statistical analyses for intra-host comparisons of feline immunodeficiency virus diversity from high-throughput sequencing data.

    Science.gov (United States)

    Liu, Yang; Chiaromonte, Francesca; Ross, Howard; Malhotra, Raunaq; Elleder, Daniel; Poss, Mary

    2015-06-30

    Infection with feline immunodeficiency virus (FIV) causes an immunosuppressive disease whose consequences are less severe if cats are co-infected with an attenuated FIV strain (PLV). We use virus diversity measurements, which reflect replication ability and the virus response to various conditions, to test whether diversity of virulent FIV in lymphoid tissues is altered in the presence of PLV. Our data consisted of the 3' half of the FIV genome from three tissues of animals infected with FIV alone, or with FIV and PLV, sequenced by 454 technology. Since rare variants dominate virus populations, we had to carefully distinguish sequence variation from errors due to experimental protocols and sequencing. We considered an exponential-normal convolution model used for background correction of microarray data, and modified it to formulate an error correction approach for minor allele frequencies derived from high-throughput sequencing. Similar to accounting for over-dispersion in counts, this accounts for error-inflated variability in frequencies - and quite effectively reproduces empirically observed distributions. After obtaining error-corrected minor allele frequencies, we applied ANalysis Of VAriance (ANOVA) based on a linear mixed model and found that conserved sites and transition frequencies in FIV genes differ among tissues of dual and single infected cats. Furthermore, analysis of minor allele frequencies at individual FIV genome sites revealed 242 sites significantly affected by infection status (dual vs. single) or infection status by tissue interaction. All together, our results demonstrated a decrease in FIV diversity in bone marrow in the presence of PLV. Importantly, these effects were weakened or undetectable when error correction was performed with other approaches (thresholding of minor allele frequencies; probabilistic clustering of reads). We also queried the data for cytidine deaminase activity on the viral genome, which causes an asymmetric increase

  4. High-throughput analysis of ammonia oxidiser community composition via a novel, amoA-based functional gene array.

    Directory of Open Access Journals (Sweden)

    Guy C J Abell

    Full Text Available Advances in microbial ecology research are more often than not limited by the capabilities of available methodologies. Aerobic autotrophic nitrification is one of the most important and well studied microbiological processes in terrestrial and aquatic ecosystems. We have developed and validated a microbial diagnostic microarray based on the ammonia-monooxygenase subunit A (amoA gene, enabling the in-depth analysis of the community structure of bacterial and archaeal ammonia oxidisers. The amoA microarray has been successfully applied to analyse nitrifier diversity in marine, estuarine, soil and wastewater treatment plant environments. The microarray has moderate costs for labour and consumables and enables the analysis of hundreds of environmental DNA or RNA samples per week per person. The array has been thoroughly validated with a range of individual and complex targets (amoA clones and environmental samples, respectively, combined with parallel analysis using traditional sequencing methods. The moderate cost and high throughput of the microarray makes it possible to adequately address broader questions of the ecology of microbial ammonia oxidation requiring high sample numbers and high resolution of the community composition.

  5. Characterization of Bacterial and Fungal Community Dynamics by High-Throughput Sequencing (HTS Metabarcoding during Flax Dew-Retting

    Directory of Open Access Journals (Sweden)

    Christophe Djemiel

    2017-10-01

    Full Text Available Flax dew-retting is a key step in the industrial extraction of fibers from flax stems and is dependent upon the production of a battery of hydrolytic enzymes produced by micro-organisms during this process. To explore the diversity and dynamics of bacterial and fungal communities involved in this process we applied a high-throughput sequencing (HTS DNA metabarcoding approach (16S rRNA/ITS region, Illumina Miseq on plant and soil samples obtained over a period of 7 weeks in July and August 2014. Twenty-three bacterial and six fungal phyla were identified in soil samples and 11 bacterial and four fungal phyla in plant samples. Dominant phyla were Proteobacteria, Bacteroidetes, Actinobacteria, and Firmicutes (bacteria and Ascomycota, Basidiomycota, and Zygomycota (fungi all of which have been previously associated with flax dew-retting except for Bacteroidetes and Basidiomycota that were identified for the first time. Rare phyla also identified for the first time in this process included Acidobacteria, CKC4, Chlorobi, Fibrobacteres, Gemmatimonadetes, Nitrospirae and TM6 (bacteria, and Chytridiomycota (fungi. No differences in microbial communities and colonization dynamics were observed between early and standard flax harvests. In contrast, the common agricultural practice of swath turning affects both bacterial and fungal community membership and structure in straw samples and may contribute to a more uniform retting. Prediction of community function using PICRUSt indicated the presence of a large collection of potential bacterial enzymes capable of hydrolyzing backbones and side-chains of cell wall polysaccharides. Assignment of functional guild (functional group using FUNGuild software highlighted a change from parasitic to saprophytic trophic modes in fungi during retting. This work provides the first exhaustive description of the microbial communities involved in flax dew-retting and will provide a valuable benchmark in future studies aiming

  6. WormSizer: high-throughput analysis of nematode size and shape.

    Directory of Open Access Journals (Sweden)

    Brad T Moore

    Full Text Available The fundamental phenotypes of growth rate, size and morphology are the result of complex interactions between genotype and environment. We developed a high-throughput software application, WormSizer, which computes size and shape of nematodes from brightfield images. Existing methods for estimating volume either coarsely model the nematode as a cylinder or assume the worm shape or opacity is invariant. Our estimate is more robust to changes in morphology or optical density as it only assumes radial symmetry. This open source software is written as a plugin for the well-known image-processing framework Fiji/ImageJ. It may therefore be extended easily. We evaluated the technical performance of this framework, and we used it to analyze growth and shape of several canonical Caenorhabditis elegans mutants in a developmental time series. We confirm quantitatively that a Dumpy (Dpy mutant is short and fat and that a Long (Lon mutant is long and thin. We show that daf-2 insulin-like receptor mutants are larger than wild-type upon hatching but grow slow, and WormSizer can distinguish dauer larvae from normal larvae. We also show that a Small (Sma mutant is actually smaller than wild-type at all stages of larval development. WormSizer works with Uncoordinated (Unc and Roller (Rol mutants as well, indicating that it can be used with mutants despite behavioral phenotypes. We used our complete data set to perform a power analysis, giving users a sense of how many images are needed to detect different effect sizes. Our analysis confirms and extends on existing phenotypic characterization of well-characterized mutants, demonstrating the utility and robustness of WormSizer.

  7. ZebraZoom: an automated program for high-throughput behavioral analysis and categorization

    Science.gov (United States)

    Mirat, Olivier; Sternberg, Jenna R.; Severi, Kristen E.; Wyart, Claire

    2013-01-01

    The zebrafish larva stands out as an emergent model organism for translational studies involving gene or drug screening thanks to its size, genetics, and permeability. At the larval stage, locomotion occurs in short episodes punctuated by periods of rest. Although phenotyping behavior is a key component of large-scale screens, it has not yet been automated in this model system. We developed ZebraZoom, a program to automatically track larvae and identify maneuvers for many animals performing discrete movements. Our program detects each episodic movement and extracts large-scale statistics on motor patterns to produce a quantification of the locomotor repertoire. We used ZebraZoom to identify motor defects induced by a glycinergic receptor antagonist. The analysis of the blind mutant atoh7 revealed small locomotor defects associated with the mutation. Using multiclass supervised machine learning, ZebraZoom categorized all episodes of movement for each larva into one of three possible maneuvers: slow forward swim, routine turn, and escape. ZebraZoom reached 91% accuracy for categorization of stereotypical maneuvers that four independent experimenters unanimously identified. For all maneuvers in the data set, ZebraZoom agreed with four experimenters in 73.2–82.5% of cases. We modeled the series of maneuvers performed by larvae as Markov chains and observed that larvae often repeated the same maneuvers within a group. When analyzing subsequent maneuvers performed by different larvae, we found that larva–larva interactions occurred as series of escapes. Overall, ZebraZoom reached the level of precision found in manual analysis but accomplished tasks in a high-throughput format necessary for large screens. PMID:23781175

  8. ToxCast Workflow: High-throughput screening assay data processing, analysis and management (SOT)

    Science.gov (United States)

    US EPA’s ToxCast program is generating data in high-throughput screening (HTS) and high-content screening (HCS) assays for thousands of environmental chemicals, for use in developing predictive toxicity models. Currently the ToxCast screening program includes over 1800 unique c...

  9. MSP-HTPrimer: a high-throughput primer design tool to improve assay design for DNA methylation analysis in epigenetics.

    Science.gov (United States)

    Pandey, Ram Vinay; Pulverer, Walter; Kallmeyer, Rainer; Beikircher, Gabriel; Pabinger, Stephan; Kriegner, Albert; Weinhäusel, Andreas

    2016-01-01

    Bisulfite (BS) conversion-based and methylation-sensitive restriction enzyme (MSRE)-based PCR methods have been the most commonly used techniques for locus-specific DNA methylation analysis. However, both methods have advantages and limitations. Thus, an integrated approach would be extremely useful to quantify the DNA methylation status successfully with great sensitivity and specificity. Designing specific and optimized primers for target regions is the most critical and challenging step in obtaining the adequate DNA methylation results using PCR-based methods. Currently, no integrated, optimized, and high-throughput methylation-specific primer design software methods are available for both BS- and MSRE-based methods. Therefore an integrated, powerful, and easy-to-use methylation-specific primer design pipeline with great accuracy and success rate will be very useful. We have developed a new web-based pipeline, called MSP-HTPrimer, to design primers pairs for MSP, BSP, pyrosequencing, COBRA, and MSRE assays on both genomic strands. First, our pipeline converts all target sequences into bisulfite-treated templates for both forward and reverse strand and designs all possible primer pairs, followed by filtering for single nucleotide polymorphisms (SNPs) and known repeat regions. Next, each primer pairs are annotated with the upstream and downstream RefSeq genes, CpG island, and cut sites (for COBRA and MSRE). Finally, MSP-HTPrimer selects specific primers from both strands based on custom and user-defined hierarchical selection criteria. MSP-HTPrimer produces a primer pair summary output table in TXT and HTML format for display and UCSC custom tracks for resulting primer pairs in GTF format. MSP-HTPrimer is an integrated, web-based, and high-throughput pipeline and has no limitation on the number and size of target sequences and designs MSP, BSP, pyrosequencing, COBRA, and MSRE assays. It is the only pipeline, which automatically designs primers on both genomic

  10. EZH2 and CD79B mutational status over time in B-cell non-Hodgkin lymphomas detected by high-throughput sequencing using minimal samples

    Science.gov (United States)

    Saieg, Mauro Ajaj; Geddie, William R; Boerner, Scott L; Bailey, Denis; Crump, Michael; da Cunha Santos, Gilda

    2013-01-01

    BACKGROUND: Numerous genomic abnormalities in B-cell non-Hodgkin lymphomas (NHLs) have been revealed by novel high-throughput technologies, including recurrent mutations in EZH2 (enhancer of zeste homolog 2) and CD79B (B cell antigen receptor complex-associated protein beta chain) genes. This study sought to determine the evolution of the mutational status of EZH2 and CD79B over time in different samples from the same patient in a cohort of B-cell NHLs, through use of a customized multiplex mutation assay. METHODS: DNA that was extracted from cytological material stored on FTA cards as well as from additional specimens, including archived frozen and formalin-fixed histological specimens, archived stained smears, and cytospin preparations, were submitted to a multiplex mutation assay specifically designed for the detection of point mutations involving EZH2 and CD79B, using MassARRAY spectrometry followed by Sanger sequencing. RESULTS: All 121 samples from 80 B-cell NHL cases were successfully analyzed. Mutations in EZH2 (Y646) and CD79B (Y196) were detected in 13.2% and 8% of the samples, respectively, almost exclusively in follicular lymphomas and diffuse large B-cell lymphomas. In one-third of the positive cases, a wild type was detected in a different sample from the same patient during follow-up. CONCLUSIONS: Testing multiple minimal tissue samples using a high-throughput multiplex platform exponentially increases tissue availability for molecular analysis and might facilitate future studies of tumor progression and the related molecular events. Mutational status of EZH2 and CD79B may vary in B-cell NHL samples over time and support the concept that individualized therapy should be based on molecular findings at the time of treatment, rather than on results obtained from previous specimens. Cancer (Cancer Cytopathol) 2013;121:377–386. © 2013 American Cancer Society. PMID:23361872

  11. Development of high-throughput analysis system using highly-functional organic polymer monoliths

    International Nuclear Information System (INIS)

    Umemura, Tomonari; Kojima, Norihisa; Ueki, Yuji

    2008-01-01

    The growing demand for high-throughput analysis in the current competitive life sciences and industries has promoted the development of high-speed HPLC techniques and tools. As one of such tools, monolithic columns have attracted increasing attention and interest in the last decade due to the low flow-resistance and excellent mass transfer, allowing for rapid separations and reactions at high flow rates with minimal loss of column efficiency. Monolithic materials are classified into two main groups: silica- and organic polymer-based monoliths, each with their own advantages and disadvantages. Organic polymer monoliths have several distinct advantages in life-science research, including wide pH stability, less irreversible adsorption, facile preparation and modification. Thus, we have so far tried to develop organic polymer monoliths for various chemical operations, such as separation, extraction, preconcentration, and reaction. In the present paper, recent progress in the development of organic polymer monoliths is discussed. Especially, the procedure for the preparation of methacrylate-based monoliths with various functional groups is described, where the influence of different compositional and processing parameters on the monolithic structure is also addressed. Furthermore, the performance of the produced monoliths is demonstrated through the results for (1) rapid separations of alklybenzenes at high flow rates, (2) flow-through enzymatic digestion of cytochrome c on a trypsin-immobilized monolithic column, and (3) separation of the tryptic digest on a reversed-phase monolithic column. The flexibility and versatility of organic polymer monoliths will be beneficial for further enhancing analytical performance, and will open the way for new applications and opportunities both in scientific and industrial research. (author)

  12. High-throughput simultaneous analysis of RNA, protein, and lipid biomarkers in heterogeneous tissue samples.

    Science.gov (United States)

    Reiser, Vladimír; Smith, Ryan C; Xue, Jiyan; Kurtz, Marc M; Liu, Rong; Legrand, Cheryl; He, Xuanmin; Yu, Xiang; Wong, Peggy; Hinchcliffe, John S; Tanen, Michael R; Lazar, Gloria; Zieba, Renata; Ichetovkin, Marina; Chen, Zhu; O'Neill, Edward A; Tanaka, Wesley K; Marton, Matthew J; Liao, Jason; Morris, Mark; Hailman, Eric; Tokiwa, George Y; Plump, Andrew S

    2011-11-01

    With expanding biomarker discovery efforts and increasing costs of drug development, it is critical to maximize the value of mass-limited clinical samples. The main limitation of available methods is the inability to isolate and analyze, from a single sample, molecules requiring incompatible extraction methods. Thus, we developed a novel semiautomated method for tissue processing and tissue milling and division (TMAD). We used a SilverHawk atherectomy catheter to collect atherosclerotic plaques from patients requiring peripheral atherectomy. Tissue preservation by flash freezing was compared with immersion in RNAlater®, and tissue grinding by traditional mortar and pestle was compared with TMAD. Comparators were protein, RNA, and lipid yield and quality. Reproducibility of analyte yield from aliquots of the same tissue sample processed by TMAD was also measured. The quantity and quality of biomarkers extracted from tissue prepared by TMAD was at least as good as that extracted from tissue stored and prepared by traditional means. TMAD enabled parallel analysis of gene expression (quantitative reverse-transcription PCR, microarray), protein composition (ELISA), and lipid content (biochemical assay) from as little as 20 mg of tissue. The mean correlation was r = 0.97 in molecular composition (RNA, protein, or lipid) between aliquots of individual samples generated by TMAD. We also demonstrated that it is feasible to use TMAD in a large-scale clinical study setting. The TMAD methodology described here enables semiautomated, high-throughput sampling of small amounts of heterogeneous tissue specimens by multiple analytical techniques with generally improved quality of recovered biomolecules.

  13. WebPrInSeS: automated full-length clone sequence identification and verification using high-throughput sequencing data.

    Science.gov (United States)

    Massouras, Andreas; Decouttere, Frederik; Hens, Korneel; Deplancke, Bart

    2010-07-01

    High-throughput sequencing (HTS) is revolutionizing our ability to obtain cheap, fast and reliable sequence information. Many experimental approaches are expected to benefit from the incorporation of such sequencing features in their pipeline. Consequently, software tools that facilitate such an incorporation should be of great interest. In this context, we developed WebPrInSeS, a web server tool allowing automated full-length clone sequence identification and verification using HTS data. WebPrInSeS encompasses two separate software applications. The first is WebPrInSeS-C which performs automated sequence verification of user-defined open-reading frame (ORF) clone libraries. The second is WebPrInSeS-E, which identifies positive hits in cDNA or ORF-based library screening experiments such as yeast one- or two-hybrid assays. Both tools perform de novo assembly using HTS data from any of the three major sequencing platforms. Thus, WebPrInSeS provides a highly integrated, cost-effective and efficient way to sequence-verify or identify clones of interest. WebPrInSeS is available at http://webprinses.epfl.ch/ and is open to all users.

  14. High-throughput metagenomic analysis of petroleum-contaminated soil microbiome reveals the versatility in xenobiotic aromatics metabolism.

    Science.gov (United States)

    Bao, Yun-Juan; Xu, Zixiang; Li, Yang; Yao, Zhi; Sun, Jibin; Song, Hui

    2017-06-01

    The soil with petroleum contamination is one of the most studied soil ecosystems due to its rich microorganisms for hydrocarbon degradation and broad applications in bioremediation. However, our understanding of the genomic properties and functional traits of the soil microbiome is limited. In this study, we used high-throughput metagenomic sequencing to comprehensively study the microbial community from petroleum-contaminated soils near Tianjin Dagang oilfield in eastern China. The analysis reveals that the soil metagenome is characterized by high level of community diversity and metabolic versatility. The metageome community is predominated by γ-Proteobacteria and α-Proteobacteria, which are key players for petroleum hydrocarbon degradation. The functional study demonstrates over-represented enzyme groups and pathways involved in degradation of a broad set of xenobiotic aromatic compounds, including toluene, xylene, chlorobenzoate, aminobenzoate, DDT, methylnaphthalene, and bisphenol. A composite metabolic network is proposed for the identified pathways, thus consolidating our identification of the pathways. The overall data demonstrated the great potential of the studied soil microbiome in the xenobiotic aromatics degradation. The results not only establish a rich reservoir for novel enzyme discovery but also provide putative applications in bioremediation. Copyright © 2016. Published by Elsevier B.V.

  15. A new sieving matrix for DNA sequencing, genotyping and mutation detection and high-throughput genotyping with a 96-capillary array system

    Energy Technology Data Exchange (ETDEWEB)

    Gao, David [Iowa State Univ., Ames, IA (United States)

    1999-11-08

    Capillary electrophoresis has been widely accepted as a fast separation technique in DNA analysis. In this dissertation, a new sieving matrix is described for DNA analysis, especially DNA sequencing, genetic typing and mutation detection. A high-throughput 96 capillary array electrophoresis system was also demonstrated for simultaneous multiple genotyping. The authors first evaluated the influence of different capillary coatings on the performance of DNA sequencing. A bare capillary was compared with a DB-wax, an FC-coated and a polyvinylpyrrolidone dynamically coated capillary with PEO as sieving matrix. It was found that covalently-coated capillaries had no better performance than bare capillaries while PVP coating provided excellent and reproducible results. The authors also developed a new sieving Matrix for DNA separation based on commercially available poly(vinylpyrrolidone) (PVP). This sieving matrix has a very low viscosity and an excellent self-coating effect. Successful separations were achieved in uncoated capillaries. Sequencing of M13mp18 showed good resolution up to 500 bases in treated PVP solution. Temperature gradient capillary electrophoresis and PVP solution was applied to mutation detection. A heteroduplex sample and a homoduplex reference were injected during a pair of continuous runs. A temperature gradient of 10 C with a ramp of 0.7 C/min was swept throughout the capillary. Detection was accomplished by laser induced fluorescence detection. Mutation detection was performed by comparing the pattern changes between the homoduplex and the heteroduplex samples. High throughput, high detection rate and easy operation were achieved in this system. They further demonstrated fast and reliable genotyping based on CTTv STR system by multiple-capillary array electrophoresis. The PCR products from individuals were mixed with pooled allelic ladder as an absolute standard and coinjected with a 96-vial tray. Simultaneous one-color laser-induced fluorescence

  16. Micro-scaled high-throughput digestion of plant tissue samples for multi-elemental analysis

    Directory of Open Access Journals (Sweden)

    Husted Søren

    2009-09-01

    Full Text Available Abstract Background Quantitative multi-elemental analysis by inductively coupled plasma (ICP spectrometry depends on a complete digestion of solid samples. However, fast and thorough sample digestion is a challenging analytical task which constitutes a bottleneck in modern multi-elemental analysis. Additional obstacles may be that sample quantities are limited and elemental concentrations low. In such cases, digestion in small volumes with minimum dilution and contamination is required in order to obtain high accuracy data. Results We have developed a micro-scaled microwave digestion procedure and optimized it for accurate elemental profiling of plant materials (1-20 mg dry weight. A commercially available 64-position rotor with 5 ml disposable glass vials, originally designed for microwave-based parallel organic synthesis, was used as a platform for the digestion. The novel micro-scaled method was successfully validated by the use of various certified reference materials (CRM with matrices rich in starch, lipid or protein. When the micro-scaled digestion procedure was applied on single rice grains or small batches of Arabidopsis seeds (1 mg, corresponding to approximately 50 seeds, the obtained elemental profiles closely matched those obtained by conventional analysis using digestion in large volume vessels. Accumulated elemental contents derived from separate analyses of rice grain fractions (aleurone, embryo and endosperm closely matched the total content obtained by analysis of the whole rice grain. Conclusion A high-throughput micro-scaled method has been developed which enables digestion of small quantities of plant samples for subsequent elemental profiling by ICP-spectrometry. The method constitutes a valuable tool for screening of mutants and transformants. In addition, the method facilitates studies of the distribution of essential trace elements between and within plant organs which is relevant for, e.g., breeding programmes aiming at

  17. Comparing the normalization methods for the differential analysis of Illumina high-throughput RNA-Seq data.

    Science.gov (United States)

    Li, Peipei; Piao, Yongjun; Shon, Ho Sun; Ryu, Keun Ho

    2015-10-28

    Recently, rapid improvements in technology and decrease in sequencing costs have made RNA-Seq a widely used technique to quantify gene expression levels. Various normalization approaches have been proposed, owing to the importance of normalization in the analysis of RNA-Seq data. A comparison of recently proposed normalization methods is required to generate suitable guidelines for the selection of the most appropriate approach for future experiments. In this paper, we compared eight non-abundance (RC, UQ, Med, TMM, DESeq, Q, RPKM, and ERPKM) and two abundance estimation normalization methods (RSEM and Sailfish). The experiments were based on real Illumina high-throughput RNA-Seq of 35- and 76-nucleotide sequences produced in the MAQC project and simulation reads. Reads were mapped with human genome obtained from UCSC Genome Browser Database. For precise evaluation, we investigated Spearman correlation between the normalization results from RNA-Seq and MAQC qRT-PCR values for 996 genes. Based on this work, we showed that out of the eight non-abundance estimation normalization methods, RC, UQ, Med, TMM, DESeq, and Q gave similar normalization results for all data sets. For RNA-Seq of a 35-nucleotide sequence, RPKM showed the highest correlation results, but for RNA-Seq of a 76-nucleotide sequence, least correlation was observed than the other methods. ERPKM did not improve results than RPKM. Between two abundance estimation normalization methods, for RNA-Seq of a 35-nucleotide sequence, higher correlation was obtained with Sailfish than that with RSEM, which was better than without using abundance estimation methods. However, for RNA-Seq of a 76-nucleotide sequence, the results achieved by RSEM were similar to without applying abundance estimation methods, and were much better than with Sailfish. Furthermore, we found that adding a poly-A tail increased alignment numbers, but did not improve normalization results. Spearman correlation analysis revealed that RC, UQ

  18. DAG expression: high-throughput gene expression analysis of real-time PCR data using standard curves for relative quantification.

    Directory of Open Access Journals (Sweden)

    María Ballester

    Full Text Available BACKGROUND: Real-time quantitative PCR (qPCR is still the gold-standard technique for gene-expression quantification. Recent technological advances of this method allow for the high-throughput gene-expression analysis, without the limitations of sample space and reagent used. However, non-commercial and user-friendly software for the management and analysis of these data is not available. RESULTS: The recently developed commercial microarrays allow for the drawing of standard curves of multiple assays using the same n-fold diluted samples. Data Analysis Gene (DAG Expression software has been developed to perform high-throughput gene-expression data analysis using standard curves for relative quantification and one or multiple reference genes for sample normalization. We discuss the application of DAG Expression in the analysis of data from an experiment performed with Fluidigm technology, in which 48 genes and 115 samples were measured. Furthermore, the quality of our analysis was tested and compared with other available methods. CONCLUSIONS: DAG Expression is a freely available software that permits the automated analysis and visualization of high-throughput qPCR. A detailed manual and a demo-experiment are provided within the DAG Expression software at http://www.dagexpression.com/dage.zip.

  19. A community resource for high-throughput quantitative RT-PCR analysis of transcription factor gene expression in Medicago truncatula

    Directory of Open Access Journals (Sweden)

    Redman Julia C

    2008-07-01

    Full Text Available Abstract Background Medicago truncatula is a model legume species that is currently the focus of an international genome sequencing effort. Although several different oligonucleotide and cDNA arrays have been produced for genome-wide transcript analysis of this species, intrinsic limitations in the sensitivity of hybridization-based technologies mean that transcripts of genes expressed at low-levels cannot be measured accurately with these tools. Amongst such genes are many encoding transcription factors (TFs, which are arguably the most important class of regulatory proteins. Quantitative reverse transcription-polymerase chain reaction (qRT-PCR is the most sensitive method currently available for transcript quantification, and one that can be scaled up to analyze transcripts of thousands of genes in parallel. Thus, qRT-PCR is an ideal method to tackle the problem of TF transcript quantification in Medicago and other plants. Results We established a bioinformatics pipeline to identify putative TF genes in Medicago truncatula and to design gene-specific oligonucleotide primers for qRT-PCR analysis of TF transcripts. We validated the efficacy and gene-specificity of over 1000 TF primer pairs and utilized these to identify sets of organ-enhanced TF genes that may play important roles in organ development or differentiation in this species. This community resource will be developed further as more genome sequence becomes available, with the ultimate goal of producing validated, gene-specific primers for all Medicago TF genes. Conclusion High-throughput qRT-PCR using a 384-well plate format enables rapid, flexible, and sensitive quantification of all predicted Medicago transcription factor mRNAs. This resource has been utilized recently by several groups in Europe, Australia, and the USA, and we expect that it will become the 'gold-standard' for TF transcript profiling in Medicago truncatula.

  20. High-throughput sequencing of 16S rRNA gene amplicons: effects of extraction procedure, primer length and annealing temperature.

    Science.gov (United States)

    Sergeant, Martin J; Constantinidou, Chrystala; Cogan, Tristan; Penn, Charles W; Pallen, Mark J

    2012-01-01

    The analysis of 16S-rDNA sequences to assess the bacterial community composition of a sample is a widely used technique that has increased with the advent of high throughput sequencing. Although considerable effort has been devoted to identifying the most informative region of the 16S gene and the optimal informatics procedures to process the data, little attention has been paid to the PCR step, in particular annealing temperature and primer length. To address this, amplicons derived from 16S-rDNA were generated from chicken caecal content DNA using different annealing temperatures, primers and different DNA extraction procedures. The amplicons were pyrosequenced to determine the optimal protocols for capture of maximum bacterial diversity from a chicken caecal sample. Even at very low annealing temperatures there was little effect on the community structure, although the abundance of some OTUs such as Bifidobacterium increased. Using shorter primers did not reveal any novel OTUs but did change the community profile obtained. Mechanical disruption of the sample by bead beating had a significant effect on the results obtained, as did repeated freezing and thawing. In conclusion, existing primers and standard annealing temperatures captured as much diversity as lower annealing temperatures and shorter primers.

  1. Gold-coated polydimethylsiloxane microwells for high-throughput electrochemiluminescence analysis of intracellular glucose at single cells.

    Science.gov (United States)

    Xia, Juan; Zhou, Junyu; Zhang, Ronggui; Jiang, Dechen; Jiang, Depeng

    2018-06-04

    In this communication, a gold-coated polydimethylsiloxane (PDMS) chip with cell-sized microwells was prepared through a stamping and spraying process that was applied directly for high-throughput electrochemiluminescence (ECL) analysis of intracellular glucose at single cells. As compared with the previous multiple-step fabrication of photoresist-based microwells on the electrode, the preparation process is simple and offers fresh electrode surface for higher luminescence intensity. More luminescence intensity was recorded from cell-retained microwells than that at the planar region among the microwells that was correlated with the content of intracellular glucose. The successful monitoring of intracellular glucose at single cells using this PDMS chip will provide an alternative strategy for high-throughput single-cell analysis. Graphical abstract ᅟ.

  2. High-Throughput Fabrication of Nanocone Substrates through Polymer Injection Moulding For SERS Analysis in Microfluidic Systems

    DEFF Research Database (Denmark)

    Viehrig, Marlitt; Matteucci, Marco; Thilsted, Anil H.

    analysis. Metal-capped silicon nanopillars, fabricated through a maskless ion etch, are state-of-the-art for on-chip SERS substrates. A dense cluster of high aspect ratio polymer nanocones was achieved by using high-throughput polymer injection moulding over a large area replicating a silicon nanopillar...... structure. Gold-capped polymer nanocones display similar SERS sensitivity as silicon nanopillars, while being easily integrable into a microfluidic chips....

  3. High Diversity of Myocyanophage in Various Aquatic Environments Revealed by High-Throughput Sequencing of Major Capsid Protein Gene With a New Set of Primers

    Directory of Open Access Journals (Sweden)

    Weiguo Hou

    2018-05-01

    Full Text Available Myocyanophages, a group of viruses infecting cyanobacteria, are abundant and play important roles in elemental cycling. Here we investigated the particle-associated viral communities retained on 0.2 μm filters and in sediment samples (representing ancient cyanophage communities from four ocean and three lake locations, using high-throughput sequencing and a newly designed primer pair targeting a gene fragment (∼145-bp in length encoding the cyanophage gp23 major capsid protein (MCP. Diverse viral communities were detected in all samples. The fragments of 142-, 145-, and 148-bp in length were most abundant in the amplicons, and most sequences (>92% belonged to cyanophages. Additionally, different sequencing depths resulted in different diversity estimates of the viral community. Operational taxonomic units obtained from deep sequencing of the MCP gene covered the majority of those obtained from shallow sequencing, suggesting that deep sequencing exhibited a more complete picture of cyanophage community than shallow sequencing. Our results also revealed a wide geographic distribution of marine myocyanophages, i.e., higher dissimilarities of the myocyanophage communities corresponded with the larger distances between the sampling sites. Collectively, this study suggests that the newly designed primer pair can be effectively used to study the community and diversity of myocyanophage from different environments, and the high-throughput sequencing represents a good method to understand viral diversity.

  4. Laser desorption mass spectrometry for high-throughput DNA analysis and its applications

    Science.gov (United States)

    Chen, C. H. Winston; Golovlev, Valeri V.; Taranenko, N. I.; Allman, S. L.; Isola, Narayana R.; Potter, N. T.; Matteson, K. J.; Chang, Linus Y.

    1999-05-01

    Laser desorption mass spectrometry (LDMS) has been developed for DNA sequencing, disease diagnosis, and DNA fingerprinting for forensic applications. With LDMS, the speed of DNA analysis can be much faster than conventional gel electrophoresis. No dye or radioactive tagging to DNA segments for detection is needed. LDMS is emerging as a new alternative technology for DNA analysis.

  5. Abundance and diversity of bacterial nitrifiers and denitrifiers and their functional genes in tannery wastewater treatment plants revealed by high-throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Zhu Wang

    Full Text Available Biological nitrification/denitrification is frequently used to remove nitrogen from tannery wastewater containing high concentrations of ammonia. However, information is limited about the bacterial nitrifiers and denitrifiers and their functional genes in tannery wastewater treatment plants (WWTPs due to the low-throughput of the previously used methods. In this study, 454 pyrosequencing and Illumina high-throughput sequencing, combined with molecular methods, were used to comprehensively characterize structures and functions of nitrification and denitrification bacterial communities in aerobic and anaerobic sludge of two full-scale tannery WWTPs. Pyrosequencing of 16S rRNA genes showed that Proteobacteria and Synergistetes dominated in the aerobic and anaerobic sludge, respectively. Ammonia-oxidizing bacteria (AOB amoA gene cloning revealed that Nitrosomonas europaea dominated the ammonia-oxidizing community in the WWTPs. Metagenomic analysis showed that the denitrifiers mainly included the genera of Thauera, Paracoccus, Hyphomicrobium, Comamonas and Azoarcus, which may greatly contribute to the nitrogen removal in the two WWTPs. It is interesting that AOB and ammonia-oxidizing archaea had low abundance although both WWTPs demonstrated high ammonium removal efficiency. Good correlation between the qPCR and metagenomic analysis is observed for the quantification of functional genes amoA, nirK, nirS and nosZ, indicating that the metagenomic approach may be a promising method used to comprehensively investigate the abundance of functional genes of nitrifiers and denitrifiers in the environment.

  6. High-throughput analysis of endogenous fruit glycosyl hydrolases using a novel chromogenic hydrogel substrate assay

    DEFF Research Database (Denmark)

    Schückel, Julia; Kracun, Stjepan Kresimir; Lausen, Thomas Frederik

    2017-01-01

    A broad range of enzyme activities can be found in a wide range of different fruits and fruiting bodies but there is a lack of methods where many samples can be handled in a high-throughput and efficient manner. In particular, plant polysaccharide degrading enzymes – glycosyl hydrolases (GHs) play...... led to a more profound understanding of the importance of GH activity and regulation, current methods for determining glycosyl hydrolase activity are lacking in throughput and fail to keep up with data output from transcriptome research. Here we present the use of a versatile, easy...

  7. Gene Expression Analysis of Escherichia Coli Grown in Miniaturized Bioreactor Platforms for High-Throughput Analysis of Growth and genomic Data

    DEFF Research Database (Denmark)

    Boccazzi, P.; Zanzotto, A.; Szita, Nicolas

    2005-01-01

    Combining high-throughput growth physiology and global gene expression data analysis is of significant value for integrating metabolism and genomics. We compared global gene expression using 500 ng of total RNA from Escherichia coli cultures grown in rich or defined minimal media in a miniaturize...... cultures using just 500 ng of total RNA indicate that high-throughput integration of growth physiology and genomics will be possible with novel biochemical platforms and improved detection technologies....

  8. High throughput proteomic analysis of the secretome in an explant model of articular cartilage inflammation

    Science.gov (United States)

    Clutterbuck, Abigail L.; Smith, Julia R.; Allaway, David; Harris, Pat; Liddell, Susan; Mobasheri, Ali

    2011-01-01

    This study employed a targeted high-throughput proteomic approach to identify the major proteins present in the secretome of articular cartilage. Explants from equine metacarpophalangeal joints were incubated alone or with interleukin-1beta (IL-1β, 10 ng/ml), with or without carprofen, a non-steroidal anti-inflammatory drug, for six days. After tryptic digestion of culture medium supernatants, resulting peptides were separated by HPLC and detected in a Bruker amaZon ion trap instrument. The five most abundant peptides in each MS scan were fragmented and the fragmentation patterns compared to mammalian entries in the Swiss-Prot database, using the Mascot search engine. Tryptic peptides originating from aggrecan core protein, cartilage oligomeric matrix protein (COMP), fibronectin, fibromodulin, thrombospondin-1 (TSP-1), clusterin (CLU), cartilage intermediate layer protein-1 (CILP-1), chondroadherin (CHAD) and matrix metalloproteinases MMP-1 and MMP-3 were detected. Quantitative western blotting confirmed the presence of CILP-1, CLU, MMP-1, MMP-3 and TSP-1. Treatment with IL-1β increased MMP-1, MMP-3 and TSP-1 and decreased the CLU precursor but did not affect CILP-1 and CLU levels. Many of the proteins identified have well-established extracellular matrix functions and are involved in early repair/stress responses in cartilage. This high throughput approach may be used to study the changes that occur in the early stages of osteoarthritis. PMID:21354348

  9. Noninvasive High-Throughput Single-Cell Analysis of HIV Protease Activity Using Ratiometric Flow Cytometry

    Directory of Open Access Journals (Sweden)

    Rok Gaber

    2013-11-01

    Full Text Available To effectively fight against the human immunodeficiency virus infection/ acquired immunodeficiency syndrome (HIV/AIDS epidemic, ongoing development of novel HIV protease inhibitors is required. Inexpensive high-throughput screening assays are needed to quickly scan large sets of chemicals for potential inhibitors. We have developed a Förster resonance energy transfer (FRET-based, HIV protease-sensitive sensor using a combination of a fluorescent protein pair, namely mCerulean and mCitrine. Through extensive in vitro characterization, we show that the FRET-HIV sensor can be used in HIV protease screening assays. Furthermore, we have used the FRET-HIV sensor for intracellular quantitative detection of HIV protease activity in living cells, which more closely resembles an actual viral infection than an in vitro assay. We have developed a high-throughput method that employs a ratiometric flow cytometry for analyzing large populations of cells that express the FRET-HIV sensor. The method enables FRET measurement of single cells with high sensitivity and speed and should be used when subpopulation-specific intracellular activity of HIV protease needs to be estimated. In addition, we have used a confocal microscopy sensitized emission FRET technique to evaluate the usefulness of the FRET-HIV sensor for spatiotemporal detection of intracellular HIV protease activity.

  10. Noninvasive High-Throughput Single-Cell Analysis of HIV Protease Activity Using Ratiometric Flow Cytometry

    Science.gov (United States)

    Gaber, Rok; Majerle, Andreja; Jerala, Roman; Benčina, Mojca

    2013-01-01

    To effectively fight against the human immunodeficiency virus infection/acquired immunodeficiency syndrome (HIV/AIDS) epidemic, ongoing development of novel HIV protease inhibitors is required. Inexpensive high-throughput screening assays are needed to quickly scan large sets of chemicals for potential inhibitors. We have developed a Förster resonance energy transfer (FRET)-based, HIV protease-sensitive sensor using a combination of a fluorescent protein pair, namely mCerulean and mCitrine. Through extensive in vitro characterization, we show that the FRET-HIV sensor can be used in HIV protease screening assays. Furthermore, we have used the FRET-HIV sensor for intracellular quantitative detection of HIV protease activity in living cells, which more closely resembles an actual viral infection than an in vitro assay. We have developed a high-throughput method that employs a ratiometric flow cytometry for analyzing large populations of cells that express the FRET-HIV sensor. The method enables FRET measurement of single cells with high sensitivity and speed and should be used when subpopulation-specific intracellular activity of HIV protease needs to be estimated. In addition, we have used a confocal microscopy sensitized emission FRET technique to evaluate the usefulness of the FRET-HIV sensor for spatiotemporal detection of intracellular HIV protease activity. PMID:24287545

  11. A complementary role of multiparameter flow cytometry and high-throughput sequencing for minimal residual disease detection in chronic lymphocytic leukemia: an European Research Initiative on CLL study.

    LENUS (Irish Health Repository)

    Rawstron, A C

    2016-04-01

    In chronic lymphocytic leukemia (CLL) the level of minimal residual disease (MRD) after therapy is an independent predictor of outcome. Given the increasing number of new agents being explored for CLL therapy, using MRD as a surrogate could greatly reduce the time necessary to assess their efficacy. In this European Research Initiative on CLL (ERIC) project we have identified and validated a flow-cytometric approach to reliably quantitate CLL cells to the level of 0.0010% (10(-5)). The assay comprises a core panel of six markers (i.e. CD19, CD20, CD5, CD43, CD79b and CD81) with a component specification independent of instrument and reagents, which can be locally re-validated using normal peripheral blood. This method is directly comparable to previous ERIC-designed assays and also provides a backbone for investigation of new markers. A parallel analysis of high-throughput sequencing using the ClonoSEQ assay showed good concordance with flow cytometry results at the 0.010% (10(-4)) level, the MRD threshold defined in the 2008 International Workshop on CLL guidelines, but it also provides good linearity to a detection limit of 1 in a million (10(-6)). The combination of both technologies would permit a highly sensitive approach to MRD detection while providing a reproducible and broadly accessible method to quantify residual disease and optimize treatment in CLL.

  12. Accurate CpG and non-CpG cytosine methylation analysis by high-throughput locus-specific pyrosequencing in plants.

    Science.gov (United States)

    How-Kit, Alexandre; Daunay, Antoine; Mazaleyrat, Nicolas; Busato, Florence; Daviaud, Christian; Teyssier, Emeline; Deleuze, Jean-François; Gallusci, Philippe; Tost, Jörg

    2015-07-01

    Pyrosequencing permits accurate quantification of DNA methylation of specific regions where the proportions of the C/T polymorphism induced by sodium bisulfite treatment of DNA reflects the DNA methylation level. The commercially available high-throughput locus-specific pyrosequencing instruments allow for the simultaneous analysis of 96 samples, but restrict the DNA methylation analysis to CpG dinucleotide sites, which can be limiting in many biological systems. In contrast to mammals where DNA methylation occurs nearly exclusively on CpG dinucleotides, plants genomes harbor DNA methylation also in other sequence contexts including CHG and CHH motives, which cannot be evaluated by these pyrosequencing instruments due to software limitations. Here, we present a complete pipeline for accurate CpG and non-CpG cytosine methylation analysis at single base-resolution using high-throughput locus-specific pyrosequencing. The devised approach includes the design and validation of PCR amplification on bisulfite-treated DNA and pyrosequencing assays as well as the quantification of the methylation level at every cytosine from the raw peak intensities of the Pyrograms by two newly developed Visual Basic Applications. Our method presents accurate and reproducible results as exemplified by the cytosine methylation analysis of the promoter regions of two Tomato genes (NOR and CNR) encoding transcription regulators of fruit ripening during different stages of fruit development. Our results confirmed a significant and temporally coordinated loss of DNA methylation on specific cytosines during the early stages of fruit development in both promoters as previously shown by WGBS. The manuscript describes thus the first high-throughput locus-specific DNA methylation analysis in plants using pyrosequencing.

  13. Characterization of Intestinal Microbiomes of Hirschsprung's Disease Patients with or without Enterocolitis Using Illumina-MiSeq High-Throughput Sequencing.

    Directory of Open Access Journals (Sweden)

    Yuqing Li

    Full Text Available Hirschsprung-associated enterocolitis (HAEC is a life-threatening complication of Hirschsprung's disease (HD. Although the pathological mechanisms are still unclear, studies have shown that HAEC has a close relationship with the disturbance of intestinal microbiota. This study aimed to investigate the characteristics of the intestinal microbiome of HD patients with or without enterocolitis. During routine or emergency surgery, we collected 35 intestinal content samples from five patients with HAEC and eight HD patients, including three HD patients with a history of enterocolitis who were in a HAEC remission (HAEC-R phase. Using Illumina-MiSeq high-throughput sequencing, we sequenced the V4 region of bacterial 16S rRNA, and operational taxonomic units (OTUs were defined by 97% sequence similarity. Principal coordinate analysis (PCoA of weighted UniFrac distances was performed to evaluate the diversity of each intestinal microbiome sample. The microbiota differed significantly between the HD patients (characterized by the prevalence of Bacteroidetes and HAEC patients (characterized by the prevalence of Proteobacteria, while the microbiota of the HAEC-R patients was more similar to that of the HAEC patients. We also observed that the specimens from different intestinal sites of each HD patient differed significantly, while the specimens from different intestinal sites of each HAEC and HAEC-R patient were more similar. In conclusion, the microbiome pattern of the HAEC-R patients was more similar to that of the HAEC patients than to that of the HD patients. The HD patients had a relatively distinct, more stable community than the HAEC and HAEC-R patients, suggesting that enterocolitis may either be caused by or result in a disruption of the patient's uniquely adapted intestinal flora. The intestinal microbiota associated with enterocolitis may persist following symptom resolution and can be implicated in the symptom recurrence.

  14. The use of high-throughput sequencing to investigate an outbreak of glycopeptide-resistant Enterococcus faecium with a novel quinupristin-dalfopristin resistance mechanism.

    Science.gov (United States)

    Shaw, Timothy D; Fairley, D J; Schneiders, T; Pathiraja, M; Hill, R L R; Werner, G; Elborn, J S; McMullan, R

    2018-02-24

    High-throughput sequencing (HTS) has successfully identified novel resistance genes in enterococci and determined clonal relatedness in outbreak analysis. We report the use of HTS to investigate two concurrent outbreaks of glycopeptide-resistant Enterococcus faecium (GRE) with an uncharacterised resistance mechanism to quinupristin-dalfopristin (QD). Seven QD-resistant and five QD-susceptible GRE isolates from a two-centre outbreak were studied. HTS was performed to identify genes or predicted proteins that were associated with the QD-resistant phenotype. MLST and SNP typing on HTS data was used to determine clonal relatedness. Comparative genomic analysis confirmed this GRE outbreak involved two distinct clones (ST80 and ST192). HTS confirmed the absence of known QD resistance genes, suggesting a novel mechanism was conferring resistance. Genomic analysis identified two significant genetic determinants with explanatory power for the high level of QD resistance in the ST80 QD-resistant clone: an additional 56aa leader sequence at the N-terminus of the lsaE gene and a transposon containing seven genes encoding proteins with possible drug or drug-target modification activities. However, HTS was unable to conclusively determine the QD resistance mechanism and did not reveal any genetic basis for QD resistance in the ST192 clone. This study highlights the usefulness of HTS in deciphering the degree of relatedness in two concurrent GRE outbreaks. Although HTS was able to reveal some genetic candidates for uncharacterised QD resistance, this study demonstrates the limitations of HTS as a tool for identifying putative determinants of resistance to QD.

  15. web cellHTS2: A web-application for the analysis of high-throughput screening data

    Directory of Open Access Journals (Sweden)

    Boutros Michael

    2010-04-01

    Full Text Available Abstract Background The analysis of high-throughput screening data sets is an expanding field in bioinformatics. High-throughput screens by RNAi generate large primary data sets which need to be analyzed and annotated to identify relevant phenotypic hits. Large-scale RNAi screens are frequently used to identify novel factors that influence a broad range of cellular processes, including signaling pathway activity, cell proliferation, and host cell infection. Here, we present a web-based application utility for the end-to-end analysis of large cell-based screening experiments by cellHTS2. Results The software guides the user through the configuration steps that are required for the analysis of single or multi-channel experiments. The web-application provides options for various standardization and normalization methods, annotation of data sets and a comprehensive HTML report of the screening data analysis, including a ranked hit list. Sessions can be saved and restored for later re-analysis. The web frontend for the cellHTS2 R/Bioconductor package interacts with it through an R-server implementation that enables highly parallel analysis of screening data sets. web cellHTS2 further provides a file import and configuration module for common file formats. Conclusions The implemented web-application facilitates the analysis of high-throughput data sets and provides a user-friendly interface. web cellHTS2 is accessible online at http://web-cellHTS2.dkfz.de. A standalone version as a virtual appliance and source code for platforms supporting Java 1.5.0 can be downloaded from the web cellHTS2 page. web cellHTS2 is freely distributed under GPL.

  16. 3D-SURFER: software for high-throughput protein surface comparison and analysis.

    Science.gov (United States)

    La, David; Esquivel-Rodríguez, Juan; Venkatraman, Vishwesh; Li, Bin; Sael, Lee; Ueng, Stephen; Ahrendt, Steven; Kihara, Daisuke

    2009-11-01

    We present 3D-SURFER, a web-based tool designed to facilitate high-throughput comparison and characterization of proteins based on their surface shape. As each protein is effectively represented by a vector of 3D Zernike descriptors, comparison times for a query protein against the entire PDB take, on an average, only a couple of seconds. The web interface has been designed to be as interactive as possible with displays showing animated protein rotations, CATH codes and structural alignments using the CE program. In addition, geometrically interesting local features of the protein surface, such as pockets that often correspond to ligand binding sites as well as protrusions and flat regions can also be identified and visualized. 3D-SURFER is a web application that can be freely accessed from: http://dragon.bio.purdue.edu/3d-surfer dkihara@purdue.edu Supplementary data are available at Bioinformatics online.

  17. High-throughput liquid chromatography for drug analysis in biological fluids: investigation of extraction column life.

    Science.gov (United States)

    Zeng, Wei; Fisher, Alison L; Musson, Donald G; Wang, Amy Qiu

    2004-07-05

    A novel method was developed and assessed to extend the lifetime of extraction columns of high-throughput liquid chromatography (HTLC) for bioanalysis of human plasma samples. In this method, a 15% acetic acid solution and 90% THF were respectively used as mobile phases to clean up the proteins in human plasma samples and residual lipids from the extraction and analytical columns. The 15% acetic acid solution weakens the interactions between proteins and the stationary phase of the extraction column and increases the protein solubility in the mobile phase. The 90% THF mobile phase prevents the accumulation of lipids and thus reduces the potential damage on the columns. Using this novel method, the extraction column lifetime has been extended to about 2000 direct plasma injections, and this is the first time that high concentration acetic acid and THF are used in HTLC for on-line cleanup and extraction column lifetime extension.

  18. High Throughput Facility

    Data.gov (United States)

    Federal Laboratory Consortium — Argonne?s high throughput facility provides highly automated and parallel approaches to material and materials chemistry development. The facility allows scientists...

  19. Evaluation of a transposase protocol for rapid generation of shotgun high-throughput sequencing libraries from nanogram quantities of DNA.

    Science.gov (United States)

    Marine, Rachel; Polson, Shawn W; Ravel, Jacques; Hatfull, Graham; Russell, Daniel; Sullivan, Matthew; Syed, Fraz; Dumas, Michael; Wommack, K Eric

    2011-11-01

    Construction of DNA fragment libraries for next-generation sequencing can prove challenging, especially for samples with low DNA yield. Protocols devised to circumvent the problems associated with low starting quantities of DNA can result in amplification biases that skew the distribution of genomes in metagenomic data. Moreover, sample throughput can be slow, as current library construction techniques are time-consuming. This study evaluated Nextera, a new transposon-based method that is designed for quick production of DNA fragment libraries from a small quantity of DNA. The sequence read distribution across nine phage genomes in a mock viral assemblage met predictions for six of the least-abundant phages; however, the rank order of the most abundant phages differed slightly from predictions. De novo genome assemblies from Nextera libraries provided long contigs spanning over half of the phage genome; in four cases where full-length genome sequences were available for comparison, consensus sequences were found to match over 99% of the genome with near-perfect identity. Analysis of areas of low and high sequence coverage within phage genomes indicated that GC content may influence coverage of sequences from Nextera libraries. Comparisons of phage genomes prepared using both Nextera and a standard 454 FLX Titanium library preparation protocol suggested that the coverage biases according to GC content observed within the Nextera libraries were largely attributable to bias in the Nextera protocol rather than to the 454 sequencing technology. Nevertheless, given suitable sequence coverage, the Nextera protocol produced high-quality data for genomic studies. For metagenomics analyses, effects of GC amplification bias would need to be considered; however, the library preparation standardization that Nextera provides should benefit comparative metagenomic analyses.

  20. The use of FTA cards for preserving unfixed cytological material for high-throughput molecular analysis.

    Science.gov (United States)

    Saieg, Mauro Ajaj; Geddie, William R; Boerner, Scott L; Liu, Ni; Tsao, Ming; Zhang, Tong; Kamel-Reid, Suzanne; da Cunha Santos, Gilda

    2012-06-25

    Novel high-throughput molecular technologies have made the collection and storage of cells and small tissue specimens a critical issue. The FTA card provides an alternative to cryopreservation for biobanking fresh unfixed cells. The current study compared the quality and integrity of the DNA obtained from 2 types of FTA cards (Classic and Elute) using 2 different extraction protocols ("Classic" and "Elute") and assessed the feasibility of performing multiplex mutational screening using fine-needle aspiration (FNA) biopsy samples. Residual material from 42 FNA biopsies was collected in the cards (21 Classic and 21 Elute cards). DNA was extracted using the Classic protocol for Classic cards and both protocols for Elute cards. Polymerase chain reaction for p53 (1.5 kilobase) and CARD11 (500 base pair) was performed to assess DNA integrity. Successful p53 amplification was achieved in 95.2% of the samples from the Classic cards and in 80.9% of the samples from the Elute cards using the Classic protocol and 28.5% using the Elute protocol (P = .001). All samples (both cards) could be amplified for CARD11. There was no significant difference in the DNA concentration or 260/280 purity ratio when the 2 types of cards were compared. Five samples were also successfully analyzed by multiplex MassARRAY spectrometry, with a mutation in KRAS found in 1 case. High molecular weight DNA was extracted from the cards in sufficient amounts and quality to perform high-throughput multiplex mutation assays. The results of the current study also suggest that FTA Classic cards preserve better DNA integrity for molecular applications compared with the FTA Elute cards. Copyright © 2012 American Cancer Society.

  1. High-Throughput Method for Strontium Isotope Analysis by Multi-Collector-Inductively Coupled Plasma-Mass Spectrometer

    Energy Technology Data Exchange (ETDEWEB)

    Wall, Andrew J. [National Energy Technology Lab. (NETL), Pittsburgh, PA, (United States); Capo, Rosemary C. [Univ. of Pittsburgh, PA (United States); Stewart, Brian W. [Univ. of Pittsburgh, PA (United States); Phan, Thai T. [Univ. of Pittsburgh, PA (United States); Jain, Jinesh C. [National Energy Technology Lab. (NETL), Pittsburgh, PA, (United States); Hakala, Alexandra [National Energy Technology Lab. (NETL), Pittsburgh, PA, (United States); Guthrie, George D. [National Energy Technology Lab. (NETL), Pittsburgh, PA, (United States)

    2016-09-22

    This technical report presents the details of the Sr column configuration and the high-throughput Sr separation protocol. Data showing the performance of the method as well as the best practices for optimizing Sr isotope analysis by MC-ICP-MS is presented. Lastly, this report offers tools for data handling and data reduction of Sr isotope results from the Thermo Scientific Neptune software to assist in data quality assurance, which help avoid issues of data glut associated with high sample throughput rapid analysis.

  2. High-Throughput Method for Strontium Isotope Analysis by Multi-Collector-Inductively Coupled Plasma-Mass Spectrometer

    Energy Technology Data Exchange (ETDEWEB)

    Hakala, Jacqueline Alexandra [National Energy Technology Lab. (NETL), Morgantown, WV (United States)

    2016-11-22

    This technical report presents the details of the Sr column configuration and the high-throughput Sr separation protocol. Data showing the performance of the method as well as the best practices for optimizing Sr isotope analysis by MC-ICP-MS is presented. Lastly, this report offers tools for data handling and data reduction of Sr isotope results from the Thermo Scientific Neptune software to assist in data quality assurance, which help avoid issues of data glut associated with high sample throughput rapid analysis.

  3. elegantRingAnalysis An Interface for High-Throughput Analysis of Storage Ring Lattices Using elegant

    CERN Document Server

    Borland, Michael

    2005-01-01

    The code {\\tt elegant} is widely used for simulation of linacs for drivers for free-electron lasers. Less well known is that elegant is also a very capable code for simulation of storage rings. In this paper, we show a newly-developed graphical user interface that allows the user to easily take advantage of these capabilities. The interface is designed for use on a Linux cluster, providing very high throughput. It can also be used on a single computer. Among the features it gives access to are basic calculations (Twiss parameters, radiation integrals), phase-space tracking, nonlinear dispersion, dynamic aperture (on- and off-momentum), frequency map analysis, and collective effects (IBS, bunch-lengthening). Using a cluster, it is easy to get highly detailed dynamic aperture and frequency map results in a surprisingly short time.

  4. Cell surface profiling using high-throughput flow cytometry: a platform for biomarker discovery and analysis of cellular heterogeneity.

    Directory of Open Access Journals (Sweden)

    Craig A Gedye

    Full Text Available Cell surface proteins have a wide range of biological functions, and are often used as lineage-specific markers. Antibodies that recognize cell surface antigens are widely used as research tools, diagnostic markers, and even therapeutic agents. The ability to obtain broad cell surface protein profiles would thus be of great value in a wide range of fields. There are however currently few available methods for high-throughput analysis of large numbers of cell surface proteins. We describe here a high-throughput flow cytometry (HT-FC platform for rapid analysis of 363 cell surface antigens. Here we demonstrate that HT-FC provides reproducible results, and use the platform to identify cell surface antigens that are influenced by common cell preparation methods. We show that multiple populations within complex samples such as primary tumors can be simultaneously analyzed by co-staining of cells with lineage-specific antibodies, allowing unprecedented depth of analysis of heterogeneous cell populations. Furthermore, standard informatics methods can be used to visualize, cluster and downsample HT-FC data to reveal novel signatures and biomarkers. We show that the cell surface profile provides sufficient molecular information to classify samples from different cancers and tissue types into biologically relevant clusters using unsupervised hierarchical clustering. Finally, we describe the identification of a candidate lineage marker and its subsequent validation. In summary, HT-FC combines the advantages of a high-throughput screen with a detection method that is sensitive, quantitative, highly reproducible, and allows in-depth analysis of heterogeneous samples. The use of commercially available antibodies means that high quality reagents are immediately available for follow-up studies. HT-FC has a wide range of applications, including biomarker discovery, molecular classification of cancers, or identification of novel lineage specific or stem cell

  5. Cell surface profiling using high-throughput flow cytometry: a platform for biomarker discovery and analysis of cellular heterogeneity.

    Science.gov (United States)

    Gedye, Craig A; Hussain, Ali; Paterson, Joshua; Smrke, Alannah; Saini, Harleen; Sirskyj, Danylo; Pereira, Keira; Lobo, Nazleen; Stewart, Jocelyn; Go, Christopher; Ho, Jenny; Medrano, Mauricio; Hyatt, Elzbieta; Yuan, Julie; Lauriault, Stevan; Meyer, Mona; Kondratyev, Maria; van den Beucken, Twan; Jewett, Michael; Dirks, Peter; Guidos, Cynthia J; Danska, Jayne; Wang, Jean; Wouters, Bradly; Neel, Benjamin; Rottapel, Robert; Ailles, Laurie E

    2014-01-01

    Cell surface proteins have a wide range of biological functions, and are often used as lineage-specific markers. Antibodies that recognize cell surface antigens are widely used as research tools, diagnostic markers, and even therapeutic agents. The ability to obtain broad cell surface protein profiles would thus be of great value in a wide range of fields. There are however currently few available methods for high-throughput analysis of large numbers of cell surface proteins. We describe here a high-throughput flow cytometry (HT-FC) platform for rapid analysis of 363 cell surface antigens. Here we demonstrate that HT-FC provides reproducible results, and use the platform to identify cell surface antigens that are influenced by common cell preparation methods. We show that multiple populations within complex samples such as primary tumors can be simultaneously analyzed by co-staining of cells with lineage-specific antibodies, allowing unprecedented depth of analysis of heterogeneous cell populations. Furthermore, standard informatics methods can be used to visualize, cluster and downsample HT-FC data to reveal novel signatures and biomarkers. We show that the cell surface profile provides sufficient molecular information to classify samples from different cancers and tissue types into biologically relevant clusters using unsupervised hierarchical clustering. Finally, we describe the identification of a candidate lineage marker and its subsequent validation. In summary, HT-FC combines the advantages of a high-throughput screen with a detection method that is sensitive, quantitative, highly reproducible, and allows in-depth analysis of heterogeneous samples. The use of commercially available antibodies means that high quality reagents are immediately available for follow-up studies. HT-FC has a wide range of applications, including biomarker discovery, molecular classification of cancers, or identification of novel lineage specific or stem cell markers.

  6. Label-free cell-cycle analysis by high-throughput quantitative phase time-stretch imaging flow cytometry

    Science.gov (United States)

    Mok, Aaron T. Y.; Lee, Kelvin C. M.; Wong, Kenneth K. Y.; Tsia, Kevin K.

    2018-02-01

    Biophysical properties of cells could complement and correlate biochemical markers to characterize a multitude of cellular states. Changes in cell size, dry mass and subcellular morphology, for instance, are relevant to cell-cycle progression which is prevalently evaluated by DNA-targeted fluorescence measurements. Quantitative-phase microscopy (QPM) is among the effective biophysical phenotyping tools that can quantify cell sizes and sub-cellular dry mass density distribution of single cells at high spatial resolution. However, limited camera frame rate and thus imaging throughput makes QPM incompatible with high-throughput flow cytometry - a gold standard in multiparametric cell-based assay. Here we present a high-throughput approach for label-free analysis of cell cycle based on quantitative-phase time-stretch imaging flow cytometry at a throughput of > 10,000 cells/s. Our time-stretch QPM system enables sub-cellular resolution even at high speed, allowing us to extract a multitude (at least 24) of single-cell biophysical phenotypes (from both amplitude and phase images). Those phenotypes can be combined to track cell-cycle progression based on a t-distributed stochastic neighbor embedding (t-SNE) algorithm. Using multivariate analysis of variance (MANOVA) discriminant analysis, cell-cycle phases can also be predicted label-free with high accuracy at >90% in G1 and G2 phase, and >80% in S phase. We anticipate that high throughput label-free cell cycle characterization could open new approaches for large-scale single-cell analysis, bringing new mechanistic insights into complex biological processes including diseases pathogenesis.

  7. Gene discovery and molecular marker development, based on high-throughput transcript sequencing of Paspalum dilatatum Poir.

    Directory of Open Access Journals (Sweden)

    Andrea Giordano

    Full Text Available BACKGROUND: Paspalum dilatatum Poir. (common name dallisgrass is a native grass species of South America, with special relevance to dairy and red meat production. P. dilatatum exhibits higher forage quality than other C4 forage grasses and is tolerant to frost and water stress. This species is predominantly cultivated in an apomictic monoculture, with an inherent high risk that biotic and abiotic stresses could potentially devastate productivity. Therefore, advanced breeding strategies that characterise and use available genetic diversity, or assess germplasm collections effectively are required to deliver advanced cultivars for production systems. However, there are limited genomic resources available for this forage grass species. RESULTS: Transcriptome sequencing using second-generation sequencing platforms has been employed using pooled RNA from different tissues (stems, roots, leaves and inflorescences at the final reproductive stage of P. dilatatum cultivar Primo. A total of 324,695 sequence reads were obtained, corresponding to c. 102 Mbp. The sequences were assembled, generating 20,169 contigs of a combined length of 9,336,138 nucleotides. The contigs were BLAST analysed against the fully sequenced grass species of Oryza sativa subsp. japonica, Brachypodium distachyon, the closely related Sorghum bicolor and foxtail millet (Setaria italica genomes as well as against the UniRef 90 protein database allowing a comprehensive gene ontology analysis to be performed. The contigs generated from the transcript sequencing were also analysed for the presence of simple sequence repeats (SSRs. A total of 2,339 SSR motifs were identified within 1,989 contigs and corresponding primer pairs were designed. Empirical validation of a cohort of 96 SSRs was performed, with 34% being polymorphic between sexual and apomictic biotypes. CONCLUSIONS: The development of genetic and genomic resources for P. dilatatum will contribute to gene discovery and expression

  8. Who's for dinner? High-throughput sequencing reveals bat dietary differentiation in a biodiversity hotspot where prey taxonomy is largely undescribed.

    Science.gov (United States)

    Burgar, Joanna M; Murray, Daithi C; Craig, Michael D; Haile, James; Houston, Jayne; Stokes, Vicki; Bunce, Michael

    2014-08-01

    Effective management and conservation of biodiversity requires understanding of predator-prey relationships to ensure the continued existence of both predator and prey populations. Gathering dietary data from predatory species, such as insectivorous bats, often presents logistical challenges, further exacerbated in biodiversity hot spots because prey items are highly speciose, yet their taxonomy is largely undescribed. We used high-throughput sequencing (HTS) and bioinformatic analyses to phylogenetically group DNA sequences into molecular operational taxonomic units (MOTUs) to examine predator-prey dynamics of three sympatric insectivorous bat species in the biodiversity hotspot of south-western Australia. We could only assign between 4% and 20% of MOTUs to known genera or species, depending on the method used, underscoring the importance of examining dietary diversity irrespective of taxonomic knowledge in areas lacking a comprehensive genetic reference database. MOTU analysis confirmed that resource partitioning occurred, with dietary divergence positively related to the ecomorphological divergence of the three bat species. We predicted that bat species' diets would converge during times of high energetic requirements, that is, the maternity season for females and the mating season for males. There was an interactive effect of season on female, but not male, bat species' diets, although small sample sizes may have limited our findings. Contrary to our predictions, females of two ecomorphologically similar species showed dietary convergence during the mating season rather than the maternity season. HTS-based approaches can help elucidate complex predator-prey relationships in highly speciose regions, which should facilitate the conservation of biodiversity in genetically uncharacterized areas, such as biodiversity hotspots. © 2013 John Wiley & Sons Ltd.

  9. Capturing the Alternative Cleavage and Polyadenylation Sites of 14 NAC Genes in Populus Using a Combination of 3′-RACE and High-Throughput Sequencing

    Directory of Open Access Journals (Sweden)

    Haoran Wang

    2018-03-01

    Full Text Available Detection of complex splice sites (SSs and polyadenylation sites (PASs of eukaryotic genes is essential for the elucidation of gene regulatory mechanisms. Transcriptome-wide studies using high-throughput sequencing (HTS have revealed prevalent alternative splicing (AS and alternative polyadenylation (APA in plants. However, small-scale and high-depth HTS aimed at detecting genes or gene families are very few and limited. We explored a convenient and flexible method for profiling SSs and PASs, which combines rapid amplification of 3′-cDNA ends (3′-RACE and HTS. Fourteen NAC (NAM, ATAF1/2, CUC2 transcription factor genes of Populus trichocarpa were analyzed by 3′-RACE-seq. Based on experimental reproducibility, boundary sequence analysis and reverse transcription PCR (RT-PCR verification, only canonical SSs were considered to be authentic. Based on stringent criteria, candidate PASs without any internal priming features were chosen as authentic PASs and assumed to be PAS-rich markers. Thirty-four novel canonical SSs, six intronic/internal exons and thirty 3′-UTR PAS-rich markers were revealed by 3′-RACE-seq. Using 3′-RACE and real-time PCR, we confirmed that three APA transcripts ending in/around PAS-rich markers were differentially regulated in response to plant hormones. Our results indicate that 3′-RACE-seq is a robust and cost-effective method to discover SSs and label active regions subjected to APA for genes or gene families. The method is suitable for small-scale AS and APA research in the initial stage.

  10. DAVID Knowledgebase: a gene-centered database integrating heterogeneous gene annotation resources to facilitate high-throughput gene functional analysis

    Directory of Open Access Journals (Sweden)

    Baseler Michael W

    2007-11-01

    Full Text Available Abstract Background Due to the complex and distributed nature of biological research, our current biological knowledge is spread over many redundant annotation databases maintained by many independent groups. Analysts usually need to visit many of these bioinformatics databases in order to integrate comprehensive annotation information for their genes, which becomes one of the bottlenecks, particularly for the analytic task associated with a large gene list. Thus, a highly centralized and ready-to-use gene-annotation knowledgebase is in demand for high throughput gene functional analysis. Description The DAVID Knowledgebase is built around the DAVID Gene Concept, a single-linkage method to agglomerate tens of millions of gene/protein identifiers from a variety of public genomic resources into DAVID gene clusters. The grouping of such identifiers improves the cross-reference capability, particularly across NCBI and UniProt systems, enabling more than 40 publicly available functional annotation sources to be comprehensively integrated and centralized by the DAVID gene clusters. The simple, pair-wise, text format files which make up the DAVID Knowledgebase are freely downloadable for various data analysis uses. In addition, a well organized web interface allows users to query different types of heterogeneous annotations in a high-throughput manner. Conclusion The DAVID Knowledgebase is designed to facilitate high throughput gene functional analysis. For a given gene list, it not only provides the quick accessibility to a wide range of heterogeneous annotation data in a centralized location, but also enriches the level of biological information for an individual gene. Moreover, the entire DAVID Knowledgebase is freely downloadable or searchable at http://david.abcc.ncifcrf.gov/knowledgebase/.

  11. Quantitative digital image analysis of chromogenic assays for high throughput screening of alpha-amylase mutant libraries.

    Science.gov (United States)

    Shankar, Manoharan; Priyadharshini, Ramachandran; Gunasekaran, Paramasamy

    2009-08-01

    An image analysis-based method for high throughput screening of an alpha-amylase mutant library using chromogenic assays was developed. Assays were performed in microplates and high resolution images of the assay plates were read using the Virtual Microplate Reader (VMR) script to quantify the concentration of the chromogen. This method is fast and sensitive in quantifying 0.025-0.3 mg starch/ml as well as 0.05-0.75 mg glucose/ml. It was also an effective screening method for improved alpha-amylase activity with a coefficient of variance of 18%.

  12. High Throughput In vivo Analysis of Plant Leaf Chemical Properties Using Hyperspectral Imaging

    Directory of Open Access Journals (Sweden)

    Piyush Pandey

    2017-08-01

    Full Text Available Image-based high-throughput plant phenotyping in greenhouse has the potential to relieve the bottleneck currently presented by phenotypic scoring which limits the throughput of gene discovery and crop improvement efforts. Numerous studies have employed automated RGB imaging to characterize biomass and growth of agronomically important crops. The objective of this study was to investigate the utility of hyperspectral imaging for quantifying chemical properties of maize and soybean plants in vivo. These properties included leaf water content, as well as concentrations of macronutrients nitrogen (N, phosphorus (P, potassium (K, magnesium (Mg, calcium (Ca, and sulfur (S, and micronutrients sodium (Na, iron (Fe, manganese (Mn, boron (B, copper (Cu, and zinc (Zn. Hyperspectral images were collected from 60 maize and 60 soybean plants, each subjected to varying levels of either water deficit or nutrient limitation stress with the goal of creating a wide range of variation in the chemical properties of plant leaves. Plants were imaged on an automated conveyor belt system using a hyperspectral imager with a spectral range from 550 to 1,700 nm. Images were processed to extract reflectance spectrum from each plant and partial least squares regression models were developed to correlate spectral data with chemical data. Among all the chemical properties investigated, water content was predicted with the highest accuracy [R2 = 0.93 and RPD (Ratio of Performance to Deviation = 3.8]. All macronutrients were also quantified satisfactorily (R2 from 0.69 to 0.92, RPD from 1.62 to 3.62, with N predicted best followed by P, K, and S. The micronutrients group showed lower prediction accuracy (R2 from 0.19 to 0.86, RPD from 1.09 to 2.69 than the macronutrient groups. Cu and Zn were best predicted, followed by Fe and Mn. Na and B were the only two properties that hyperspectral imaging was not able to quantify satisfactorily (R2 < 0.3 and RPD < 1.2. This study suggested

  13. A high-throughput screening system for barley/powdery mildew interactions based on automated analysis of light micrographs.

    Science.gov (United States)

    Ihlow, Alexander; Schweizer, Patrick; Seiffert, Udo

    2008-01-23

    To find candidate genes that potentially influence the susceptibility or resistance of crop plants to powdery mildew fungi, an assay system based on transient-induced gene silencing (TIGS) as well as transient over-expression in single epidermal cells of barley has been developed. However, this system relies on quantitative microscopic analysis of the barley/powdery mildew interaction and will only become a high-throughput tool of phenomics upon automation of the most time-consuming steps. We have developed a high-throughput screening system based on a motorized microscope which evaluates the specimens fully automatically. A large-scale double-blind verification of the system showed an excellent agreement of manual and automated analysis and proved the system to work dependably. Furthermore, in a series of bombardment experiments an RNAi construct targeting the Mlo gene was included, which is expected to phenocopy resistance mediated by recessive loss-of-function alleles such as mlo5. In most cases, the automated analysis system recorded a shift towards resistance upon RNAi of Mlo, thus providing proof of concept for its usefulness in detecting gene-target effects. Besides saving labor and enabling a screening of thousands of candidate genes, this system offers continuous operation of expensive laboratory equipment and provides a less subjective analysis as well as a complete and enduring documentation of the experimental raw data in terms of digital images. In general, it proves the concept of enabling available microscope hardware to handle challenging screening tasks fully automatically.

  14. Analysis of high-throughput biological data using their rank values.

    Science.gov (United States)

    Dembélé, Doulaye

    2018-01-01

    High-throughput biological technologies are routinely used to generate gene expression profiling or cytogenetics data. To achieve high performance, methods available in the literature become more specialized and often require high computational resources. Here, we propose a new versatile method based on the data-ordering rank values. We use linear algebra, the Perron-Frobenius theorem and also extend a method presented earlier for searching differentially expressed genes for the detection of recurrent copy number aberration. A result derived from the proposed method is a one-sample Student's t-test based on rank values. The proposed method is to our knowledge the only that applies to gene expression profiling and to cytogenetics data sets. This new method is fast, deterministic, and requires a low computational load. Probabilities are associated with genes to allow a statistically significant subset selection in the data set. Stability scores are also introduced as quality parameters. The performance and comparative analyses were carried out using real data sets. The proposed method can be accessed through an R package available from the CRAN (Comprehensive R Archive Network) website: https://cran.r-project.org/web/packages/fcros .

  15. High-throughput DNA methylation analysis in anorexia nervosa confirms TNXB hypermethylation.

    Science.gov (United States)

    Kesselmeier, Miriam; Pütter, Carolin; Volckmar, Anna-Lena; Baurecht, Hansjörg; Grallert, Harald; Illig, Thomas; Ismail, Khadeeja; Ollikainen, Miina; Silén, Yasmina; Keski-Rahkonen, Anna; Bulik, Cynthia M; Collier, David A; Zeggini, Eleftheria; Hebebrand, Johannes; Scherag, André; Hinney, Anke

    2018-04-01

    Patients with anorexia nervosa (AN) are ideally suited to identify differentially methylated genes in response to starvation. We examined high-throughput DNA methylation derived from whole blood of 47 females with AN, 47 lean females without AN and 100 population-based females to compare AN with both controls. To account for different cell type compositions, we applied two reference-free methods (FastLMM-EWASher, RefFreeEWAS) and searched for consensus CpG sites identified by both methods. We used a validation sample of five monozygotic AN-discordant twin pairs. Fifty-one consensus sites were identified in AN vs. lean and 81 in AN vs. population-based comparisons. These sites have not been reported in AN methylation analyses, but for the latter comparison 54/81 sites showed directionally consistent differential methylation effects in the AN-discordant twins. For a single nucleotide polymorphism rs923768 in CSGALNACT1 a nearby site was nominally associated with AN. At the gene level, we confirmed hypermethylated sites at TNXB. We found support for a locus at NR1H3 in the AN vs. lean control comparison, but the methylation direction was opposite to the one previously reported. We confirm genes like TNXB previously described to comprise differentially methylated sites, and highlight further sites that might be specifically involved in AN starvation processes.

  16. High-throughput STR analysis for DNA database using direct PCR.

    Science.gov (United States)

    Sim, Jeong Eun; Park, Su Jeong; Lee, Han Chul; Kim, Se-Yong; Kim, Jong Yeol; Lee, Seung Hwan

    2013-07-01

    Since the Korean criminal DNA database was launched in 2010, we have focused on establishing an automated DNA database profiling system that analyzes short tandem repeat loci in a high-throughput and cost-effective manner. We established a DNA database profiling system without DNA purification using a direct PCR buffer system. The quality of direct PCR procedures was compared with that of conventional PCR system under their respective optimized conditions. The results revealed not only perfect concordance but also an excellent PCR success rate, good electropherogram quality, and an optimal intra/inter-loci peak height ratio. In particular, the proportion of DNA extraction required due to direct PCR failure could be minimized to <3%. In conclusion, the newly developed direct PCR system can be adopted for automated DNA database profiling systems to replace or supplement conventional PCR system in a time- and cost-saving manner. © 2013 American Academy of Forensic Sciences Published 2013. This article is a U.S. Government work and is in the public domain in the U.S.A.

  17. High-throughput molecular analysis in lung cancer: insights into biology and potential clinical applications.

    Science.gov (United States)

    Ocak, S; Sos, M L; Thomas, R K; Massion, P P

    2009-08-01

    During the last decade, high-throughput technologies including genomic, epigenomic, transcriptomic and proteomic have been applied to further our understanding of the molecular pathogenesis of this heterogeneous disease, and to develop strategies that aim to improve the management of patients with lung cancer. Ultimately, these approaches should lead to sensitive, specific and noninvasive methods for early diagnosis, and facilitate the prediction of response to therapy and outcome, as well as the identification of potential novel therapeutic targets. Genomic studies were the first to move this field forward by providing novel insights into the molecular biology of lung cancer and by generating candidate biomarkers of disease progression. Lung carcinogenesis is driven by genetic and epigenetic alterations that cause aberrant gene function; however, the challenge remains to pinpoint the key regulatory control mechanisms and to distinguish driver from passenger alterations that may have a small but additive effect on cancer development. Epigenetic regulation by DNA methylation and histone modifications modulate chromatin structure and, in turn, either activate or silence gene expression. Proteomic approaches critically complement these molecular studies, as the phenotype of a cancer cell is determined by proteins and cannot be predicted by genomics or transcriptomics alone. The present article focuses on the technological platforms available and some proposed clinical applications. We illustrate herein how the "-omics" have revolutionised our approach to lung cancer biology and hold promise for personalised management of lung cancer.

  18. An automated, high-throughput plant phenotyping system using machine learning-based plant segmentation and image analysis.

    Science.gov (United States)

    Lee, Unseok; Chang, Sungyul; Putra, Gian Anantrio; Kim, Hyoungseok; Kim, Dong Hwan

    2018-01-01

    A high-throughput plant phenotyping system automatically observes and grows many plant samples. Many plant sample images are acquired by the system to determine the characteristics of the plants (populations). Stable image acquisition and processing is very important to accurately determine the characteristics. However, hardware for acquiring plant images rapidly and stably, while minimizing plant stress, is lacking. Moreover, most software cannot adequately handle large-scale plant imaging. To address these problems, we developed a new, automated, high-throughput plant phenotyping system using simple and robust hardware, and an automated plant-imaging-analysis pipeline consisting of machine-learning-based plant segmentation. Our hardware acquires images reliably and quickly and minimizes plant stress. Furthermore, the images are processed automatically. In particular, large-scale plant-image datasets can be segmented precisely using a classifier developed using a superpixel-based machine-learning algorithm (Random Forest), and variations in plant parameters (such as area) over time can be assessed using the segmented images. We performed comparative evaluations to identify an appropriate learning algorithm for our proposed system, and tested three robust learning algorithms. We developed not only an automatic analysis pipeline but also a convenient means of plant-growth analysis that provides a learning data interface and visualization of plant growth trends. Thus, our system allows end-users such as plant biologists to analyze plant growth via large-scale plant image data easily.

  19. SONAR: A high-throughput pipeline for inferring antibody ontogenies from longitudinal sequencing of B cell transcripts

    Directory of Open Access Journals (Sweden)

    Chaim A Schramm

    2016-09-01

    Full Text Available The rapid advance of massively parallel or next-generation sequencing technologies has made possible the characterization of B cell receptor repertoires in ever greater detail, leading to a proliferation of software tools for processing and annotating this data. Of especial interest, however, is the capability to track the development of specific antibody lineages across time, which remains beyond the scope of most current programs. We have previously reported on the use of techniques such as inter- and intra-donor analysis and CDR3 tracing to identify transcripts related to an antibody of interest. Here, we present Software for the Ontogenic aNalysis of Antibody Repertoires (SONAR, capable of automating both general repertoire analysis and specialized techniques for investigating specific lineages. SONAR annotates next-generation sequencing data, identifies transcripts in a lineage of interest, and tracks lineage development across multiple time points. SONAR also generates figures, such as identity-divergence plots and longitudinal phylogenetic birthday trees, and provides interfaces to other programs such as DNAML and BEAST. SONAR can be downloaded as a ready-to-run Docker image or manually installed on a local machine. In the latter case, it can also be configured to take advantage of a high-performance computing cluster for the most computationally intensive steps, if available. In summary, this software provides a useful new tool for the processing of large next-generation sequencing datasets and the ontogenic analysis of neutralizing antibody lineages. SONAR can be found at https://github.com/scharch/SONAR and the Docker image can be obtained from https://hub.docker.com/r/scharch/sonar/.

  20. SONAR: A High-Throughput Pipeline for Inferring Antibody Ontogenies from Longitudinal Sequencing of B Cell Transcripts.

    Science.gov (United States)

    Schramm, Chaim A; Sheng, Zizhang; Zhang, Zhenhai; Mascola, John R; Kwong, Peter D; Shapiro, Lawrence

    2016-01-01

    The rapid advance of massively parallel or next-generation sequencing technologies has made possible the characterization of B cell receptor repertoires in ever greater detail, and these developments have triggered a proliferation of software tools for processing and annotating these data. Of especial interest, however, is the capability to track the development of specific antibody lineages across time, which remains beyond the scope of most current programs. We have previously reported on the use of techniques such as inter- and intradonor analysis and CDR3 tracing to identify transcripts related to an antibody of interest. Here, we present Software for the Ontogenic aNalysis of Antibody Repertoires (SONAR), capable of automating both general repertoire analysis and specialized techniques for investigating specific lineages. SONAR annotates next-generation sequencing data, identifies transcripts in a lineage of interest, and tracks lineage development across multiple time points. SONAR also generates figures, such as identity-divergence plots and longitudinal phylogenetic "birthday" trees, and provides interfaces to other programs such as DNAML and BEAST. SONAR can be downloaded as a ready-to-run Docker image or manually installed on a local machine. In the latter case, it can also be configured to take advantage of a high-performance computing cluster for the most computationally intensive steps, if available. In summary, this software provides a useful new tool for the processing of large next-generation sequencing datasets and the ontogenic analysis of neutralizing antibody lineages. SONAR can be found at https://github.com/scharch/SONAR, and the Docker image can be obtained from https://hub.docker.com/r/scharch/sonar/.

  1. Human Leukocyte Antigen Typing Using a Knowledge Base Coupled with a High-Throughput Oligonucleotide Probe Array Analysis

    Science.gov (United States)

    Zhang, Guang Lan; Keskin, Derin B.; Lin, Hsin-Nan; Lin, Hong Huang; DeLuca, David S.; Leppanen, Scott; Milford, Edgar L.; Reinherz, Ellis L.; Brusic, Vladimir

    2014-01-01

    Human leukocyte antigens (HLA) are important biomarkers because multiple diseases, drug toxicity, and vaccine responses reveal strong HLA associations. Current clinical HLA typing is an elimination process requiring serial testing. We present an alternative in situ synthesized DNA-based microarray method that contains hundreds of thousands of probes representing a complete overlapping set covering 1,610 clinically relevant HLA class I alleles accompanied by computational tools for assigning HLA type to 4-digit resolution. Our proof-of-concept experiment included 21 blood samples, 18 cell lines, and multiple controls. The method is accurate, robust, and amenable to automation. Typing errors were restricted to homozygous samples or those with very closely related alleles from the same locus, but readily resolved by targeted DNA sequencing validation of flagged samples. High-throughput HLA typing technologies that are effective, yet inexpensive, can be used to analyze the world’s populations, benefiting both global public health and personalized health care. PMID:25505899

  2. Heap: a highly sensitive and accurate SNP detection tool for low-coverage high-throughput sequencing data

    KAUST Repository

    Kobayashi, Masaaki

    2017-04-20

    Recent availability of large-scale genomic resources enables us to conduct so called genome-wide association studies (GWAS) and genomic prediction (GP) studies, particularly with next-generation sequencing (NGS) data. The effectiveness of GWAS and GP depends on not only their mathematical models, but the quality and quantity of variants employed in the analysis. In NGS single nucleotide polymorphism (SNP) calling, conventional tools ideally require more reads for higher SNP sensitivity and accuracy. In this study, we aimed to develop a tool, Heap, that enables robustly sensitive and accurate calling of SNPs, particularly with a low coverage NGS data, which must be aligned to the reference genome sequences in advance. To reduce false positive SNPs, Heap determines genotypes and calls SNPs at each site except for sites at the both ends of reads or containing a minor allele supported by only one read. Performance comparison with existing tools showed that Heap achieved the highest F-scores with low coverage (7X) restriction-site associated DNA sequencing reads of sorghum and rice individuals. This will facilitate cost-effective GWAS and GP studies in this NGS era. Code and documentation of Heap are freely available from https://github.com/meiji-bioinf/heap (29 March 2017, date last accessed) and our web site (http://bioinf.mind.meiji.ac.jp/lab/en/tools.html (29 March 2017, date last accessed)).

  3. Response of forest soil euglyphid testate amoebae (Rhizaria: Cercozoa) to pig cadavers assessed by high-throughput sequencing.

    Science.gov (United States)

    Seppey, Christophe V W; Fournier, Bertrand; Szelecz, Ildikò; Singer, David; Mitchell, Edward A D; Lara, Enrique

    2016-03-01

    Decomposing cadavers modify the soil environment, but the effect on soil organisms and especially on soil protists is still poorly documented. We conducted a 35-month experiment in a deciduous forest where soil samples were taken under pig cadavers, control plots and fake pigs (bags of similar volume as the pigs). We extracted total soil DNA, amplified the SSU ribosomal RNA (rRNA) gene V9 region and sequenced it by Illumina technology and analysed the data for euglyphid testate amoebae (Rhizaria: Euglyphida), a common group of protozoa known to respond to micro-environmental changes. We found 51 euglyphid operational taxonomic units (OTUs), 45 of which did not match any known sequence. Most OTUs decreased in abundance underneath cadavers between days 0 and 309, but some responded positively after a time lag. We sequenced the full-length SSU rRNA gene of two common OTUs that responded positively to cadavers; a phylogenetic analysis showed that they did not belong to any known euglyphid family. This study confirmed the existence of an unknown diversity of euglyphids and that they react to cadavers. Results suggest that metabarcoding of soil euglyphids could be used as a forensic tool to estimate the post-mortem interval (PMI) particularly for long-term (>2 months) PMI, for which no reliable tool exists.

  4. Acute effects of TiO2 nanomaterials on the viability and taxonomic composition of aquatic bacterial communities assessed via high-throughput screening and next generation sequencing.

    Directory of Open Access Journals (Sweden)

    Chu Thi Thanh Binh

    Full Text Available The nanotechnology industry is growing rapidly, leading to concerns about the potential ecological consequences of the release of engineered nanomaterials (ENMs to the environment. One challenge of assessing the ecological risks of ENMs is the incredible diversity of ENMs currently available and the rapid pace at which new ENMs are being developed. High-throughput screening (HTS is a popular approach to assessing ENM cytotoxicity that offers the opportunity to rapidly test in parallel a wide range of ENMs at multiple concentrations. However, current HTS approaches generally test one cell type at a time, which limits their ability to predict responses of complex microbial communities. In this study toxicity screening via a HTS platform was used in combination with next generation sequencing (NGS to assess responses of bacterial communities from two aquatic habitats, Lake Michigan (LM and the Chicago River (CR, to short-term exposure in their native waters to several commercial TiO2 nanomaterials under simulated solar irradiation. Results demonstrate that bacterial communities from LM and CR differed in their sensitivity to nano-TiO2, with the community from CR being more resistant. NGS analysis revealed that the composition of the bacterial communities from LM and CR were significantly altered by exposure to nano-TiO2, including decreases in overall bacterial diversity, decreases in the relative abundance of Actinomycetales, Sphingobacteriales, Limnohabitans, and Flavobacterium, and a significant increase in Limnobacter. These results suggest that the release of nano-TiO2 to the environment has the potential to alter the composition of aquatic bacterial communities, which could have implications for the stability and function of aquatic ecosystems. The novel combination of HTS and NGS described in this study represents a major advance over current methods for assessing ENM ecotoxicity because the relative toxicities of multiple ENMs to thousands

  5. Acute effects of TiO2 nanomaterials on the viability and taxonomic composition of aquatic bacterial communities assessed via high-throughput screening and next generation sequencing.

    Science.gov (United States)

    Binh, Chu Thi Thanh; Tong, Tiezheng; Gaillard, Jean-François; Gray, Kimberly A; Kelly, John J

    2014-01-01

    The nanotechnology industry is growing rapidly, leading to concerns about the potential ecological consequences of the release of engineered nanomaterials (ENMs) to the environment. One challenge of assessing the ecological risks of ENMs is the incredible diversity of ENMs currently available and the rapid pace at which new ENMs are being developed. High-throughput screening (HTS) is a popular approach to assessing ENM cytotoxicity that offers the opportunity to rapidly test in parallel a wide range of ENMs at multiple concentrations. However, current HTS approaches generally test one cell type at a time, which limits their ability to predict responses of complex microbial communities. In this study toxicity screening via a HTS platform was used in combination with next generation sequencing (NGS) to assess responses of bacterial communities from two aquatic habitats, Lake Michigan (LM) and the Chicago River (CR), to short-term exposure in their native waters to several commercial TiO2 nanomaterials under simulated solar irradiation. Results demonstrate that bacterial communities from LM and CR differed in their sensitivity to nano-TiO2, with the community from CR being more resistant. NGS analysis revealed that the composition of the bacterial communities from LM and CR were significantly altered by exposure to nano-TiO2, including decreases in overall bacterial diversity, decreases in the relative abundance of Actinomycetales, Sphingobacteriales, Limnohabitans, and Flavobacterium, and a significant increase in Limnobacter. These results suggest that the release of nano-TiO2 to the environment has the potential to alter the composition of aquatic bacterial communities, which could have implications for the stability and function of aquatic ecosystems. The novel combination of HTS and NGS described in this study represents a major advance over current methods for assessing ENM ecotoxicity because the relative toxicities of multiple ENMs to thousands of naturally

  6. Polymerase chain reaction-hybridization method using urease gene sequences for high-throughput Ureaplasma urealyticum and Ureaplasma parvum detection and differentiation.

    Science.gov (United States)

    Xu, Chen; Zhang, Nan; Huo, Qianyu; Chen, Minghui; Wang, Rengfeng; Liu, Zhili; Li, Xue; Liu, Yunde; Bao, Huijing

    2016-04-15

    In this article, we discuss the polymerase chain reaction (PCR)-hybridization assay that we developed for high-throughput simultaneous detection and differentiation of Ureaplasma urealyticum and Ureaplasma parvum using one set of primers and two specific DNA probes based on urease gene nucleotide sequence differences. First, U. urealyticum and U. parvum DNA samples were specifically amplified using one set of biotin-labeled primers. Furthermore, amine-modified DNA probes, which can specifically react with U. urealyticum or U. parvum DNA, were covalently immobilized to a DNA-BIND plate surface. The plate was then incubated with the PCR products to facilitate sequence-specific DNA binding. Horseradish peroxidase-streptavidin conjugation and a colorimetric assay were used. Based on the results, the PCR-hybridization assay we developed can specifically differentiate U. urealyticum and U. parvum with high sensitivity (95%) compared with cultivation (72.5%). Hence, this study demonstrates a new method for high-throughput simultaneous differentiation and detection of U. urealyticum and U. parvum with high sensitivity. Based on these observations, the PCR-hybridization assay developed in this study is ideal for detecting and discriminating U. urealyticum and U. parvum in clinical applications. Copyright © 2016 Elsevier Inc. All rights reserved.

  7. An RNA-Based Fluorescent Biosensor for High-Throughput Analysis of the cGAS-cGAMP-STING Pathway.

    Science.gov (United States)

    Bose, Debojit; Su, Yichi; Marcus, Assaf; Raulet, David H; Hammond, Ming C

    2016-12-22

    In mammalian cells, the second messenger (2'-5',3'-5') cyclic guanosine monophosphate-adenosine monophosphate (2',3'-cGAMP), is produced by the cytosolic DNA sensor cGAMP synthase (cGAS), and subsequently bound by the stimulator of interferon genes (STING) to trigger interferon response. Thus, the cGAS-cGAMP-STING pathway plays a critical role in pathogen detection, as well as pathophysiological conditions including cancer and autoimmune disorders. However, studying and targeting this immune signaling pathway has been challenging due to the absence of tools for high-throughput analysis. We have engineered an RNA-based fluorescent biosensor that responds to 2',3'-cGAMP. The resulting "mix-and-go" cGAS activity assay shows excellent statistical reliability as a high-throughput screening (HTS) assay and distinguishes between direct and indirect cGAS inhibitors. Furthermore, the biosensor enables quantitation of 2',3'-cGAMP in mammalian cell lysates. We envision this biosensor-based assay as a resource to study the cGAS-cGAMP-STING pathway in the context of infectious diseases, cancer immunotherapy, and autoimmune diseases. Copyright © 2016 Elsevier Ltd. All rights reserved.

  8. High-throughput quantitative biochemical characterization of algal biomass by NIR spectroscopy; multiple linear regression and multivariate linear regression analysis.

    Science.gov (United States)

    Laurens, L M L; Wolfrum, E J

    2013-12-18

    One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary.

  9. Statistical significance approximation in local trend analysis of high-throughput time-series data using the theory of Markov chains.

    Science.gov (United States)

    Xia, Li C; Ai, Dongmei; Cram, Jacob A; Liang, Xiaoyi; Fuhrman, Jed A; Sun, Fengzhu

    2015-09-21

    Local trend (i.e. shape) analysis of time series data reveals co-changing patterns in dynamics of biological systems. However, slow permutation procedures to evaluate the statistical significance of local trend scores have limited its applications to high-throughput time series data analysis, e.g., data from the next generation sequencing technology based studies. By extending the theories for the tail probability of the range of sum of Markovian random variables, we propose formulae for approximating the statistical significance of local trend scores. Using simulations and real data, we show that the approximate p-value is close to that obtained using a large number of permutations (starting at time points >20 with no delay and >30 with delay of at most three time steps) in that the non-zero decimals of the p-values obtained by the approximation and the permutations are mostly the same when the approximate p-value is less than 0.05. In addition, the approximate p-value is slightly larger than that based on permutations making hypothesis testing based on the approximate p-value conservative. The approximation enables efficient calculation of p-values for pairwise local trend analysis, making large scale all-versus-all comparisons possible. We also propose a hybrid approach by integrating the approximation and permutations to obtain accurate p-values for significantly associated pairs. We further demonstrate its use with the analysis of the Polymouth Marine Laboratory (PML) microbial community time series from high-throughput sequencing data and found interesting organism co-occurrence dynamic patterns. The software tool is integrated into the eLSA software package that now provides accelerated local trend and similarity analysis pipelines for time series data. The package is freely available from the eLSA website: http://bitbucket.org/charade/elsa.

  10. High throughput deep degradome sequencing reveals microRNAs and their targets in response to drought stress in mulberry (Morus alba).

    Science.gov (United States)

    Li, Ruixue; Chen, Dandan; Wang, Taichu; Wan, Yizhen; Li, Rongfang; Fang, Rongjun; Wang, Yuting; Hu, Fei; Zhou, Hong; Li, Long; Zhao, Weiguo

    2017-01-01

    MicroRNAs (miRNAs) play important regulatory roles by targeting mRNAs for cleavage or translational repression. Identification of miRNA targets is essential to better understanding the roles of miRNAs. miRNA targets have not been well characterized in mulberry (Morus alba). To anatomize miRNA guided gene regulation under drought stress, transcriptome-wide high throughput degradome sequencing was used in this study to directly detect drought stress responsive miRNA targets in mulberry. A drought library (DL) and a contrast library (CL) were constructed to capture the cleaved mRNAs for sequencing. In CL, 409 target genes of 30 conserved miRNA families and 990 target genes of 199 novel miRNAs were identified. In DL, 373 target genes of 30 conserved miRNA families and 950 target genes of 195 novel miRNAs were identified. Of the conserved miRNA families in DL, mno-miR156, mno-miR172, and mno-miR396 had the highest number of targets with 54, 52 and 41 transcripts, respectively, indicating that these three miRNA families and their target genes might play important functions in response to drought stress in mulberry. Additionally, we found that many of the target genes were transcription factors. By analyzing the miRNA-target molecular network, we found that the DL independent networks consisted of 838 miRNA-mRNA pairs (63.34%). The expression patterns of 11 target genes and 12 correspondent miRNAs were detected using qRT-PCR. Six miRNA targets were further verified by RNA ligase-mediated 5' rapid amplification of cDNA ends (RLM-5' RACE). Gene Ontology (GO) annotations and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis revealed that these target transcripts were implicated in a broad range of biological processes and various metabolic pathways. This is the first study to comprehensively characterize target genes and their associated miRNAs in response to drought stress by degradome sequencing in mulberry. This study provides a framework for understanding

  11. Assessment of whole genome amplification-induced bias through high-throughput, massively parallel whole genome sequencing

    Directory of Open Access Journals (Sweden)

    Plant Ramona N

    2006-08-01

    Full Text Available Abstract Background Whole genome amplification is an increasingly common technique through which minute amounts of DNA can be multiplied to generate quantities suitable for genetic testing and analysis. Questions of amplification-induced error and template bias generated by these methods have previously been addressed through either small scale (SNPs or large scale (CGH array, FISH methodologies. Here we utilized whole genome sequencing to assess amplification-induced bias in both coding and non-coding regions of two bacterial genomes. Halobacterium species NRC-1 DNA and Campylobacter jejuni were amplified by several common, commercially available protocols: multiple displacement amplification, primer extension pre-amplification and degenerate oligonucleotide primed PCR. The amplification-induced bias of each method was assessed by sequencing both genomes in their entirety using the 454 Sequencing System technology and comparing the results with those obtained from unamplified controls. Results All amplification methodologies induced statistically significant bias relative to the unamplified control. For the Halobacterium species NRC-1 genome, assessed at 100 base resolution, the D-statistics from GenomiPhi-amplified material were 119 times greater than those from unamplified material, 164.0 times greater for Repli-G, 165.0 times greater for PEP-PCR and 252.0 times greater than the unamplified controls for DOP-PCR. For Campylobacter jejuni, also analyzed at 100 base resolution, the D-statistics from GenomiPhi-amplified material were 15 times greater than those from unamplified material, 19.8 times greater for Repli-G, 61.8 times greater for PEP-PCR and 220.5 times greater than the unamplified controls for DOP-PCR. Conclusion Of the amplification methodologies examined in this paper, the multiple displacement amplification products generated the least bias, and produced significantly higher yields of amplified DNA.

  12. High Throughput In vivo Analysis of Plant Leaf Chemical Properties Using Hyperspectral Imaging.

    Science.gov (United States)

    Pandey, Piyush; Ge, Yufeng; Stoerger, Vincent; Schnable, James C

    2017-01-01

    Image-based high-throughput plant phenotyping in greenhouse has the potential to relieve the bottleneck currently presented by phenotypic scoring which limits the throughput of gene discovery and crop improvement efforts. Numerous studies have employed automated RGB imaging to characterize biomass and growth of agronomically important crops. The objective of this study was to investigate the utility of hyperspectral imaging for quantifying chemical properties of maize and soybean plants in vivo . These properties included leaf water content, as well as concentrations of macronutrients nitrogen (N), phosphorus (P), potassium (K), magnesium (Mg), calcium (Ca), and sulfur (S), and micronutrients sodium (Na), iron (Fe), manganese (Mn), boron (B), copper (Cu), and zinc (Zn). Hyperspectral images were collected from 60 maize and 60 soybean plants, each subjected to varying levels of either water deficit or nutrient limitation stress with the goal of creating a wide range of variation in the chemical properties of plant leaves. Plants were imaged on an automated conveyor belt system using a hyperspectral imager with a spectral range from 550 to 1,700 nm. Images were processed to extract reflectance spectrum from each plant and partial least squares regression models were developed to correlate spectral data with chemical data. Among all the chemical properties investigated, water content was predicted with the highest accuracy [ R 2 = 0.93 and RPD (Ratio of Performance to Deviation) = 3.8]. All macronutrients were also quantified satisfactorily ( R 2 from 0.69 to 0.92, RPD from 1.62 to 3.62), with N predicted best followed by P, K, and S. The micronutrients group showed lower prediction accuracy ( R 2 from 0.19 to 0.86, RPD from 1.09 to 2.69) than the macronutrient groups. Cu and Zn were best predicted, followed by Fe and Mn. Na and B were the only two properties that hyperspectral imaging was not able to quantify satisfactorily ( R 2 plant chemical traits. Future

  13. High Throughput In vivo Analysis of Plant Leaf Chemical Properties Using Hyperspectral Imaging

    Science.gov (United States)

    Pandey, Piyush; Ge, Yufeng; Stoerger, Vincent; Schnable, James C.

    2017-01-01

    Image-based high-throughput plant phenotyping in greenhouse has the potential to relieve the bottleneck currently presented by phenotypic scoring which limits the throughput of gene discovery and crop improvement efforts. Numerous studies have employed automated RGB imaging to characterize biomass and growth of agronomically important crops. The objective of this study was to investigate the utility of hyperspectral imaging for quantifying chemical properties of maize and soybean plants in vivo. These properties included leaf water content, as well as concentrations of macronutrients nitrogen (N), phosphorus (P), potassium (K), magnesium (Mg), calcium (Ca), and sulfur (S), and micronutrients sodium (Na), iron (Fe), manganese (Mn), boron (B), copper (Cu), and zinc (Zn). Hyperspectral images were collected from 60 maize and 60 soybean plants, each subjected to varying levels of either water deficit or nutrient limitation stress with the goal of creating a wide range of variation in the chemical properties of plant leaves. Plants were imaged on an automated conveyor belt system using a hyperspectral imager with a spectral range from 550 to 1,700 nm. Images were processed to extract reflectance spectrum from each plant and partial least squares regression models were developed to correlate spectral data with chemical data. Among all the chemical properties investigated, water content was predicted with the highest accuracy [R2 = 0.93 and RPD (Ratio of Performance to Deviation) = 3.8]. All macronutrients were also quantified satisfactorily (R2 from 0.69 to 0.92, RPD from 1.62 to 3.62), with N predicted best followed by P, K, and S. The micronutrients group showed lower prediction accuracy (R2 from 0.19 to 0.86, RPD from 1.09 to 2.69) than the macronutrient groups. Cu and Zn were best predicted, followed by Fe and Mn. Na and B were the only two properties that hyperspectral imaging was not able to quantify satisfactorily (R2 designing experiments to vary plant nutrients

  14. High-Throughput Genetic Analysis and Combinatorial Chiral Separations Based on Capillary Electrophoresis

    Energy Technology Data Exchange (ETDEWEB)

    Zhong, Wenwan [Iowa State Univ., Ames, IA (United States)

    2003-01-01

    Capillary electrophoresis (CE) offers many advantages over conventional analytical methods, such as speed, simplicity, high resolution, low cost, and small sample consumption, especially for the separation of enantiomers. However, chiral method developments still can be time consuming and tedious. They designed a comprehensive enantioseparation protocol employing neutral and sulfated cyclodextrins as chiral selectors for common basic, neutral, and acidic compounds with a 96-capillary array system. By using only four judiciously chosen separation buffers, successful enantioseparations were achieved for 49 out of 54 test compounds spanning a large variety of pKs and structures. Therefore, unknown compounds can be screened in this manner to identify optimal enantioselective conditions in just one rn. In addition to superior separation efficiency for small molecules, CE is also the most powerful technique for DNA separations. Using the same multiplexed capillary system with UV absorption detection, the sequence of a short DNA template can be acquired without any dye-labels. Two internal standards were utilized to adjust the migration time variations among capillaries, so that the four electropherograms for the A, T, C, G Sanger reactions can be aligned and base calling can be completed with a high level of confidence. the CE separation of DNA can be applied to study differential gene expression as well. Combined with pattern recognition techniques, small variations among electropherograms obtained by the separation of cDNA fragments produced from the total RNA samples of different human tissues can be revealed. These variations reflect the differences in total RNA expression among tissues. Thus, this Ce-based approach can serve as an alternative to the DNA array techniques in gene expression analysis.

  15. High-Throughput Sequencing of MicroRNAs in Adenovirus Type 3 Infected Human Laryngeal Epithelial Cells

    Directory of Open Access Journals (Sweden)

    Yuhua Qi

    2010-01-01

    Full Text Available Adenovirus infection can cause various illnesses depending on the infecting serotype, such as gastroenteritis, conjunctivitis, cystitis, and rash illness, but the infection mechanism is still unknown. MicroRNAs (miRNA have been reported to play essential roles in cell proliferation, cell differentiation, and pathogenesis of human diseases including viral infections. We analyzed the miRNA expression profiles from adenovirus type 3 (AD3 infected Human laryngeal epithelial (Hep2 cells using a SOLiD deep sequencing. 492 precursor miRNAs were identified in the AD3 infected Hep2 cells, and 540 precursor miRNAs were identified in the control. A total of 44 miRNAs demonstrated high expression and 36 miRNAs showed lower expression in the AD3 infected cells than control. The biogenesis of miRNAs has been analyzed, and some of the SOLiD results were confirmed by Quantitative PCR analysis. The present studies may provide a useful clue for the biological function research into AD3 infection.

  16. Validation of a Microscale Extraction and High Throughput UHPLC-QTOF-MS Analysis Method for Huperzine A in Huperzia

    Science.gov (United States)

    Cuthbertson, Daniel; Piljac-Žegarac, Jasenka; Lange, Bernd Markus

    2011-01-01

    Herein we report on an improved method for the microscale extraction of huperzine A (HupA), an acetylcholinesterase-inhibiting alkaloid, from as little as 3 mg of tissue homogenate from the clubmoss Huperzia squarrosa (G. Forst.) Trevis with 99.95 % recovery. We also validated a novel UHPLC-QTOF-MS method for the high-throughput analysis of H. squarrosa extracts in only 6 min, which, in combination with the very low limit of detection (20 pg on column) and the wide linear range for quantification (20 to 10,000 pg on column), allow for a highly efficient screening of extracts containing varying amounts of HupA. Utilization of this methodology has the potential to conserve valuable plant resources. PMID:22275140

  17. High-throughput sequencing of small RNA transcriptome reveals salt stress regulated microRNAs in sugarcane.

    Directory of Open Access Journals (Sweden)

    Mariana Carnavale Bottino

    Full Text Available Salt stress is a primary cause of crop losses worldwide, and it has been the subject of intense investigation to unravel the complex mechanisms responsible for salinity tolerance. MicroRNA is implicated in many developmental processes and in responses to various abiotic stresses, playing pivotal roles in plant adaptation. Deep sequencing technology was chosen to determine the small RNA transcriptome of Saccharum sp cultivars grown on saline conditions. We constructed four small RNAs libraries prepared from plants grown on hydroponic culture submitted to 170 mM NaCl and harvested after 1 h, 6 hs and 24 hs. Each library was sequenced individually and together generated more than 50 million short reads. Ninety-eight conserved miRNAs and 33 miRNAs* were identified by bioinformatics. Several of the microRNA showed considerable differences of expression in the four libraries. To confirm the results of the bioinformatics-based analysis, we studied the expression of the 10 most abundant miRNAs and 1 miRNA* in plants treated with 170 mM NaCl and in plants with a severe treatment of 340 mM NaCl. The results showed that 11 selected miRNAs had higher expression in samples treated with severe salt treatment compared to the mild one. We also investigated the regulation of the same miRNAs in shoots of four cultivars grown on soil treated with 170 mM NaCl. Cultivars could be grouped according to miRNAs expression in response to salt stress. Furthermore, the majority of the predicted target genes had an inverse regulation with their correspondent microRNAs. The targets encode a wide range of proteins, including transcription factors, metabolic enzymes and genes involved in hormone signaling, probably assisting the plants to develop tolerance to salinity. Our work provides insights into the regulatory functions of miRNAs, thereby expanding our knowledge on potential salt-stressed regulated genes.

  18. High throughput sequencing and proteomics to identify immunogenic proteins of a new pathogen: the dirty genome approach.

    Science.gov (United States)

    Greub, Gilbert; Kebbi-Beghdadi, Carole; Bertelli, Claire; Collyn, François; Riederer, Beat M; Yersin, Camille; Croxatto, Antony; Raoult, Didier

    2009-12-23

    With the availability of new generation sequencing technologies, bacterial genome projects have undergone a major boost. Still, chromosome completion needs a costly and time-consuming gap closure, especially when containing highly repetitive elements. However, incomplete genome data may be sufficiently informative to derive the pursued information. For emerging pathogens, i.e. newly identified pathogens, lack of release of genome data during gap closure stage is clearly medically counterproductive. We thus investigated the feasibility of a dirty genome approach, i.e. the release of unfinished genome sequences to develop serological diagnostic tools. We showed that almost the whole genome sequence of the emerging pathogen Parachlamydia acanthamoebae was retrieved even with relatively short reads from Genome Sequencer 20 and Solexa. The bacterial proteome was analyzed to select immunogenic proteins, which were then expressed and used to elaborate the first steps of an ELISA. This work constitutes the proof of principle for a dirty genome approach, i.e. the use of unfinished genome sequences of pathogenic bacteria, coupled with proteomics to rapidly identify new immunogenic proteins useful to develop in the future specific diagnostic tests such as ELISA, immunohistochemistry and direct antigen detection. Although applied here to an emerging pathogen, this combined dirty genome sequencing/proteomic approach may be used for any pathogen for which better diagnostics are needed. These genome sequences may also be very useful to develop DNA based diagnostic tests. All these diagnostic tools will allow further evaluations of the pathogenic potential of this obligate intracellular bacterium.

  19. High throughput LC-MS/MS method for the simultaneous analysis of multiple vitamin D analytes in serum.

    Science.gov (United States)

    Jenkinson, Carl; Taylor, Angela E; Hassan-Smith, Zaki K; Adams, John S; Stewart, Paul M; Hewison, Martin; Keevil, Brian G

    2016-03-01

    Recent studies suggest that vitamin D-deficiency is linked to increased risk of common human health problems. To define vitamin D 'status' most routine analytical methods quantify one particular vitamin D metabolite, 25-hydroxyvitamin D3 (25OHD3). However, vitamin D is characterized by complex metabolic pathways, and simultaneous measurement of multiple vitamin D metabolites may provide a more accurate interpretation of vitamin D status. To address this we developed a high-throughput liquid chromatography-tandem mass spectrometry (LC-MS/MS) method to analyse multiple vitamin D analytes, with particular emphasis on the separation of epimer metabolites. A supportive liquid-liquid extraction (SLE) and LC-MS/MS method was developed to quantify 10 vitamin D metabolites as well as separation of an interfering 7α-hydroxy-4-cholesten-3-one (7αC4) isobar (precursor of bile acid), and validated by analysis of human serum samples. In a cohort of 116 healthy subjects, circulating concentrations of 25-hydroxyvitamin D3 (25OHD3), 3-epi-25-hydroxyvitamin D3 (3-epi-25OHD3), 24,25-dihydroxyvitamin D3 (24R,25(OH)2D3), 1,25-dihydroxyvitamin D3 (1α,25(OH)2D3), and 25-hydroxyvitamin D2 (25OHD2) were quantifiable using 220μL of serum, with 25OHD3 and 24R,25(OH)2D3 showing significant seasonal variations. This high-throughput LC-MS/MS method provides a novel strategy for assessing the impact of vitamin D on human health and disease. Copyright © 2016 Elsevier B.V. All rights reserved.

  20. High throughput sequencing and proteomics to identify immunogenic proteins of a new pathogen: the dirty genome approach.

    Directory of Open Access Journals (Sweden)

    Gilbert Greub

    Full Text Available BACKGROUND: With the availability of new generation sequencing technologies, bacterial genome projects have undergone a major boost. Still, chromosome completion needs a costly and time-consuming gap closure, especially when containing highly repetitive elements. However, incomplete genome data may be sufficiently informative to derive the pursued information. For emerging pathogens, i.e. newly identified pathogens, lack of release of genome data during gap closure stage is clearly medically counterproductive. METHODS/PRINCIPAL FINDINGS: We thus investigated the feasibility of a dirty genome approach, i.e. the release of unfinished genome sequences to develop serological diagnostic tools. We showed that almost the whole genome sequence of the emerging pathogen Parachlamydia acanthamoebae was retrieved even with relatively short reads from Genome Sequencer 20 and Solexa. The bacterial proteome was analyzed to select immunogenic proteins, which were then expressed and used to elaborate the first steps of an ELISA. CONCLUSIONS/SIGNIFICANCE: This work constitutes the proof of principle for a dirty genome approach, i.e. the use of unfinished genome sequences of pathogenic bacteria, coupled with proteomics to rapidly identify new immunogenic proteins useful to develop in the future specific diagnostic tests such as ELISA, immunohistochemistry and direct antigen detection. Although applied here to an emerging pathogen, this combined dirty genome sequencing/proteomic approach may be used for any pathogen for which better diagnostics are needed. These genome sequences may also be very useful to develop DNA based diagnostic tests. All these diagnostic tools will allow further evaluations of the pathogenic potential of this obligate intracellular bacterium.

  1. Gold nanoparticle-mediated (GNOME) laser perforation: a new method for a high-throughput analysis of gap junction intercellular coupling.

    Science.gov (United States)

    Begandt, Daniela; Bader, Almke; Antonopoulos, Georgios C; Schomaker, Markus; Kalies, Stefan; Meyer, Heiko; Ripken, Tammo; Ngezahayo, Anaclet

    2015-10-01

    The present report evaluates the advantages of using the gold nanoparticle-mediated laser perforation (GNOME LP) technique as a computer-controlled cell optoperforation to introduce Lucifer yellow (LY) into cells in order to analyze the gap junction coupling in cell monolayers. To permeabilize GM-7373 endothelial cells grown in a 24 multiwell plate with GNOME LP, a laser beam of 88 μm in diameter was applied in the presence of gold nanoparticles and LY. After 10 min to allow dye uptake and diffusion through gap junctions, we observed a LY-positive cell band of 179 ± 8 μm width. The presence of the gap junction channel blocker carbenoxolone during the optoperforation reduced the LY-positive band to 95 ± 6 μm. Additionally, a forskolin-related enhancement of gap junction coupling, recently found using the scrape loading technique, was also observed using GNOME LP. Further, an automatic cell imaging and a subsequent semi-automatic quantification of the images using a java-based ImageJ-plugin were performed in a high-throughput sequence. Moreover, the GNOME LP was used on cells such as RBE4 rat brain endothelial cells, which cannot be mechanically scraped as well as on three-dimensionally cultivated cells, opening the possibility to implement the GNOME LP technique for analysis of gap junction coupling in tissues. We conclude that the GNOME LP technique allows a high-throughput automated analysis of gap junction coupling in cells. Moreover this non-invasive technique could be used on monolayers that do not support mechanical scraping as well as on cells in tissue allowing an in vivo/ex vivo analysis of gap junction coupling.

  2. Identification of QTLs for 14 Agronomically Important Traits in Setaria italica Based on SNPs Generated from High-Throughput Sequencing

    Directory of Open Access Journals (Sweden)

    Kai Zhang

    2017-05-01

    Full Text Available Foxtail millet (Setaria italica is an important crop possessing C4 photosynthesis capability. The S. italica genome was de novo sequenced in 2012, but the sequence lacked high-density genetic maps with agronomic and yield trait linkages. In the present study, we resequenced a foxtail millet population of 439 recombinant inbred lines (RILs and developed high-resolution bin map and high-density SNP markers, which could provide an effective approach for gene identification. A total of 59 QTL for 14 agronomic traits in plants grown under long- and short-day photoperiods were identified. The phenotypic variation explained ranged from 4.9 to 43.94%. In addition, we suggested that there may be segregation distortion on chromosome 6 that is significantly distorted toward Zhang gu. The newly identified QTL will provide a platform for sequence-based research on the S. italica genome, and for molecular marker-assisted breeding.

  3. Identification of QTLs for 14 Agronomically Important Traits in Setaria italica Based on SNPs Generated from High-Throughput Sequencing.

    Science.gov (United States)

    Zhang, Kai; Fan, Guangyu; Zhang, Xinxin; Zhao, Fang; Wei, Wei; Du, Guohua; Feng, Xiaolei; Wang, Xiaoming; Wang, Feng; Song, Guoliang; Zou, Hongfeng; Zhang, Xiaolei; Li, Shuangdong; Ni, Xuemei; Zhang, Gengyun; Zhao, Zhihai

    2017-05-05

    Foxtail millet ( Setaria italica ) is an important crop possessing C4 photosynthesis capability. The S. italica genome was de novo sequenced in 2012, but the sequence lacked high-density genetic maps with agronomic and yield trait linkages. In the present study, we resequenced a foxtail millet population of 439 recombinant inbred lines (RILs) and developed high-resolution bin map and high-density SNP markers, which could provide an effective approach for gene identification. A total of 59 QTL for 14 agronomic traits in plants grown under long- and short-day photoperiods were identified. The phenotypic variation explained ranged from 4.9 to 43.94%. In addition, we suggested that there may be segregation distortion on chromosome 6 that is significantly distorted toward Zhang gu. The newly identified QTL will provide a platform for sequence-based research on the S. italica genome, and for molecular marker-assisted breeding. Copyright © 2017 Zhang et al.

  4. Comparative high-throughput transcriptome sequencing and development of SiESTa, the Silene EST annotation database

    Directory of Open Access Journals (Sweden)

    Marais Gabriel AB

    2011-07-01

    Full Text Available Abstract Background The genus Silene is widely used as a model system for addressing ecological and evolutionary questions in plants, but advances in using the genus as a model system are impeded by the lack of available resources for studying its genome. Massively parallel sequencing cDNA has recently developed into an efficient method for characterizing the transcriptomes of non-model organisms, generating massive amounts of data that enable the study of multiple species in a comparative framework. The sequences generated provide an excellent resource for identifying expressed genes, characterizing functional variation and developing molecular markers, thereby laying the foundations for future studies on gene sequence and gene expression divergence. Here, we report the results of a comparative transcriptome sequencing study of eight individuals representing four Silene and one Dianthus species as outgroup. All sequences and annotations have been deposited in a newly developed and publicly available database called SiESTa, the Silene EST annotation database. Results A total of 1,041,122 EST reads were generated in two runs on a Roche GS-FLX 454 pyrosequencing platform. EST reads were analyzed separately for all eight individuals sequenced and were assembled into contigs using TGICL. These were annotated with results from BLASTX searches and Gene Ontology (GO terms, and thousands of single-nucleotide polymorphisms (SNPs were characterized. Unassembled reads were kept as singletons and together with the contigs contributed to the unigenes characterized in each individual. The high quality of unigenes is evidenced by the proportion (49% that have significant hits in similarity searches with the A. thaliana proteome. The SiESTa database is accessible at http://www.siesta.ethz.ch. Conclusion The sequence collections established in the present study provide an important genomic resource for four Silene and one Dianthus species and will help to

  5. Comparative high-throughput transcriptome sequencing and development of SiESTa, the Silene EST annotation database

    Science.gov (United States)

    2011-01-01

    Background The genus Silene is widely used as a model system for addressing ecological and evolutionary questions in plants, but advances in using the genus as a model system are impeded by the lack of available resources for studying its genome. Massively parallel sequencing cDNA has recently developed into an efficient method for characterizing the transcriptomes of non-model organisms, generating massive amounts of data that enable the study of multiple species in a comparative framework. The sequences generated provide an excellent resource for identifying expressed genes, characterizing functional variation and developing molecular markers, thereby laying the foundations for future studies on gene sequence and gene expression divergence. Here, we report the results of a comparative transcriptome sequencing study of eight individuals representing four Silene and one Dianthus species as outgroup. All sequences and annotations have been deposited in a newly developed and publicly available database called SiESTa, the Silene EST annotation database. Results A total of 1,041,122 EST reads were generated in two runs on a Roche GS-FLX 454 pyrosequencing platform. EST reads were analyzed separately for all eight individuals sequenced and were assembled into contigs using TGICL. These were annotated with results from BLASTX searches and Gene Ontology (GO) terms, and thousands of single-nucleotide polymorphisms (SNPs) were characterized. Unassembled reads were kept as singletons and together with the contigs contributed to the unigenes characterized in each individual. The high quality of unigenes is evidenced by the proportion (49%) that have significant hits in similarity searches with the A. thaliana proteome. The SiESTa database is accessible at http://www.siesta.ethz.ch. Conclusion The sequence collections established in the present study provide an important genomic resource for four Silene and one Dianthus species and will help to further develop Silene as a

  6. Identification and characterization of miRNAs in ripening fruit of Lycium barbarum L. using high-throughput sequencing

    Directory of Open Access Journals (Sweden)

    Shaohua eZeng

    2015-09-01

    Full Text Available MicroRNAs (miRNAs are master regulators of gene activity documented to play central roles in fruit ripening in model plant species, yet little is known of their roles in Lycium barbarum L. fruits. In this study, miRNA levels in L. barbarum fruit samples at four developmental stages, were assayed using Illumina HiSeqTM2000. This revealed the presence of 50 novel miRNAs and 38 known miRNAs in L. barbarum fruits. Of the novel miRNAs, 36 were specific to L. barbarum fruits compared with L. chinense. A number of stage-specific miRNAs were identified and GO terms were assigned to 194 unigenes targeted by miRNAs. The majority of GO terms of unigenes targeted by differentially expressed miRNAs are ‘intracellular organelle’, ‘binding’, ‘metabolic process’, ‘pigmentation’, and ‘biological regulation’. Enriched KEGG analysis indicated that nucleotide excision repair and ubiquitin mediated proteolysis were over-represented during the initial stage of ripening, with ABC transporters and sulfur metabolism pathways active during the middle stages and ABC transporters and spliceosome enriched in the final stages of ripening. Several miRNAs and their targets serving as potential regulators in L. barbarum fruit ripening were identified using quantitative reverse transcription polymerase chain reaction. The miRNA-target interactions were predicted for L. barbarum ripening regulators including miR156/157 with LbCNR and LbWRKY8, and miR171 with LbGRAS. Additionally, regulatory interactions potentially controlling fruit quality and nutritional value via sugar and secondary metabolite accumulation were identified. These include miR156 targeting of fructokinase and 1-deoxy-D-xylulose-5-phosphate synthase and miR164 targeting of beta-fructofuranosidase. In sum, valuable information revealed by small RNA sequencing in this study will provide a solid foundation for uncovering the miRNA-mediated mechanism of fruit ripening and quality in this

  7. The gut microbiotassay – a high-throughput real-time PCR chip combined with next generation sequencing

    DEFF Research Database (Denmark)

    Hermann-Bank, Marie Louise; Skovgaard, Kerstin; Mølbak, Lars

    it demonstrated distinct quantities of bacteria in the different gut sections, with the highest number found in colon as expected. From the sequence data it was evident that primer systems targeting lower taxonomical levels, contributed with a higher resolution, revealing species that primer system targeting...

  8. Analysis of high-throughput plant image data with the information system IAP

    Directory of Open Access Journals (Sweden)

    Klukas Christian

    2012-06-01

    Full Text Available This work presents a sophisticated information system, the Integrated Analysis Platform (IAP, an approach supporting large-scale image analysis for different species and imaging systems. In its current form, IAP supports the investigation of Maize, Barley and Arabidopsis plants based on images obtained in different spectra.

  9. Identification and characterization of microRNAs from peanut (Arachis hypogaea L. by high-throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Xiaoyuan Chi

    Full Text Available BACKGROUND: MicroRNAs (miRNAs are noncoding RNAs of approximately 21 nt that regulate gene expression in plants post-transcriptionally by endonucleolytic cleavage or translational inhibition. miRNAs play essential roles in numerous developmental and physiological processes and many of them are conserved across species. Extensive studies of miRNAs have been done in a few model plants; however, less is known about the diversity of these regulatory RNAs in peanut (Arachis hypogaea L., one of the most important oilseed crops cultivated worldwide. RESULTS: A library of small RNA from peanut was constructed for deep sequencing. In addition to 126 known miRNAs from 33 families, 25 novel peanut miRNAs were identified. The miRNA* sequences of four novel miRNAs were discovered, providing additional evidence for the existence of miRNAs. Twenty of the novel miRNAs were considered to be species-specific because no homolog has been found for other plant species. qRT-PCR was used to analyze the expression of seven miRNAs in different tissues and in seed at different developmental stages and some showed tissue- and/or growth stage-specific expression. Furthermore, potential targets of these putative miRNAs were predicted on the basis of the sequence homology search. CONCLUSIONS: We have identified large numbers of miRNAs and their related target genes through deep sequencing of a small RNA library. This study of the identification and characterization of miRNAs in peanut can initiate further study on peanut miRNA regulation mechanisms, and help toward a greater understanding of the important roles of miRNAs in peanut.

  10. Identification and Characterization of Wilt and Salt Stress-Responsive MicroRNAs in Chickpea through High-Throughput Sequencing

    Science.gov (United States)

    Deokar, Amit Atmaram; Bhardwaj, Ankur R.; Agarwal, Manu; Katiyar-Agarwal, Surekha; Srinivasan, Ramamurthy; Jain, Pradeep Kumar

    2014-01-01

    Chickpea (Cicer arietinum) is the second most widely grown legume worldwide and is the most important pulse crop in the Indian subcontinent. Chickpea productivity is adversely affected by a large number of biotic and abiotic stresses. MicroRNAs (miRNAs) have been implicated in the regulation of plant responses to several biotic and abiotic stresses. This study is the first attempt to identify chickpea miRNAs that are associated with biotic and abiotic stresses. The wilt infection that is caused by the fungus Fusarium oxysporum f.sp. ciceris is one of the major diseases severely affecting chickpea yields. Of late, increasing soil salinization has become a major problem in realizing these potential yields. Three chickpea libraries using fungal-infected, salt-treated and untreated seedlings were constructed and sequenced using next-generation sequencing technology. A total of 12,135,571 unique reads were obtained. In addition to 122 conserved miRNAs belonging to 25 different families, 59 novel miRNAs along with their star sequences were identified. Four legume-specific miRNAs, including miR5213, miR5232, miR2111 and miR2118, were found in all of the libraries. Poly(A)-based qRT-PCR (Quantitative real-time PCR) was used to validate eleven conserved and five novel miRNAs. miR530 was highly up regulated in response to fungal infection, which targets genes encoding zinc knuckle- and microtubule-associated proteins. Many miRNAs responded in a similar fashion under both biotic and abiotic stresses, indicating the existence of cross talk between the pathways that are involved in regulating these stresses. The potential target genes for the conserved and novel miRNAs were predicted based on sequence homologies. miR166 targets a HD-ZIPIII transcription factor and was validated by 5′ RLM-RACE. This study has identified several conserved and novel miRNAs in the chickpea that are associated with gene regulation following exposure to wilt and salt stress. PMID:25295754

  11. A new high-throughput LC-MS method for the analysis of complex fructan mixtures

    DEFF Research Database (Denmark)

    Verspreet, Joran; Hansen, Anders Holmgaard; Dornez, Emmie

    2014-01-01

    In this paper, a new liquid chromatography-mass spectrometry (LC-MS) method for the analysis of complex fructan mixtures is presented. In this method, columns with a trifunctional C18 alkyl stationary phase (T3) were used and their performance compared with that of a porous graphitized carbon (PGC...

  12. CRISPR-Cas9-Edited Site Sequencing (CRES-Seq): An Efficient and High-Throughput Method for the Selection of CRISPR-Cas9-Edited Clones.

    Science.gov (United States)

    Veeranagouda, Yaligara; Debono-Lagneaux, Delphine; Fournet, Hamida; Thill, Gilbert; Didier, Michel

    2018-01-16

    The emergence of clustered regularly interspaced short palindromic repeats-Cas9 (CRISPR-Cas9) gene editing systems has enabled the creation of specific mutants at low cost, in a short time and with high efficiency, in eukaryotic cells. Since a CRISPR-Cas9 system typically creates an array of mutations in targeted sites, a successful gene editing project requires careful selection of edited clones. This process can be very challenging, especially when working with multiallelic genes and/or polyploid cells (such as cancer and plants cells). Here we described a next-generation sequencing method called CRISPR-Cas9 Edited Site Sequencing (CRES-Seq) for the efficient and high-throughput screening of CRISPR-Cas9-edited clones. CRES-Seq facilitates the precise genotyping up to 96 CRISPR-Cas9-edited sites (CRES) in a single MiniSeq (Illumina) run with an approximate sequencing cost of $6/clone. CRES-Seq is particularly useful when multiple genes are simultaneously targeted by CRISPR-Cas9, and also for screening of clones generated from multiallelic genes/polyploid cells. © 2018 by John Wiley & Sons, Inc. Copyright © 2018 John Wiley & Sons, Inc.

  13. High-throughput sequencing of natively paired antibody chains provides evidence for original antigenic sin shaping the antibody response to influenza vaccination.

    Science.gov (United States)

    Tan, Yann-Chong; Blum, Lisa K; Kongpachith, Sarah; Ju, Chia-Hsin; Cai, Xiaoyong; Lindstrom, Tamsin M; Sokolove, Jeremy; Robinson, William H

    2014-03-01

    We developed a DNA barcoding method to enable high-throughput sequencing of the cognate heavy- and light-chain pairs of the antibodies expressed by individual B cells. We used this approach to elucidate the plasmablast antibody response to influenza vaccination. We show that >75% of the rationally selected plasmablast antibodies bind and neutralize influenza, and that antibodies from clonal families, defined by sharing both heavy-chain VJ and light-chain VJ sequence usage, do so most effectively. Vaccine-induced heavy-chain VJ regions contained on average >20 nucleotide mutations as compared to their predicted germline gene sequences, and some vaccine-induced antibodies exhibited higher binding affinities for hemagglutinins derived from prior years' seasonal influenza as compared to their affinities for the immunization strains. Our results show that influenza vaccination induces the recall of memory B cells that express antibodies that previously underwent affinity maturation against prior years' seasonal influenza, suggesting that 'original antigenic sin' shapes the antibody response to influenza vaccination. Published by Elsevier Inc.

  14. High-throughput phenotyping allows for QTL analysis of defense, symbiosis and development-related traits

    DEFF Research Database (Denmark)

    Hansen, Nina Eberhardtsen

    -throughput phenotyping of whole plants. Additionally, a system for automated confocal microscopy aiming at automated detection of infection thread formation as well as detection of lateral root and nodule primordia is being developed. The objective was to use both systems in genome wide association studies and mutant...... the analysis. Additional phenotyping of defense mutants revealed that MLO, which confers susceptibility towards Blumeria graminis in barley, is also a prime candidate for a S. trifoliorum susceptibility gene in Lotus....

  15. metaBIT, an integrative and automated metagenomic pipeline for analysing microbial profiles from high-throughput sequencing shotgun data

    DEFF Research Database (Denmark)

    Louvel, Guillaume; Der Sarkissian, Clio; Hanghøj, Kristian Ebbesen

    2016-01-01

    -throughput DNA sequencing (HTS). Here, we develop metaBIT, an open-source computational pipeline automatizing routine microbial profiling of shotgun HTS data. Customizable by the user at different stringency levels, it performs robust taxonomy-based assignment and relative abundance calculation of microbial taxa......, as well as cross-sample statistical analyses of microbial diversity distributions. We demonstrate the versatility of metaBIT within a range of published HTS data sets sampled from the environment (soil and seawater) and the human body (skin and gut), but also from archaeological specimens. We present......-friendly profiling of the microbial DNA present in HTS shotgun data sets. The applications of metaBIT are vast, from monitoring of laboratory errors and contaminations, to the reconstruction of past and present microbiota, and the detection of candidate species, including pathogens....

  16. Big data scalability for high throughput processing and analysis of vehicle engineering data

    OpenAIRE

    Lu, Feng

    2017-01-01

    "Sympathy for Data" is a platform that is utilized for Big Data automation analytics. It is based on visual interface and workflow configurations. The main purpose of the platform is to reuse parts of code for structured analysis of vehicle engineering data. However, there are some performance issues on a single machine for processing a large amount of data in Sympathy for Data. There are also disk and CPU IO intensive issues when the data is oversized and the platform need fits comfortably i...

  17. Multichannel microscale system for high throughput preparative separation with comprehensive collection and analysis

    Energy Technology Data Exchange (ETDEWEB)

    Karger, Barry L.; Kotler, Lev; Foret, Frantisek; Minarik, Marek; Kleparnik, Karel

    2003-12-09

    A modular multiple lane or capillary electrophoresis (chromatography) system that permits automated parallel separation and comprehensive collection of all fractions from samples in all lanes or columns, with the option of further on-line automated sample fraction analysis, is disclosed. Preferably, fractions are collected in a multi-well fraction collection unit, or plate (40). The multi-well collection plate (40) is preferably made of a solvent permeable gel, most preferably a hydrophilic, polymeric gel such as agarose or cross-linked polyacrylamide.

  18. Identification of miRNAs Responsive to Botrytis cinerea in Herbaceous Peony (Paeonia lactiflora Pall. by High-Throughput Sequencing

    Directory of Open Access Journals (Sweden)

    Daqiu Zhao

    2015-09-01

    Full Text Available Herbaceous peony (Paeonia lactiflora Pall., one of the world’s most important ornamental plants, is highly susceptible to Botrytis cinerea, and improving resistance to this pathogenic fungus is a problem yet to be solved. MicroRNAs (miRNAs play an essential role in resistance to B. cinerea, but until now, no studies have been reported concerning miRNAs induction in P. lactiflora. Here, we constructed and sequenced two small RNA (sRNA libraries from two B. cinerea-infected P. lactiflora cultivars (“Zifengyu” and “Dafugui” with significantly different levels of resistance to B. cinerea, using the Illumina HiSeq 2000 platform. From the raw reads generated, 4,592,881 and 5,809,796 sRNAs were obtained, and 280 and 306 miRNAs were identified from “Zifengyu” and “Dafugui”, respectively. A total of 237 conserved and 7 novel sequences of miRNAs were differentially expressed between the two cultivars, and we predicted and annotated their potential target genes. Subsequently, 7 differentially expressed candidate miRNAs were screened according to their target genes annotated in KEGG pathways, and the expression patterns of miRNAs and corresponding target genes were elucidated. We found that miR5254, miR165a-3p, miR3897-3p and miR6450a might be involved in the P. lactiflora response to B. cinerea infection. These results provide insight into the molecular mechanisms responsible for resistance to B. cinerea in P. lactiflora.

  19. High-throughput sequencing identification and characterization of potentially adhesion-related small RNAs in Streptococcus mutans.

    Science.gov (United States)

    Zhu, Wenhui; Liu, Shanshan; Liu, Jia; Zhou, Yan; Lin, Huancai

    2018-05-01

    Adherence capacity is one of the principal virulence factors of Streptococcus mutans, and adhesion virulence factors are controlled by small RNAs (sRNAs) at the post-transcriptional level in various bacteria. Here, we aimed to identify and decipher putative adhesion-related sRNAs in clinical strains of S. mutans. RNA deep-sequencing was performed to identify potential sRNAs under different adhesion conditions. The expression of sRNAs was analysed by quantitative real-time PCR (qRT-PCR), and bioinformatic methods were used to predict the functional characteristics of sRNAs. A total of 736 differentially expressed candidate sRNAs were predicted, and these included 352 sRNAs located on the antisense to mRNA (AM) and 384 sRNAs in intergenic regions (IGRs). The top 7 differentially expressed sRNAs were successfully validated by qRT-PCR in UA159, and 2 of these were further confirmed in 100 clinical isolates. Moreover, the sequences of two sRNAs were conserved in other Streptococcus species, indicating a conserved role in such closely related species. A good correlation between the expression of sRNAs and the adhesion of 100 clinical strains was observed, which, combined with GO and KEGG, provides a perspective for the comprehension of sRNA function annotation. This study revealed a multitude of novel putative adhesion-related sRNAs in S. mutans and contributed to a better understanding of information concerning the transcriptional regulation of adhesion in S. mutans.

  20. microRNA Biomarker Discovery and High-Throughput DNA Sequencing Are Possible Using Long-term Archived Serum Samples.

    Science.gov (United States)

    Rounge, Trine B; Lauritzen, Marianne; Langseth, Hilde; Enerly, Espen; Lyle, Robert; Gislefoss, Randi E

    2015-09-01

    The impacts of long-term storage and varying preanalytical factors on the quality and quantity of DNA and miRNA from archived serum have not been fully assessed. Preanalytical and analytical variations and degradation may introduce bias in representation of DNA and miRNA and may result in loss or corruption of quantitative data. We have evaluated DNA and miRNA quantity, quality, and variability in samples stored up to 40 years using one of the oldest prospective serum collections in the world, the Janus Serumbank, a biorepository dedicated to cancer research. miRNAs are present and stable in archived serum samples frozen at -25°C for at least 40 years. Long-time storage did not reduce miRNA yields; however, varying preanalytical conditions had a significant effect and should be taken into consideration during project design. Of note, 500 μL serum yielded sufficient miRNA for qPCR and small RNA sequencing and on average 650 unique miRNAs were detected in samples from presumably healthy donors. Of note, 500 μL serum yielded sufficient DNA for whole-genome sequencing and subsequent SNP calling, giving a uniform representation of the genomes. DNA and miRNA are stable during long-term storage, making large prospectively collected serum repositories an invaluable source for miRNA and DNA biomarker discovery. Large-scale biomarker studies with long follow-up time are possible utilizing biorepositories with archived serum and state-of-the-art technology. ©2015 American Association for Cancer Research.

  1. Assessing the impact of water treatment on bacterial biofilms in drinking water distribution systems using high-throughput DNA sequencing.

    Science.gov (United States)

    Shaw, Jennifer L A; Monis, Paul; Fabris, Rolando; Ho, Lionel; Braun, Kalan; Drikas, Mary; Cooper, Alan

    2014-12-01

    Biofilm control in drinking water distribution systems (DWDSs) is crucial, as biofilms are known to reduce flow efficiency, impair taste and quality of drinking water and have been implicated in the transmission of harmful pathogens. Microorganisms within biofilm communities are more resistant to disinfection compared to planktonic microorganisms, making them difficult to manage in DWDSs. This study evaluates the impact of four unique drinking water treatments on biofilm community structure using metagenomic DNA sequencing. Four experimental DWDSs were subjected to the following treatments: (1) conventional coagulation, (2) magnetic ion exchange contact (MIEX) plus conventional coagulation, (3) MIEX plus conventional coagulation plus granular activated carbon, and (4) membrane filtration (MF). Bacterial biofilms located inside the pipes of each system were sampled under sterile conditions both (a) immediately after treatment application ('inlet') and (b) at a 1 km distance from the treatment application ('outlet'). Bacterial 16S rRNA gene sequencing revealed that the outlet biofilms were more diverse than those sampled at the inlet for all treatments. The lowest number of unique operational taxonomic units (OTUs) and lowest diversity was observed in the MF inlet. However, the MF system revealed the greatest increase in diversity and OTU count from inlet to outlet. Further, the biofilm communities at the outlet of each system were more similar to one another than to their respective inlet, suggesting that biofilm communities converge towards a common established equilibrium as distance from treatment application increases. Based on the results, MF treatment is most effective at inhibiting biofilm growth, but a highly efficient post-treatment disinfection regime is also critical in order to prevent the high rates of post-treatment regrowth. Copyright © 2014 Elsevier Ltd. All rights reserved.

  2. PlantCV v2: Image analysis software for high-throughput plant phenotyping

    Directory of Open Access Journals (Sweden)

    Malia A. Gehan

    2017-12-01

    Full Text Available Systems for collecting image data in conjunction with computer vision techniques are a powerful tool for increasing the temporal resolution at which plant phenotypes can be measured non-destructively. Computational tools that are flexible and extendable are needed to address the diversity of plant phenotyping problems. We previously described the Plant Computer Vision (PlantCV software package, which is an image processing toolkit for plant phenotyping analysis. The goal of the PlantCV project is to develop a set of modular, reusable, and repurposable tools for plant image analysis that are open-source and community-developed. Here we present the details and rationale for major developments in the second major release of PlantCV. In addition to overall improvements in the organization of the PlantCV project, new functionality includes a set of new image processing and normalization tools, support for analyzing images that include multiple plants, leaf segmentation, landmark identification tools for morphometrics, and modules for machine learning.

  3. FIM imaging and FIMtrack: two new tools allowing high-throughput and cost effective locomotion analysis.

    Science.gov (United States)

    Risse, Benjamin; Otto, Nils; Berh, Dimitri; Jiang, Xiaoyi; Klämbt, Christian

    2014-12-24

    The analysis of neuronal network function requires a reliable measurement of behavioral traits. Since the behavior of freely moving animals is variable to a certain degree, many animals have to be analyzed, to obtain statistically significant data. This in turn requires a computer assisted automated quantification of locomotion patterns. To obtain high contrast images of almost translucent and small moving objects, a novel imaging technique based on frustrated total internal reflection called FIM was developed. In this setup, animals are only illuminated with infrared light at the very specific position of contact with the underlying crawling surface. This methodology results in very high contrast images. Subsequently, these high contrast images are processed using established contour tracking algorithms. Based on this, we developed the FIMTrack software, which serves to extract a number of features needed to quantitatively describe a large variety of locomotion characteristics. During the development of this software package, we focused our efforts on an open source architecture allowing the easy addition of further modules. The program operates platform independent and is accompanied by an intuitive GUI guiding the user through data analysis. All locomotion parameter values are given in form of csv files allowing further data analyses. In addition, a Results Viewer integrated into the tracking software provides the opportunity to interactively review and adjust the output, as might be needed during stimulus integration. The power of FIM and FIMTrack is demonstrated by studying the locomotion of Drosophila larvae.

  4. μTAS (micro total analysis systems) for the high-throughput measurement of nanomaterial solubility

    International Nuclear Information System (INIS)

    Tantra, R; Jarman, J

    2013-01-01

    There is a consensus in the nanoecotoxicology community that better analytical tools i.e. faster and more accurate ones, are needed for the physicochemical characterisation of nanomaterials in environmentally/biologically relevant media. In this study, we introduce the concept of μTAS (Micro Total Analysis Systems), which was a term coined to encapsulate the integration of laboratory processes on a single microchip. Our focus here is on the use of a capillary electrophoresis (CE) with conductivity detection microchip and how this may be used for the measurement of dissolution of metal oxide nanomaterials. Our preliminary results clearly show promise in that the device is able to: a) measure ionic zinc in various ecotox media with high selectivity b) track the dynamic dissolution events of zinc oxide (ZnO) nanomaterial when dispersed in fish medium.