WorldWideScience

Sample records for quantitative deep sequencing

  1. De novo peptide sequencing by deep learning.

    Science.gov (United States)

    Tran, Ngoc Hieu; Zhang, Xianglilan; Xin, Lei; Shan, Baozhen; Li, Ming

    2017-07-18

    De novo peptide sequencing from tandem MS data is the key technology in proteomics for the characterization of proteins, especially for new sequences, such as mAbs. In this study, we propose a deep neural network model, DeepNovo, for de novo peptide sequencing. DeepNovo architecture combines recent advances in convolutional neural networks and recurrent neural networks to learn features of tandem mass spectra, fragment ions, and sequence patterns of peptides. The networks are further integrated with local dynamic programming to solve the complex optimization task of de novo sequencing. We evaluated the method on a wide variety of species and found that DeepNovo considerably outperformed state of the art methods, achieving 7.7-22.9% higher accuracy at the amino acid level and 38.1-64.0% higher accuracy at the peptide level. We further used DeepNovo to automatically reconstruct the complete sequences of antibody light and heavy chains of mouse, achieving 97.5-100% coverage and 97.2-99.5% accuracy, without assisting databases. Moreover, DeepNovo is retrainable to adapt to any sources of data and provides a complete end-to-end training and prediction solution to the de novo sequencing problem. Not only does our study extend the deep learning revolution to a new field, but it also shows an innovative approach in solving optimization problems by using deep learning and dynamic programming.

  2. A quantitative lubricant test for deep drawing

    DEFF Research Database (Denmark)

    Olsson, David Dam; Bay, Niels; Andreasen, Jan L.

    2010-01-01

    A tribological test for deep drawing has been developed by which the performance of lubricants may be evaluated quantitatively measuring the maximum backstroke force on the punch owing to friction between tool and workpiece surface. The forming force is found not to give useful information...

  3. NGS-based deep bisulfite sequencing.

    Science.gov (United States)

    Lee, Suman; Kim, Joomyeong

    2016-01-01

    We have developed an NGS-based deep bisulfite sequencing protocol for the DNA methylation analysis of genomes. This approach allows the rapid and efficient construction of NGS-ready libraries with a large number of PCR products that have been individually amplified from bisulfite-converted DNA. This approach also employs a bioinformatics strategy to sort the raw sequence reads generated from NGS platforms and subsequently to derive DNA methylation levels for individual loci. The results demonstrated that this NGS-based deep bisulfite sequencing approach provide not only DNA methylation levels but also informative DNA methylation patterns that have not been seen through other existing methods.•This protocol provides an efficient method generating NGS-ready libraries from individually amplified PCR products.•This protocol provides a bioinformatics strategy sorting NGS-derived raw sequence reads.•This protocol provides deep bisulfite sequencing results that can measure DNA methylation levels and patterns of individual loci.

  4. Quantitative Prediction for Deep Mineral Exploration

    Institute of Scientific and Technical Information of China (English)

    Zhao Pengda; Cheng Qiuming; Xia Qinglin

    2008-01-01

    On reviewing the characteristics of deep mineral exploration, this article elaborates on the necessity of employing quantitative prediction to reduce uncertainty. This is caused by complexity of mineral deposit formational environments and mineralization systems as increase of exploration depth and incompleteness of geo-information from limited direct observation. The authors wish to share the idea of "seeking difference" principle in addition to the "similar analogy" principle in deep mineral exploration, especially the focus is on the new ores in depth either in an area with discovered shallow mineral deposits or in new areas where there are no sufficient mineral deposit models to be compared. An on-going research project, involving Sn and Cu mineral deposit quantitative prediction in the Gejiu (个旧) area of Yunnan (云南) Province, China, was briefly introduced to demonstrate how the "three-component" (geoanomaly-mineralization diversity-mineral deposit spectrum) theory and non-linear methods series in conjunction with advanced GIS technology, can be applied in multi-scale and multi-task deep mineral prospecting and quantitative mineral resource assessment.

  5. Approaching marine bioprospecting in hexacorals by RNA deep sequencing.

    Science.gov (United States)

    Johansen, Steinar D; Emblem, Ase; Karlsen, Bård Ove; Okkenhaug, Siri; Hansen, Hilde; Moum, Truls; Coucheron, Dag H; Seternes, Ole Morten

    2010-07-31

    RNA deep sequencing represents a new complementary approach in marine bioprospecting. Next-generation sequencing platforms have recently been developed for de novo whole transcriptome analysis, small RNA discovery and gene expression profiling. Deep sequencing transcriptomics (sequencing the complete set of cellular transcripts at a specific stage or condition) leads to sequential identification of all expressed genes in a sample. When combined to high-throughput bioinformatics and protein synthesis, RNA deep sequencing represents a new powerful approach in gene product discovery and bioprospecting. Here we summarize recent progress in the analyses of hexacoral transcriptomes with the focus on cold-water sea anemones and related organisms.

  6. Deep sequencing: becoming a critical tool in clinical virology.

    Science.gov (United States)

    Quiñones-Mateu, Miguel E; Avila, Santiago; Reyes-Teran, Gustavo; Martinez, Miguel A

    2014-09-01

    Population (Sanger) sequencing has been the standard method in basic and clinical DNA sequencing for almost 40 years; however, next-generation (deep) sequencing methodologies are now revolutionizing the field of genomics, and clinical virology is no exception. Deep sequencing is highly efficient, producing an enormous amount of information at low cost in a relatively short period of time. High-throughput sequencing techniques have enabled significant contributions to multiples areas in virology, including virus discovery and metagenomics (viromes), molecular epidemiology, pathogenesis, and studies of how viruses to escape the host immune system and antiviral pressures. In addition, new and more affordable deep sequencing-based assays are now being implemented in clinical laboratories. Here, we review the use of the current deep sequencing platforms in virology, focusing on three of the most studied viruses: human immunodeficiency virus (HIV), hepatitis C virus (HCV), and influenza virus. Copyright © 2014 Elsevier B.V. All rights reserved.

  7. Geoseq: a tool for dissecting deep-sequencing datasets

    OpenAIRE

    Homann Robert; George Ajish; Levovitz Chaya; Shah Hardik; Cancio Anthony; Gurtowski James; Sachidanandam Ravi

    2010-01-01

    Abstract Background Datasets generated on deep-sequencing platforms have been deposited in various public repositories such as the Gene Expression Omnibus (GEO), Sequence Read Archive (SRA) hosted by the NCBI, or the DNA Data Bank of Japan (ddbj). Despite being rich data sources, they have not been used much due to the difficulty in locating and analyzing datasets of interest. Results Geoseq http://geoseq.mssm.edu provides a new method of analyzing short reads from deep sequencing experiments...

  8. Development of a quantitative lubricant test for deep drawing

    DEFF Research Database (Denmark)

    Olsson, David Dam; Bay, Niels; Andreasen, Jan Lasson

    2004-01-01

    A tribological test for deep drawing has been developed by which the performance of lubricants may be evaluated quantitatively measuring the maximum backstroke force on the punch due to sliding friction between tool and work piece surface. The forming force is found not to give useful information...

  9. Development of a quantitative lubricant test for deep drawing

    DEFF Research Database (Denmark)

    Olsson, David Dam; Bay, Niels; Andreasen, Jan Lasson

    2004-01-01

    A tribological test for deep drawing has been developed by which the performance of lubricants may be evaluated quantitatively measuring the maximum backstroke force on the punch due to sliding friction between tool and work piece surface. The forming force is found not to give useful information...

  10. Quantitative photoacoustic image reconstruction improves accuracy in deep tissue structures.

    Science.gov (United States)

    Mastanduno, Michael A; Gambhir, Sanjiv S

    2016-10-01

    Photoacoustic imaging (PAI) is emerging as a potentially powerful imaging tool with multiple applications. Image reconstruction for PAI has been relatively limited because of limited or no modeling of light delivery to deep tissues. This work demonstrates a numerical approach to quantitative photoacoustic image reconstruction that minimizes depth and spectrally derived artifacts. We present the first time-domain quantitative photoacoustic image reconstruction algorithm that models optical sources through acoustic data to create quantitative images of absorption coefficients. We demonstrate quantitative accuracy of less than 5% error in large 3 cm diameter 2D geometries with multiple targets and within 22% error in the largest size quantitative photoacoustic studies to date (6cm diameter). We extend the algorithm to spectral data, reconstructing 6 varying chromophores to within 17% of the true values. This quantitiative PA tomography method was able to improve considerably on filtered-back projection from the standpoint of image quality, absolute, and relative quantification in all our simulation geometries. We characterize the effects of time step size, initial guess, and source configuration on final accuracy. This work could help to generate accurate quantitative images from both endogenous absorbers and exogenous photoacoustic dyes in both preclinical and clinical work, thereby increasing the information content obtained especially from deep-tissue photoacoustic imaging studies.

  11. Measuring cation dependent DNA polymerase fidelity landscapes by deep sequencing.

    Directory of Open Access Journals (Sweden)

    Bradley Michael Zamft

    Full Text Available High-throughput recording of signals embedded within inaccessible micro-environments is a technological challenge. The ideal recording device would be a nanoscale machine capable of quantitatively transducing a wide range of variables into a molecular recording medium suitable for long-term storage and facile readout in the form of digital data. We have recently proposed such a device, in which cation concentrations modulate the misincorporation rate of a DNA polymerase (DNAP on a known template, allowing DNA sequences to encode information about the local cation concentration. In this work we quantify the cation sensitivity of DNAP misincorporation rates, making possible the indirect readout of cation concentration by DNA sequencing. Using multiplexed deep sequencing, we quantify the misincorporation properties of two DNA polymerases--Dpo4 and Klenow exo(---obtaining the probability and base selectivity of misincorporation at all positions within the template. We find that Dpo4 acts as a DNA recording device for Mn(2+ with a misincorporation rate gain of ∼2%/mM. This modulation of misincorporation rate is selective to the template base: the probability of misincorporation on template T by Dpo4 increases >50-fold over the range tested, while the other template bases are affected less strongly. Furthermore, cation concentrations act as scaling factors for misincorporation: on a given template base, Mn(2+ and Mg(2+ change the overall misincorporation rate but do not alter the relative frequencies of incoming misincorporated nucleotides. Characterization of the ion dependence of DNAP misincorporation serves as the first step towards repurposing it as a molecular recording device.

  12. Genome-scale validation of deep-sequencing libraries.

    Directory of Open Access Journals (Sweden)

    Dominic Schmidt

    Full Text Available Chromatin immunoprecipitation followed by high-throughput (HTP sequencing (ChIP-seq is a powerful tool to establish protein-DNA interactions genome-wide. The primary limitation of its broad application at present is the often-limited access to sequencers. Here we report a protocol, Mab-seq, that generates genome-scale quality evaluations for nucleic acid libraries intended for deep-sequencing. We show how commercially available genomic microarrays can be used to maximize the efficiency of library creation and quickly generate reliable preliminary data on a chromosomal scale in advance of deep sequencing. We also exploit this technique to compare enriched regions identified using microarrays with those identified by sequencing, demonstrating that they agree on a core set of clearly identified enriched regions, while characterizing the additional enriched regions identifiable using HTP sequencing.

  13. Deep Sequencing Analysis of Apple Infecting Viruses in Korea

    OpenAIRE

    In-Sook Cho; Davaajargal Igori; Seungmo Lim; Gug-Seoun Choi; John Hammond; Hyoun-Sub Lim; Jae Sun Moon

    2016-01-01

    Deep sequencing has generated 52 contigs derived from five viruses; Apple chlorotic leaf spot virus (ACLSV), Apple stem grooving virus (ASGV), Apple stem pitting virus (ASPV), Apple green crinkle associated virus (AGCaV), and Apricot latent virus (ApLV) were identified from eight apple samples showing small leaves and/or growth retardation. Nucleotide (nt) sequence identity of the assembled contigs was from 68% to 99% compared to the reference sequences of the five respective viral genomes. S...

  14. Geoseq: a tool for dissecting deep-sequencing datasets

    Directory of Open Access Journals (Sweden)

    Homann Robert

    2010-10-01

    Full Text Available Abstract Background Datasets generated on deep-sequencing platforms have been deposited in various public repositories such as the Gene Expression Omnibus (GEO, Sequence Read Archive (SRA hosted by the NCBI, or the DNA Data Bank of Japan (ddbj. Despite being rich data sources, they have not been used much due to the difficulty in locating and analyzing datasets of interest. Results Geoseq http://geoseq.mssm.edu provides a new method of analyzing short reads from deep sequencing experiments. Instead of mapping the reads to reference genomes or sequences, Geoseq maps a reference sequence against the sequencing data. It is web-based, and holds pre-computed data from public libraries. The analysis reduces the input sequence to tiles and measures the coverage of each tile in a sequence library through the use of suffix arrays. The user can upload custom target sequences or use gene/miRNA names for the search and get back results as plots and spreadsheet files. Geoseq organizes the public sequencing data using a controlled vocabulary, allowing identification of relevant libraries by organism, tissue and type of experiment. Conclusions Analysis of small sets of sequences against deep-sequencing datasets, as well as identification of public datasets of interest, is simplified by Geoseq. We applied Geoseq to, a identify differential isoform expression in mRNA-seq datasets, b identify miRNAs (microRNAs in libraries, and identify mature and star sequences in miRNAS and c to identify potentially mis-annotated miRNAs. The ease of using Geoseq for these analyses suggests its utility and uniqueness as an analysis tool.

  15. Deep sequencing in the management of hepatitis virus infections.

    Science.gov (United States)

    Quer, Josep; Rodríguez-Frias, Francisco; Gregori, Josep; Tabernero, David; Soria, Maria Eugenia; García-Cehic, Damir; Homs, Maria; Bosch, Albert; Pintó, Rosa María; Esteban, Juan Ignacio; Domingo, Esteban; Perales, Celia

    2016-12-28

    The hepatitis viruses represent a major public health problem worldwide. Procedures for characterization of the genomic composition of their populations, accurate diagnosis, identification of multiple infections, and information on inhibitor-escape mutants for treatment decisions are needed. Deep sequencing methodologies are extremely useful for these viruses since they replicate as complex and dynamic quasispecies swarms whose complexity and mutant composition are biologically relevant traits. Population complexity is a major challenge for disease prevention and control, but also an opportunity to distinguish among related but phenotypically distinct variants that might anticipate disease progression and treatment outcome. Detailed characterization of mutant spectra should permit choosing better treatment options, given the increasing number of new antiviral inhibitors available. In the present review we briefly summarize our experience on the use of deep sequencing for the management of hepatitis virus infections, particularly for hepatitis B and C viruses, and outline some possible new applications of deep sequencing for these important human pathogens.

  16. Quantitative biostratigraphy and species evolutionary se-quence

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    Introduction of species evolutionary sequence into the quantitative biostratigraphy is a significant work, either for studying biologic evolution or for making stratigraphic correlation and reconstructing geologic history. The quantitative biostratigraphy is to determine biostratigraphic event sequences by using probabilistic analysis. The evolutionary sequence systematics can efficiently ascertain species evolutionary sequences. Two methods have been proposed to determine the sequence of species-disappearance events: (1) species extinction events can be closed by last occurrence events using quantitative biostratigraphic analysis; (2) the duration of a species may be approximately replaced by the duration of its parent species. To combine these two methods for determining the sequence of species disappearance is the best way up to now. A consulting standard sequence that consists of the speciation sequence of Permian waagenophylloid corals and the biostratigraphic event sequence of other important fossils in Permian is used as an example. The group spearman rank-correlation test is used to test the consulting standard sequence by comparing four types of calculations and two kinds of sequences and to find abnormal events. Based on the found abnormal events in the test, the consulting standard sequence is revised to deal with different conditions. Sequences of speciation and species-disappearance, and species duration are determined. Application of species evolutionary sequence to quantitative biostratigraphy can largely improve the quality of biostratigraphic event sequence. In stratigraphic correlation, furthermore, event sequences have higher precision than range biozones.

  17. Preparing DNA libraries for multiplexed paired-end deep sequencing for Illumina GA sequencers.

    Science.gov (United States)

    Son, Mike S; Taylor, Ronald K

    2011-02-01

    Whole-genome sequencing, also known as deep sequencing, is becoming a more affordable and efficient way to identify SNP mutations, deletions, and insertions in DNA sequences across several different strains. Two major obstacles preventing the widespread use of deep sequencers are the costs involved in services used to prepare DNA libraries for sequencing and the overall accuracy of the sequencing data. This unit describes the preparation of DNA libraries for multiplexed paired-end sequencing using the Illumina GA series sequencer. Self-preparation of DNA libraries can help reduce overall expenses, especially if optimization is required for the different samples, and use of the Illumina GA Sequencer can improve the quality of the data.

  18. Deep-Sea, Deep-Sequencing: Metabarcoding Extracellular DNA from Sediments of Marine Canyons.

    Directory of Open Access Journals (Sweden)

    Magdalena Guardiola

    Full Text Available Marine sediments are home to one of the richest species pools on Earth, but logistics and a dearth of taxonomic work-force hinders the knowledge of their biodiversity. We characterized α- and β-diversity of deep-sea assemblages from submarine canyons in the western Mediterranean using an environmental DNA metabarcoding. We used a new primer set targeting a short eukaryotic 18S sequence (ca. 110 bp. We applied a protocol designed to obtain extractions enriched in extracellular DNA from replicated sediment corers. With this strategy we captured information from DNA (local or deposited from the water column that persists adsorbed to inorganic particles and buffered short-term spatial and temporal heterogeneity. We analysed replicated samples from 20 localities including 2 deep-sea canyons, 1 shallower canal, and two open slopes (depth range 100-2,250 m. We identified 1,629 MOTUs, among which the dominant groups were Metazoa (with representatives of 19 phyla, Alveolata, Stramenopiles, and Rhizaria. There was a marked small-scale heterogeneity as shown by differences in replicates within corers and within localities. The spatial variability between canyons was significant, as was the depth component in one of the canyons where it was tested. Likewise, the composition of the first layer (1 cm of sediment was significantly different from deeper layers. We found that qualitative (presence-absence and quantitative (relative number of reads data showed consistent trends of differentiation between samples and geographic areas. The subset of exclusively benthic MOTUs showed similar patterns of β-diversity and community structure as the whole dataset. Separate analyses of the main metazoan phyla (in number of MOTUs showed some differences in distribution attributable to different lifestyles. Our results highlight the differentiation that can be found even between geographically close assemblages, and sets the ground for future monitoring and conservation

  19. Deep sequencing analysis of phage libraries using Illumina platform.

    Science.gov (United States)

    Matochko, Wadim L; Chu, Kiki; Jin, Bingjie; Lee, Sam W; Whitesides, George M; Derda, Ratmir

    2012-09-01

    This paper presents an analysis of phage-displayed libraries of peptides using Illumina. We describe steps for the preparation of short DNA fragments for deep sequencing and MatLab software for the analysis of the results. Screening of peptide libraries displayed on the surface of bacteriophage (phage display) can be used to discover peptides that bind to any target. The key step in this discovery is the analysis of peptide sequences present in the library. This analysis is usually performed by Sanger sequencing, which is labor intensive and limited to examination of a few hundred phage clones. On the other hand, Illumina deep-sequencing technology can characterize over 10(7) reads in a single run. We applied Illumina sequencing to analyze phage libraries. Using PCR, we isolated the variable regions from M13KE phage vectors from a phage display library. The PCR primers contained (i) sequences flanking the variable region, (ii) barcodes, and (iii) variable 5'-terminal region. We used this approach to examine how diversity of peptides in phage display libraries changes as a result of amplification of libraries in bacteria. Using HiSeq single-end Illumina sequencing of these fragments, we acquired over 2×10(7) reads, 57 base pairs (bp) in length. Each read contained information about the barcode (6bp), one complimentary region (12bp) and a variable region (36bp). We applied this sequencing to a model library of 10(6) unique clones and observed that amplification enriches ∼150 clones, which dominate ∼20% of the library. Deep sequencing, for the first time, characterized the collapse of diversity in phage libraries. The results suggest that screens based on repeated amplification and small-scale sequencing identify a few binding clones and miss thousands of useful clones. The deep sequencing approach described here could identify under-represented clones in phage screens. It could also be instrumental in developing new screening strategies, which can preserve

  20. deepTools: a flexible platform for exploring deep-sequencing data

    OpenAIRE

    Ramírez, Fidel; Dündar, Friederike; Diehl, Sarah; Grüning, Björn A; Manke, Thomas

    2014-01-01

    We present a Galaxy based web server for processing and visualizing deeply sequenced data. The web server's core functionality consists of a suite of newly developed tools, called deepTools, that enable users with little bioinformatic background to explore the results of their sequencing experiments in a standardized setting. Users can upload pre-processed files with continuous data in standard formats and generate heatmaps and summary plots in a straight-forward, yet highly customizable mann...

  1. deepTools: a flexible platform for exploring deep-sequencing data.

    Science.gov (United States)

    Ramírez, Fidel; Dündar, Friederike; Diehl, Sarah; Grüning, Björn A; Manke, Thomas

    2014-07-01

    We present a Galaxy based web server for processing and visualizing deeply sequenced data. The web server's core functionality consists of a suite of newly developed tools, called deepTools, that enable users with little bioinformatic background to explore the results of their sequencing experiments in a standardized setting. Users can upload pre-processed files with continuous data in standard formats and generate heatmaps and summary plots in a straight-forward, yet highly customizable manner. In addition, we offer several tools for the analysis of files containing aligned reads and enable efficient and reproducible generation of normalized coverage files. As a modular and open-source platform, deepTools can easily be expanded and customized to future demands and developments. The deepTools webserver is freely available at http://deeptools.ie-freiburg.mpg.de and is accompanied by extensive documentation and tutorials aimed at conveying the principles of deep-sequencing data analysis. The web server can be used without registration. deepTools can be installed locally either stand-alone or as part of Galaxy.

  2. DSAP: deep-sequencing small RNA analysis pipeline.

    Science.gov (United States)

    Huang, Po-Jung; Liu, Yi-Chung; Lee, Chi-Ching; Lin, Wei-Chen; Gan, Richie Ruei-Chi; Lyu, Ping-Chiang; Tang, Petrus

    2010-07-01

    DSAP is an automated multiple-task web service designed to provide a total solution to analyzing deep-sequencing small RNA datasets generated by next-generation sequencing technology. DSAP uses a tab-delimited file as an input format, which holds the unique sequence reads (tags) and their corresponding number of copies generated by the Solexa sequencing platform. The input data will go through four analysis steps in DSAP: (i) cleanup: removal of adaptors and poly-A/T/C/G/N nucleotides; (ii) clustering: grouping of cleaned sequence tags into unique sequence clusters; (iii) non-coding RNA (ncRNA) matching: sequence homology mapping against a transcribed sequence library from the ncRNA database Rfam (http://rfam.sanger.ac.uk/); and (iv) known miRNA matching: detection of known miRNAs in miRBase (http://www.mirbase.org/) based on sequence homology. The expression levels corresponding to matched ncRNAs and miRNAs are summarized in multi-color clickable bar charts linked to external databases. DSAP is also capable of displaying miRNA expression levels from different jobs using a log(2)-scaled color matrix. Furthermore, a cross-species comparative function is also provided to show the distribution of identified miRNAs in different species as deposited in miRBase. DSAP is available at http://dsap.cgu.edu.tw.

  3. Deep whole-genome sequencing of 100 southeast Asian Malays.

    Science.gov (United States)

    Wong, Lai-Ping; Ong, Rick Twee-Hee; Poh, Wan-Ting; Liu, Xuanyao; Chen, Peng; Li, Ruoying; Lam, Kevin Koi-Yau; Pillai, Nisha Esakimuthu; Sim, Kar-Seng; Xu, Haiyan; Sim, Ngak-Leng; Teo, Shu-Mei; Foo, Jia-Nee; Tan, Linda Wei-Lin; Lim, Yenly; Koo, Seok-Hwee; Gan, Linda Seo-Hwee; Cheng, Ching-Yu; Wee, Sharon; Yap, Eric Peng-Huat; Ng, Pauline Crystal; Lim, Wei-Yen; Soong, Richie; Wenk, Markus Rene; Aung, Tin; Wong, Tien-Yin; Khor, Chiea-Chuen; Little, Peter; Chia, Kee-Seng; Teo, Yik-Ying

    2013-01-10

    Whole-genome sequencing across multiple samples in a population provides an unprecedented opportunity for comprehensively characterizing the polymorphic variants in the population. Although the 1000 Genomes Project (1KGP) has offered brief insights into the value of population-level sequencing, the low coverage has compromised the ability to confidently detect rare and low-frequency variants. In addition, the composition of populations in the 1KGP is not complete, despite the fact that the study design has been extended to more than 2,500 samples from more than 20 population groups. The Malays are one of the Austronesian groups predominantly present in Southeast Asia and Oceania, and the Singapore Sequencing Malay Project (SSMP) aims to perform deep whole-genome sequencing of 100 healthy Malays. By sequencing at a minimum of 30× coverage, we have illustrated the higher sensitivity at detecting low-frequency and rare variants and the ability to investigate the presence of hotspots of functional mutations. Compared to the low-pass sequencing in the 1KGP, the deeper coverage allows more functional variants to be identified for each person. A comparison of the fidelity of genotype imputation of Malays indicated that a population-specific reference panel, such as the SSMP, outperforms a cosmopolitan panel with larger number of individuals for common SNPs. For lower-frequency (population-level sequencing versus low-pass sequencing, especially in populations that are poorly represented in population-genetics studies.

  4. Deep Sequencing Analysis of Apple Infecting Viruses in Korea.

    Science.gov (United States)

    Cho, In-Sook; Igori, Davaajargal; Lim, Seungmo; Choi, Gug-Seoun; Hammond, John; Lim, Hyoun-Sub; Moon, Jae Sun

    2016-10-01

    Deep sequencing has generated 52 contigs derived from five viruses; Apple chlorotic leaf spot virus (ACLSV), Apple stem grooving virus (ASGV), Apple stem pitting virus (ASPV), Apple green crinkle associated virus (AGCaV), and Apricot latent virus (ApLV) were identified from eight apple samples showing small leaves and/or growth retardation. Nucleotide (nt) sequence identity of the assembled contigs was from 68% to 99% compared to the reference sequences of the five respective viral genomes. Sequences of ASPV and ASGV were the most abundantly represented by the 52 contigs assembled. The presence of the five viruses in the samples was confirmed by RT-PCR using specific primers based on the sequences of each assembled contig. All five viruses were detected in three of the samples, whereas all samples had mixed infections with at least two viruses. The most frequently detected virus was ASPV, followed by ASGV, ApLV, ACLSV, and AGCaV which were withal found in mixed infections in the tested samples. AGCaV was identified in assembled contigs ID 1012480 and 93549, which showed 82% and 78% nt sequence identity with ORF1 of AGCaV isolate Aurora-1. ApLV was identified in three assembled contigs, ID 65587, 1802365, and 116777, which showed 77%, 78%, and 76% nt sequence identity respectively with ORF1 of ApLV isolate LA2. Deep sequencing assay was shown to be a valuable and powerful tool for detection and identification of known and unknown virome in infected apple trees, here identifying ApLV and AGCaV in commercial orchards in Korea for the first time.

  5. Deep Sequencing Analysis of Apple Infecting Viruses in Korea

    Directory of Open Access Journals (Sweden)

    In-Sook Cho

    2016-10-01

    Full Text Available Deep sequencing has generated 52 contigs derived from five viruses; Apple chlorotic leaf spot virus (ACLSV, Apple stem grooving virus (ASGV, Apple stem pitting virus (ASPV, Apple green crinkle associated virus (AGCaV, and Apricot latent virus (ApLV were identified from eight apple samples showing small leaves and/or growth retardation. Nucleotide (nt sequence identity of the assembled contigs was from 68% to 99% compared to the reference sequences of the five respective viral genomes. Sequences of ASPV and ASGV were the most abundantly represented by the 52 contigs assembled. The presence of the five viruses in the samples was confirmed by RT-PCR using specific primers based on the sequences of each assembled contig. All five viruses were detected in three of the samples, whereas all samples had mixed infections with at least two viruses. The most frequently detected virus was ASPV, followed by ASGV, ApLV, ACLSV, and AGCaV which were withal found in mixed infections in the tested samples. AGCaV was identified in assembled contigs ID 1012480 and 93549, which showed 82% and 78% nt sequence identity with ORF1 of AGCaV isolate Aurora-1. ApLV was identified in three assembled contigs, ID 65587, 1802365, and 116777, which showed 77%, 78%, and 76% nt sequence identity respectively with ORF1 of ApLV isolate LA2. Deep sequencing assay was shown to be a valuable and powerful tool for detection and identification of known and unknown virome in infected apple trees, here identifying ApLV and AGCaV in commercial orchards in Korea for the first time.

  6. Microsatellite discovery by deep sequencing of enriched genomic libraries.

    Science.gov (United States)

    Santana, Quentin; Coetzee, Martin; Steenkamp, Emma; Mlonyeni, Osmond; Hammond, Gifty; Wingfield, Michael; Wingfield, Brenda

    2009-03-01

    Robust molecular markers such as microsatellites are important tools used to understand the dynamics of natural populations, but their identification and development are typically time consuming and labor intensive. The recent emergence of so-called next-generation sequencing raised the question as to whether this new technology might be applied to microsatellite development. Following this view, we considered whether deep sequencing using the 454 Life Sciences/Roche GS-FLX genome sequencing system could lead to a rapid protocol to develop microsatellite primers as markers for genetic studies. For this purpose, genomic DNA was sourced from three unrelated organisms: a fungus (the pine pathogen Fusarium circinatum), an insect (the pine-damaging wasp Sirex noctilio), and the wasp's associated nematode parasite (Deladenus siricidicola). Two methods, FIASCO (fast isolation by AFLP of sequences containing repeats) and ISSR-PCR (inter-simple sequence repeat PCR), were used to generate microsatellite-enriched DNA for the 454 libraries. From the resulting 1.2-1.7 megabases of DNA sequence data, we were able to identify 873 microsatellites that have sufficient flanking sequence available for primer design and potential amplification. This approach to microsatellite discovery was substantially more rapid, effective, and economical than other methods, and this study has shown that pyrosequencing provides an outstanding new technology that can be applied to this purpose.

  7. Deep sequencing approach for investigating infectious agents causing fever.

    Science.gov (United States)

    Susilawati, T N; Jex, A R; Cantacessi, C; Pearson, M; Navarro, S; Susianto, A; Loukas, A C; McBride, W J H

    2016-07-01

    Acute undifferentiated fever (AUF) poses a diagnostic challenge due to the variety of possible aetiologies. While the majority of AUFs resolve spontaneously, some cases become prolonged and cause significant morbidity and mortality, necessitating improved diagnostic methods. This study evaluated the utility of deep sequencing in fever investigation. DNA and RNA were isolated from plasma/sera of AUF cases being investigated at Cairns Hospital in northern Australia, including eight control samples from patients with a confirmed diagnosis. Following isolation, DNA and RNA were bulk amplified and RNA was reverse transcribed to cDNA. The resulting DNA and cDNA amplicons were subjected to deep sequencing on an Illumina HiSeq 2000 platform. Bioinformatics analysis was performed using the program Kraken and the CLC assembly-alignment pipeline. The results were compared with the outcomes of clinical tests. We generated between 4 and 20 million reads per sample. The results of Kraken and CLC analyses concurred with diagnoses obtained by other means in 87.5 % (7/8) and 25 % (2/8) of control samples, respectively. Some plausible causes of fever were identified in ten patients who remained undiagnosed following routine hospital investigations, including Escherichia coli bacteraemia and scrub typhus that eluded conventional tests. Achromobacter xylosoxidans, Alteromonas macleodii and Enterobacteria phage were prevalent in all samples. A deep sequencing approach of patient plasma/serum samples led to the identification of aetiological agents putatively implicated in AUFs and enabled the study of microbial diversity in human blood. The application of this approach in hospital practice is currently limited by sequencing input requirements and complicated data analysis.

  8. HIV-1 quasispecies delineation by tag linkage deep sequencing.

    Science.gov (United States)

    Wu, Nicholas C; De La Cruz, Justin; Al-Mawsawi, Laith Q; Olson, C Anders; Qi, Hangfei; Luan, Harding H; Nguyen, Nguyen; Du, Yushen; Le, Shuai; Wu, Ting-Ting; Li, Xinmin; Lewis, Martha J; Yang, Otto O; Sun, Ren

    2014-01-01

    Trade-offs between throughput, read length, and error rates in high-throughput sequencing limit certain applications such as monitoring viral quasispecies. Here, we describe a molecular-based tag linkage method that allows assemblage of short sequence reads into long DNA fragments. It enables haplotype phasing with high accuracy and sensitivity to interrogate individual viral sequences in a quasispecies. This approach is demonstrated to deduce ∼ 2000 unique 1.3 kb viral sequences from HIV-1 quasispecies in vivo and after passaging ex vivo with a detection limit of ∼ 0.005% to ∼ 0.001%. Reproducibility of the method is validated quantitatively and qualitatively by a technical replicate. This approach can improve monitoring of the genetic architecture and evolution dynamics in any quasispecies population.

  9. deepTools2: a next generation web server for deep-sequencing data analysis.

    Science.gov (United States)

    Ramírez, Fidel; Ryan, Devon P; Grüning, Björn; Bhardwaj, Vivek; Kilpert, Fabian; Richter, Andreas S; Heyne, Steffen; Dündar, Friederike; Manke, Thomas

    2016-07-08

    We present an update to our Galaxy-based web server for processing and visualizing deeply sequenced data. Its core tool set, deepTools, allows users to perform complete bioinformatic workflows ranging from quality controls and normalizations of aligned reads to integrative analyses, including clustering and visualization approaches. Since we first described our deepTools Galaxy server in 2014, we have implemented new solutions for many requests from the community and our users. Here, we introduce significant enhancements and new tools to further improve data visualization and interpretation. deepTools continue to be open to all users and freely available as a web service at deeptools.ie-freiburg.mpg.de The new deepTools2 suite can be easily deployed within any Galaxy framework via the toolshed repository, and we also provide source code for command line usage under Linux and Mac OS X. A public and documented API for access to deepTools functionality is also available. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. Deep sequencing of HIV: clinical and research applications.

    Science.gov (United States)

    Chabria, Shiven B; Gupta, Shaili; Kozal, Michael J

    2014-01-01

    Human immunodeficiency virus (HIV) exhibits remarkable diversity in its genomic makeup and exists in any given individual as a complex distribution of closely related but nonidentical genomes called a viral quasispecies, which is subject to genetic variation, competition, and selection. This viral diversity clinically manifests as a selection of mutant variants based on viral fitness in treatment-naive individuals and based on drug-selective pressure in those on antiretroviral therapy (ART). The current standard-of-care ART consists of a combination of antiretroviral agents, which ensures maximal viral suppression while preventing the emergence of drug-resistant HIV variants. Unfortunately, transmission of drug-resistant HIV does occur, affecting 5% to >20% of newly infected individuals. To optimize therapy, clinicians rely on viral genotypic information obtained from conventional population sequencing-based assays, which cannot reliably detect viral variants that constitute <20% of the circulating viral quasispecies. These low-frequency variants can be detected by highly sensitive genotyping methods collectively grouped under the moniker of deep sequencing. Low-frequency variants have been correlated to treatment failures and HIV transmission, and detection of these variants is helping to inform strategies for vaccine development. Here, we discuss the molecular virology of HIV, viral heterogeneity, drug-resistance mutations, and the application of deep sequencing technologies in research and the clinical care of HIV-infected individuals.

  11. Molecular indexing enables quantitative targeted RNA sequencing and reveals poor efficiencies in standard library preparations.

    Science.gov (United States)

    Fu, Glenn K; Xu, Weihong; Wilhelmy, Julie; Mindrinos, Michael N; Davis, Ronald W; Xiao, Wenzhong; Fodor, Stephen P A

    2014-02-01

    We present a simple molecular indexing method for quantitative targeted RNA sequencing, in which mRNAs of interest are selectively captured from complex cDNA libraries and sequenced to determine their absolute concentrations. cDNA fragments are individually labeled so that each molecule can be tracked from the original sample through the library preparation and sequencing process. Multiple copies of cDNA fragments of identical sequence become distinct through labeling, and replicate clones created during PCR amplification steps can be identified and assigned to their distinct parent molecules. Selective capture enables efficient use of sequencing for deep sampling and for the absolute quantitation of rare or transient transcripts that would otherwise escape detection by standard sequencing methods. We have also constructed a set of synthetic barcoded RNA molecules, which can be introduced as controls into the sample preparation mix and used to monitor the efficiency of library construction. The quantitative targeted sequencing revealed extremely low efficiency in standard library preparations, which were further confirmed by using synthetic barcoded RNA molecules. This finding shows that standard library preparation methods result in the loss of rare transcripts and highlights the need for monitoring library efficiency and for developing more efficient sample preparation methods.

  12. Deep-sequencing protocols influence the results obtained in small-RNA sequencing.

    Directory of Open Access Journals (Sweden)

    Joern Toedling

    Full Text Available Second-generation sequencing is a powerful method for identifying and quantifying small-RNA components of cells. However, little attention has been paid to the effects of the choice of sequencing platform and library preparation protocol on the results obtained. We present a thorough comparison of small-RNA sequencing libraries generated from the same embryonic stem cell lines, using different sequencing platforms, which represent the three major second-generation sequencing technologies, and protocols. We have analysed and compared the expression of microRNAs, as well as populations of small RNAs derived from repetitive elements. Despite the fact that different libraries display a good correlation between sequencing platforms, qualitative and quantitative variations in the results were found, depending on the protocol used. Thus, when comparing libraries from different biological samples, it is strongly recommended to use the same sequencing platform and protocol in order to ensure the biological relevance of the comparisons.

  13. From DNA sequence to transcriptional behaviour: a quantitative approach.

    Science.gov (United States)

    Segal, Eran; Widom, Jonathan

    2009-07-01

    Complex transcriptional behaviours are encoded in the DNA sequences of gene regulatory regions. Advances in our understanding of these behaviours have been recently gained through quantitative models that describe how molecules such as transcription factors and nucleosomes interact with genomic sequences. An emerging view is that every regulatory sequence is associated with a unique binding affinity landscape for each molecule and, consequently, with a unique set of molecule-binding configurations and transcriptional outputs. We present a quantitative framework based on existing methods that unifies these ideas. This framework explains many experimental observations regarding the binding patterns of factors and nucleosomes and the dynamics of transcriptional activation. It can also be used to model more complex phenomena such as transcriptional noise and the evolution of transcriptional regulation.

  14. Quantitative assessment of deep gas migration in Fennoscandian sites

    Energy Technology Data Exchange (ETDEWEB)

    Delos, Anne; Trinchero, Paolo; Richard, Laurent; Molinero, Jorge (Amphos 21 Consulting S.L., Barcelona (Spain)); Dentz, Marco (IDAEA-CSIC Instituto de Diagnostico Ambiental y Estudios del Agua, Barcelona (Spain)); Pitkaenen, Petteri (Posiva Oy, Olkiluoto, Eurajoki (Finland))

    2010-11-15

    (which, strictly speaking, is infinite). It turns out that its results cannot be reliably used to estimate gas fluxes. They can rather provide an estimate of the effective in situ gas production (i.e. radiogenic production) averaged over the model domain. Such effective in situ gas productions have been computed and discussed for the 3 sites. Helium profiles have been modelled using two different approaches: calibrating the residence time or estimating the release fraction (i.e. the rate of Helium production actually released to the water). Methane and Hydrogen in situ productions have been then determined either setting time equal to the age of formation of the rock or using the value of residence time obtained from the analysis of the Helium profiles. A new analytical solution that can take into account not only the radiogenic production but also the flux from a deep source of helium has been developed. This solution in companion with further field characterization (e.g. isotopic measurements of Helium) provides a powerful tool that allows accounting for the coupled effect of a (limited in space) in situ production and a source occurring at a large depth in the crust or mantle. It is thought that this new analytical solution could be used for future quantitative modelling of gas migration when more data were available

  15. An efficient quantitation method of next-generation sequencing libraries by using MiSeq sequencer.

    Science.gov (United States)

    Katsuoka, Fumiki; Yokozawa, Junji; Tsuda, Kaoru; Ito, Shin; Pan, Xiaoqing; Nagasaki, Masao; Yasuda, Jun; Yamamoto, Masayuki

    2014-12-01

    Library quantitation is a critical step to obtain high data output in Illumina HiSeq sequencers. Here, we introduce a library quantitation method that uses the Illumina MiSeq sequencer designated as quantitative MiSeq (qMiSeq). In this procedure, 96 dual-index libraries, including control samples, are denatured, pooled in equal volume, and sequenced by MiSeq. We found that relative concentration of each library can be determined based on the observed index ratio and can be used to determine HiSeq run condition for each library. Thus, qMiSeq provides an efficient way to quantitate a large number of libraries at a time.

  16. Deep homology in the age of next-generation sequencing.

    Science.gov (United States)

    Tschopp, Patrick; Tabin, Clifford J

    2017-02-05

    The principle of homology is central to conceptualizing the comparative aspects of morphological evolution. The distinctions between homologous or non-homologous structures have become blurred, however, as modern evolutionary developmental biology (evo-devo) has shown that novel features often result from modification of pre-existing developmental modules, rather than arising completely de novo. With this realization in mind, the term 'deep homology' was coined, in recognition of the remarkably conserved gene expression during the development of certain animal structures that would not be considered homologous by previous strict definitions. At its core, it can help to formulate an understanding of deeper layers of ontogenetic conservation for anatomical features that lack any clear phylogenetic continuity. Here, we review deep homology and related concepts in the context of a gene expression-based homology discussion. We then focus on how these conceptual frameworks have profited from the recent rise of high-throughput next-generation sequencing. These techniques have greatly expanded the range of organisms amenable to such studies. Moreover, they helped to elevate the traditional gene-by-gene comparison to a transcriptome-wide level. We will end with an outlook on the next challenges in the field and how technological advances might provide exciting new strategies to tackle these questions.This article is part of the themed issue 'Evo-devo in the genomics era, and the origins of morphological diversity'. © 2016 The Author(s).

  17. Error Analysis of Deep Sequencing of Phage Libraries: Peptides Censored in Sequencing

    Directory of Open Access Journals (Sweden)

    Wadim L. Matochko

    2013-01-01

    Full Text Available Next-generation sequencing techniques empower selection of ligands from phage-display libraries because they can detect low abundant clones and quantify changes in the copy numbers of clones without excessive selection rounds. Identification of errors in deep sequencing data is the most critical step in this process because these techniques have error rates >1%. Mechanisms that yield errors in Illumina and other techniques have been proposed, but no reports to date describe error analysis in phage libraries. Our paper focuses on error analysis of 7-mer peptide libraries sequenced by Illumina method. Low theoretical complexity of this phage library, as compared to complexity of long genetic reads and genomes, allowed us to describe this library using convenient linear vector and operator framework. We describe a phage library as N×1 frequency vector n=ni, where ni is the copy number of the ith sequence and N is the theoretical diversity, that is, the total number of all possible sequences. Any manipulation to the library is an operator acting on n. Selection, amplification, or sequencing could be described as a product of a N×N matrix and a stochastic sampling operator (Sa. The latter is a random diagonal matrix that describes sampling of a library. In this paper, we focus on the properties of Sa and use them to define the sequencing operator (Seq. Sequencing without any bias and errors is Seq=Sa IN, where IN is a N×N unity matrix. Any bias in sequencing changes IN to a nonunity matrix. We identified a diagonal censorship matrix (CEN, which describes elimination or statistically significant downsampling, of specific reads during the sequencing process.

  18. Ultra-deep sequencing of foraminiferal microbarcodes unveils hidden richness of early monothalamous lineages in deep-sea sediments.

    Science.gov (United States)

    Lecroq, Béatrice; Lejzerowicz, Franck; Bachar, Dipankar; Christen, Richard; Esling, Philippe; Baerlocher, Loïc; Østerås, Magne; Farinelli, Laurent; Pawlowski, Jan

    2011-08-09

    Deep-sea floors represent one of the largest and most complex ecosystems on Earth but remain essentially unexplored. The vastness and remoteness of this ecosystem make deep-sea sampling difficult, hampering traditional taxonomic observations and diversity assessment. This problem is particularly true in the case of the deep-sea meiofauna, which largely comprises small-sized, fragile, and difficult-to-identify metazoans and protists. Here, we introduce an ultra-deep sequencing-based metagenetic approach to examine the richness of benthic foraminifera, a principal component of deep-sea meiofauna. We used Illumina sequencing technology to assess foraminiferal richness in 31 unsieved deep-sea sediment samples from five distinct oceanic regions. We sequenced an extremely short fragment (36 bases) of the small subunit ribosomal DNA hypervariable region 37f, which has been shown to accurately distinguish foraminiferal species. In total, we obtained 495,978 unique sequences that were grouped into 1,643 operational taxonomic units, of which about half (841) could be reliably assigned to foraminifera. The vast majority of the operational taxonomic units (nearly 90%) were either assigned to early (ancient) lineages of soft-walled, single-chambered (monothalamous) foraminifera or remained undetermined and yet possibly belong to unknown early lineages. Contrasting with the classical view of multichambered taxa dominating foraminiferal assemblages, our work reflects an unexpected diversity of monothalamous lineages that are as yet unknown using conventional micropaleontological observations. Although we can only speculate about their morphology, the immense richness of deep-sea phylotypes revealed by this study suggests that ultra-deep sequencing can improve understanding of deep-sea benthic diversity considered until now as unknowable based on a traditional taxonomic approach.

  19. Deep Sequencing the MicroRNA Transcriptome in Colorectal Cancer.

    Directory of Open Access Journals (Sweden)

    Kristina Schee

    Full Text Available Colorectal cancer (CRC is one of the leading causes of cancer related deaths and the search for prognostic biomarkers that might improve treatment decisions is warranted. MicroRNAs (miRNAs are short non-coding RNA molecules involved in regulating gene expression and have been proposed as possible biomarkers in CRC. In order to characterize the miRNA transcriptome, a large cohort including 88 CRC tumors with long-term follow-up was deep sequenced. 523 mature miRNAs were expressed in our cohort, and they exhibited largely uniform expression patterns across tumor samples. Few associations were found between clinical parameters and miRNA expression, among them, low expression of miR-592 and high expression of miR-10b-5p and miR-615-3p were associated with tumors located in the right colon relative to the left colon and rectum. High expression of miR-615-3p was also associated with poorly differentiated tumors. No prognostic biomarker candidates for overall and metastasis-free survival were identified by applying the LASSO method in a Cox proportional hazards model or univariate Cox. Examination of the five most abundantly expressed miRNAs in the cohort (miR-10a-5p, miR-21-5p, miR-22-3p, miR-143-3p and miR-192-5p revealed that their collective expression represented 54% of the detected miRNA sequences. Pathway analysis of the target genes regulated by the five most highly expressed miRNAs uncovered a significant number of genes involved in the CRC pathway, including APC, TGFβ and PI3K, thus suggesting that these miRNAs are relevant in CRC.

  20. Illuminating uveitis: metagenomic deep sequencing identifies common and rare pathogens.

    Science.gov (United States)

    Doan, Thuy; Wilson, Michael R; Crawford, Emily D; Chow, Eric D; Khan, Lillian M; Knopp, Kristeene A; O'Donovan, Brian D; Xia, Dongxiang; Hacker, Jill K; Stewart, Jay M; Gonzales, John A; Acharya, Nisha R; DeRisi, Joseph L

    2016-08-25

    Ocular infections remain a major cause of blindness and morbidity worldwide. While prognosis is dependent on the timing and accuracy of diagnosis, the etiology remains elusive in ~50 % of presumed infectious uveitis cases. The objective of this study is to determine if unbiased metagenomic deep sequencing (MDS) can accurately detect pathogens in intraocular fluid samples of patients with uveitis. This is a proof-of-concept study, in which intraocular fluid samples were obtained from five subjects with known diagnoses, and one subject with bilateral chronic uveitis without a known etiology. Samples were subjected to MDS, and results were compared with those from conventional diagnostic tests. Pathogens were identified using a rapid computational pipeline to analyze the non-host sequences obtained from MDS. Unbiased MDS of intraocular fluid produced results concordant with known diagnoses in subjects with (n = 4) and without (n = 1) uveitis. Samples positive for Cryptococcus neoformans, Toxoplasma gondii, and herpes simplex virus 1 as tested by a Clinical Laboratory Improvement Amendments-certified laboratory were correctly identified with MDS. Rubella virus was identified in one case of chronic bilateral idiopathic uveitis. The subject's strain was most closely related to a German rubella virus strain isolated in 1992, one year before he developed a fever and rash while living in Germany. The pattern and the number of viral identified mutations present in the patient's strain were consistent with long-term viral replication in the eye. MDS can identify fungi, parasites, and DNA and RNA viruses in minute volumes of intraocular fluid samples. The identification of chronic intraocular rubella virus infection highlights the eye's role as a long-term pathogen reservoir, which has implications for virus eradication and emerging global epidemics.

  1. A quantitative SMRT cell sequencing method for ribosomal amplicons.

    Science.gov (United States)

    Jones, Bethan M; Kustka, Adam B

    2017-04-01

    Advances in sequencing technologies continue to provide unprecedented opportunities to characterize microbial communities. For example, the Pacific Biosciences Single Molecule Real-Time (SMRT) platform has emerged as a unique approach harnessing DNA polymerase activity to sequence template molecules, enabling long reads at low costs. With the aim to simultaneously classify and enumerate in situ microbial populations, we developed a quantitative SMRT (qSMRT) approach that involves the addition of exogenous standards to quantify ribosomal amplicons derived from environmental samples. The V7-9 regions of 18S SSU rDNA were targeted and quantified from protistan community samples collected in the Ross Sea during the Austral summer of 2011. We used three standards of different length and optimized conditions to obtain accurate quantitative retrieval across the range of expected amplicon sizes, a necessary criterion for analyzing taxonomically diverse 18S rDNA molecules from natural environments. The ability to concurrently identify and quantify microorganisms in their natural environment makes qSMRT a powerful, rapid and cost-effective approach for defining ecosystem diversity and function.

  2. Deep sequencing of the murine olfactory receptor neuron transcriptome.

    Directory of Open Access Journals (Sweden)

    Ninthujah Kanageswaran

    Full Text Available The ability of animals to sense and differentiate among thousands of odorants relies on a large set of olfactory receptors (OR and a multitude of accessory proteins within the olfactory epithelium (OE. ORs and related signaling mechanisms have been the subject of intensive studies over the past years, but our knowledge regarding olfactory processing remains limited. The recent development of next generation sequencing (NGS techniques encouraged us to assess the transcriptome of the murine OE. We analyzed RNA from OEs of female and male adult mice and from fluorescence-activated cell sorting (FACS-sorted olfactory receptor neurons (ORNs obtained from transgenic OMP-GFP mice. The Illumina RNA-Seq protocol was utilized to generate up to 86 million reads per transcriptome. In OE samples, nearly all OR and trace amine-associated receptor (TAAR genes involved in the perception of volatile amines were detectably expressed. Other genes known to participate in olfactory signaling pathways were among the 200 genes with the highest expression levels in the OE. To identify OE-specific genes, we compared olfactory neuron expression profiles with RNA-Seq transcriptome data from different murine tissues. By analyzing different transcript classes, we detected the expression of non-olfactory GPCRs in ORNs and established an expression ranking for GPCRs detected in the OE. We also identified other previously undescribed membrane proteins as potential new players in olfaction. The quantitative and comprehensive transcriptome data provide a virtually complete catalogue of genes expressed in the OE and present a useful tool to uncover candidate genes involved in, for example, olfactory signaling, OR trafficking and recycling, and proliferation.

  3. Virus identification in unknown tropical febrile illness cases using deep sequencing.

    Directory of Open Access Journals (Sweden)

    Nathan L Yozwiak

    Full Text Available Dengue virus is an emerging infectious agent that infects an estimated 50-100 million people annually worldwide, yet current diagnostic practices cannot detect an etiologic pathogen in ∼40% of dengue-like illnesses. Metagenomic approaches to pathogen detection, such as viral microarrays and deep sequencing, are promising tools to address emerging and non-diagnosable disease challenges. In this study, we used the Virochip microarray and deep sequencing to characterize the spectrum of viruses present in human sera from 123 Nicaraguan patients presenting with dengue-like symptoms but testing negative for dengue virus. We utilized a barcoding strategy to simultaneously deep sequence multiple serum specimens, generating on average over 1 million reads per sample. We then implemented a stepwise bioinformatic filtering pipeline to remove the majority of human and low-quality sequences to improve the speed and accuracy of subsequent unbiased database searches. By deep sequencing, we were able to detect virus sequence in 37% (45/123 of previously negative cases. These included 13 cases with Human Herpesvirus 6 sequences. Other samples contained sequences with similarity to sequences from viruses in the Herpesviridae, Flaviviridae, Circoviridae, Anelloviridae, Asfarviridae, and Parvoviridae families. In some cases, the putative viral sequences were virtually identical to known viruses, and in others they diverged, suggesting that they may derive from novel viruses. These results demonstrate the utility of unbiased metagenomic approaches in the detection of known and divergent viruses in the study of tropical febrile illness.

  4. Error rates, PCR recombination, and sampling depth in HIV-1 whole genome deep sequencing.

    Science.gov (United States)

    Zanini, Fabio; Brodin, Johanna; Albert, Jan; Neher, Richard A

    2016-12-27

    Deep sequencing is a powerful and cost-effective tool to characterize the genetic diversity and evolution of virus populations. While modern sequencing instruments readily cover viral genomes many thousand fold and very rare variants can in principle be detected, sequencing errors, amplification biases, and other artifacts can limit sensitivity and complicate data interpretation. For this reason, the number of studies using whole genome deep sequencing to characterize viral quasi-species in clinical samples is still limited. We have previously undertaken a large scale whole genome deep sequencing study of HIV-1 populations. Here we discuss the challenges, error profiles, control experiments, and computational test we developed to quantify the accuracy of variant frequency estimation.

  5. KRAS, BRAF, and TP53 deep sequencing for colorectal carcinoma patient diagnostics.

    Science.gov (United States)

    Rechsteiner, Markus; von Teichman, Adriana; Rüschoff, Jan H; Fankhauser, Niklaus; Pestalozzi, Bernhard; Schraml, Peter; Weber, Achim; Wild, Peter; Zimmermann, Dieter; Moch, Holger

    2013-05-01

    In colorectal carcinoma, KRAS (alias Ki-ras) and BRAF mutations have emerged as predictors of resistance to anti-epidermal growth factor receptor antibody treatment and worse patient outcome, respectively. In this study, we aimed to establish a high-throughput deep sequencing workflow according to 454 pyrosequencing technology to cope with the increasing demand for sequence information at medical institutions. A cohort of 81 patients with known KRAS mutation status detected by Sanger sequencing was chosen for deep sequencing. The workflow allowed us to analyze seven amplicons (one BRAF, two KRAS, and four TP53 exons) of nine patients in parallel in one deep sequencing run. Target amplification and variant calling showed reproducible results with input DNA derived from FFPE tissue that ranged from 0.4 to 50 ng with the use of different targets and multiplex identifiers. Equimolar pooling of each amplicon in a deep sequencing run was necessary to counterbalance differences in patient tissue quality. Five BRAF and 49 TP53 mutations with functional consequences were detected. The lowest mutation frequency detected in a patient tumor population was 5% in TP53 exon 5. This low-frequency mutation was successfully verified in a second PCR and deep sequencing run. In summary, our workflow allows us to process 315 targets a week and provides the quality, flexibility, and speed needed to be integrated as standard procedure for mutational analysis in diagnostics.

  6. Protein sequences bound to mineral surfaces persist into deep time

    OpenAIRE

    Demarchi, Beatrice; Hall, Shaun; Roncal-Herrero, Teresa; Freeman, Colin L.; Woolley, Jos; Crisp, Molly K; Wilson, Julie; Fotakis, Anna Katerina; Fischer, Roman; Kessler, Benedikt M; Jersie-Christensen, Rosa Rakownikow; Olsen, Jesper Velgaard; Haile, James; Thomas, Jessica; Marean, Curtis W.

    2016-01-01

    Proteins persist longer in the fossil record than DNA, but the longevity, survival mechanisms and substrates remain contested. Here, we demonstrate the role of mineral binding in preserving the protein sequence in ostrich (Struthionidae) eggshell, including from the palaeontological sites of Laetoli (3.8 Ma) and Olduvai Gorge (1.3 Ma) in Tanzania. By tracking protein diagenesis back in time we find consistent patterns of preservation, demonstrating authenticity of the surviving sequences. Mol...

  7. Deep amplicon sequencing reveals mixed phytoplasma infection within single grapevine plants

    DEFF Research Database (Denmark)

    Nicolaisen, Mogens; Contaldo, Nicoletta; Makarova, Olga

    2011-01-01

    The diversity of phytoplasmas within single plants has not yet been fully investigated. In this project, deep amplicon sequencing was used to generate 50,926 phytoplasma sequences from 11 phytoplasma-infected grapevine samples from a PCR amplicon in the 5' end of the 16S region. After clustering ...

  8. Deep amplicon sequencing reveals mixed phytoplasma infection within single grapevine plants

    DEFF Research Database (Denmark)

    Nicolaisen, Mogens; Contaldo, Nicoletta; Makarova, Olga

    2011-01-01

    The diversity of phytoplasmas within single plants has not yet been fully investigated. In this project, deep amplicon sequencing was used to generate 50,926 phytoplasma sequences from 11 phytoplasma-infected grapevine samples from a PCR amplicon in the 5' end of the 16S region. After clustering ...

  9. Protein sequences bound to mineral surfaces persist into deep time

    Science.gov (United States)

    Demarchi, Beatrice; Hall, Shaun; Roncal-Herrero, Teresa; Freeman, Colin L; Woolley, Jos; Crisp, Molly K; Wilson, Julie; Fotakis, Anna; Fischer, Roman; Kessler, Benedikt M; Rakownikow Jersie-Christensen, Rosa; Olsen, Jesper V; Haile, James; Thomas, Jessica; Marean, Curtis W; Parkington, John; Presslee, Samantha; Lee-Thorp, Julia; Ditchfield, Peter; Hamilton, Jacqueline F; Ward, Martyn W; Wang, Chunting Michelle; Shaw, Marvin D; Harrison, Terry; Domínguez-Rodrigo, Manuel; MacPhee, Ross DE; Kwekason, Amandus; Ecker, Michaela; Kolska Horwitz, Liora; Chazan, Michael; Kröger, Roland; Thomas-Oates, Jane; Harding, John H; Cappellini, Enrico; Penkman, Kirsty; Collins, Matthew J

    2016-01-01

    Proteins persist longer in the fossil record than DNA, but the longevity, survival mechanisms and substrates remain contested. Here, we demonstrate the role of mineral binding in preserving the protein sequence in ostrich (Struthionidae) eggshell, including from the palaeontological sites of Laetoli (3.8 Ma) and Olduvai Gorge (1.3 Ma) in Tanzania. By tracking protein diagenesis back in time we find consistent patterns of preservation, demonstrating authenticity of the surviving sequences. Molecular dynamics simulations of struthiocalcin-1 and -2, the dominant proteins within the eggshell, reveal that distinct domains bind to the mineral surface. It is the domain with the strongest calculated binding energy to the calcite surface that is selectively preserved. Thermal age calculations demonstrate that the Laetoli and Olduvai peptides are 50 times older than any previously authenticated sequence (equivalent to ~16 Ma at a constant 10°C). DOI: http://dx.doi.org/10.7554/eLife.17092.001 PMID:27668515

  10. Protein sequences bound to mineral surfaces persist into deep time

    DEFF Research Database (Denmark)

    Demarchi, Beatrice; Hall, Shaun; Roncal-Herrero, Teresa;

    2016-01-01

    of Laetoli (3.8 Ma) and Olduvai Gorge (1.3 Ma) in Tanzania. By tracking protein diagenesis back in time we find consistent patterns of preservation, demonstrating authenticity of the surviving sequences. Molecular dynamics simulations of struthiocalcin-1 and -2, the dominant proteins within the eggshell......, reveal that distinct domains bind to the mineral surface. It is the domain with the strongest calculated binding energy to the calcite surface that is selectively preserved. Thermal age calculations demonstrate that the Laetoli and Olduvai peptides are 50 times older than any previously authenticated...... sequence (equivalent to ~16 Ma at a constant 10°C)....

  11. Protein sequences bound to mineral surfaces persist into deep time

    DEFF Research Database (Denmark)

    Demarchi, Beatrice; Hall, Shaun; Roncal-Herrero, Teresa

    2016-01-01

    Proteins persist longer in the fossil record than DNA, but the longevity, survival mechanisms and substrates remain contested. Here, we demonstrate the role of mineral binding in preserving the protein sequence in ostrich (Struthionidae) eggshell, including from the palaeontological sites...... of Laetoli (3.8 Ma) and Olduvai Gorge (1.3 Ma) in Tanzania. By tracking protein diagenesis back in time we find consistent patterns of preservation, demonstrating authenticity of the surviving sequences. Molecular dynamics simulations of struthiocalcin-1 and -2, the dominant proteins within the eggshell...

  12. Characterization of the Melanoma miRNAome by Deep Sequencing.

    Directory of Open Access Journals (Sweden)

    Mitchell S Stark

    Full Text Available BACKGROUND: MicroRNAs (miRNAs are 18-23 nucleotide non-coding RNAs that regulate gene expression in a sequence specific manner. Little is known about the repertoire and function of miRNAs in melanoma or the melanocytic lineage. We therefore undertook a comprehensive analysis of the miRNAome in a diverse range of pigment cells including: melanoblasts, melanocytes, congenital nevocytes, acral, mucosal, cutaneous and uveal melanoma cells. METHODOLOGY/PRINCIPAL FINDINGS: We sequenced 12 small RNA libraries using Illumina's Genome Analyzer II platform. This massively parallel sequencing approach of a diverse set of melanoma and pigment cell libraries revealed a total of 539 known mature and mature-star sequences, along with the prediction of 279 novel miRNA candidates, of which 109 were common to 2 or more libraries and 3 were present in all libraries. CONCLUSIONS/SIGNIFICANCE: Some of the novel candidate miRNAs may be specific to the melanocytic lineage and as such could be used as biomarkers to assist in the early detection of distant metastases by measuring the circulating levels in blood. Follow up studies of the functional roles of these pigment cell miRNAs and the identification of the targets should shed further light on the development and progression of melanoma.

  13. Deep RNA sequencing analysis of readthrough gene fusions in human prostate adenocarcinoma and reference samples

    Directory of Open Access Journals (Sweden)

    Nacu Serban

    2011-01-01

    Full Text Available Abstract Background Readthrough fusions across adjacent genes in the genome, or transcription-induced chimeras (TICs, have been estimated using expressed sequence tag (EST libraries to involve 4-6% of all genes. Deep transcriptional sequencing (RNA-Seq now makes it possible to study the occurrence and expression levels of TICs in individual samples across the genome. Methods We performed single-end RNA-Seq on three human prostate adenocarcinoma samples and their corresponding normal tissues, as well as brain and universal reference samples. We developed two bioinformatics methods to specifically identify TIC events: a targeted alignment method using artificial exon-exon junctions within 200,000 bp from adjacent genes, and genomic alignment allowing splicing within individual reads. We performed further experimental verification and characterization of selected TIC and fusion events using quantitative RT-PCR and comparative genomic hybridization microarrays. Results Targeted alignment against artificial exon-exon junctions yielded 339 distinct TIC events, including 32 gene pairs with multiple isoforms. The false discovery rate was estimated to be 1.5%. Spliced alignment to the genome was less sensitive, finding only 18% of those found by targeted alignment in 33-nt reads and 59% of those in 50-nt reads. However, spliced alignment revealed 30 cases of TICs with intervening exons, in addition to distant inversions, scrambled genes, and translocations. Our findings increase the catalog of observed TIC gene pairs by 66%. We verified 6 of 6 predicted TICs in all prostate samples, and 2 of 5 predicted novel distant gene fusions, both private events among 54 prostate tumor samples tested. Expression of TICs correlates with that of the upstream gene, which can explain the prostate-specific pattern of some TIC events and the restriction of the SLC45A3-ELK4 e4-e2 TIC to ERG-negative prostate samples, as confirmed in 20 matched prostate tumor and normal

  14. Quantitative modeling of a gene's expression from its intergenic sequence.

    Directory of Open Access Journals (Sweden)

    Md Abul Hassan Samee

    2014-03-01

    Full Text Available Modeling a gene's expression from its intergenic locus and trans-regulatory context is a fundamental goal in computational biology. Owing to the distributed nature of cis-regulatory information and the poorly understood mechanisms that integrate such information, gene locus modeling is a more challenging task than modeling individual enhancers. Here we report the first quantitative model of a gene's expression pattern as a function of its locus. We model the expression readout of a locus in two tiers: 1 combinatorial regulation by transcription factors bound to each enhancer is predicted by a thermodynamics-based model and 2 independent contributions from multiple enhancers are linearly combined to fit the gene expression pattern. The model does not require any prior knowledge about enhancers contributing toward a gene's expression. We demonstrate that the model captures the complex multi-domain expression patterns of anterior-posterior patterning genes in the early Drosophila embryo. Altogether, we model the expression patterns of 27 genes; these include several gap genes, pair-rule genes, and anterior, posterior, trunk, and terminal genes. We find that the model-selected enhancers for each gene overlap strongly with its experimentally characterized enhancers. Our findings also suggest the presence of sequence-segments in the locus that would contribute ectopic expression patterns and hence were "shut down" by the model. We applied our model to identify the transcription factors responsible for forming the stripe boundaries of the studied genes. The resulting network of regulatory interactions exhibits a high level of agreement with known regulatory influences on the target genes. Finally, we analyzed whether and why our assumption of enhancer independence was necessary for the genes we studied. We found a deterioration of expression when binding sites in one enhancer were allowed to influence the readout of another enhancer. Thus, interference

  15. Quantitative biostratigraphy and species evolutionary se-quence

    Institute of Scientific and Technical Information of China (English)

    XU; Guirong

    2001-01-01

    [1]Liu, T. S., Loess and the Environment, Beijing: China Ocean Press, 1985, 1-251.[2]Chen, L. X., Zhu, Q. G., Luo, H. B. et al., East Asian Monsoon, Beijing: China Meteorology Press, 1991, 28-61.[3]An, Z. S., Liu, T. S., Lu, Y. C. et al., The long-term palaeomonsoon variation recorded by the loess-palaeosol sequence in central China, Quaternary International, 1990, (7/8): 91-95.[4]Guo, Z. T., Liu, T. S., Fedoroff, N. et al., Shift of the monsoon intensity on the Loess Plateau at ca. 0.85 MaBP, Chinese Science Bulletin, 1993, 38(2): 586-591.[5]Chen, J., An, Z. S., Wang, Y. J. et al., Distributions of Rb and Sr in the Luochuan loess-paleosol sequence of China during the last 800 ka: Implications for paleomonsoon variations, Science in China, Ser. D, 1999, 42(3): 225-232.[6]Chen, J., Wang, Y. J., Ji, J. F. et al., Rb/Sr variations and its climatic stratigraphical significance of a loess-paleosol profile from Luochuan, Shaanxi Province, Quaternary Sciences (in Chinese), 1999, 19(4): 350-356.[7]Guo, Z. T.,Liu, T. S., Fedoroff, N. et al., Climate extremes in loess of China coupled with the strength of deep-water for-mation in the North Atlantic, Global and Planetary Change, 1998, 18: 113-128.[8]Guo, Z. T., Liu, T. S., An, Z. S., Paleosols of the last 0.15 Ma in the Weinan loess section and their paleoclimate signifi-cance, Quaternary Sciences (in Chinese), 1994, 14(3): 256-269.[9]Guo, Z, T,, Fedoroff, N., Liu, T. S., Micromorphology of the loess-paleosol sequence of the last 130 ka in China and pa-leoclimatic event, Science in China (in Chinese), Ser. D, 1996, 26(3): 392-398.[10]Guo, Z., Liu, T., Guiot, J., et al., High frequency pulses of East Asian monsoon climate in the last two glaciations: Link with the North Atlantic, Climate Dynamics, 1996, 12: 701-709.[11]Guo, Z. T., Peng, S. Z., Wei, L. Y. et al., Weathering signals of Millennial-Scale oscillations of the East Asian Summer monsoon over the last 220 ka, Chinese Science

  16. Deep Sequencing Analysis of the Ixodes ricinus Haemocytome.

    Directory of Open Access Journals (Sweden)

    Michalis Kotsyfakis

    2015-05-01

    Full Text Available Ixodes ricinus is the main tick vector of the microbes that cause Lyme disease and tick-borne encephalitis in Europe. Pathogens transmitted by ticks have to overcome innate immunity barriers present in tick tissues, including midgut, salivary glands epithelia and the hemocoel. Molecularly, invertebrate immunity is initiated when pathogen recognition molecules trigger serum or cellular signalling cascades leading to the production of antimicrobials, pathogen opsonization and phagocytosis. We presently aimed at identifying hemocyte transcripts from semi-engorged female I. ricinus ticks by mass sequencing a hemocyte cDNA library and annotating immune-related transcripts based on their hemocyte abundance as well as their ubiquitous distribution.De novo assembly of 926,596 pyrosequence reads plus 49,328,982 Illumina reads (148 nt length from a hemocyte library, together with over 189 million Illumina reads from salivary gland and midgut libraries, generated 15,716 extracted coding sequences (CDS; these are displayed in an annotated hyperlinked spreadsheet format. Read mapping allowed the identification and annotation of tissue-enriched transcripts. A total of 327 transcripts were found significantly over expressed in the hemocyte libraries, including those coding for scavenger receptors, antimicrobial peptides, pathogen recognition proteins, proteases and protease inhibitors. Vitellogenin and lipid metabolism transcription enrichment suggests fat body components. We additionally annotated ubiquitously distributed transcripts associated with immune function, including immune-associated signal transduction proteins and transcription factors, including the STAT transcription factor.This is the first systems biology approach to describe the genes expressed in the haemocytes of this neglected disease vector. A total of 2,860 coding sequences were deposited to GenBank, increasing to 27,547 the number so far deposited by our previous transcriptome studies

  17. Determining mutant spectra of three RNA viral samples using ultra-deep sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Chen, H

    2012-06-06

    RNA viruses have extremely high mutation rates that enable the virus to adapt to new host environments and even jump from one species to another. As part of a viral transmission study, three viral samples collected from naturally infected animals were sequenced using Illumina paired-end technology at ultra-deep coverage. In order to determine the mutant spectra within the viral quasispecies, it is critical to understand the sequencing error rates and control for false positive calls of viral variants (point mutantations). I will estimate the sequencing error rate from two control sequences and characterize the mutant spectra in the natural samples with this error rate.

  18. Using Small RNA Deep Sequencing Data to Detect Human Viruses.

    Science.gov (United States)

    Wang, Fang; Sun, Yu; Ruan, Jishou; Chen, Rui; Chen, Xin; Chen, Chengjie; Kreuze, Jan F; Fei, ZhangJun; Zhu, Xiao; Gao, Shan

    2016-01-01

    Small RNA sequencing (sRNA-seq) can be used to detect viruses in infected hosts without the necessity to have any prior knowledge or specialized sample preparation. The sRNA-seq method was initially used for viral detection and identification in plants and then in invertebrates and fungi. However, it is still controversial to use sRNA-seq in the detection of mammalian or human viruses. In this study, we used 931 sRNA-seq runs of data from the NCBI SRA database to detect and identify viruses in human cells or tissues, particularly from some clinical samples. Six viruses including HPV-18, HBV, HCV, HIV-1, SMRV, and EBV were detected from 36 runs of data. Four viruses were consistent with the annotations from the previous studies. HIV-1 was found in clinical samples without the HIV-positive reports, and SMRV was found in Diffuse Large B-Cell Lymphoma cells for the first time. In conclusion, these results suggest the sRNA-seq can be used to detect viruses in mammals and humans.

  19. Using Small RNA Deep Sequencing Data to Detect Human Viruses

    Directory of Open Access Journals (Sweden)

    Fang Wang

    2016-01-01

    Full Text Available Small RNA sequencing (sRNA-seq can be used to detect viruses in infected hosts without the necessity to have any prior knowledge or specialized sample preparation. The sRNA-seq method was initially used for viral detection and identification in plants and then in invertebrates and fungi. However, it is still controversial to use sRNA-seq in the detection of mammalian or human viruses. In this study, we used 931 sRNA-seq runs of data from the NCBI SRA database to detect and identify viruses in human cells or tissues, particularly from some clinical samples. Six viruses including HPV-18, HBV, HCV, HIV-1, SMRV, and EBV were detected from 36 runs of data. Four viruses were consistent with the annotations from the previous studies. HIV-1 was found in clinical samples without the HIV-positive reports, and SMRV was found in Diffuse Large B-Cell Lymphoma cells for the first time. In conclusion, these results suggest the sRNA-seq can be used to detect viruses in mammals and humans.

  20. Deep sequencing of the vaginal microbiota of women with HIV.

    Directory of Open Access Journals (Sweden)

    Ruben Hummelen

    Full Text Available BACKGROUND: Women living with HIV and co-infected with bacterial vaginosis (BV are at higher risk for transmitting HIV to a partner or newborn. It is poorly understood which bacterial communities constitute BV or the normal vaginal microbiota among this population and how the microbiota associated with BV responds to antibiotic treatment. METHODS AND FINDINGS: The vaginal microbiota of 132 HIV positive Tanzanian women, including 39 who received metronidazole treatment for BV, were profiled using Illumina to sequence the V6 region of the 16S rRNA gene. Of note, Gardnerella vaginalis and Lactobacillus iners were detected in each sample constituting core members of the vaginal microbiota. Eight major clusters were detected with relatively uniform microbiota compositions. Two clusters dominated by L. iners or L. crispatus were strongly associated with a normal microbiota. The L. crispatus dominated microbiota were associated with low pH, but when L. crispatus was not present, a large fraction of L. iners was required to predict a low pH. Four clusters were strongly associated with BV, and were dominated by Prevotella bivia, Lachnospiraceae, or a mixture of different species. Metronidazole treatment reduced the microbial diversity and perturbed the BV-associated microbiota, but rarely resulted in the establishment of a lactobacilli-dominated microbiota. CONCLUSIONS: Illumina based microbial profiling enabled high though-put analyses of microbial samples at a high phylogenetic resolution. The vaginal microbiota among women living with HIV in Sub-Saharan Africa constitutes several profiles associated with a normal microbiota or BV. Recurrence of BV frequently constitutes a different BV-associated profile than before antibiotic treatment.

  1. Next-generation sequencing facilitates quantitative analysis of wild-type and Nrl(-/-) retinal transcriptomes.

    Science.gov (United States)

    Brooks, Matthew J; Rajasimha, Harsha K; Roger, Jerome E; Swaroop, Anand

    2011-01-01

    Next-generation sequencing (NGS) has revolutionized systems-based analysis of cellular pathways. The goals of this study are to compare NGS-derived retinal transcriptome profiling (RNA-seq) to microarray and quantitative reverse transcription polymerase chain reaction (qRT-PCR) methods and to evaluate protocols for optimal high-throughput data analysis. Retinal mRNA profiles of 21-day-old wild-type (WT) and neural retina leucine zipper knockout (Nrl(-/-)) mice were generated by deep sequencing, in triplicate, using Illumina GAIIx. The sequence reads that passed quality filters were analyzed at the transcript isoform level with two methods: Burrows-Wheeler Aligner (BWA) followed by ANOVA (ANOVA) and TopHat followed by Cufflinks. qRT-PCR validation was performed using TaqMan and SYBR Green assays. Using an optimized data analysis workflow, we mapped about 30 million sequence reads per sample to the mouse genome (build mm9) and identified 16,014 transcripts in the retinas of WT and Nrl(-/-) mice with BWA workflow and 34,115 transcripts with TopHat workflow. RNA-seq data confirmed stable expression of 25 known housekeeping genes, and 12 of these were validated with qRT-PCR. RNA-seq data had a linear relationship with qRT-PCR for more than four orders of magnitude and a goodness of fit (R(2)) of 0.8798. Approximately 10% of the transcripts showed differential expression between the WT and Nrl(-/-) retina, with a fold change ≥1.5 and p value <0.05. Altered expression of 25 genes was confirmed with qRT-PCR, demonstrating the high degree of sensitivity of the RNA-seq method. Hierarchical clustering of differentially expressed genes uncovered several as yet uncharacterized genes that may contribute to retinal function. Data analysis with BWA and TopHat workflows revealed a significant overlap yet provided complementary insights in transcriptome profiling. Our study represents the first detailed analysis of retinal transcriptomes, with biologic replicates, generated by RNA

  2. Deep RNA Sequencing of the Skeletal Muscle Transcriptome in Swimming Fish

    NARCIS (Netherlands)

    Palstra, A.P.; Beltran, S.; Burgerhout, E.; Brittijn, S.A.; Magnoni, L.J.; Henkel, C.V.; Jansen, A.; Thillart, G.E.E.J.M.; Spaink, H.P.; Planas, J.V.

    2013-01-01

    Deep RNA sequencing (RNA-seq) was performed to provide an in-depth view of the transcriptome of red and white skeletal muscle of exercised and non-exercised rainbow trout (Oncorhynchus mykiss) with the specific objective to identify expressed genes and quantify the transcriptomic effects of swimming

  3. Studies of a Biochemical Factory: Tomato Trichome Deep Expressed Sequence Tag Sequencing and Proteomics1[W][OA

    Science.gov (United States)

    Schilmiller, Anthony L.; Miner, Dennis P.; Larson, Matthew; McDowell, Eric; Gang, David R.; Wilkerson, Curtis; Last, Robert L.

    2010-01-01

    Shotgun proteomics analysis allows hundreds of proteins to be identified and quantified from a single sample at relatively low cost. Extensive DNA sequence information is a prerequisite for shotgun proteomics, and it is ideal to have sequence for the organism being studied rather than from related species or accessions. While this requirement has limited the set of organisms that are candidates for this approach, next generation sequencing technologies make it feasible to obtain deep DNA sequence coverage from any organism. As part of our studies of specialized (secondary) metabolism in tomato (Solanum lycopersicum) trichomes, 454 sequencing of cDNA was combined with shotgun proteomics analyses to obtain in-depth profiles of genes and proteins expressed in leaf and stem glandular trichomes of 3-week-old plants. The expressed sequence tag and proteomics data sets combined with metabolite analysis led to the discovery and characterization of a sesquiterpene synthase that produces β-caryophyllene and α-humulene from E,E-farnesyl diphosphate in trichomes of leaf but not of stem. This analysis demonstrates the utility of combining high-throughput cDNA sequencing with proteomics experiments in a target tissue. These data can be used for dissection of other biochemical processes in these specialized epidermal cells. PMID:20431087

  4. A simple method for the parallel deep sequencing of full influenza A genomes

    DEFF Research Database (Denmark)

    Kampmann, Marie-Louise; Fordyce, Sarah Louise; Avila Arcos, Maria del Carmen

    2011-01-01

    . This study reports a comprehensive method that enables deep sequencing of the complete genomes of influenza A subtypes using the Illumina Genome Analyzer IIx (GAIIx). By using this method, the complete genomes of nine viruses were sequenced in parallel, representing the 2009 pandemic H1N1 virus, H5N1 virus...... from human and H1N1 virus from swine, on a single lane of a GAIIx flow cell to an average depth of 122-fold. This technique can be applied to cultivated and uncultivated virus.......Given the major threat of influenza A to human and animal health, and its ability to evolve rapidly through mutation and reassortment, tools that enable its timely characterization are necessary to help monitor its evolution and spread. For this purpose, deep sequencing can be a very valuable tool...

  5. Simultaneous identification of DNA and RNA viruses present in pig faeces using process-controlled deep sequencing.

    Directory of Open Access Journals (Sweden)

    Jana Sachsenröder

    Full Text Available BACKGROUND: Animal faeces comprise a community of many different microorganisms including bacteria and viruses. Only scarce information is available about the diversity of viruses present in the faeces of pigs. Here we describe a protocol, which was optimized for the purification of the total fraction of viral particles from pig faeces. The genomes of the purified DNA and RNA viruses were simultaneously amplified by PCR and subjected to deep sequencing followed by bioinformatic analyses. The efficiency of the method was monitored using a process control consisting of three bacteriophages (T4, M13 and MS2 with different morphology and genome types. Defined amounts of the bacteriophages were added to the sample and their abundance was assessed by quantitative PCR during the preparation procedure. RESULTS: The procedure was applied to a pooled faecal sample of five pigs. From this sample, 69,613 sequence reads were generated. All of the added bacteriophages were identified by sequence analysis of the reads. In total, 7.7% of the reads showed significant sequence identities with published viral sequences. They mainly originated from bacteriophages (73.9% and mammalian viruses (23.9%; 0.8% of the sequences showed identities to plant viruses. The most abundant detected porcine viruses were kobuvirus, rotavirus C, astrovirus, enterovirus B, sapovirus and picobirnavirus. In addition, sequences with identities to the chimpanzee stool-associated circular ssDNA virus were identified. Whole genome analysis indicates that this virus, tentatively designated as pig stool-associated circular ssDNA virus (PigSCV, represents a novel pig virus. CONCLUSION: The established protocol enables the simultaneous detection of DNA and RNA viruses in pig faeces including the identification of so far unknown viruses. It may be applied in studies investigating aetiology, epidemiology and ecology of diseases. The implemented process control serves as quality control, ensures

  6. Deep Learning Automates the Quantitative Analysis of Individual Cells in Live-Cell Imaging Experiments.

    Science.gov (United States)

    Van Valen, David A; Kudo, Takamasa; Lane, Keara M; Macklin, Derek N; Quach, Nicolas T; DeFelice, Mialy M; Maayan, Inbal; Tanouchi, Yu; Ashley, Euan A; Covert, Markus W

    2016-11-01

    Live-cell imaging has opened an exciting window into the role cellular heterogeneity plays in dynamic, living systems. A major critical challenge for this class of experiments is the problem of image segmentation, or determining which parts of a microscope image correspond to which individual cells. Current approaches require many hours of manual curation and depend on approaches that are difficult to share between labs. They are also unable to robustly segment the cytoplasms of mammalian cells. Here, we show that deep convolutional neural networks, a supervised machine learning method, can solve this challenge for multiple cell types across the domains of life. We demonstrate that this approach can robustly segment fluorescent images of cell nuclei as well as phase images of the cytoplasms of individual bacterial and mammalian cells from phase contrast images without the need for a fluorescent cytoplasmic marker. These networks also enable the simultaneous segmentation and identification of different mammalian cell types grown in co-culture. A quantitative comparison with prior methods demonstrates that convolutional neural networks have improved accuracy and lead to a significant reduction in curation time. We relay our experience in designing and optimizing deep convolutional neural networks for this task and outline several design rules that we found led to robust performance. We conclude that deep convolutional neural networks are an accurate method that require less curation time, are generalizable to a multiplicity of cell types, from bacteria to mammalian cells, and expand live-cell imaging capabilities to include multi-cell type systems.

  7. Exploring fungal diversity in deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing

    Science.gov (United States)

    Zhang, Xiao-Yong; Wang, Guang-Hua; Xu, Xin-Ya; Nong, Xu-Hua; Wang, Jie; Amin, Muhammad; Qi, Shu-Hua

    2016-10-01

    The present study investigated the fungal diversity in four different deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing of the nuclear ribosomal internal transcribed spacer-1 (ITS1). A total of 40,297 fungal ITS1 sequences clustered into 420 operational taxonomic units (OTUs) with 97% sequence similarity and 170 taxa were recovered from these sediments. Most ITS1 sequences (78%) belonged to the phylum Ascomycota, followed by Basidiomycota (17.3%), Zygomycota (1.5%) and Chytridiomycota (0.8%), and a small proportion (2.4%) belonged to unassigned fungal phyla. Compared with previous studies on fungal diversity of sediments from deep-sea environments by culture-dependent approach and clone library analysis, the present result suggested that Illumina sequencing had been dramatically accelerating the discovery of fungal community of deep-sea sediments. Furthermore, our results revealed that Sordariomycetes was the most diverse and abundant fungal class in this study, challenging the traditional view that the diversity of Sordariomycetes phylotypes was low in the deep-sea environments. In addition, more than 12 taxa accounted for 21.5% sequences were found to be rarely reported as deep-sea fungi, suggesting the deep-sea sediments from Okinawa Trough harbored a plethora of different fungal communities compared with other deep-sea environments. To our knowledge, this study is the first exploration of the fungal diversity in deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing.

  8. Quantitative analysis of patients with celiac disease by video capsule endoscopy: A deep learning method.

    Science.gov (United States)

    Zhou, Teng; Han, Guoqiang; Li, Bing Nan; Lin, Zhizhe; Ciaccio, Edward J; Green, Peter H; Qin, Jing

    2017-06-01

    Celiac disease is one of the most common diseases in the world. Capsule endoscopy is an alternative way to visualize the entire small intestine without invasiveness to the patient. It is useful to characterize celiac disease, but hours are need to manually analyze the retrospective data of a single patient. Computer-aided quantitative analysis by a deep learning method helps in alleviating the workload during analysis of the retrospective videos. Capsule endoscopy clips from 6 celiac disease patients and 5 controls were preprocessed for training. The frames with a large field of opaque extraluminal fluid or air bubbles were removed automatically by using a pre-selection algorithm. Then the frames were cropped and the intensity was corrected prior to frame rotation in the proposed new method. The GoogLeNet is trained with these frames. Then, the clips of capsule endoscopy from 5 additional celiac disease patients and 5 additional control patients are used for testing. The trained GoogLeNet was able to distinguish the frames from capsule endoscopy clips of celiac disease patients vs controls. Quantitative measurement with evaluation of the confidence was developed to assess the severity level of pathology in the subjects. Relying on the evaluation confidence, the GoogLeNet achieved 100% sensitivity and specificity for the testing set. The t-test confirmed the evaluation confidence is significant to distinguish celiac disease patients from controls. Furthermore, it is found that the evaluation confidence may also relate to the severity level of small bowel mucosal lesions. A deep convolutional neural network was established for quantitative measurement of the existence and degree of pathology throughout the small intestine, which may improve computer-aided clinical techniques to assess mucosal atrophy and other etiologies in real-time with videocapsule endoscopy. Copyright © 2017 Elsevier Ltd. All rights reserved.

  9. Ultra-deep sequencing of intra-host rabies virus populations during cross-species transmission.

    Directory of Open Access Journals (Sweden)

    Monica K Borucki

    2013-11-01

    Full Text Available One of the hurdles to understanding the role of viral quasispecies in RNA virus cross-species transmission (CST events is the need to analyze a densely sampled outbreak using deep sequencing in order to measure the amount of mutation occurring on a small time scale. In 2009, the California Department of Public Health reported a dramatic increase (350 in the number of gray foxes infected with a rabies virus variant for which striped skunks serve as a reservoir host in Humboldt County. To better understand the evolution of rabies, deep-sequencing was applied to 40 unpassaged rabies virus samples from the Humboldt outbreak. For each sample, approximately 11 kb of the 12 kb genome was amplified and sequenced using the Illumina platform. Average coverage was 17,448 and this allowed characterization of the rabies virus population present in each sample at unprecedented depths. Phylogenetic analysis of the consensus sequence data demonstrated that samples clustered according to date (1995 vs. 2009 and geographic location (northern vs. southern. A single amino acid change in the G protein distinguished a subset of northern foxes from a haplotype present in both foxes and skunks, suggesting this mutation may have played a role in the observed increased transmission among foxes in this region. Deep-sequencing data indicated that many genetic changes associated with the CST event occurred prior to 2009 since several nonsynonymous mutations that were present in the consensus sequences of skunk and fox rabies samples obtained from 20032010 were present at the sub-consensus level (as rare variants in the viral population in skunk and fox samples from 1995. These results suggest that analysis of rare variants within a viral population may yield clues to ancestral genomes and identify rare variants that have the potential to be selected for if environment conditions change.

  10. MicroRNA identity and abundance in porcine skeletal muscles determined by deep sequencing

    DEFF Research Database (Denmark)

    Nielsen, M; Hansen, J H; Hedegaard, J;

    2010-01-01

    MicroRNAs (miRNA) are short single-stranded RNA molecules that regulate gene expression post-transcriptionally by binding to complementary sequences in the 3' untranslated region (3' UTR) of target mRNAs. MiRNAs participate in the regulation of myogenesis, and identification of the complete set...... of miRNAs expressed in muscles is likely to significantly increase our understanding of muscle growth and development. To determine the identity and abundance of miRNA in porcine skeletal muscle, we applied a deep sequencing approach. This allowed us to identify the sequences and relative expression...... levels of 212 annotated miRNA genes, thereby providing a thorough account of the miRNA transcriptome in porcine muscle tissue. The expression levels displayed a very large range, as reflected by the number of sequence reads, which varied from single counts for rare miRNAs to several million reads...

  11. Prognostic value of deep sequencing method for minimal residual disease detection in multiple myeloma

    Science.gov (United States)

    Lahuerta, Juan J.; Pepin, François; González, Marcos; Barrio, Santiago; Ayala, Rosa; Puig, Noemí; Montalban, María A.; Paiva, Bruno; Weng, Li; Jiménez, Cristina; Sopena, María; Moorhead, Martin; Cedena, Teresa; Rapado, Immaculada; Mateos, María Victoria; Rosiñol, Laura; Oriol, Albert; Blanchard, María J.; Martínez, Rafael; Bladé, Joan; San Miguel, Jesús; Faham, Malek; García-Sanz, Ramón

    2014-01-01

    We assessed the prognostic value of minimal residual disease (MRD) detection in multiple myeloma (MM) patients using a sequencing-based platform in bone marrow samples from 133 MM patients in at least very good partial response (VGPR) after front-line therapy. Deep sequencing was carried out in patients in whom a high-frequency myeloma clone was identified and MRD was assessed using the IGH-VDJH, IGH-DJH, and IGK assays. The results were contrasted with those of multiparametric flow cytometry (MFC) and allele-specific oligonucleotide polymerase chain reaction (ASO-PCR). The applicability of deep sequencing was 91%. Concordance between sequencing and MFC and ASO-PCR was 83% and 85%, respectively. Patients who were MRD– by sequencing had a significantly longer time to tumor progression (TTP) (median 80 vs 31 months; P < .0001) and overall survival (median not reached vs 81 months; P = .02), compared with patients who were MRD+. When stratifying patients by different levels of MRD, the respective TTP medians were: MRD ≥10−3 27 months, MRD 10−3 to 10−5 48 months, and MRD <10−5 80 months (P = .003 to .0001). Ninety-two percent of VGPR patients were MRD+. In complete response patients, the TTP remained significantly longer for MRD– compared with MRD+ patients (131 vs 35 months; P = .0009). PMID:24646471

  12. FANSe2: a robust and cost-efficient alignment tool for quantitative next-generation sequencing applications.

    Directory of Open Access Journals (Sweden)

    Chuan-Le Xiao

    Full Text Available Correct and bias-free interpretation of the deep sequencing data is inevitably dependent on the complete mapping of all mappable reads to the reference sequence, especially for quantitative RNA-seq applications. Seed-based algorithms are generally slow but robust, while Burrows-Wheeler Transform (BWT based algorithms are fast but less robust. To have both advantages, we developed an algorithm FANSe2 with iterative mapping strategy based on the statistics of real-world sequencing error distribution to substantially accelerate the mapping without compromising the accuracy. Its sensitivity and accuracy are higher than the BWT-based algorithms in the tests using both prokaryotic and eukaryotic sequencing datasets. The gene identification results of FANSe2 is experimentally validated, while the previous algorithms have false positives and false negatives. FANSe2 showed remarkably better consistency to the microarray than most other algorithms in terms of gene expression quantifications. We implemented a scalable and almost maintenance-free parallelization method that can utilize the computational power of multiple office computers, a novel feature not present in any other mainstream algorithm. With three normal office computers, we demonstrated that FANSe2 mapped an RNA-seq dataset generated from an entire Illunima HiSeq 2000 flowcell (8 lanes, 608 M reads to masked human genome within 4.1 hours with higher sensitivity than Bowtie/Bowtie2. FANSe2 thus provides robust accuracy, full indel sensitivity, fast speed, versatile compatibility and economical computational utilization, making it a useful and practical tool for deep sequencing applications. FANSe2 is freely available at http://bioinformatics.jnu.edu.cn/software/fanse2/.

  13. Continuous Distributed Representation of Biological Sequences for Deep Proteomics and Genomics.

    Directory of Open Access Journals (Sweden)

    Ehsaneddin Asgari

    Full Text Available We introduce a new representation and feature extraction method for biological sequences. Named bio-vectors (BioVec to refer to biological sequences in general with protein-vectors (ProtVec for proteins (amino-acid sequences and gene-vectors (GeneVec for gene sequences, this representation can be widely used in applications of deep learning in proteomics and genomics. In the present paper, we focus on protein-vectors that can be utilized in a wide array of bioinformatics investigations such as family classification, protein visualization, structure prediction, disordered protein identification, and protein-protein interaction prediction. In this method, we adopt artificial neural network approaches and represent a protein sequence with a single dense n-dimensional vector. To evaluate this method, we apply it in classification of 324,018 protein sequences obtained from Swiss-Prot belonging to 7,027 protein families, where an average family classification accuracy of 93%±0.06% is obtained, outperforming existing family classification methods. In addition, we use ProtVec representation to predict disordered proteins from structured proteins. Two databases of disordered sequences are used: the DisProt database as well as a database featuring the disordered regions of nucleoporins rich with phenylalanine-glycine repeats (FG-Nups. Using support vector machine classifiers, FG-Nup sequences are distinguished from structured protein sequences found in Protein Data Bank (PDB with a 99.8% accuracy, and unstructured DisProt sequences are differentiated from structured DisProt sequences with 100.0% accuracy. These results indicate that by only providing sequence data for various proteins into this model, accurate information about protein structure can be determined. Importantly, this model needs to be trained only once and can then be applied to extract a comprehensive set of information regarding proteins of interest. Moreover, this representation can be

  14. Continuous Distributed Representation of Biological Sequences for Deep Proteomics and Genomics.

    Science.gov (United States)

    Asgari, Ehsaneddin; Mofrad, Mohammad R K

    2015-01-01

    We introduce a new representation and feature extraction method for biological sequences. Named bio-vectors (BioVec) to refer to biological sequences in general with protein-vectors (ProtVec) for proteins (amino-acid sequences) and gene-vectors (GeneVec) for gene sequences, this representation can be widely used in applications of deep learning in proteomics and genomics. In the present paper, we focus on protein-vectors that can be utilized in a wide array of bioinformatics investigations such as family classification, protein visualization, structure prediction, disordered protein identification, and protein-protein interaction prediction. In this method, we adopt artificial neural network approaches and represent a protein sequence with a single dense n-dimensional vector. To evaluate this method, we apply it in classification of 324,018 protein sequences obtained from Swiss-Prot belonging to 7,027 protein families, where an average family classification accuracy of 93%±0.06% is obtained, outperforming existing family classification methods. In addition, we use ProtVec representation to predict disordered proteins from structured proteins. Two databases of disordered sequences are used: the DisProt database as well as a database featuring the disordered regions of nucleoporins rich with phenylalanine-glycine repeats (FG-Nups). Using support vector machine classifiers, FG-Nup sequences are distinguished from structured protein sequences found in Protein Data Bank (PDB) with a 99.8% accuracy, and unstructured DisProt sequences are differentiated from structured DisProt sequences with 100.0% accuracy. These results indicate that by only providing sequence data for various proteins into this model, accurate information about protein structure can be determined. Importantly, this model needs to be trained only once and can then be applied to extract a comprehensive set of information regarding proteins of interest. Moreover, this representation can be considered as

  15. Deep Learning Automates the Quantitative Analysis of Individual Cells in Live-Cell Imaging Experiments.

    Directory of Open Access Journals (Sweden)

    David A Van Valen

    2016-11-01

    Full Text Available Live-cell imaging has opened an exciting window into the role cellular heterogeneity plays in dynamic, living systems. A major critical challenge for this class of experiments is the problem of image segmentation, or determining which parts of a microscope image correspond to which individual cells. Current approaches require many hours of manual curation and depend on approaches that are difficult to share between labs. They are also unable to robustly segment the cytoplasms of mammalian cells. Here, we show that deep convolutional neural networks, a supervised machine learning method, can solve this challenge for multiple cell types across the domains of life. We demonstrate that this approach can robustly segment fluorescent images of cell nuclei as well as phase images of the cytoplasms of individual bacterial and mammalian cells from phase contrast images without the need for a fluorescent cytoplasmic marker. These networks also enable the simultaneous segmentation and identification of different mammalian cell types grown in co-culture. A quantitative comparison with prior methods demonstrates that convolutional neural networks have improved accuracy and lead to a significant reduction in curation time. We relay our experience in designing and optimizing deep convolutional neural networks for this task and outline several design rules that we found led to robust performance. We conclude that deep convolutional neural networks are an accurate method that require less curation time, are generalizable to a multiplicity of cell types, from bacteria to mammalian cells, and expand live-cell imaging capabilities to include multi-cell type systems.

  16. Deep sequencing analysis of HBV genotype shift and correlation with antiviral efficiency during adefovir dipivoxil therapy.

    Directory of Open Access Journals (Sweden)

    Yuwei Wang

    Full Text Available Viral genotype shift in chronic hepatitis B (CHB patients during antiviral therapy has been reported, but the underlying mechanism remains elusive.38 CHB patients treated with ADV for one year were selected for studying genotype shift by both deep sequencing and Sanger sequencing method.Sanger sequencing method found that 7.9% patients showed mixed genotype before ADV therapy. In contrast, all 38 patients showed mixed genotype before ADV treatment by deep sequencing. 95.5% mixed genotype rate was also obtained from additional 200 treatment-naïve CHB patients. Of the 13 patients with genotype shift, the fraction of the minor genotype in 5 patients (38% increased gradually during the course of ADV treatment. Furthermore, responses to ADV and HBeAg seroconversion were associated with the high rate of genotype shift, suggesting drug and immune pressure may be key factors to induce genotype shift. Interestingly, patients with genotype C had a significantly higher rate of genotype shift than genotype B. In genotype shift group, ADV treatment induced a marked enhancement of genotype B ratio accompanied by a reduction of genotype C ratio, suggesting genotype C may be more sensitive to ADV than genotype B. Moreover, patients with dominant genotype C may have a better therapeutic effect. Finally, genotype shifts was correlated with clinical improvement in terms of ALT.Our findings provided a rational explanation for genotype shift among ADV-treated CHB patients. The genotype and genotype shift might be associated with antiviral efficiency.

  17. Metagenomes obtained by "deep sequencing" - what do they tell about the EBPR communities

    DEFF Research Database (Denmark)

    Albertsen, Mads; Saunders, Aaron Marc; Nielsen, Kåre Lehmann

    and to investigate in detail similarities and differences between the two EBPR communities. Material and Methods DNA extraction from activated sludge from the EBPR wastewater treatment plants Aalborg East and West and the further metagenomic sequencing, assembly and annotation were largely conducted as described...... Albertsen Keywords: Metagenomics; Accumulibacter; Micro-diversity; Enhanced Biological Phosphorus Removal Introduction Metagenomics, or environmental genomics, provides comprehensive information about the entire microbial community of a certain ecosystem, e.g. a wastewater treatment plant. So far......, metagenomic analyses have been hampered by high costs and high level of expertise needed to conduct the investigations, but it is changing now with development of new technologies allowing analyses of billions of DNA sequences (deep-sequencing) and user-friendly pipelines for analyses of the huge data sets...

  18. Ultra-deep sequencing reveals the subclonal structure and genomic evolution of oral squamous cell carcinoma

    DEFF Research Database (Denmark)

    Tabatabaeifar, Siavosh; Thomassen, Mads; Larsen, Martin Jakob

    Background: Oral squamous cell carcinoma (OSCC), a subgroup of head and neck squamous cell carcinoma (HNSCC), is primarily caused by alcohol consumption and tobacco use. Recent DNA sequencing studies suggests that HNSCC are very heterogeneous between patients; however the intra-patient subclonal...... structure remains unexplored due to lack of sampling multiple tumor biopsies from each patient. Materials and methods: To examine the clonal structure and describe the genomic cancer evolution we applied whole-exome sequencing combined with targeted ultra-deep targeted sequencing on biopsies from 5stage IV...... complex subclonal architectures comprising distinct subclones only found in geographically distinct regions of the tumors. The metastatic potential of the tumor is acquired early in the tumor evolution, as indicated by the lymph node sharing the majority of the mutations with the tumor biopsies, while...

  19. Exploring the Mechanisms of Gastrointestinal Cancer Development Using Deep Sequencing Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Matsumoto, Tomonori; Shimizu, Takahiro; Takai, Atsushi; Marusawa, Hiroyuki, E-mail: maru@kuhp.kyoto-u.ac.jp [Department of Gastroenterology and Hepatology, Graduate School of Medicine, Kyoto University, 54 Shogoin-Kawahara-cho, Sakyo-ku, Kyoto 606-8507 (Japan)

    2015-06-15

    Next-generation sequencing (NGS) technologies have revolutionized cancer genomics due to their high throughput sequencing capacity. Reports of the gene mutation profiles of various cancers by many researchers, including international cancer genome research consortia, have increased over recent years. In addition to detecting somatic mutations in tumor cells, NGS technologies enable us to approach the subject of carcinogenic mechanisms from new perspectives. Deep sequencing, a method of optimizing the high throughput capacity of NGS technologies, allows for the detection of genetic aberrations in small subsets of premalignant and/or tumor cells in noncancerous chronically inflamed tissues. Genome-wide NGS data also make it possible to clarify the mutational signatures of each cancer tissue by identifying the precise pattern of nucleotide alterations in the cancer genome, providing new information regarding the mechanisms of tumorigenesis. In this review, we highlight these new methods taking advantage of NGS technologies, and discuss our current understanding of carcinogenic mechanisms elucidated from such approaches.

  20. Sequence-based prediction of protein protein interaction using a deep-learning algorithm.

    Science.gov (United States)

    Sun, Tanlin; Zhou, Bo; Lai, Luhua; Pei, Jianfeng

    2017-05-25

    Protein-protein interactions (PPIs) are critical for many biological processes. It is therefore important to develop accurate high-throughput methods for identifying PPI to better understand protein function, disease occurrence, and therapy design. Though various computational methods for predicting PPI have been developed, their robustness for prediction with external datasets is unknown. Deep-learning algorithms have achieved successful results in diverse areas, but their effectiveness for PPI prediction has not been tested. We used a stacked autoencoder, a type of deep-learning algorithm, to study the sequence-based PPI prediction. The best model achieved an average accuracy of 97.19% with 10-fold cross-validation. The prediction accuracies for various external datasets ranged from 87.99% to 99.21%, which are superior to those achieved with previous methods. To our knowledge, this research is the first to apply a deep-learning algorithm to sequence-based PPI prediction, and the results demonstrate its potential in this field.

  1. Deep sequence characterisation of a divergent HPIV-4a from an adult with prolonged influenza-like illness

    Directory of Open Access Journals (Sweden)

    Katherine E. Arden

    2015-12-01

    Deep sequencing allowed identification and genomic characterisation of a possible pathogen from an ILI as well as being an important tool to aid future understanding of the linkages between viral genetic variation, transmission and disease prognosis.

  2. Large-Scale and Deep Quantitative Proteome Profiling Using Isobaric Labeling Coupled with Two-Dimensional LC-MS/MS.

    Science.gov (United States)

    Gritsenko, Marina A; Xu, Zhe; Liu, Tao; Smith, Richard D

    2016-01-01

    Comprehensive, quantitative information on abundances of proteins and their posttranslational modifications (PTMs) can potentially provide novel biological insights into diseases pathogenesis and therapeutic intervention. Herein, we introduce a quantitative strategy utilizing isobaric stable isotope-labeling techniques combined with two-dimensional liquid chromatography-tandem mass spectrometry (2D-LC-MS/MS) for large-scale, deep quantitative proteome profiling of biological samples or clinical specimens such as tumor tissues. The workflow includes isobaric labeling of tryptic peptides for multiplexed and accurate quantitative analysis, basic reversed-phase LC fractionation and concatenation for reduced sample complexity, and nano-LC coupled to high resolution and high mass accuracy MS analysis for high confidence identification and quantification of proteins. This proteomic analysis strategy has been successfully applied for in-depth quantitative proteomic analysis of tumor samples and can also be used for integrated proteome and PTM characterization, as well as comprehensive quantitative proteomic analysis across samples from large clinical cohorts.

  3. Mayday SeaSight: combined analysis of deep sequencing and microarray data.

    Directory of Open Access Journals (Sweden)

    Florian Battke

    Full Text Available Recently emerged deep sequencing technologies offer new high-throughput methods to quantify gene expression, epigenetic modifications and DNA-protein binding. From a computational point of view, the data is very different from that produced by the already established microarray technology, providing a new perspective on the samples under study and complementing microarray gene expression data. Software offering the integrated analysis of data from different technologies is of growing importance as new data emerge in systems biology studies. Mayday is an extensible platform for visual data exploration and interactive analysis and provides many methods for dissecting complex transcriptome datasets. We present Mayday SeaSight, an extension that allows to integrate data from different platforms such as deep sequencing and microarrays. It offers methods for computing expression values from mapped reads and raw microarray data, background correction and normalization and linking microarray probes to genomic coordinates. It is now possible to use Mayday's wealth of methods to analyze sequencing data and to combine data from different technologies in one analysis.

  4. Deep sequencing discovery of novel and conserved microRNAs in trifoliate orange (Citrus trifoliata

    Directory of Open Access Journals (Sweden)

    Yu Huaping

    2010-07-01

    Full Text Available Abstract Background MicroRNAs (miRNAs play a critical role in post-transcriptional gene regulation and have been shown to control many genes involved in various biological and metabolic processes. There have been extensive studies to discover miRNAs and analyze their functions in model plant species, such as Arabidopsis and rice. Deep sequencing technologies have facilitated identification of species-specific or lowly expressed as well as conserved or highly expressed miRNAs in plants. Results In this research, we used Solexa sequencing to discover new microRNAs in trifoliate orange (Citrus trifoliata which is an important rootstock of citrus. A total of 13,106,753 reads representing 4,876,395 distinct sequences were obtained from a short RNA library generated from small RNA extracted from C. trifoliata flower and fruit tissues. Based on sequence similarity and hairpin structure prediction, we found that 156,639 reads representing 63 sequences from 42 highly conserved miRNA families, have perfect matches to known miRNAs. We also identified 10 novel miRNA candidates whose precursors were all potentially generated from citrus ESTs. In addition, five miRNA* sequences were also sequenced. These sequences had not been earlier described in other plant species and accumulation of the 10 novel miRNAs were confirmed by qRT-PCR analysis. Potential target genes were predicted for most conserved and novel miRNAs. Moreover, four target genes including one encoding IRX12 copper ion binding/oxidoreductase and three genes encoding NB-LRR disease resistance protein have been experimentally verified by detection of the miRNA-mediated mRNA cleavage in C. trifoliata. Conclusion Deep sequencing of short RNAs from C. trifoliata flowers and fruits identified 10 new potential miRNAs and 42 highly conserved miRNA families, indicating that specific miRNAs exist in C. trifoliata. These results show that regulatory miRNAs exist in agronomically important trifoliate orange

  5. Ultra-deep sequencing of VHSV isolates contributes to understanding the role of viral quasispecies.

    Science.gov (United States)

    Schönherz, Anna A; Lorenzen, Niels; Guldbrandtsen, Bernt; Buitenhuis, Bart; Einer-Jensen, Katja

    2016-01-08

    The high mutation rate of RNA viruses enables the generation of a genetically diverse viral population, termed a quasispecies, within a single infected host. This high in-host genetic diversity enables an RNA virus to adapt to a diverse array of selective pressures such as host immune response and switching between host species. The negative-sense, single-stranded RNA virus, viral haemorrhagic septicaemia virus (VHSV), was originally considered an epidemic virus of cultured rainbow trout in Europe, but was later proved to be endemic among a range of marine fish species in the Northern hemisphere. To better understand the nature of a virus quasispecies related to the evolutionary potential of VHSV, a deep-sequencing protocol specific to VHSV was established and applied to 4 VHSV isolates, 2 originating from rainbow trout and 2 from Atlantic herring. Each isolate was subjected to Illumina paired end shotgun sequencing after PCR amplification and the 11.1 kb genome was successfully sequenced with an average coverage of 0.5-1.9 × 10(6) sequenced copies. Differences in single nucleotide polymorphism (SNP) frequency were detected both within and between isolates, possibly related to their stage of adaptation to host species and host immune reactions. The N, M, P and Nv genes appeared nearly fixed, while genetic variation in the G and L genes demonstrated presence of diverse genetic populations particularly in two isolates. The results demonstrate that deep sequencing and analysis methodologies can be useful for future in vivo host adaption studies of VHSV.

  6. Genetic variation of human papillomavirus type 16 in individual clinical specimens revealed by deep sequencing.

    Directory of Open Access Journals (Sweden)

    Iwao Kukimoto

    Full Text Available Viral genetic diversity within infected cells or tissues, called viral quasispecies, has been mostly studied for RNA viruses, but has also been described among DNA viruses, including human papillomavirus type 16 (HPV16 present in cervical precancerous lesions. However, the extent of HPV genetic variation in cervical specimens, and its involvement in HPV-induced carcinogenesis, remains unclear. Here, we employ deep sequencing to comprehensively analyze genetic variation in the HPV16 genome isolated from individual clinical specimens. Through overlapping full-circle PCR, approximately 8-kb DNA fragments covering the whole HPV16 genome were amplified from HPV16-positive cervical exfoliated cells collected from patients with either low-grade squamous intraepithelial lesion (LSIL or invasive cervical cancer (ICC. Deep sequencing of the amplified HPV16 DNA enabled de novo assembly of the full-length HPV16 genome sequence for each of 7 specimens (5 LSIL and 2 ICC samples. Subsequent alignment of read sequences to the assembled HPV16 sequence revealed that 2 LSILs and 1 ICC contained nucleotide variations within E6, E1 and the non-coding region between E5 and L2 with mutation frequencies of 0.60% to 5.42%. In transient replication assays, a novel E1 mutant found in ICC, E1 Q381E, showed reduced ability to support HPV16 origin-dependent replication. In addition, partially deleted E2 genes were detected in 1 LSIL sample in a mixed state with the intact E2 gene. Thus, the methods used in this study provide a fundamental framework for investigating the influence of HPV somatic genetic variation on cervical carcinogenesis.

  7. Identification of Retinopathy of Prematurity Related miRNAs in Hyperoxia-Induced Neonatal Rats by Deep Sequencing

    Directory of Open Access Journals (Sweden)

    Ruibin Zhao

    2014-12-01

    Full Text Available Retinopathy of prematurity (ROP remains a major problem for many preterm infants. MicroRNAs (miRNAs are a class of small noncoding RNAs that regulate gene expression at the posttranscriptional level and have been studied in many diseases. To understand the roles of miRNAs in ROP model rats, we constructed two small RNA libraries from the plasma of hyperoxia-induced rats and normal controls. Sequencing data revealed that 44 down-regulated microRNAs and 22 up-regulated microRNAs from the hyperoxia-induced rats were identified by deep sequencing technology. Some of the differentially expressed miRNAs were confirmed by quantitative reverse transcription-PCR (qRT-PCR. A total of 594 target genes of the differentially expressed microRNAs were identified using a bioinformatics approach. Functional annotation analysis indicated that a number of pathways might be involved in angiogenesis, cell proliferation and cell differentiation, which might be involved in the genesis and development of ROP. The elevated expression level of the vascular endothelial growth factor (VEGF protein in the hyperoxia-induced neonatal rats was also confirmed by enzyme linked immunosorbent assay (ELISA. This study provides some insights into the molecular mechanisms that underlie ROP development, thereby aiding the diagnosis and treatment of this disease.

  8. Development of genic-SSR markers by deep transcriptome sequencing in pigeonpea [Cajanus cajan (L. Millspaugh

    Directory of Open Access Journals (Sweden)

    Bashasab Fakrudin

    2011-01-01

    Full Text Available Abstract Background Pigeonpea [Cajanus cajan (L. Millspaugh], one of the most important food legumes of semi-arid tropical and subtropical regions, has limited genomic resources, particularly expressed sequence based (genic markers. We report a comprehensive set of validated genic simple sequence repeat (SSR markers using deep transcriptome sequencing, and its application in genetic diversity analysis and mapping. Results In this study, 43,324 transcriptome shotgun assembly unigene contigs were assembled from 1.696 million 454 GS-FLX sequence reads of separate pooled cDNA libraries prepared from leaf, root, stem and immature seed of two pigeonpea varieties, Asha and UPAS 120. A total of 3,771 genic-SSR loci, excluding homopolymeric and compound repeats, were identified; of which 2,877 PCR primer pairs were designed for marker development. Dinucleotide was the most common repeat motif with a frequency of 60.41%, followed by tri- (34.52%, hexa- (2.62%, tetra- (1.67% and pentanucleotide (0.76% repeat motifs. Primers were synthesized and tested for 772 of these loci with repeat lengths of ≥18 bp. Of these, 550 markers were validated for consistent amplification in eight diverse pigeonpea varieties; 71 were found to be polymorphic on agarose gel electrophoresis. Genetic diversity analysis was done on 22 pigeonpea varieties and eight wild species using 20 highly polymorphic genic-SSR markers. The number of alleles at these loci ranged from 4-10 and the polymorphism information content values ranged from 0.46 to 0.72. Neighbor-joining dendrogram showed distinct separation of the different groups of pigeonpea cultivars and wild species. Deep transcriptome sequencing of the two parental lines helped in silico identification of polymorphic genic-SSR loci to facilitate the rapid development of an intra-species reference genetic map, a subset of which was validated for expected allelic segregation in the reference mapping population. Conclusion We

  9. Ultra-deep sequencing of VHSV isolates contributes to understanding the role of viral quasispecies

    DEFF Research Database (Denmark)

    Schönherz, Anna A.; Lorenzen, Niels; Guldbrandtsen, Bernt

    2016-01-01

    The high mutation rate of RNA viruses enables the generation of a genetically diverse viral population, termed a quasispecies, within a single infected host. This high in-host genetic diversity enables an RNA virus to adapt to a diverse array of selective pressures such as host immune response....... To better understand the nature of a virus quasispecies related to the evolutionary potential of VHSV, a deep-sequencing protocol specific to VHSV was established and applied to 4 VHSV isolates, 2 originating from rainbow trout and 2 from Atlantic herring. Each isolate was subjected to Illumina paired end...

  10. FANSe: an accurate algorithm for quantitative mapping of large scale sequencing reads.

    Science.gov (United States)

    Zhang, Gong; Fedyunin, Ivan; Kirchner, Sebastian; Xiao, Chuanle; Valleriani, Angelo; Ignatova, Zoya

    2012-06-01

    The most crucial step in data processing from high-throughput sequencing applications is the accurate and sensitive alignment of the sequencing reads to reference genomes or transcriptomes. The accurate detection of insertions and deletions (indels) and errors introduced by the sequencing platform or by misreading of modified nucleotides is essential for the quantitative processing of the RNA-based sequencing (RNA-Seq) datasets and for the identification of genetic variations and modification patterns. We developed a new, fast and accurate algorithm for nucleic acid sequence analysis, FANSe, with adjustable mismatch allowance settings and ability to handle indels to accurately and quantitatively map millions of reads to small or large reference genomes. It is a seed-based algorithm which uses the whole read information for mapping and high sensitivity and low ambiguity are achieved by using short and non-overlapping reads. Furthermore, FANSe uses hotspot score to prioritize the processing of highly possible matches and implements modified Smith-Watermann refinement with reduced scoring matrix to accelerate the calculation without compromising its sensitivity. The FANSe algorithm stably processes datasets from various sequencing platforms, masked or unmasked and small or large genomes. It shows a remarkable coverage of low-abundance mRNAs which is important for quantitative processing of RNA-Seq datasets.

  11. Evaluation of Nine Somatic Variant Callers for Detection of Somatic Mutations in Exome and Targeted Deep Sequencing Data

    DEFF Research Database (Denmark)

    Krøigård, Anne Bruun; Thomassen, Mads; Lænkholm, Anne Vibeke;

    2016-01-01

    a comprehensive evaluation using exome sequencing and targeted deep sequencing data of paired tumor-normal samples from five breast cancer patients to evaluate the performance of nine publicly available somatic variant callers: EBCall, Mutect, Seurat, Shimmer, Indelocator, Somatic Sniper, Strelka, VarScan 2...... and Virmid for the detection of single nucleotide mutations and small deletions and insertions. We report a large variation in the number of calls from the nine somatic variant callers on the same sequencing data and highly variable agreement. Sequencing depth had markedly diverse impact on individual...... callers, as for some callers, increased sequencing depth highly improved sensitivity. For SNV calling, we report EBCall, Mutect, Virmid and Strelka to be the most reliable somatic variant callers for both exome sequencing and targeted deep sequencing. For indel calling, EBCall is superior due to high...

  12. Evaluation of Nine Somatic Variant Callers for Detection of Somatic Mutations in Exome and Targeted Deep Sequencing Data

    DEFF Research Database (Denmark)

    Krøigård, Anne Bruun; Thomassen, Mads; Lænkholm, Anne Vibeke;

    2016-01-01

    Next generation sequencing is extensively applied to catalogue somatic mutations in cancer, in research settings and increasingly in clinical settings for molecular diagnostics, guiding therapy decisions. Somatic variant callers perform paired comparisons of sequencing data from cancer tissue...... a comprehensive evaluation using exome sequencing and targeted deep sequencing data of paired tumor-normal samples from five breast cancer patients to evaluate the performance of nine publicly available somatic variant callers: EBCall, Mutect, Seurat, Shimmer, Indelocator, Somatic Sniper, Strelka, VarScan 2...... callers, as for some callers, increased sequencing depth highly improved sensitivity. For SNV calling, we report EBCall, Mutect, Virmid and Strelka to be the most reliable somatic variant callers for both exome sequencing and targeted deep sequencing. For indel calling, EBCall is superior due to high...

  13. Optimization of affinity, specificity and function of designed influenza inhibitors using deep sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Whitehead, Timothy A.; Chevalier, Aaron; Song, Yifan; Dreyfus, Cyrille; Fleishman, Sarel J.; De Mattos, Cecilia; Myers, Chris A.; Kamisetty, Hetunandan; Blair, Patrick; Wilson, Ian A.; Baker, David (UWASH); (Scripps); (NRL)

    2012-06-19

    We show that comprehensive sequence-function maps obtained by deep sequencing can be used to reprogram interaction specificity and to leapfrog over bottlenecks in affinity maturation by combining many individually small contributions not detectable in conventional approaches. We use this approach to optimize two computationally designed inhibitors against H1N1 influenza hemagglutinin and, in both cases, obtain variants with subnanomolar binding affinity. The most potent of these, a 51-residue protein, is broadly cross-reactive against all influenza group 1 hemagglutinins, including human H2, and neutralizes H1N1 viruses with a potency that rivals that of several human monoclonal antibodies, demonstrating that computational design followed by comprehensive energy landscape mapping can generate proteins with potential therapeutic utility.

  14. Metatranscriptomic analysis of small RNAs present in soybean deep sequencing libraries

    Directory of Open Access Journals (Sweden)

    Lorrayne Gomes Molina

    2012-01-01

    Full Text Available A large number of small RNAs unrelated to the soybean genome were identified after deep sequencing of soybean small RNA libraries. A metatranscriptomic analysis was carried out to identify the origin of these sequences. Comparative analyses of small interference RNAs (siRNAs present in samples collected in open areas corresponding to soybean field plantations and samples from soybean cultivated in greenhouses under a controlled environment were made. Different pathogenic, symbiotic and free-living organisms were identified from samples of both growth systems. They included viruses, bacteria and different groups of fungi. This approach can be useful not only to identify potentially unknown pathogens and pests, but also to understand the relations that soybean plants establish with microorganisms that may affect, directly or indirectly, plant health and crop production.

  15. Detection and characterization of mycoviruses in arbuscular mycorrhizal fungi by deep-sequencing.

    Science.gov (United States)

    Ezawa, Tatsuhiro; Ikeda, Yoji; Shimura, Hanako; Masuta, Chikara

    2015-01-01

    Fungal viruses (mycoviruses) often have a significant impact not only on phenotypic expression of the host fungus but also on higher order biological interactions, e.g., conferring plant stress tolerance via an endophytic host fungus. Arbuscular mycorrhizal (AM) fungi in the phylum Glomeromycota associate with most land plants and supply mineral nutrients to the host plants. So far, little information about mycoviruses has been obtained in the fungi due to their obligate biotrophic nature. Here we provide a technical breakthrough, "two-step strategy" in combination with deep-sequencing, for virological study in AM fungi; dsRNA is first extracted and sequenced using material obtained from highly productive open pot culture, and then the presence of viruses is verified using pure material produced in the in vitro monoxenic culture. This approach enabled us to demonstrate the presence of several viruses for the first time from a glomeromycotan fungus.

  16. Inside the intraterrestrials: The deep biosphere seen through massively parallel sequencing

    Science.gov (United States)

    Biddle, J.

    2009-12-01

    Deeply buried marine sediments may house a large amount of the Earth’s microbial population. Initial studies based on 16S rRNA clone libraries suggest that these sediments contain unique phylotypes of microorganisms, particularly from the archaeal domain. Since this environment is so difficult to study, microbiologists are challenged to find ways to examine these populations remotely. A major approach taken to study this environment uses massively parallel sequencing to examine the inner genetic workings of these microorganisms after the sediment has been drilled. Both metagenomics and tagged amplicon sequencing have been employed on deep sediments, and initial results show that different geographic regions can be differentiated through genomics and also minor populations may cause major geochemical changes.

  17. Ultra-deep and quantitative saliva proteome reveals dynamics of the oral microbiome

    DEFF Research Database (Denmark)

    Grassl, Niklas; Kulak, Nils Alexander; Pichler, Garwin

    2016-01-01

    , disruptions in saliva secretion and changes in the oral microbiome contribute to conditions such as tooth decay and respiratory tract infections. Here we set out to quantitatively map the saliva proteome in great depth with a rapid and in-depth mass spectrometry-based proteomics workflow. METHODS: We used...... with next-generation sequencing data from the Human Microbiome Project as well as a comparison to MALDI-TOF mass spectrometry on microbial cultures revealed strong agreement. The oral microbiome differs between individuals and changes drastically upon eating and tooth brushing. CONCLUSION: Rapid shotgun...... and robust technology can now simultaneously characterize the human and microbiome contributions to the proteome of a body fluid and is therefore a valuable complement to genomic studies. This opens new frontiers for the study of host-pathogen interactions and clinical saliva diagnostics....

  18. Modeling and Quantitative Analysis of GNSS/INS Deep Integration Tracking Loops in High Dynamics

    Directory of Open Access Journals (Sweden)

    Yalong Ban

    2017-09-01

    Full Text Available To meet the requirements of global navigation satellite systems (GNSS precision applications in high dynamics, this paper describes a study on the carrier phase tracking technology of the GNSS/inertial navigation system (INS deep integration system. The error propagation models of INS-aided carrier tracking loops are modeled in detail in high dynamics. Additionally, quantitative analysis of carrier phase tracking errors caused by INS error sources is carried out under the uniform high dynamic linear acceleration motion of 100 g. Results show that the major INS error sources, affecting the carrier phase tracking accuracy in high dynamics, include initial attitude errors, accelerometer scale factors, gyro noise and gyro g-sensitivity errors. The initial attitude errors are usually combined with the receiver acceleration to impact the tracking loop performance, which can easily cause the failure of carrier phase tracking. The main INS error factors vary with the vehicle motion direction and the relative position of the receiver and the satellites. The analysis results also indicate that the low-cost micro-electro mechanical system (MEMS inertial measurement units (IMU has the ability to maintain GNSS carrier phase tracking in high dynamics.

  19. Pathogen-specific deep sequence-coupled biopanning: A method for surveying human antibody responses

    Science.gov (United States)

    Pascale, Juan M.; Moreno, Brechla; Chackerian, Bryce; Peabody, David S.

    2017-01-01

    Identifying the targets of antibody responses during infection is important for designing vaccines, developing diagnostic and prognostic tools, and understanding pathogenesis. We developed a novel deep sequence-coupled biopanning approach capable of identifying the protein epitopes of antibodies present in human polyclonal serum. Here, we report the adaptation of this approach for the identification of pathogen-specific epitopes recognized by antibodies elicited during acute infection. As a proof-of-principle, we applied this approach to assessing antibodies to Dengue virus (DENV). Using a panel of sera from patients with acute secondary DENV infection, we panned a DENV antigen fragment library displayed on the surface of bacteriophage MS2 virus-like particles and characterized the population of affinity-selected peptide epitopes by deep sequence analysis. Although there was considerable variation in the responses of individuals, we found several epitopes within the Envelope glycoprotein and Non-Structural Protein 1 that were commonly enriched. This report establishes a novel approach for characterizing pathogen-specific antibody responses in human sera, and has future utility in identifying novel diagnostic and vaccine targets. PMID:28152075

  20. Deep sequencing reveals as-yet-undiscovered small RNAs in Escherichia coli

    Directory of Open Access Journals (Sweden)

    Hirano Reiko

    2011-08-01

    Full Text Available Abstract Background In Escherichia coli, approximately 100 regulatory small RNAs (sRNAs have been identified experimentally and many more have been predicted by various methods. To provide a comprehensive overview of sRNAs, we analysed the low-molecular-weight RNAs (E. coli with deep sequencing, because the regulatory RNAs in bacteria are usually 50-200 nt in length. Results We discovered 229 novel candidate sRNAs (≥ 50 nt with computational or experimental evidence of transcription initiation. Among them, the expression of seven intergenic sRNAs and three cis-antisense sRNAs was detected by northern blot analysis. Interestingly, five novel sRNAs are expressed from prophage regions and we note that these sRNAs have several specific characteristics. Furthermore, we conducted an evolutionary conservation analysis of the candidate sRNAs and summarised the data among closely related bacterial strains. Conclusions This comprehensive screen for E. coli sRNAs using a deep sequencing approach has shown that many as-yet-undiscovered sRNAs are potentially encoded in the E. coli genome. We constructed the Escherichia coli Small RNA Browser (ECSBrowser; http://rna.iab.keio.ac.jp/, which integrates the data for previously identified sRNAs and the novel sRNAs found in this study.

  1. Deep sequencing reveals a global reprogramming of lncRNA transcriptome during EMT.

    Science.gov (United States)

    Liao, Jian-You; Wu, Jue; Wang, Yan-Jie; He, Jie-Hua; Deng, Wei-Xi; Hu, KaiShun; Zhang, Yu-Chan; Zhang, Yin; Yan, Haiyan; Wang, Dan-Lan; Liu, Qiang; Zeng, Mu-Sheng; Phillip Koeffler, H; Song, Erwei; Yin, Dong

    2017-10-01

    Several studies have shown that long non-coding RNAs (lncRNAs) may play an essential role in Epithelial-Mesenchymal Transition (EMT), which is an important step in tumor metastasis; however, little is known about the global change of lncRNA transcriptome during EMT. To investigate how lncRNA transcriptome alterations contribute to EMT progression regulation, we deep-sequenced the whole-transcriptome of MCF10A as the cells underwent TGF-β-induced EMT. Deep-sequencing results showed that the long RNA transcriptome of MCF10A had undergone global changes as early as 8h after treatment with TGF-β. The expression of 3403 known and novel lncRNAs, and 570 known and novel circRNAs were altered during EMT. To identify the key lncRNA-regulator, we constructed the co-expression network and found all junction nodes in the network are lncRNAs. One junction node, RP6-65G23.5, was further verified as a key regulator of EMT. Intriguingly, we identified 216 clusters containing lncRNAs which were located in "gene desert" regions. The expressions of all lncRNAs in these clusters changed concurrently during EMT, strongly suggesting that these clusters might play important roles in EMT. Our study reveals a global reprogramming of lncRNAs transcriptome during EMT and provides clues for the future study of the molecular mechanism of EMT. Copyright © 2017 Elsevier B.V. All rights reserved.

  2. Advanced methylome analysis after bisulfite deep sequencing: an example in Arabidopsis.

    Directory of Open Access Journals (Sweden)

    Huy Q Dinh

    Full Text Available Deep sequencing after bisulfite conversion (BS-Seq is the method of choice to generate whole genome maps of cytosine methylation at single base-pair resolution. Its application to genomic DNA of Arabidopsis flower bud tissue resulted in the first complete methylome, determining a methylation rate of 6.7% in this tissue. BS-Seq reads were mapped onto an in silico converted reference genome, applying the so-called 3-letter genome method. Here, we present BiSS (Bisufite Sequencing Scorer, a new method applying Smith-Waterman alignment to map bisulfite-converted reads to a reference genome. In addition, we introduce a comprehensive adaptive error estimate that accounts for sequencing errors, erroneous bisulfite conversion and also wrongly mapped reads. The re-analysis of the Arabidopsis methylome data with BiSS mapped substantially more reads to the genome. As a result, it determines the methylation status of an extra 10% of cytosines and estimates the methylation rate to be 7.7%. We validated the results by individual traditional bisulfite sequencing for selected genomic regions. In addition to predicting the methylation status of each cytosine, BiSS also provides an estimate of the methylation degree at each genomic site. Thus, BiSS explores BS-Seq data more extensively and provides more information for downstream analysis.

  3. MicroRNA expression signatures of bladder cancer revealed by deep sequencing.

    Directory of Open Access Journals (Sweden)

    Yonghua Han

    Full Text Available BACKGROUND: MicroRNAs (miRNAs are a class of small noncoding RNAs that regulate gene expression. They are aberrantly expressed in many types of cancers. In this study, we determined the genome-wide miRNA profiles in bladder urothelial carcinoma by deep sequencing. METHODOLOGY/PRINCIPAL FINDINGS: We detected 656 differentially expressed known human miRNAs and miRNA antisense sequences (miRNA*s in nine bladder urothelial carcinoma patients by deep sequencing. Many miRNAs and miRNA*s were significantly upregulated or downregulated in bladder urothelial carcinoma compared to matched histologically normal urothelium. hsa-miR-96 was the most significantly upregulated miRNA and hsa-miR-490-5p was the most significantly downregulated one. Upregulated miRNAs were more common than downregulated ones. The hsa-miR-183, hsa-miR-200b ∼ 429, hsa-miR-200c ∼ 141 and hsa-miR-17 ∼ 92 clusters were significantly upregulated. The hsa-miR-143 ∼ 145 cluster was significantly downregulated. hsa-miR-182, hsa-miR-183, hsa-miR-200a, hsa-miR-143 and hsa-miR-195 were evaluated by Real-Time qPCR in a total of fifty-one bladder urothelial carcinoma patients. They were aberrantly expressed in bladder urothelial carcinoma compared to matched histologically normal urothelium (p < 0.001 for each miRNA. CONCLUSIONS/SIGNIFICANCE: To date, this is the first study to determine genome-wide miRNA expression patterns in human bladder urothelial carcinoma by deep sequencing. We found that a collection of miRNAs were aberrantly expressed in bladder urothelial carcinoma compared to matched histologically normal urothelium, suggesting that they might play roles as oncogenes or tumor suppressors in the development and/or progression of this cancer. Our data provide novel insights into cancer biology.

  4. Deep sequencing analysis of the developing mouse brain reveals a novel microRNA

    Directory of Open Access Journals (Sweden)

    Piltz Sandra

    2011-04-01

    Full Text Available Abstract Background MicroRNAs (miRNAs are small non-coding RNAs that can exert multilevel inhibition/repression at a post-transcriptional or protein synthesis level during disease or development. Characterisation of miRNAs in adult mammalian brains by deep sequencing has been reported previously. However, to date, no small RNA profiling of the developing brain has been undertaken using this method. We have performed deep sequencing and small RNA analysis of a developing (E15.5 mouse brain. Results We identified the expression of 294 known miRNAs in the E15.5 developing mouse brain, which were mostly represented by let-7 family and other brain-specific miRNAs such as miR-9 and miR-124. We also discovered 4 putative 22-23 nt miRNAs: mm_br_e15_1181, mm_br_e15_279920, mm_br_e15_96719 and mm_br_e15_294354 each with a 70-76 nt predicted pre-miRNA. We validated the 4 putative miRNAs and further characterised one of them, mm_br_e15_1181, throughout embryogenesis. Mm_br_e15_1181 biogenesis was Dicer1-dependent and was expressed in E3.5 blastocysts and E7 whole embryos. Embryo-wide expression patterns were observed at E9.5 and E11.5 followed by a near complete loss of expression by E13.5, with expression restricted to a specialised layer of cells within the developing and early postnatal brain. Mm_br_e15_1181 was upregulated during neurodifferentiation of P19 teratocarcinoma cells. This novel miRNA has been identified as miR-3099. Conclusions We have generated and analysed the first deep sequencing dataset of small RNA sequences of the developing mouse brain. The analysis revealed a novel miRNA, miR-3099, with potential regulatory effects on early embryogenesis, and involvement in neuronal cell differentiation/function in the brain during late embryonic and early neonatal development.

  5. Quantitative and phylogenetic study of the Deep Sea Archaeal Group in sediments of the arctic mid-ocean spreading ridge

    Directory of Open Access Journals (Sweden)

    Steffen Leth eJørgensen

    2013-10-01

    Full Text Available In marine sediments archaea often constitute a considerable part of the microbial community, of which the Deep Sea Archaeal Group (DSAG is one of the most predominant. Despite their high abundance no members from this archaeal group have so far been characterized and thus their metabolism is unknown. Here we show that the relative abundance of DSAG marker genes can be correlated with geochemical parameters, allowing prediction of both the potential electron donors and acceptors of these organisms. We estimated the abundance of 16S rRNA genes from Archaea, Bacteria and DSAG in 52 sediment horizons from two cores collected at the slow-spreading Arctic Mid-Ocean Ridge, using qPCR. The results indicate that members of the DSAG make up the entire archaeal population in certain horizons and constitute up to ~ 50% of the total microbial community. The quantitative data were correlated to 30 different geophysical and geochemical parameters obtained from the same sediment horizons. We observed a significant correlation between the relative abundance of DSAG 16S rRNA genes and the content of organic carbon (p < 0.0001. Further, significant co-variation with iron oxide, and dissolved iron and manganese (all p < 0.0000, indicated a direct or indirect link to iron and manganese cycling. Neither of these parameters correlated with the relative abundance of archaeal or bacterial 16S rRNA genes, nor did any other major electron donor or acceptor measured. Phylogenetic analysis of DSAG 16S rRNA gene sequences reveals three monophyletic lineages with no apparent habitat-specific distribution. In this study we support the hypothesis that members of the DSAG are tightly linked to the content of organic carbon and directly or indirectly involved in the cycling of iron and/or manganese compounds. Further, we provide a molecular tool to assess their abundance in environmental samples and enrichment cultures.

  6. High-throughput sequencing and analysis of the gill tissue transcriptome from the deep-sea hydrothermal vent mussel Bathymodiolus azoricus

    Directory of Open Access Journals (Sweden)

    Gomes Paula

    2010-10-01

    Full Text Available Abstract Background Bathymodiolus azoricus is a deep-sea hydrothermal vent mussel found in association with large faunal communities living in chemosynthetic environments at the bottom of the sea floor near the Azores Islands. Investigation of the exceptional physiological reactions that vent mussels have adopted in their habitat, including responses to environmental microbes, remains a difficult challenge for deep-sea biologists. In an attempt to reveal genes potentially involved in the deep-sea mussel innate immunity we carried out a high-throughput sequence analysis of freshly collected B. azoricus transcriptome using gills tissues as the primary source of immune transcripts given its strategic role in filtering the surrounding waterborne potentially infectious microorganisms. Additionally, a substantial EST data set was produced and from which a comprehensive collection of genes coding for putative proteins was organized in a dedicated database, "DeepSeaVent" the first deep-sea vent animal transcriptome database based on the 454 pyrosequencing technology. Results A normalized cDNA library from gills tissue was sequenced in a full 454 GS-FLX run, producing 778,996 sequencing reads. Assembly of the high quality reads resulted in 75,407 contigs of which 3,071 were singletons. A total of 39,425 transcripts were conceptually translated into amino-sequences of which 22,023 matched known proteins in the NCBI non-redundant protein database, 15,839 revealed conserved protein domains through InterPro functional classification and 9,584 were assigned with Gene Ontology terms. Queries conducted within the database enabled the identification of genes putatively involved in immune and inflammatory reactions which had not been previously evidenced in the vent mussel. Their physical counterpart was confirmed by semi-quantitative quantitative Reverse-Transcription-Polymerase Chain Reactions (RT-PCR and their RNA transcription level by quantitative PCR (q

  7. Breaking the 1000-gene barrier for Mimivirus using ultra-deep genome and transcriptome sequencing

    Directory of Open Access Journals (Sweden)

    Claverie Jean-Michel

    2011-03-01

    Full Text Available Abstract Background Mimivirus, a giant dsDNA virus infecting Acanthamoeba, is the prototype of the mimiviridae family, the latest addition to the family of the nucleocytoplasmic large DNA viruses (NCLDVs. Its 1.2 Mb-genome was initially predicted to encode 917 genes. A subsequent RNA-Seq analysis precisely mapped many transcript boundaries and identified 75 new genes. Findings We now report a much deeper analysis using the SOLiD™ technology combining RNA-Seq of the Mimivirus transcriptome during the infectious cycle (202.4 Million reads, and a complete genome re-sequencing (45.3 Million reads. This study corrected the genome sequence and identified several single nucleotide polymorphisms. Our results also provided clear evidence of previously overlooked transcription units, including an important RNA polymerase subunit distantly related to Euryarchea homologues. The total Mimivirus gene count is now 1018, 11% greater than the original annotation. Conclusions This study highlights the huge progress brought about by ultra-deep sequencing for the comprehensive annotation of virus genomes, opening the door to a complete one-nucleotide resolution level description of their transcriptional activity, and to the realistic modeling of the viral genome expression at the ultimate molecular level. This work also illustrates the need to go beyond bioinformatics-only approaches for the annotation of short protein and non-coding genes in viral genomes.

  8. Mapping vaccinia virus DNA replication origins at nucleotide level by deep sequencing.

    Science.gov (United States)

    Senkevich, Tatiana G; Bruno, Daniel; Martens, Craig; Porcella, Stephen F; Wolf, Yuri I; Moss, Bernard

    2015-09-01

    Poxviruses reproduce in the host cytoplasm and encode most or all of the enzymes and factors needed for expression and synthesis of their double-stranded DNA genomes. Nevertheless, the mode of poxvirus DNA replication and the nature and location of the replication origins remain unknown. A current but unsubstantiated model posits only leading strand synthesis starting at a nick near one covalently closed end of the genome and continuing around the other end to generate a concatemer that is subsequently resolved into unit genomes. The existence of specific origins has been questioned because any plasmid can replicate in cells infected by vaccinia virus (VACV), the prototype poxvirus. We applied directional deep sequencing of short single-stranded DNA fragments enriched for RNA-primed nascent strands isolated from the cytoplasm of VACV-infected cells to pinpoint replication origins. The origins were identified as the switching points of the fragment directions, which correspond to the transition from continuous to discontinuous DNA synthesis. Origins containing a prominent initiation point mapped to a sequence within the hairpin loop at one end of the VACV genome and to the same sequence within the concatemeric junction of replication intermediates. These findings support a model for poxvirus genome replication that involves leading and lagging strand synthesis and is consistent with the requirements for primase and ligase activities as well as earlier electron microscopic and biochemical studies implicating a replication origin at the end of the VACV genome.

  9. Transcriptome and small RNA deep sequencing reveals deregulation of miRNA biogenesis in human glioma.

    Science.gov (United States)

    Moore, Lynette M; Kivinen, Virpi; Liu, Yuexin; Annala, Matti; Cogdell, David; Liu, Xiuping; Liu, Chang-Gong; Sawaya, Raymond; Yli-Harja, Olli; Shmulevich, Ilya; Fuller, Gregory N; Zhang, Wei; Nykter, Matti

    2013-02-01

    Altered expression of oncogenic and tumour-suppressing microRNAs (miRNAs) is widely associated with tumourigenesis. However, the regulatory mechanisms underlying these alterations are poorly understood. We sought to shed light on the deregulation of miRNA biogenesis promoting the aberrant miRNA expression profiles identified in these tumours. Using sequencing technology to perform both whole-transcriptome and small RNA sequencing of glioma patient samples, we examined precursor and mature miRNAs to directly evaluate the miRNA maturation process, and examined expression profiles for genes involved in the major steps of miRNA biogenesis. We found that ratios of mature to precursor forms of a large number of miRNAs increased with the progression from normal brain to low-grade and then to high-grade gliomas. The expression levels of genes involved in each of the three major steps of miRNA biogenesis (nuclear processing, nucleo-cytoplasmic transport, and cytoplasmic processing) were systematically altered in glioma tissues. Survival analysis of an independent data set demonstrated that the alteration of genes involved in miRNA maturation correlates with survival in glioma patients. Direct quantification of miRNA maturation with deep sequencing demonstrated that deregulation of the miRNA biogenesis pathway is a hallmark for glioma genesis and progression.

  10. Quantitative analysis of axonal fiber activation evoked by deep brain stimulation via activation density heat maps

    Directory of Open Access Journals (Sweden)

    Christian J. Hartmann

    2015-02-01

    Full Text Available Background: Cortical modulation is likely to be involved in the various therapeutic effects of deep brain stimulation (DBS. However, it is currently difficult to predict the changes of cortical modulation during clinical adjustment of DBS. Therefore, we present a novel quantitative approach to estimate anatomical regions of DBS-evoked cortical modulation. Methods: Four different models of the subthalamic nucleus (STN DBS were created to represent variable electrode placements (model I: dorsal border of the posterolateral STN; model II: central posterolateral STN; model III: central anteromedial STN; model IV: dorsal border of the anteromedial STN. Axonal fibers of passage near each electrode location were reconstructed using probabilistic tractography and modeled using multi-compartment cable models. Stimulation-evoked activation of local axon fibers and corresponding cortical projections were modeled and quantified. Results: Stimulation at the border of the STN (models I and IV led to a higher degree of fiber activation and associated cortical modulation than stimulation deeply inside the STN (models II and III. A posterolateral target (models I and II was highly connected to cortical areas representing motor function. Additionally, model I was also associated with strong activation of fibers projecting to the cerebellum. Finally, models III and IV showed a dorsoventral difference of preferentially targeted prefrontal areas (models III: middle frontal gyrus; model IV: inferior frontal gyrus.Discussion: The method described herein allows characterization of cortical modulation across different electrode placements and stimulation parameters. Furthermore, knowledge of anatomical distribution of stimulation-evoked activation targeting cortical regions may help predict efficacy and potential side effects, and therefore can be used to improve the therapeutic effectiveness of individual adjustments in DBS patients.

  11. Analysis of Plasmodium falciparum diversity in natural infections by deep sequencing

    Science.gov (United States)

    Manske, Magnus; Miotto, Olivo; Campino, Susana; Auburn, Sarah; Almagro-Garcia, Jacob; Maslen, Gareth; O’Brien, Jack; Djimde, Abdoulaye; Doumbo, Ogobara; Zongo, Issaka; Ouedraogo, Jean-Bosco; Michon, Pascal; Mueller, Ivo; Siba, Peter; Nzila, Alexis; Borrmann, Steffen; Kiara, Steven M.; Marsh, Kevin; Jiang, Hongying; Su, Xin-Zhuan; Amaratunga, Chanaki; Fairhurst, Rick; Socheat, Duong; Nosten, Francois; Imwong, Mallika; White, Nicholas J.; Sanders, Mandy; Anastasi, Elisa; Alcock, Dan; Drury, Eleanor; Oyola, Samuel; Quail, Michael A.; Turner, Daniel J.; Rubio, Valentin Ruano; Jyothi, Dushyanth; Amenga-Etego, Lucas; Hubbart, Christina; Jeffreys, Anna; Rowlands, Kate; Sutherland, Colin; Roper, Cally; Mangano, Valentina; Modiano, David; Tan, John C.; Ferdig, Michael T.; Amambua-Ngwa, Alfred; Conway, David J.; Takala-Harrison, Shannon; Plowe, Christopher V.; Rayner, Julian C.; Rockett, Kirk A.; Clark, Taane G.; Newbold, Chris I.; Berriman, Matthew; MacInnis, Bronwyn; Kwiatkowski, Dominic P.

    2013-01-01

    Malaria elimination strategies require surveillance of the parasite population for genetic changes that demand a public health response, such as new forms of drug resistance. 1,2 Here we describe methods for large-scale analysis of genetic variation in Plasmodium falciparum by deep sequencing of parasite DNA obtained from the blood of patients with malaria, either directly or after short term culture. Analysis of 86,158 exonic SNPs that passed genotyping quality control in 227 samples from Africa, Asia and Oceania provides genome-wide estimates of allele frequency distribution, population structure and linkage disequilibrium. By comparing the genetic diversity of individual infections with that of the local parasite population, we derive a metric of within-host diversity that is related to the level of inbreeding in the population. An open-access web application has been established for exploration of regional differences in allele frequency and of highly differentiated loci in the P. falciparum genome. PMID:22722859

  12. Deep HST-WFPC2 photometry of NGC 288. II. The Main Sequence Luminosity Function

    CERN Document Server

    Bellazzini, M; Montegriffo, P; Messineo, M; Monaco, L; Rood, R T; Pecci, Flavio Fusi; Montegriffo, Paolo; Messineo, Maria

    2002-01-01

    The Main Sequence Luminosity Function (LF) of the Galactic globular cluster NGC 288 has been obtained using deep WFPC2 photometry. We have employed a new method to correct for completeness and fully account for bin-to-bin migration due to blending and/or observational scatter. The effect of the presence of binary systems in the final LF is quantified and is found to be negligible. There is a strong indication of the mass segregation of unevolved single stars and clear signs of a depletion of low mass stars in NGC 288 with respect to other clusters. The results are in good agreement with the prediction of theoretical models of the dynamical evolution of NGC 288 that take into account the extreme orbital properties of this cluster.

  13. Metagenomes obtained by "deep sequencing" - what do they tell about the EBPR communities

    DEFF Research Database (Denmark)

    Albertsen, Mads; Saunders, Aaron Marc; Nielsen, Kåre Lehmann

    Metagenomes obtained by "deep sequencing" - what do they tell about the EBPR communities? Mads Albertsen1, Aaron M. Saunders1, Kåre L. Nielsen1 and Per H. Nielsen1 1 Department of Biotechnology, Chemistry and Environmental Engineering, Aalborg University, Aalborg, Denmark Presenting Author: Mads...... on phylogenetic and functional level (Fig. 1). Even though the samples were taken at different times of the year (August vs. December) and from different EBPR plants, they cluster tightly, which may be attributed to the wide range of selection pressures acting on the EBPR communities. These results confirm...... clade I was estimated to 1.6% and II to 1.3% of the total population, and 1.5% and 1.1% in West. As a reference genome exists for clade IIA we used the raw metagenome reads to estimate the Accumulibacter micro-diversity in the metagenomes. This revealed a much greater micro-diversity than observed...

  14. Advancing Eucalyptus genomics: identification and sequencing of lignin biosynthesis genes from deep-coverage BAC libraries

    Directory of Open Access Journals (Sweden)

    Kudrna David

    2011-03-01

    Full Text Available Abstract Background Eucalyptus species are among the most planted hardwoods in the world because of their rapid growth, adaptability and valuable wood properties. The development and integration of genomic resources into breeding practice will be increasingly important in the decades to come. Bacterial artificial chromosome (BAC libraries are key genomic tools that enable positional cloning of important traits, synteny evaluation, and the development of genome framework physical maps for genetic linkage and genome sequencing. Results We describe the construction and characterization of two deep-coverage BAC libraries EG_Ba and EG_Bb obtained from nuclear DNA fragments of E. grandis (clone BRASUZ1 digested with HindIII and BstYI, respectively. Genome coverages of 17 and 15 haploid genome equivalents were estimated for EG_Ba and EG_Bb, respectively. Both libraries contained large inserts, with average sizes ranging from 135 Kb (Eg_Bb to 157 Kb (Eg_Ba, very low extra-nuclear genome contamination providing a probability of finding a single copy gene ≥ 99.99%. Libraries were screened for the presence of several genes of interest via hybridizations to high-density BAC filters followed by PCR validation. Five selected BAC clones were sequenced and assembled using the Roche GS FLX technology providing the whole sequence of the E. grandis chloroplast genome, and complete genomic sequences of important lignin biosynthesis genes. Conclusions The two E. grandis BAC libraries described in this study represent an important milestone for the advancement of Eucalyptus genomics and forest tree research. These BAC resources have a highly redundant genome coverage (> 15×, contain large average inserts and have a very low percentage of clones with organellar DNA or empty vectors. These publicly available BAC libraries are thus suitable for a broad range of applications in genetic and genomic research in Eucalyptus and possibly in related species of Myrtaceae

  15. Complex Genotype Mixtures Analyzed by Deep Sequencing in Two Different Regions of Hepatitis B Virus.

    Science.gov (United States)

    Caballero, Andrea; Gregori, Josep; Homs, Maria; Tabernero, David; Gonzalez, Carolina; Quer, Josep; Blasi, Maria; Casillas, Rosario; Nieto, Leonardo; Riveiro-Barciela, Mar; Esteban, Rafael; Buti, Maria; Rodriguez-Frias, Francisco

    2015-01-01

    This study assesses the presence and outcome of genotype mixtures in the polymerase/surface and X/preCore regions of the HBV genome in patients with chronic hepatitis B virus (HBV) infection. Thirty samples from ten chronic hepatitis B patients were included. The polymerase/surface and X/preCore regions were analyzed by deep sequencing (UDPS) in the first available sample at diagnosis, a pre-treatment sample, and a sample while under treatment. HBV genotype was determined by phylogenesis. Quasispecies complexity was evaluated by mutation frequency and nucleotide diversity. The polymerase/surface and X/preCore regions were validated for genotyping from 113 GenBank reference sequences. UDPS yielded a median of 10,960 sequences per sample (IQR 16,645) in the polymerase/surface region and 11,595 sequences per sample (IQR 14,682) in X/preCore. Genotype mixtures were more common in X/preCore (90%) than in polymerase/surface (30%) (pgenotyping, all samples were genotype A, whereas polymerase/surface yielded genotypes A (80%), D (16.7%), and F (3.3%) (p = 0.036). Genotype changes in polymerase/surface were observed in four patients during natural quasispecies dynamics and in two patients during treatment. There were no genotype changes in X/preCore. Quasispecies complexity was higher in X/preCore than in polymerase/surface (p = 0.004). The results provide evidence of genotype mixtures and differential genotype proportions in the polymerase/surface and X/preCore regions. The genotype dynamics in HBV infection and the different patterns of quasispecies complexity in the HBV genome suggest a new paradigm for HBV genotype classification.

  16. Deep sequencing the transcriptome reveals seasonal adaptive mechanisms in a hibernating mammal.

    Directory of Open Access Journals (Sweden)

    Marshall Hampton

    Full Text Available Mammalian hibernation is a complex phenotype involving metabolic rate reduction, bradycardia, profound hypothermia, and a reliance on stored fat that allows the animal to survive for months without food in a state of suspended animation. To determine the genes responsible for this phenotype in the thirteen-lined ground squirrel (Ictidomys tridecemlineatus we used the Roche 454 platform to sequence mRNA isolated at six points throughout the year from three key tissues: heart, skeletal muscle, and white adipose tissue (WAT. Deep sequencing generated approximately 3.7 million cDNA reads from 18 samples (6 time points ×3 tissues with a mean read length of 335 bases. Of these, 3,125,337 reads were assembled into 140,703 contigs. Approximately 90% of all sequences were matched to proteins in the human UniProt database. The total number of distinct human proteins matched by ground squirrel transcripts was 13,637 for heart, 12,496 for skeletal muscle, and 14,351 for WAT. Extensive mitochondrial RNA sequences enabled a novel approach of using the transcriptome to construct the complete mitochondrial genome for I. tridecemlineatus. Seasonal and activity-specific changes in mRNA levels that met our stringent false discovery rate cutoff (1.0 × 10(-11 were used to identify patterns of gene expression involving various aspects of the hibernation phenotype. Among these patterns are differentially expressed genes encoding heart proteins AT1A1, NAC1 and RYR2 controlling ion transport required for contraction and relaxation at low body temperatures. Abundant RNAs in skeletal muscle coding ubiquitin pathway proteins ASB2, UBC and DDB1 peak in October, suggesting an increase in muscle proteolysis. Finally, genes in WAT that encode proteins involved in lipogenesis (ACOD, FABP4 are highly expressed in August, but gradually decline in expression during the seasonal transition to lipolysis.

  17. Deep sequencing reveals low incidence of endogenous LINE-1 retrotransposition in human induced pluripotent stem cells.

    Directory of Open Access Journals (Sweden)

    Hubert Arokium

    Full Text Available Long interspersed element-1 (LINE-1 or L1 retrotransposition induces insertional mutations that can result in diseases. It was recently shown that the copy number of L1 and other retroelements is stable in induced pluripotent stem cells (iPSCs. However, by using an engineered reporter construct over-expressing L1, another study suggests that reprogramming activates L1 mobility in iPSCs. Given the potential of human iPSCs in therapeutic applications, it is important to clarify whether these cells harbor somatic insertions resulting from endogenous L1 retrotransposition. Here, we verified L1 expression during and after reprogramming as well as potential somatic insertions driven by the most active human endogenous L1 subfamily (L1Hs. Our results indicate that L1 over-expression is initiated during the reprogramming process and is subsequently sustained in isolated clones. To detect potential somatic insertions in iPSCs caused by L1Hs retotransposition, we used a novel sequencing strategy. As opposed to conventional sequencing direction, we sequenced from the 3' end of L1Hs to the genomic DNA, thus enabling the direct detection of the polyA tail signature of retrotransposition for verification of true insertions. Deep coverage sequencing thus allowed us to detect seven potential somatic insertions with low read counts from two iPSC clones. Negative PCR amplification in parental cells, presence of a polyA tail and absence from seven L1 germline insertion databases highly suggested true somatic insertions in iPSCs. Furthermore, these insertions could not be detected in iPSCs by PCR, likely due to low abundance. We conclude that L1Hs retrotransposes at low levels in iPSCs and therefore warrants careful analyses for genotoxic effects.

  18. Deep RNA sequencing of the skeletal muscle transcriptome in swimming fish.

    Directory of Open Access Journals (Sweden)

    Arjan P Palstra

    Full Text Available Deep RNA sequencing (RNA-seq was performed to provide an in-depth view of the transcriptome of red and white skeletal muscle of exercised and non-exercised rainbow trout (Oncorhynchus mykiss with the specific objective to identify expressed genes and quantify the transcriptomic effects of swimming-induced exercise. Pubertal autumn-spawning seawater-raised female rainbow trout were rested (n = 10 or swum (n = 10 for 1176 km at 0.75 body-lengths per second in a 6,000-L swim-flume under reproductive conditions for 40 days. Red and white muscle RNA of exercised and non-exercised fish (4 lanes was sequenced and resulted in 15-17 million reads per lane that, after de novo assembly, yielded 149,159 red and 118,572 white muscle contigs. Most contigs were annotated using an iterative homology search strategy against salmonid ESTs, the zebrafish Danio rerio genome and general Metazoan genes. When selecting for large contigs (>500 nucleotides, a number of novel rainbow trout gene sequences were identified in this study: 1,085 and 1,228 novel gene sequences for red and white muscle, respectively, which included a number of important molecules for skeletal muscle function. Transcriptomic analysis revealed that sustained swimming increased transcriptional activity in skeletal muscle and specifically an up-regulation of genes involved in muscle growth and developmental processes in white muscle. The unique collection of transcripts will contribute to our understanding of red and white muscle physiology, specifically during the long-term reproductive migration of salmonids.

  19. Dysregulation of B Cell Repertoire Formation in Myasthenia Gravis Patients Revealed through Deep Sequencing.

    Science.gov (United States)

    Vander Heiden, Jason A; Stathopoulos, Panos; Zhou, Julian Q; Chen, Luan; Gilbert, Tamara J; Bolen, Christopher R; Barohn, Richard J; Dimachkie, Mazen M; Ciafaloni, Emma; Broering, Teresa J; Vigneault, Francois; Nowak, Richard J; Kleinstein, Steven H; O'Connor, Kevin C

    2017-02-15

    Myasthenia gravis (MG) is a prototypical B cell-mediated autoimmune disease affecting 20-50 people per 100,000. The majority of patients fall into two clinically distinguishable types based on whether they produce autoantibodies targeting the acetylcholine receptor (AChR-MG) or muscle specific kinase (MuSK-MG). The autoantibodies are pathogenic, but whether their generation is associated with broader defects in the B cell repertoire is unknown. To address this question, we performed deep sequencing of the BCR repertoire of AChR-MG, MuSK-MG, and healthy subjects to generate ∼518,000 unique VH and VL sequences from sorted naive and memory B cell populations. AChR-MG and MuSK-MG subjects displayed distinct gene segment usage biases in both VH and VL sequences within the naive and memory compartments. The memory compartment of AChR-MG was further characterized by reduced positive selection of somatic mutations in the VH CDR and altered VH CDR3 physicochemical properties. The VL repertoire of MuSK-MG was specifically characterized by reduced V-J segment distance in recombined sequences, suggesting diminished VL receptor editing during B cell development. Our results identify large-scale abnormalities in both the naive and memory B cell repertoires. Particular abnormalities were unique to either AChR-MG or MuSK-MG, indicating that the repertoires reflect the distinct properties of the subtypes. These repertoire abnormalities are consistent with previously observed defects in B cell tolerance checkpoints in MG, thereby offering additional insight regarding the impact of tolerance defects on peripheral autoimmune repertoires. These collective findings point toward a deformed B cell repertoire as a fundamental component of MG.

  20. Deep sequencing reveals low incidence of endogenous LINE-1 retrotransposition in human induced pluripotent stem cells.

    Science.gov (United States)

    Arokium, Hubert; Kamata, Masakazu; Kim, Sanggu; Kim, Namshin; Liang, Min; Presson, Angela P; Chen, Irvin S

    2014-01-01

    Long interspersed element-1 (LINE-1 or L1) retrotransposition induces insertional mutations that can result in diseases. It was recently shown that the copy number of L1 and other retroelements is stable in induced pluripotent stem cells (iPSCs). However, by using an engineered reporter construct over-expressing L1, another study suggests that reprogramming activates L1 mobility in iPSCs. Given the potential of human iPSCs in therapeutic applications, it is important to clarify whether these cells harbor somatic insertions resulting from endogenous L1 retrotransposition. Here, we verified L1 expression during and after reprogramming as well as potential somatic insertions driven by the most active human endogenous L1 subfamily (L1Hs). Our results indicate that L1 over-expression is initiated during the reprogramming process and is subsequently sustained in isolated clones. To detect potential somatic insertions in iPSCs caused by L1Hs retotransposition, we used a novel sequencing strategy. As opposed to conventional sequencing direction, we sequenced from the 3' end of L1Hs to the genomic DNA, thus enabling the direct detection of the polyA tail signature of retrotransposition for verification of true insertions. Deep coverage sequencing thus allowed us to detect seven potential somatic insertions with low read counts from two iPSC clones. Negative PCR amplification in parental cells, presence of a polyA tail and absence from seven L1 germline insertion databases highly suggested true somatic insertions in iPSCs. Furthermore, these insertions could not be detected in iPSCs by PCR, likely due to low abundance. We conclude that L1Hs retrotransposes at low levels in iPSCs and therefore warrants careful analyses for genotoxic effects.

  1. Small RNA Library Cloning Procedure for Deep Sequencing of Specific Endogenous siRNA Classes in Caenorhabditis elegans

    Science.gov (United States)

    Ow, Maria C.; Lau, Nelson C.; Hall, Sarah E.

    2017-01-01

    In recent years, distinct classes of small RNAs ranging in size from ~21 to 26 nucleotides have been discovered and shown to play important roles in a wide array of cellular functions. Because of the abundance of these small RNAs, library preparation from an RNA sample followed by deep sequencing provides the identity and quantity of a particular class of small RNAs. In this chapter we describe a detailed protocol for preparing small RNA libraries for deep sequencing on the Illumina platform from the nematode C. elegans. PMID:24920360

  2. Quantitative assessment of RNA-protein interactions with high-throughput sequencing-RNA affinity profiling.

    Science.gov (United States)

    Ozer, Abdullah; Tome, Jacob M; Friedman, Robin C; Gheba, Dan; Schroth, Gary P; Lis, John T

    2015-08-01

    Because RNA-protein interactions have a central role in a wide array of biological processes, methods that enable a quantitative assessment of these interactions in a high-throughput manner are in great demand. Recently, we developed the high-throughput sequencing-RNA affinity profiling (HiTS-RAP) assay that couples sequencing on an Illumina GAIIx genome analyzer with the quantitative assessment of protein-RNA interactions. This assay is able to analyze interactions between one or possibly several proteins with millions of different RNAs in a single experiment. We have successfully used HiTS-RAP to analyze interactions of the EGFP and negative elongation factor subunit E (NELF-E) proteins with their corresponding canonical and mutant RNA aptamers. Here we provide a detailed protocol for HiTS-RAP that can be completed in about a month (8 d hands-on time). This includes the preparation and testing of recombinant proteins and DNA templates, clustering DNA templates on a flowcell, HiTS and protein binding with a GAIIx instrument, and finally data analysis. We also highlight aspects of HiTS-RAP that can be further improved and points of comparison between HiTS-RAP and two other recently developed methods, quantitative analysis of RNA on a massively parallel array (RNA-MaP) and RNA Bind-n-Seq (RBNS), for quantitative analysis of RNA-protein interactions.

  3. Enhanced methods for unbiased deep sequencing of Lassa and Ebola RNA viruses from clinical and biological samples.

    Science.gov (United States)

    Matranga, Christian B; Andersen, Kristian G; Winnicki, Sarah; Busby, Michele; Gladden, Adrianne D; Tewhey, Ryan; Stremlau, Matthew; Berlin, Aaron; Gire, Stephen K; England, Eleina; Moses, Lina M; Mikkelsen, Tarjei S; Odia, Ikponmwonsa; Ehiane, Philomena E; Folarin, Onikepe; Goba, Augustine; Kahn, S Humarr; Grant, Donald S; Honko, Anna; Hensley, Lisa; Happi, Christian; Garry, Robert F; Malboeuf, Christine M; Birren, Bruce W; Gnirke, Andreas; Levin, Joshua Z; Sabeti, Pardis C

    2014-01-01

    We have developed a robust RNA sequencing method for generating complete de novo assemblies with intra-host variant calls of Lassa and Ebola virus genomes in clinical and biological samples. Our method uses targeted RNase H-based digestion to remove contaminating poly(rA) carrier and ribosomal RNA. This depletion step improves both the quality of data and quantity of informative reads in unbiased total RNA sequencing libraries. We have also developed a hybrid-selection protocol to further enrich the viral content of sequencing libraries. These protocols have enabled rapid deep sequencing of both Lassa and Ebola virus and are broadly applicable to other viral genomics studies.

  4. Improved sequence learning with subthalamic nucleus deep brain stimulation: evidence for treatment-specific network modulation.

    Science.gov (United States)

    Mure, Hideo; Tang, Chris C; Argyelan, Miklos; Ghilardi, Maria-Felice; Kaplitt, Michael G; Dhawan, Vijay; Eidelberg, David

    2012-02-22

    We used a network approach to study the effects of anti-parkinsonian treatment on motor sequence learning in humans. Eight Parkinson's disease (PD) patients with bilateral subthalamic nucleus (STN) deep brain stimulation underwent H(2)(15)O positron emission tomography (PET) imaging to measure regional cerebral blood flow (rCBF) while they performed kinematically matched sequence learning and movement tasks at baseline and during stimulation. Network analysis revealed a significant learning-related spatial covariance pattern characterized by consistent increases in subject expression during stimulation (p = 0.008, permutation test). The network was associated with increased activity in the lateral cerebellum, dorsal premotor cortex, and parahippocampal gyrus, with covarying reductions in the supplementary motor area (SMA) and orbitofrontal cortex. Stimulation-mediated increases in network activity correlated with concurrent improvement in learning performance (p learning performance or network activity. Analysis of learning-related rCBF in network regions revealed improvement in baseline abnormalities with STN stimulation but not levodopa. These effects were most pronounced in the SMA. In this region, a consistent rCBF response to stimulation was observed across subjects and trials (p = 0.01), although the levodopa response was not significant. These findings link the cognitive treatment response in PD to changes in the activity of a specific cerebello-premotor cortical network. Selective modulation of overactive SMA-STN projection pathways may underlie the improvement in learning found with stimulation.

  5. Deep sequencing of Trichomonas vaginalis during the early infection of vaginal epithelial cells and amoeboid transition.

    Science.gov (United States)

    Gould, Sven B; Woehle, Christian; Kusdian, Gary; Landan, Giddy; Tachezy, Jan; Zimorski, Verena; Martin, William F

    2013-08-01

    The human pathogen Trichomonas vaginalis has the largest protozoan genome known, potentially encoding approximately 60,000 proteins. To what degree these genes are expressed is not well known and only a few key transcription factors and promoter domains have been identified. To shed light on the expression capacity of the parasite and transcriptional regulation during phase transitions, we deep sequenced the transcriptomes of the protozoan during two environmental stimuli of the early infection process: exposure to oxygen and contact with vaginal epithelial cells. Eleven 3' fragment libraries from different time points after exposure to oxygen only and in combination with human tissue were sequenced, generating more than 150 million reads which mapped onto 33,157 protein coding genes in total and a core set of more than 20,000 genes represented within all libraries. The data uncover gene family expression regulation in this parasite and give evidence for a concentrated response to the individual stimuli. Oxygen stress primarily reveals the parasite's strategies to deal with oxygen radicals. The exposure of oxygen-adapted parasites to human epithelial cells primarily induces cytoskeletal rearrangement and proliferation, reflecting the rapid morphological transition from spindle shaped flagellates to tissue-feeding and actively dividing amoeboids.

  6. Ultra Deep Sequencing of a Baculovirus Population Reveals Widespread Genomic Variations

    Directory of Open Access Journals (Sweden)

    Aurélien Chateigner

    2015-07-01

    Full Text Available Viruses rely on widespread genetic variation and large population size for adaptation. Large DNA virus populations are thought to harbor little variation though natural populations may be polymorphic. To measure the genetic variation present in a dsDNA virus population, we deep sequenced a natural strain of the baculovirus Autographa californica multiple nucleopolyhedrovirus. With 124,221X average genome coverage of our 133,926 bp long consensus, we could detect low frequency mutations (0.025%. K-means clustering was used to classify the mutations in four categories according to their frequency in the population. We found 60 high frequency non-synonymous mutations under balancing selection distributed in all functional classes. These mutants could alter viral adaptation dynamics, either through competitive or synergistic processes. Lastly, we developed a technique for the delimitation of large deletions in next generation sequencing data. We found that large deletions occur along the entire viral genome, with hotspots located in homologous repeat regions (hrs. Present in 25.4% of the genomes, these deletion mutants presumably require functional complementation to complete their infection cycle. They might thus have a large impact on the fitness of the baculovirus population. Altogether, we found a wide breadth of genomic variation in the baculovirus population, suggesting it has high adaptive potential.

  7. MicroRNA repertoire for functional genome research in tilapia identified by deep sequencing.

    Science.gov (United States)

    Yan, Biao; Wang, Zhen-Hua; Zhu, Chang-Dong; Guo, Jin-Tao; Zhao, Jin-Liang

    2014-08-01

    The Nile tilapia (Oreochromis niloticus; Cichlidae) is an economically important species in aquaculture and occupies a prominent position in the aquaculture industry. MicroRNAs (miRNAs) are a class of noncoding RNAs that post-transcriptionally regulate gene expression involved in diverse biological and metabolic processes. To increase the repertoire of miRNAs characterized in tilapia, we used the Illumina/Solexa sequencing technology to sequence a small RNA library using pooled RNA sample isolated from the different developmental stages of tilapia. Bioinformatic analyses suggest that 197 conserved and 27 novel miRNAs are expressed in tilapia. Sequence alignments indicate that all tested miRNAs and miRNAs* are highly conserved across many species. In addition, we characterized the tissue expression patterns of five miRNAs using real-time quantitative PCR. We found that miR-1/206, miR-7/9, and miR-122 is abundantly expressed in muscle, brain, and liver, respectively, implying a potential role in the regulation of tissue differentiation or the maintenance of tissue identity. Overall, our results expand the number of tilapia miRNAs, and the discovery of miRNAs in tilapia genome contributes to a better understanding the role of miRNAs in regulating diverse biological processes.

  8. Photogrammetric quantitative study of habitat and benthic communities of deep Cantabrian Sea hard grounds

    Science.gov (United States)

    Sánchez, F.; Serrano, A.; Ballesteros, M. Gómez

    2009-05-01

    To study the highly complex deep-sea habitats of the Cantabrian Sea and their macro-epibenthic communities a new towed underwater sled was designed to carry out quantitative visual transects based on photogrammetric analysis. The main objective of the study was undertaken to provide a first approach for gaining a better understanding of the correlation between hard substrates, depth and ecology in this region; thereby enabling researchers to determine the extent to which benthic communities depend on physical factors. The results were compared from two areas with different characteristics and methodological problems: one in the central Cantabrian Sea outer shelf (150 m depth), near the head of the Lastres Canyon, and another at the summit of the Le Danois Bank (555 m depth). Two image databases corresponding to two transects were analysed, with every photo being linked to a faunal list and a set of environmental variables. To assess the amount of variation in faunal densities related to the set of habitat environmental characteristics, a redundancy analysis (RDA) was used. The set of environmental variables comprised depth, temperature, salinity, substrate type and seafloor reflectivity. Using the hierarchical classification proposed by EUNIS, three habitats were identified from a Cantabrian Sea shelf visual transect: A4.12—Sponge communities on circalittoral rock (14.5% coverage), A5.35—Circalittoral sandy mud (56.8%) and A5.44—Circalittoral mixed sediments (28.7%). A typical community appeared on the rocky habitat, made up of yellow coral Dendrophyllia cornigera and the cup sponge Phakellia ventilabrum. On Le Danois Bank, three habitats were identified and the cnidarians ( Caryophyllia smithii and Callogorgia verticillata) and the sponges ( Asconema setubalense, Aplysilla sp., hexactinellids) characterized rocky habitats and patchy rock-sand habitats. This study provided groundtruthing for the existing surficial seafloor features and very valuable

  9. A method to prioritize quantitative traits and individuals for sequencing in family-based studies.

    Directory of Open Access Journals (Sweden)

    Kaanan P Shah

    Full Text Available Owing to recent advances in DNA sequencing, it is now technically feasible to evaluate the contribution of rare variation to complex traits and diseases. However, it is still cost prohibitive to sequence the whole genome (or exome of all individuals in each study. For quantitative traits, one strategy to reduce cost is to sequence individuals in the tails of the trait distribution. However, the next challenge becomes how to prioritize traits and individuals for sequencing since individuals are often characterized for dozens of medically relevant traits. In this article, we describe a new method, the Rare Variant Kinship Test (RVKT, which leverages relationship information in family-based studies to identify quantitative traits that are likely influenced by rare variants. Conditional on nuclear families and extended pedigrees, we evaluate the power of the RVKT via simulation. Not unexpectedly, the power of our method depends strongly on effect size, and to a lesser extent, on the frequency of the rare variant and the number and type of relationships in the sample. As an illustration, we also apply our method to data from two genetic studies in the Old Order Amish, a founder population with extensive genealogical records. Remarkably, we implicate the presence of a rare variant that lowers fasting triglyceride levels in the Heredity and Phenotype Intervention (HAPI Heart study (p = 0.044, consistent with the presence of a previously identified null mutation in the APOC3 gene that lowers fasting triglyceride levels in HAPI Heart study participants.

  10. Quantitative miRNA expression analysis: comparing microarrays with next-generation sequencing

    DEFF Research Database (Denmark)

    Willenbrock, Hanni; Salomon, Jesper; Søkilde, Rolf

    2009-01-01

    Recently, next-generation sequencing has been introduced as a promising, new platform for assessing the copy number of transcripts, while the existing microarray technology is considered less reliable for absolute, quantitative expression measurements. Nonetheless, so far, results from the two...... technologies have only been compared based on biological data, leading to the conclusion that, although they are somewhat correlated, expression values differ significantly. Here, we use synthetic RNA samples, resembling human microRNA samples, to find that microarray expression measures actually correlate...... better with sample RNA content than expression measures obtained from sequencing data. In addition, microarrays appear highly sensitive and perform equivalently to next-generation sequencing in terms of reproducibility and relative ratio quantification....

  11. Ultra-deep sequencing of mouse mitochondrial DNA: mutational patterns and their origins.

    Directory of Open Access Journals (Sweden)

    Adam Ameur

    2011-03-01

    Full Text Available Somatic mutations of mtDNA are implicated in the aging process, but there is no universally accepted method for their accurate quantification. We have used ultra-deep sequencing to study genome-wide mtDNA mutation load in the liver of normally- and prematurely-aging mice. Mice that are homozygous for an allele expressing a proof-reading-deficient mtDNA polymerase (mtDNA mutator mice have 10-times-higher point mutation loads than their wildtype siblings. In addition, the mtDNA mutator mice have increased levels of a truncated linear mtDNA molecule, resulting in decreased sequence coverage in the deleted region. In contrast, circular mtDNA molecules with large deletions occur at extremely low frequencies in mtDNA mutator mice and can therefore not drive the premature aging phenotype. Sequence analysis shows that the main proportion of the mutation load in heterozygous mtDNA mutator mice and their wildtype siblings is inherited from their heterozygous mothers consistent with germline transmission. We found no increase in levels of point mutations or deletions in wildtype C57Bl/6N mice with increasing age, thus questioning the causative role of these changes in aging. In addition, there was no increased frequency of transversion mutations with time in any of the studied genotypes, arguing against oxidative damage as a major cause of mtDNA mutations. Our results from studies of mice thus indicate that most somatic mtDNA mutations occur as replication errors during development and do not result from damage accumulation in adult life.

  12. Reconstructing the dynamics of HIV evolution within hosts from serial deep sequence data.

    Directory of Open Access Journals (Sweden)

    Art F Y Poon

    Full Text Available At the early stage of infection, human immunodeficiency virus (HIV-1 predominantly uses the CCR5 coreceptor for host cell entry. The subsequent emergence of HIV variants that use the CXCR4 coreceptor in roughly half of all infections is associated with an accelerated decline of CD4+ T-cells and rate of progression to AIDS. The presence of a 'fitness valley' separating CCR5- and CXCR4-using genotypes is postulated to be a biological determinant of whether the HIV coreceptor switch occurs. Using phylogenetic methods to reconstruct the evolutionary dynamics of HIV within hosts enables us to discriminate between competing models of this process. We have developed a phylogenetic pipeline for the molecular clock analysis, ancestral reconstruction, and visualization of deep sequence data. These data were generated by next-generation sequencing of HIV RNA extracted from longitudinal serum samples (median 7 time points from 8 untreated subjects with chronic HIV infections (Amsterdam Cohort Studies on HIV-1 infection and AIDS. We used the known dates of sampling to directly estimate rates of evolution and to map ancestral mutations to a reconstructed timeline in units of days. HIV coreceptor usage was predicted from reconstructed ancestral sequences using the geno2pheno algorithm. We determined that the first mutations contributing to CXCR4 use emerged about 16 (per subject range 4 to 30 months before the earliest predicted CXCR4-using ancestor, which preceded the first positive cell-based assay of CXCR4 usage by 10 (range 5 to 25 months. CXCR4 usage arose in multiple lineages within 5 of 8 subjects, and ancestral lineages following alternate mutational pathways before going extinct were common. We observed highly patient-specific distributions and time-scales of mutation accumulation, implying that the role of a fitness valley is contingent on the genotype of the transmitted variant.

  13. Reconstructing the dynamics of HIV evolution within hosts from serial deep sequence data.

    Science.gov (United States)

    Poon, Art F Y; Swenson, Luke C; Bunnik, Evelien M; Edo-Matas, Diana; Schuitemaker, Hanneke; van 't Wout, Angélique B; Harrigan, P Richard

    2012-01-01

    At the early stage of infection, human immunodeficiency virus (HIV)-1 predominantly uses the CCR5 coreceptor for host cell entry. The subsequent emergence of HIV variants that use the CXCR4 coreceptor in roughly half of all infections is associated with an accelerated decline of CD4+ T-cells and rate of progression to AIDS. The presence of a 'fitness valley' separating CCR5- and CXCR4-using genotypes is postulated to be a biological determinant of whether the HIV coreceptor switch occurs. Using phylogenetic methods to reconstruct the evolutionary dynamics of HIV within hosts enables us to discriminate between competing models of this process. We have developed a phylogenetic pipeline for the molecular clock analysis, ancestral reconstruction, and visualization of deep sequence data. These data were generated by next-generation sequencing of HIV RNA extracted from longitudinal serum samples (median 7 time points) from 8 untreated subjects with chronic HIV infections (Amsterdam Cohort Studies on HIV-1 infection and AIDS). We used the known dates of sampling to directly estimate rates of evolution and to map ancestral mutations to a reconstructed timeline in units of days. HIV coreceptor usage was predicted from reconstructed ancestral sequences using the geno2pheno algorithm. We determined that the first mutations contributing to CXCR4 use emerged about 16 (per subject range 4 to 30) months before the earliest predicted CXCR4-using ancestor, which preceded the first positive cell-based assay of CXCR4 usage by 10 (range 5 to 25) months. CXCR4 usage arose in multiple lineages within 5 of 8 subjects, and ancestral lineages following alternate mutational pathways before going extinct were common. We observed highly patient-specific distributions and time-scales of mutation accumulation, implying that the role of a fitness valley is contingent on the genotype of the transmitted variant.

  14. Large-Scale and Deep Quantitative Proteome Profiling Using Isobaric Labeling Coupled with Two-Dimensional LC-MS/MS

    Energy Technology Data Exchange (ETDEWEB)

    Gritsenko, Marina A.; Xu, Zhe; Liu, Tao; Smith, Richard D.

    2016-02-12

    Comprehensive, quantitative information on abundances of proteins and their post-translational modifications (PTMs) can potentially provide novel biological insights into diseases pathogenesis and therapeutic intervention. Herein, we introduce a quantitative strategy utilizing isobaric stable isotope-labelling techniques combined with two-dimensional liquid chromatography-tandem mass spectrometry (2D-LC-MS/MS) for large-scale, deep quantitative proteome profiling of biological samples or clinical specimens such as tumor tissues. The workflow includes isobaric labeling of tryptic peptides for multiplexed and accurate quantitative analysis, basic reversed-phase LC fractionation and concatenation for reduced sample complexity, and nano-LC coupled to high resolution and high mass accuracy MS analysis for high confidence identification and quantification of proteins. This proteomic analysis strategy has been successfully applied for in-depth quantitative proteomic analysis of tumor samples, and can also be used for integrated proteome and PTM characterization, as well as comprehensive quantitative proteomic analysis across samples from large clinical cohorts.

  15. Whole-genome sequence of Sunxiuqinia dokdonensis DH1T, isolated from deep sub-seafloor sediment in Dokdo Island

    OpenAIRE

    Sooyeon Lim; Dong-Ho Chang; Byoung-Chan Kim

    2016-01-01

    Sunxiuqinia dokdonensis DH1T was isolated from deep sub-seafloor sediment at a depth of 900 m below the seafloor off Seo-do (the west part of Dokdo Island) in the East Sea of the Republic of Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession LGIA00000000.

  16. Whole-genome sequence of Sunxiuqinia dokdonensis DH1(T), isolated from deep sub-seafloor sediment in Dokdo Island.

    Science.gov (United States)

    Lim, Sooyeon; Chang, Dong-Ho; Kim, Byoung-Chan

    2016-09-01

    Sunxiuqinia dokdonensis DH1(T) was isolated from deep sub-seafloor sediment at a depth of 900 m below the seafloor off Seo-do (the west part of Dokdo Island) in the East Sea of the Republic of Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession LGIA00000000.

  17. Whole-genome sequence of Sunxiuqinia dokdonensis DH1T, isolated from deep sub-seafloor sediment in Dokdo Island

    Directory of Open Access Journals (Sweden)

    Sooyeon Lim

    2016-09-01

    Full Text Available Sunxiuqinia dokdonensis DH1T was isolated from deep sub-seafloor sediment at a depth of 900 m below the seafloor off Seo-do (the west part of Dokdo Island in the East Sea of the Republic of Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession LGIA00000000.

  18. Homology-independent discovery of replicating pathogenic circular RNAs by deep sequencing and a new computational algorithm

    OpenAIRE

    Wu, Qingfa; Wang, Ying; Cao, Mengji; Pantaleo, Vitantonio; Burgyan, Joszef; Li, Wan-Xiang; Ding, Shou-wei

    2012-01-01

    A common challenge in pathogen discovery by deep sequencing approaches is to recognize viral or subviral pathogens in samples of diseased tissue that share no significant homology with a known pathogen. Here we report a homology-independent approach for discovering viroids, a distinct class of free circular RNA subviral pathogens that encode no protein and are known to infect plants only. Our approach involves analyzing the sequences of the total small RNAs of the infected plants obtained by ...

  19. Develop a quantitative understanding of rockmass behaviour near excavations in deep mines, part 1

    CSIR Research Space (South Africa)

    Napier, JAL

    1995-12-01

    Full Text Available Control of the rock mass deformation near deep level stopes and the avoidance of damaging incidents of violent rock failure require a fundamental understanding of rock failure mechanisms. Research work to gain this understanding has been undertaken...

  20. Transcriptome walking: a laboratory-oriented GUI-based approach to mRNA identification from deep-sequenced data

    Directory of Open Access Journals (Sweden)

    French Andrew S

    2012-12-01

    Full Text Available Abstract Background Deep sequencing technology provides efficient and economical production of large numbers of randomly positioned, relatively short, estimates of base identities in DNA molecules. Application of this technology to mRNA samples allows rapid examination of the molecular genetic environment in individual cells or tissues, the transcriptome. However, assembly of such short sequences into complete mRNA creates a challenge that limits the usefulness of the technology, particularly when no, or limited, genomic data is available. Several approaches to this problem have been developed, but there is still no general method to rapidly obtain an mRNA sequence from deep sequence data when a specific molecule, or family of molecules, are of interest. A frequent requirement is to identify specific mRNA molecules from tissues that are being investigated by methods such as electrophysiology, immunocytology and pharmacology. To be widely useful, any approach must be relatively simple to use in the laboratory by operators without extensive statistical or bioinformatics knowledge, and with readily available hardware. Findings An approach was developed that allows de novo assembly of individual mRNA sequences in two linked stages: sequence discovery and sequence completion. Both stages rely on computer assisted, Graphical User Interface (GUI-guided, user interaction with the data, but proceed relatively efficiently once discovery is complete. The method grows a discovered sequence by repeated passes through the complete raw data in a series of steps, and is hence termed ‘transcriptome walking’. All of the operations required for transcriptome analysis are combined in one program that presents a relatively simple user interface and runs on a standard desktop, or laptop computer, but takes advantage of multi-core processors, when available. Complete mRNA sequence identifications usually require less than 24 hours. This approach has already

  1. Sequence analysis and quantitative detection of Norwalk-like viruses in cultured oysters of China

    Science.gov (United States)

    Wang, Jun; Tang, Qingjuan; Yue, Zhiqin; Li, Zhaojie; Zhang, Jin; Xue, Changhu

    2008-05-01

    We isolated 4 Norwalk-like viruses (NLVs) contaminated oysters from 33 Chinese oysters collected from local commercial sources of Shandong Province. After amplification of the RNA-dependent RNA polymerase (RdRp) region of NLVs genomes with RT-PCR, the open reading frame 1 (ORF1) of the RdRp was sequenced and subjected to multiple-sequence alignment. The results showed that NLVs in the four isolates belong to genogroup II. The sequence comparison showed that the similarity between four Chinese oyster isolates were higher than 99.0%, which indicated that NLVs prevalent in close areas have high homogeneity in genome sequences. In addition, the most conserved sequences between diverse NLVs were used to design primers and TaqMan probes, then the real-time quantitative PCR assay was performed. According to the standard curve of GII NLVs, the original amounts (copies) of NLVs in positive patient’s fecal isolate, positive Japanese oyster isolate, and the Chinese oyster isolate were 8.9×108, 1.25×108 and 4.7×101 respectively. The detecting limit of NLVs was 1×101 copies. This study will be helpful for routine diagnosis of NLVs pathogens in foods and thus for avoiding food poisoning in the future.

  2. Sequence Analysis and Quantitative Detection of Norwalk-like Viruses in Cultured Oysters of China

    Institute of Scientific and Technical Information of China (English)

    WANG Jun; TANG Qingjuan; YUE Zhiqin; LI Zhaojie; ZHANG Jin; XUE Changhu

    2008-01-01

    We isolated 4 Norwalk-like viruses (NLVs) contaminated oysters from 33 Chinese oysters collected from local commer-cial sources of Shandong Province. After amplification of the RNA-dependent RNA polymerase (RdRp) region of NLVs genomes with RT-PCR, the open reading frame 1 (ORF1) of the RdRp was sequenced and subjected to multiple-sequence alignment. The re-suits showed that NLVs in the four isolates belong to genogroup Ⅱ. The sequence comparison showed that the similarity between four Chinese oyster isolates were higher than 99.0%, which indicated that NLVs prevalent in close areas have high homogeneity in genome sequences. In addition, the most conserved sequences between diverse NLVs were used to design primers and TaqMan probes, then the real-time quantitative PCR assay was performed. According to the standard curve of GII NLVs, the original amounts (copies) of NLVs in positive patient's fecal isolate, positive Japanese oyster isolate, and the Chinese oyster isolate were 8.9×108, 1.25×108 and 4.7×101 respectively. The detecting limit of NLVs was 1×101 copies. This study will be helpful for routine diagnosis of NLVs pathogens in foods and thus for avoiding food poisoning in the future.

  3. High-Resolution Hepatitis C Virus Subtyping Using NS5B Deep Sequencing and Phylogeny, an Alternative to Current Methods

    Science.gov (United States)

    Gregori, Josep; Rodríguez-Frias, Francisco; Buti, Maria; Madejon, Antonio; Perez-del-Pulgar, Sofia; Garcia-Cehic, Damir; Casillas, Rosario; Blasi, Maria; Homs, Maria; Tabernero, David; Alvarez-Tejado, Miguel; Muñoz, Jose Manuel; Cubero, Maria; Caballero, Andrea; delCampo, Jose Antonio; Domingo, Esteban; Belmonte, Irene; Nieto, Leonardo; Lens, Sabela; Muñoz-de-Rueda, Paloma; Sanz-Cameno, Paloma; Sauleda, Silvia; Bes, Marta; Gomez, Jordi; Briones, Carlos; Perales, Celia; Sheldon, Julie; Castells, Lluis; Viladomiu, Lluis; Salmeron, Javier; Ruiz-Extremera, Angela; Quiles-Pérez, Rosa; Moreno-Otero, Ricardo; López-Rodríguez, Rosario; Allende, Helena; Romero-Gómez, Manuel; Guardia, Jaume; Esteban, Rafael; Garcia-Samaniego, Javier; Forns, Xavier

    2014-01-01

    Hepatitis C virus (HCV) is classified into seven major genotypes and 67 subtypes. Recent studies have shown that in HCV genotype 1-infected patients, response rates to regimens containing direct-acting antivirals (DAAs) are subtype dependent. Currently available genotyping methods have limited subtyping accuracy. We have evaluated the performance of a deep-sequencing-based HCV subtyping assay, developed for the 454/GS-Junior platform, in comparison with those of two commercial assays (Versant HCV genotype 2.0 and Abbott Real-time HCV Genotype II) and using direct NS5B sequencing as a gold standard (direct sequencing), in 114 clinical specimens previously tested by first-generation hybridization assay (82 genotype 1 and 32 with uninterpretable results). Phylogenetic analysis of deep-sequencing reads matched subtype 1 calling by population Sanger sequencing (69% 1b, 31% 1a) in 81 specimens and identified a mixed-subtype infection (1b/3a/1a) in one sample. Similarly, among the 32 previously indeterminate specimens, identical genotype and subtype results were obtained by direct and deep sequencing in all but four samples with dual infection. In contrast, both Versant HCV Genotype 2.0 and Abbott Real-time HCV Genotype II failed subtype 1 calling in 13 (16%) samples each and were unable to identify the HCV genotype and/or subtype in more than half of the non-genotype 1 samples. We concluded that deep sequencing is more efficient for HCV subtyping than currently available methods and allows qualitative identification of mixed infections and may be more helpful with respect to informing treatment strategies with new DAA-containing regimens across all HCV subtypes. PMID:25378574

  4. Generating quantitative models describing the sequence specificity of biological processes with the stabilized matrix method

    Directory of Open Access Journals (Sweden)

    Sette Alessandro

    2005-05-01

    Full Text Available Abstract Background Many processes in molecular biology involve the recognition of short sequences of nucleic-or amino acids, such as the binding of immunogenic peptides to major histocompatibility complex (MHC molecules. From experimental data, a model of the sequence specificity of these processes can be constructed, such as a sequence motif, a scoring matrix or an artificial neural network. The purpose of these models is two-fold. First, they can provide a summary of experimental results, allowing for a deeper understanding of the mechanisms involved in sequence recognition. Second, such models can be used to predict the experimental outcome for yet untested sequences. In the past we reported the development of a method to generate such models called the Stabilized Matrix Method (SMM. This method has been successfully applied to predicting peptide binding to MHC molecules, peptide transport by the transporter associated with antigen presentation (TAP and proteasomal cleavage of protein sequences. Results Herein we report the implementation of the SMM algorithm as a publicly available software package. Specific features determining the type of problems the method is most appropriate for are discussed. Advantageous features of the package are: (1 the output generated is easy to interpret, (2 input and output are both quantitative, (3 specific computational strategies to handle experimental noise are built in, (4 the algorithm is designed to effectively handle bounded experimental data, (5 experimental data from randomized peptide libraries and conventional peptides can easily be combined, and (6 it is possible to incorporate pair interactions between positions of a sequence. Conclusion Making the SMM method publicly available enables bioinformaticians and experimental biologists to easily access it, to compare its performance to other prediction methods, and to extend it to other applications.

  5. Hybridization Capture-Based Next-Generation Sequencing to Evaluate Coding Sequence and Deep Intronic Mutations in the NF1 Gene.

    Science.gov (United States)

    Cunha, Karin Soares; Oliveira, Nathalia Silva; Fausto, Anna Karoline; de Souza, Carolina Cruz; Gros, Audrey; Bandres, Thomas; Idrissi, Yamina; Merlio, Jean-Philippe; de Moura Neto, Rodrigo Soares; Silva, Rosane; Geller, Mauro; Cappellen, David

    2016-12-17

    Neurofibromatosis 1 (NF1) is one of the most common genetic disorders and is caused by mutations in the NF1 gene. NF1 gene mutational analysis presents a considerable challenge because of its large size, existence of highly homologous pseudogenes located throughout the human genome, absence of mutational hotspots, and diversity of mutations types, including deep intronic splicing mutations. We aimed to evaluate the use of hybridization capture-based next-generation sequencing to screen coding and noncoding NF1 regions. Hybridization capture-based next-generation sequencing, with genomic DNA as starting material, was used to sequence the whole NF1 gene (exons and introns) from 11 unrelated individuals and 1 relative, who all had NF1. All of them met the NF1 clinical diagnostic criteria. We showed a mutation detection rate of 91% (10 out of 11). We identified eight recurrent and two novel mutations, which were all confirmed by Sanger methodology. In the Sanger sequencing confirmation, we also included another three relatives with NF1. Splicing alterations accounted for 50% of the mutations. One of them was caused by a deep intronic mutation (c.1260 + 1604A > G). Frameshift truncation and missense mutations corresponded to 30% and 20% of the pathogenic variants, respectively. In conclusion, we show the use of a simple and fast approach to screen, at once, the entire NF1 gene (exons and introns) for different types of pathogenic variations, including the deep intronic splicing mutations.

  6. Hybridization Capture-Based Next-Generation Sequencing to Evaluate Coding Sequence and Deep Intronic Mutations in the NF1 Gene

    Science.gov (United States)

    Cunha, Karin Soares; Oliveira, Nathalia Silva; Fausto, Anna Karoline; de Souza, Carolina Cruz; Gros, Audrey; Bandres, Thomas; Idrissi, Yamina; Merlio, Jean-Philippe; de Moura Neto, Rodrigo Soares; Silva, Rosane; Geller, Mauro; Cappellen, David

    2016-01-01

    Neurofibromatosis 1 (NF1) is one of the most common genetic disorders and is caused by mutations in the NF1 gene. NF1 gene mutational analysis presents a considerable challenge because of its large size, existence of highly homologous pseudogenes located throughout the human genome, absence of mutational hotspots, and diversity of mutations types, including deep intronic splicing mutations. We aimed to evaluate the use of hybridization capture-based next-generation sequencing to screen coding and noncoding NF1 regions. Hybridization capture-based next-generation sequencing, with genomic DNA as starting material, was used to sequence the whole NF1 gene (exons and introns) from 11 unrelated individuals and 1 relative, who all had NF1. All of them met the NF1 clinical diagnostic criteria. We showed a mutation detection rate of 91% (10 out of 11). We identified eight recurrent and two novel mutations, which were all confirmed by Sanger methodology. In the Sanger sequencing confirmation, we also included another three relatives with NF1. Splicing alterations accounted for 50% of the mutations. One of them was caused by a deep intronic mutation (c.1260 + 1604A > G). Frameshift truncation and missense mutations corresponded to 30% and 20% of the pathogenic variants, respectively. In conclusion, we show the use of a simple and fast approach to screen, at once, the entire NF1 gene (exons and introns) for different types of pathogenic variations, including the deep intronic splicing mutations. PMID:27999334

  7. Hybridization Capture-Based Next-Generation Sequencing to Evaluate Coding Sequence and Deep Intronic Mutations in the NF1 Gene

    Directory of Open Access Journals (Sweden)

    Karin Soares Cunha

    2016-12-01

    Full Text Available Neurofibromatosis 1 (NF1 is one of the most common genetic disorders and is caused by mutations in the NF1 gene. NF1 gene mutational analysis presents a considerable challenge because of its large size, existence of highly homologous pseudogenes located throughout the human genome, absence of mutational hotspots, and diversity of mutations types, including deep intronic splicing mutations. We aimed to evaluate the use of hybridization capture-based next-generation sequencing to screen coding and noncoding NF1 regions. Hybridization capture-based next-generation sequencing, with genomic DNA as starting material, was used to sequence the whole NF1 gene (exons and introns from 11 unrelated individuals and 1 relative, who all had NF1. All of them met the NF1 clinical diagnostic criteria. We showed a mutation detection rate of 91% (10 out of 11. We identified eight recurrent and two novel mutations, which were all confirmed by Sanger methodology. In the Sanger sequencing confirmation, we also included another three relatives with NF1. Splicing alterations accounted for 50% of the mutations. One of them was caused by a deep intronic mutation (c.1260 + 1604A > G. Frameshift truncation and missense mutations corresponded to 30% and 20% of the pathogenic variants, respectively. In conclusion, we show the use of a simple and fast approach to screen, at once, the entire NF1 gene (exons and introns for different types of pathogenic variations, including the deep intronic splicing mutations.

  8. DeepSNVMiner: a sequence analysis tool to detect emergent, rare mutations in subsets of cell populations

    Directory of Open Access Journals (Sweden)

    T. Daniel Andrews

    2016-05-01

    Full Text Available Background. Massively parallel sequencing technology is being used to sequence highly diverse populations of DNA such as that derived from heterogeneous cell mixtures containing both wild-type and disease-related states. At the core of such molecule tagging techniques is the tagging and identification of sequence reads derived from individual input DNA molecules, which must be first computationally disambiguated to generate read groups sharing common sequence tags, with each read group representing a single input DNA molecule. This disambiguation typically generates huge numbers of reads groups, each of which requires additional variant detection analysis steps to be run specific to each read group, thus representing a significant computational challenge. While sequencing technologies for producing these data are approaching maturity, the lack of available computational tools for analysing such heterogeneous sequence data represents an obstacle to the widespread adoption of this technology. Results. Using synthetic data we successfully detect unique variants at dilution levels of 1 in a 1,000,000 molecules, and find DeeepSNVMiner obtains significantly lower false positive and false negative rates compared to popular variant callers GATK, SAMTools, FreeBayes and LoFreq, particularly as the variant concentration levels decrease. In a dilution series with genomic DNA from two cells lines, we find DeepSNVMiner identifies a known somatic variant when present at concentrations of only 1 in 1,000 molecules in the input material, the lowest concentration amongst all variant callers tested. Conclusions. Here we present DeepSNVMiner; a tool to disambiguate tagged sequence groups and robustly identify sequence variants specific to subsets of starting DNA molecules that may indicate the presence of a disease. DeepSNVMiner is an automated workflow of custom sequence analysis utilities and open source tools able to differentiate somatic DNA variants from

  9. Targeted deep sequencing improves outcome stratification in chronic myelomonocytic leukemia with low risk cytogenetic features

    Science.gov (United States)

    Palomo, Laura; Garcia, Olga; Arnan, Montse; Xicoy, Blanca; Fuster, Francisco; Cabezón, Marta; Coll, Rosa; Ademà, Vera; Grau, Javier; Jiménez, Maria-José; Pomares, Helena; Marcé, Sílvia; Mallo, Mar; Millá, Fuensanta; Alonso, Esther; Sureda, Anna; Gallardo, David; Feliu, Evarist; Ribera, Josep-Maria; Solé, Francesc; Zamora, Lurdes

    2016-01-01

    Clonal cytogenetic abnormalities are found in 20-30% of patients with chronic myelomonocytic leukemia (CMML), while gene mutations are present in >90% of cases. Patients with low risk cytogenetic features account for 80% of CMML cases and often fall into the low risk categories of CMML prognostic scoring systems, but the outcome differs considerably among them. We performed targeted deep sequencing of 83 myeloid-related genes in 56 CMML patients with low risk cytogenetic features or uninformative conventional cytogenetics (CC) at diagnosis, with the aim to identify the genetic characteristics of patients with a more aggressive disease. Targeted sequencing was also performed in a subset of these patients at time of acute myeloid leukemia (AML) transformation. Overall, 98% of patients harbored at least one mutation. Mutations in cell signaling genes were acquired at time of AML progression. Mutations in ASXL1, EZH2 and NRAS correlated with higher risk features and shorter overall survival (OS) and progression free survival (PFS). Patients with SRSF2 mutations associated with poorer OS, while absence of TET2 mutations (TET2wt) was predictive of shorter PFS. A decrease in OS and PFS was observed as the number of adverse risk gene mutations (ASXL1, EZH2, NRAS and SRSF2) increased. On multivariate analyses, CMML-specific scoring system (CPSS) and presence of adverse risk gene mutations remained significant for OS, while CPSS and TET2wt were predictive of PFS. These results confirm that mutation analysis can add prognostic value to patients with CMML and low risk cytogenetic features or uninformative CC. PMID:27486981

  10. Deep sequencing-based analysis of the anaerobic stimulon in Neisseria gonorrhoeae

    Directory of Open Access Journals (Sweden)

    Clark Virginia L

    2011-01-01

    Full Text Available Abstract Background Maintenance of an anaerobic denitrification system in the obligate human pathogen, Neisseria gonorrhoeae, suggests that an anaerobic lifestyle may be important during the course of infection. Furthermore, mounting evidence suggests that reduction of host-produced nitric oxide has several immunomodulary effects on the host. However, at this point there have been no studies analyzing the complete gonococcal transcriptome response to anaerobiosis. Here we performed deep sequencing to compare the gonococcal transcriptomes of aerobically and anaerobically grown cells. Using the information derived from this sequencing, we discuss the implications of the robust transcriptional response to anaerobic growth. Results We determined that 198 chromosomal genes were differentially expressed (~10% of the genome in response to anaerobic conditions. We also observed a large induction of genes encoded within the cryptic plasmid, pJD1. Validation of RNA-seq data using translational-lacZ fusions or RT-PCR demonstrated the RNA-seq results to be very reproducible. Surprisingly, many genes of prophage origin were induced anaerobically, as well as several transcriptional regulators previously unknown to be involved in anaerobic growth. We also confirmed expression and regulation of a small RNA, likely a functional equivalent of fnrS in the Enterobacteriaceae family. We also determined that many genes found to be responsive to anaerobiosis have also been shown to be responsive to iron and/or oxidative stress. Conclusions Gonococci will be subject to many forms of environmental stress, including oxygen-limitation, during the course of infection. Here we determined that the anaerobic stimulon in gonococci was larger than previous studies would suggest. Many new targets for future research have been uncovered, and the results derived from this study may have helped to elucidate factors or mechanisms of virulence that may have otherwise been overlooked.

  11. Deep sequencing-based analysis of the anaerobic stimulon in Neisseria gonorrhoeae.

    Science.gov (United States)

    Isabella, Vincent M; Clark, Virginia L

    2011-01-20

    Maintenance of an anaerobic denitrification system in the obligate human pathogen, Neisseria gonorrhoeae, suggests that an anaerobic lifestyle may be important during the course of infection. Furthermore, mounting evidence suggests that reduction of host-produced nitric oxide has several immunomodulary effects on the host. However, at this point there have been no studies analyzing the complete gonococcal transcriptome response to anaerobiosis. Here we performed deep sequencing to compare the gonococcal transcriptomes of aerobically and anaerobically grown cells. Using the information derived from this sequencing, we discuss the implications of the robust transcriptional response to anaerobic growth. We determined that 198 chromosomal genes were differentially expressed (~10% of the genome) in response to anaerobic conditions. We also observed a large induction of genes encoded within the cryptic plasmid, pJD1. Validation of RNA-seq data using translational-lacZ fusions or RT-PCR demonstrated the RNA-seq results to be very reproducible. Surprisingly, many genes of prophage origin were induced anaerobically, as well as several transcriptional regulators previously unknown to be involved in anaerobic growth. We also confirmed expression and regulation of a small RNA, likely a functional equivalent of fnrS in the Enterobacteriaceae family. We also determined that many genes found to be responsive to anaerobiosis have also been shown to be responsive to iron and/or oxidative stress. Gonococci will be subject to many forms of environmental stress, including oxygen-limitation, during the course of infection. Here we determined that the anaerobic stimulon in gonococci was larger than previous studies would suggest. Many new targets for future research have been uncovered, and the results derived from this study may have helped to elucidate factors or mechanisms of virulence that may have otherwise been overlooked.

  12. Next-Generation Analysis of Deep Sequencing Data: Bringing Light into the Black Box of SELEX Experiments.

    Science.gov (United States)

    Blank, Michael

    2016-01-01

    In silico analysis of next-generation sequencing data (NGS; also termed deep sequencing) derived from in vitro selection experiments enables the analysis of the SELEX procedure (Systematic Evolution of Ligands by EXponential enrichment) in an unprecedented depth and improves the identification of aptamers. Besides quality control and optimization of starting libraries, advanced screening strategies for difficult targets or early identification of rare but high quality aptamers which are otherwise lost in the in vitro selection experiments become possible. The high information content of sequence data obtained from selection experiments is furthermore useful for subsequent lead optimization.

  13. Distinctive Drug-resistant Mutation Profiles and Interpretations of HIV-1 Proviral DNA Revealed by Deep Sequencing in Reverse Transcriptase

    Institute of Scientific and Technical Information of China (English)

    YIN Qian Qian; SHAO Yi Ming; MA Li Ying; LI Zhen Peng; ZHAO Hai; PAN Dong; WANG Yan; XU Wei Si; XING Hui; FENGYi; JIANG Shi Bo

    2016-01-01

    ObjectiveTo investigate distinctive features in drug-resistant mutations(DRMs) and interpretations for reverse transcriptase inhibitors (RTIs) between proviral DNA and paired viral RNA in HIV-1-infected patients. MethodsForty-three HIV-1-infected individuals receiving first-line antiretroviral therapy were recruited to participate in a multicenter AIDS Cohort Study in Anhui and Henan Provinces in China in 2004. Drug resistance genotyping was performed by bulk sequencing and deep sequencing on the plasma and whole blood of 77 samples, respectively. Drug-resistance interpretation was compared between viral RNA and paired proviral DNA. ResultsCompared with bulk sequencing, deep sequencing could detect more DRMs and samples with DRMs in both viral RNA and proviral DNA. The mutations M184I and M230I were more prevalent in proviral DNA than in viral RNA (Fisher’s exact test,P ConclusionCompared with viral RNA, the distinctive information of DRMsand drug resistance interpretations for proviral DNA could be obtained by deep sequencing, which could provide more detailed and precise information for drug resistance monitoring and the rational design of optimal antiretroviral therapy regimens.

  14. Analysis of Plasmodium falciparum diversity in natural infections by deep sequencing.

    Science.gov (United States)

    Manske, Magnus; Miotto, Olivo; Campino, Susana; Auburn, Sarah; Almagro-Garcia, Jacob; Maslen, Gareth; O'Brien, Jack; Djimde, Abdoulaye; Doumbo, Ogobara; Zongo, Issaka; Ouedraogo, Jean-Bosco; Michon, Pascal; Mueller, Ivo; Siba, Peter; Nzila, Alexis; Borrmann, Steffen; Kiara, Steven M; Marsh, Kevin; Jiang, Hongying; Su, Xin-Zhuan; Amaratunga, Chanaki; Fairhurst, Rick; Socheat, Duong; Nosten, Francois; Imwong, Mallika; White, Nicholas J; Sanders, Mandy; Anastasi, Elisa; Alcock, Dan; Drury, Eleanor; Oyola, Samuel; Quail, Michael A; Turner, Daniel J; Ruano-Rubio, Valentin; Jyothi, Dushyanth; Amenga-Etego, Lucas; Hubbart, Christina; Jeffreys, Anna; Rowlands, Kate; Sutherland, Colin; Roper, Cally; Mangano, Valentina; Modiano, David; Tan, John C; Ferdig, Michael T; Amambua-Ngwa, Alfred; Conway, David J; Takala-Harrison, Shannon; Plowe, Christopher V; Rayner, Julian C; Rockett, Kirk A; Clark, Taane G; Newbold, Chris I; Berriman, Matthew; MacInnis, Bronwyn; Kwiatkowski, Dominic P

    2012-07-19

    Malaria elimination strategies require surveillance of the parasite population for genetic changes that demand a public health response, such as new forms of drug resistance. Here we describe methods for the large-scale analysis of genetic variation in Plasmodium falciparum by deep sequencing of parasite DNA obtained from the blood of patients with malaria, either directly or after short-term culture. Analysis of 86,158 exonic single nucleotide polymorphisms that passed genotyping quality control in 227 samples from Africa, Asia and Oceania provides genome-wide estimates of allele frequency distribution, population structure and linkage disequilibrium. By comparing the genetic diversity of individual infections with that of the local parasite population, we derive a metric of within-host diversity that is related to the level of inbreeding in the population. An open-access web application has been established for the exploration of regional differences in allele frequency and of highly differentiated loci in the P. falciparum genome.

  15. Genomic region operation kit for flexible processing of deep sequencing data.

    Science.gov (United States)

    Ovaska, Kristian; Lyly, Lauri; Sahu, Biswajyoti; Jänne, Olli A; Hautaniemi, Sampsa

    2013-01-01

    Computational analysis of data produced in deep sequencing (DS) experiments is challenging due to large data volumes and requirements for flexible analysis approaches. Here, we present a mathematical formalism based on set algebra for frequently performed operations in DS data analysis to facilitate translation of biomedical research questions to language amenable for computational analysis. With the help of this formalism, we implemented the Genomic Region Operation Kit (GROK), which supports various DS-related operations such as preprocessing, filtering, file conversion, and sample comparison. GROK provides high-level interfaces for R, Python, Lua, and command line, as well as an extension C++ API. It supports major genomic file formats and allows storing custom genomic regions in efficient data structures such as red-black trees and SQL databases. To demonstrate the utility of GROK, we have characterized the roles of two major transcription factors (TFs) in prostate cancer using data from 10 DS experiments. GROK is freely available with a user guide from >http://csbi.ltdk.helsinki.fi/grok/.

  16. Multiple platform assessment of the EGF dependent transcriptome by microarray and deep tag sequencing analysis

    Directory of Open Access Journals (Sweden)

    Iraola Susana

    2011-06-01

    Full Text Available Abstract Background Epidermal Growth Factor (EGF is a key regulatory growth factor activating many processes relevant to normal development and disease, affecting cell proliferation and survival. Here we use a combined approach to study the EGF dependent transcriptome of HeLa cells by using multiple long oligonucleotide based microarray platforms (from Agilent, Operon, and Illumina in combination with digital gene expression profiling (DGE with the Illumina Genome Analyzer. Results By applying a procedure for cross-platform data meta-analysis based on RankProd and GlobalAncova tests, we establish a well validated gene set with transcript levels altered after EGF treatment. We use this robust gene list to build higher order networks of gene interaction by interconnecting associated networks, supporting and extending the important role of the EGF signaling pathway in cancer. In addition, we find an entirely new set of genes previously unrelated to the currently accepted EGF associated cellular functions. Conclusions We propose that the use of global genomic cross-validation derived from high content technologies (microarrays or deep sequencing can be used to generate more reliable datasets. This approach should help to improve the confidence of downstream in silico functional inference analyses based on high content data.

  17. Metagenomes obtained by 'deep sequencing' - what do they tell about the enhanced biological phosphorus removal communities?

    Science.gov (United States)

    Albertsen, Mads; Saunders, Aaron M; Nielsen, Kåre L; Nielsen, Per H

    2013-01-01

    Metagenomics enables studies of the genomic potential of complex microbial communities by sequencing bulk genomic DNA directly from the environment. Knowledge of the genetic potential of a community can be used to formulate and test ecological hypotheses about stability and performance. In this study deep metagenomics and fluorescence in situ hybridization (FISH) were used to study a full-scale wastewater treatment plant with enhanced biological phosphorus removal (EBPR), and the results were compared to an existing EBPR metagenome. EBPR is a widely used process that relies on a complex community of microorganisms to function properly. Insight into community and species level stability and dynamics is valuable for knowledge-driven optimization of the EBPR process. The metagenomes of the EBPR communities were distinct compared to metagenomes of communities from a wide range of other environments, which could be attributed to selection pressures of the EBPR process. The metabolic potential of one of the key microorganisms in the EPBR process, Accumulibacter, was investigated in more detail in the two plants, revealing a potential importance of phage predation on the dynamics of Accumulibacter populations. The results demonstrate that metagenomics can be used as a powerful tool for system wide characterization of the EBPR community as well as for a deeper understanding of the function of specific community members. Furthermore, we discuss and illustrate some of the general pitfalls in metagenomics and stress the need of additional DNA extraction independent information in metagenome studies.

  18. Deep sequencing-based identification of small regulatory RNAs in Synechocystis sp. PCC 6803.

    Science.gov (United States)

    Xu, Wen; Chen, Hui; He, Chen-Liu; Wang, Qiang

    2014-01-01

    Synechocystis sp. PCC 6803 is a genetically tractable model organism for photosynthesis research. The genome of Synechocystis sp. PCC 6803 consists of a circular chromosome and seven plasmids. The importance of small regulatory RNAs (sRNAs) as mediators of a number of cellular processes in bacteria has begun to be recognized. However, little is known regarding sRNAs in Synechocystis sp. PCC 6803. To provide a comprehensive overview of sRNAs in this model organism, the sRNAs of Synechocystis sp. PCC 6803 were analyzed using deep sequencing, and 7,951,189 reads were obtained. High quality mapping reads (6,127,890) were mapped onto the genome and assembled into 16,192 transcribed regions (clusters) based on read overlap. A total number of 5211 putative sRNAs were revealed from the genome and the 4 megaplasmids, and 27 of these molecules, including four from plasmids, were confirmed by RT-PCR. In addition, possible target genes regulated by all of the putative sRNAs identified in this study were predicted by IntaRNA and analyzed for functional categorization and biological pathways, which provided evidence that sRNAs are indeed involved in many different metabolic pathways, including basic metabolic pathways, such as glycolysis/gluconeogenesis, the citrate cycle, fatty acid metabolism and adaptations to environmentally stress-induced changes. The information from this study provides a valuable reservoir for understanding the sRNA-mediated regulation of the complex physiology and metabolic processes of cyanobacteria.

  19. Deep Sequencing of Suppression Subtractive Hybridisation Drought and Recovery Libraries of the Non-model Crop Trifolium repens L.

    Science.gov (United States)

    Bisaga, Maciej; Lowe, Matthew; Hegarty, Matthew; Abberton, Michael; Ravagnani, Adriana

    2017-01-01

    White clover is a short-lived perennial whose persistence is greatly affected by abiotic stresses, particularly drought. The aim of this work was to characterize its molecular response to water deficit and recovery following re-hydration to identify targets for the breeding of tolerant varieties. We created a white clover reference transcriptome of 16,193 contigs by deep sequencing (mean base coverage 387x) four Suppression Subtractive Hybridization (SSH) libraries (a forward and a reverse library for each treatment) constructed from young leaf tissue of white clover at the onset of the response to drought and recovery. Reads from individual libraries were then mapped to the reference transcriptome and processed comparing expression level data. The pipeline generated four robust sets of transcripts induced and repressed in the leaves of plants subjected to water deficit stress (6,937 and 3,142, respectively) and following re-hydration (6,695 and 4,897, respectively). Semi-quantitative polymerase chain reaction was used to verify the expression pattern of 16 genes. The differentially expressed transcripts were functionally annotated and mapped to biological processes and pathways. In agreement with similar studies in other crops, the majority of transcripts up-regulated in response to drought belonged to metabolic processes, such as amino acid, carbohydrate, and lipid metabolism, while transcripts involved in photosynthesis, such as components of the photosystem and the biosynthesis of photosynthetic pigments, were up-regulated during recovery. The data also highlighted the role of raffinose family oligosaccharides (RFOs) and the possible delayed response of the flavonoid pathways in the initial response of white clover to water withdrawal. The work presented in this paper is to our knowledge the first large scale molecular analysis of the white clover response to drought stress and re-hydration. The data generated provide a valuable genomic resource for marker

  20. Transcriptome analysis of the model protozoan, Tetrahymena thermophila, using Deep RNA sequencing.

    Directory of Open Access Journals (Sweden)

    Jie Xiong

    Full Text Available BACKGROUND: The ciliated protozoan Tetrahymena thermophila is a well-studied single-celled eukaryote model organism for cellular and molecular biology. However, the lack of extensive T. thermophila cDNA libraries or a large expressed sequence tag (EST database limited the quality of the original genome annotation. METHODOLOGY/PRINCIPAL FINDINGS: This RNA-seq study describes the first deep sequencing analysis of the T. thermophila transcriptome during the three major stages of the life cycle: growth, starvation and conjugation. Uniquely mapped reads covered more than 96% of the 24,725 predicted gene models in the somatic genome. More than 1,000 new transcribed regions were identified. The great dynamic range of RNA-seq allowed detection of a nearly six order-of-magnitude range of measurable gene expression orchestrated by this cell. RNA-seq also allowed the first prediction of transcript untranslated regions (UTRs and an updated (larger size estimate of the T. thermophila transcriptome: 57 Mb, or about 55% of the somatic genome. Our study identified nearly 1,500 alternative splicing (AS events distributed over 5.2% of T. thermophila genes. This percentage represents a two order-of-magnitude increase over previous EST-based estimates in Tetrahymena. Evidence of stage-specific regulation of alternative splicing was also obtained. Finally, our study allowed us to completely confirm about 26.8% of the genes originally predicted by the gene finder, to correct coding sequence boundaries and intron-exon junctions for about a third, and to reassign microarray probes and correct earlier microarray data. CONCLUSIONS/SIGNIFICANCE: RNA-seq data significantly improve the genome annotation and provide a fully comprehensive view of the global transcriptome of T. thermophila. To our knowledge, 5.2% of T. thermophila genes with AS is the highest percentage of genes showing AS reported in a unicellular eukaryote. Tetrahymena thus becomes an excellent unicellular

  1. Genome-wide identification of Schistosoma japonicum microRNAs using a deep-sequencing approach.

    Directory of Open Access Journals (Sweden)

    Jian Huang

    Full Text Available BACKGROUND: Human schistosomiasis is one of the most prevalent and serious parasitic diseases worldwide. Schistosoma japonicum is one of important pathogens of this disease. MicroRNAs (miRNAs are a large group of non-coding RNAs that play important roles in regulating gene expression and protein translation in animals. Genome-wide identification of miRNAs in a given organism is a critical step to facilitating our understanding of genome organization, genome biology, evolution, and posttranscriptional regulation. METHODOLOGY/PRINCIPAL FINDINGS: We sequenced two small RNA libraries prepared from different stages of the life cycle of S. japonicum, immature schistosomula and mature pairing adults, through a deep DNA sequencing approach, which yielded approximately 12 million high-quality short sequence reads containing a total of approximately 2 million non-redundant tags. Based on a bioinformatics pipeline, we identified 176 new S. japonicum miRNAs, of which some exhibited a differential pattern of expression between the two stages. Although 21 S. japonicum miRNAs are orthologs of known miRNAs within the metazoans, some nucleotides at many positions of Schistosoma miRNAs, such as miR-8, let-7, miR-10, miR-31, miR-92, miR-124, and miR-125, are indeed significantly distinct from other bilaterian orthologs. In addition, both miR-71 and some miR-2 family members in tandem are found to be clustered in a reversal direction model on two genomic loci, and two pairs of novel S. japonicum miRNAs were derived from sense and antisense DNA strands at the same genomic loci. CONCLUSIONS/SIGNIFICANCE: The collection of S. japonicum miRNAs could be used as a new platform to study the genomic structure, gene regulation and networks, evolutionary processes, development, and host-parasite interactions. Some S. japonicum miRNAs and their clusters could represent the ancestral forms of the conserved orthologues and a model for the genesis of novel miRNAs.

  2. Genotypic tropism testing by massively parallel sequencing: qualitative and quantitative analysis

    Directory of Open Access Journals (Sweden)

    Thiele Bernhard

    2011-05-01

    Full Text Available Abstract Background Inferring viral tropism from genotype is a fast and inexpensive alternative to phenotypic testing. While being highly predictive when performed on clonal samples, sensitivity of predicting CXCR4-using (X4 variants drops substantially in clinical isolates. This is mainly attributed to minor variants not detected by standard bulk-sequencing. Massively parallel sequencing (MPS detects single clones thereby being much more sensitive. Using this technology we wanted to improve genotypic prediction of coreceptor usage. Methods Plasma samples from 55 antiretroviral-treated patients tested for coreceptor usage with the Monogram Trofile Assay were sequenced with standard population-based approaches. Fourteen of these samples were selected for further analysis with MPS. Tropism was predicted from each sequence with geno2pheno[coreceptor]. Results Prediction based on bulk-sequencing yielded 59.1% sensitivity and 90.9% specificity compared to the trofile assay. With MPS, 7600 reads were generated on average per isolate. Minorities of sequences with high confidence in CXCR4-usage were found in all samples, irrespective of phenotype. When using the default false-positive-rate of geno2pheno[coreceptor] (10%, and defining a minority cutoff of 5%, the results were concordant in all but one isolate. Conclusions The combination of MPS and coreceptor usage prediction results in a fast and accurate alternative to phenotypic assays. The detection of X4-viruses in all isolates suggests that coreceptor usage as well as fitness of minorities is important for therapy outcome. The high sensitivity of this technology in combination with a quantitative description of the viral population may allow implementing meaningful cutoffs for predicting response to CCR5-antagonists in the presence of X4-minorities.

  3. Identification of microRNA-like RNAs from Curvularia lunata associated with maize leaf spot by bioinformation analysis and deep sequencing.

    Science.gov (United States)

    Liu, Tong; Hu, John; Zuo, Yuhu; Jin, Yazhong; Hou, Jumei

    2016-04-01

    Deep sequencing of small RNAs is a useful tool to identify novel small RNAs that may be involved in fungal growth and pathogenesis. In this study, we used HiSeq deep sequencing to identify 747,487 unique small RNAs from Curvularia lunata. Among these small RNAs were 1012 microRNA-like RNAs (milRNAs), which are similar to other known microRNAs, and 48 potential novel milRNAs without homologs in other organisms have been identified using the miRBase© database. We used quantitative PCR to analyze the expression of four of these milRNAs from C. lunata at different developmental stages. The analysis revealed several changes associated with germinating conidia and mycelial growth, suggesting that these milRNAs may play a role in pathogen infection and mycelial growth. A total of 8334 target mRNAs for the 1012 milRNAs that were identified, and 256 target mRNAs for the 48 novel milRNAs were predicted by computational analysis. These target mRNAs of milRNAs were also performed by gene ontology and Kyoto Encyclopedia of Genes and Genomes pathway analysis. To our knowledge, this study is the first report of C. lunata's milRNA profiles. This information will provide a better understanding of pathogen development and infection mechanism.

  4. Quantitative Modelling of Multiphase Lithospheric Stretching and Deep Thermal History of Some Tertiary Rift Basins in Eastern China

    Institute of Scientific and Technical Information of China (English)

    林畅松; 张燕梅; 李思田; 刘景彦; 仝志刚; 丁孝忠; 李喜臣

    2002-01-01

    The stretching process of some Tertiary rift basins in eastern China is characterized by multiphase rifting. A multiple instantaneous uniform stretching model is proposed in this paper to simulate the formation of the basins as the rifting process cannot be accurately described by a simple (one episode) stretching model. The study shows that the multiphase stretching model, combined with the back-stripping technique, can be used to reconstruct the subsidence history and the stretching process of the lithosphere, and to evaluate the depth to the top of the asthenosphere and the deep thermal evolution of the basins. The calculated results obtained by applying the quantitative model to the episodic rifting process of the Tertiary Qiongdongnan and Yinggehai basins in the South China Sea are in agreement with geophysical data and geological observations. This provides a new method for quantitative evaluation of the geodynamic process of multiphase rifting occurring during the Tertiary in eastern China.

  5. Eigenspectra optoacoustic tomography achieves quantitative blood oxygenation imaging deep in tissues

    Science.gov (United States)

    Tzoumas, Stratis; Nunes, Antonio; Olefir, Ivan; Stangl, Stefan; Symvoulidis, Panagiotis; Glasl, Sarah; Bayer, Christine; Multhoff, Gabriele; Ntziachristos, Vasilis

    2016-06-01

    Light propagating in tissue attains a spectrum that varies with location due to wavelength-dependent fluence attenuation, an effect that causes spectral corruption. Spectral corruption has limited the quantification accuracy of optical and optoacoustic spectroscopic methods, and impeded the goal of imaging blood oxygen saturation (sO2) deep in tissues; a critical goal for the assessment of oxygenation in physiological processes and disease. Here we describe light fluence in the spectral domain and introduce eigenspectra multispectral optoacoustic tomography (eMSOT) to account for wavelength-dependent light attenuation, and estimate blood sO2 within deep tissue. We validate eMSOT in simulations, phantoms and animal measurements and spatially resolve sO2 in muscle and tumours, validating our measurements with histology data. eMSOT shows substantial sO2 accuracy enhancement over previous optoacoustic methods, potentially serving as a valuable tool for imaging tissue pathophysiology.

  6. Eigenspectra optoacoustic tomography achieves quantitative blood oxygenation imaging deep in tissues

    Science.gov (United States)

    Tzoumas, Stratis; Nunes, Antonio; Olefir, Ivan; Stangl, Stefan; Symvoulidis, Panagiotis; Glasl, Sarah; Bayer, Christine; Multhoff, Gabriele; Ntziachristos, Vasilis

    2016-01-01

    Light propagating in tissue attains a spectrum that varies with location due to wavelength-dependent fluence attenuation, an effect that causes spectral corruption. Spectral corruption has limited the quantification accuracy of optical and optoacoustic spectroscopic methods, and impeded the goal of imaging blood oxygen saturation (sO2) deep in tissues; a critical goal for the assessment of oxygenation in physiological processes and disease. Here we describe light fluence in the spectral domain and introduce eigenspectra multispectral optoacoustic tomography (eMSOT) to account for wavelength-dependent light attenuation, and estimate blood sO2 within deep tissue. We validate eMSOT in simulations, phantoms and animal measurements and spatially resolve sO2 in muscle and tumours, validating our measurements with histology data. eMSOT shows substantial sO2 accuracy enhancement over previous optoacoustic methods, potentially serving as a valuable tool for imaging tissue pathophysiology. PMID:27358000

  7. Quantitative analysis of gait and balance response to deep brain stimulation in Parkinson's disease

    OpenAIRE

    Mera, Thomas O.; Filipkowski, Danielle E.; Riley, David E.; Whitney, Christina M.; Walter, Benjamin L.; Gunzler, Steven A; Giuffrida, Joseph P

    2012-01-01

    Gait and balance disturbances in Parkinson’s disease (PD) can be debilitating and may lead to increased fall risk. Deep brain stimulation (DBS) is a treatment option once therapeutic benefits from medication are limited due to motor fluctuations and dyskinesia. Optimizing DBS parameters for gait and balance can be significantly more challenging than for other PD motor symptoms. Furthermore, inter-rater reliability of the standard clinical PD assessment scale, Unified Parkinson’s Disease Ratin...

  8. Quantitative and sensitive detection of GNAS mutations causing mccune-albright syndrome with next generation sequencing.

    Science.gov (United States)

    Narumi, Satoshi; Matsuo, Kumihiro; Ishii, Tomohiro; Tanahashi, Yusuke; Hasegawa, Tomonobu

    2013-01-01

    Somatic activating GNAS mutations cause McCune-Albright syndrome (MAS). Owing to low mutation abundance, mutant-specific enrichment procedures, such as the peptide nucleic acid (PNA) method, are required to detect mutations in peripheral blood. Next generation sequencing (NGS) can analyze millions of PCR amplicons independently, thus it is expected to detect low-abundance GNAS mutations quantitatively. In the present study, we aimed to develop an NGS-based method to detect low-abundance somatic GNAS mutations. PCR amplicons encompassing exons 8 and 9 of GNAS, in which most activating mutations occur, were sequenced on the MiSeq instrument. As expected, our NGS-based method could sequence the GNAS locus with very high read depth (approximately 100,000) and low error rate. A serial dilution study with use of cloned mutant and wildtype DNA samples showed a linear correlation between dilution and measured mutation abundance, indicating the reliability of quantification of the mutation. Using the serially diluted samples, the detection limits of three mutation detection methods (the PNA method, NGS, and combinatory use of PNA and NGS [PNA-NGS]) were determined. The lowest detectable mutation abundance was 1% for the PNA method, 0.03% for NGS and 0.01% for PNA-NGS. Finally, we analyzed 16 MAS patient-derived leukocytic DNA samples with the three methods, and compared the mutation detection rate of them. Mutation detection rate of the PNA method, NGS and PNA-NGS in 16 patient-derived peripheral blood samples were 56%, 63% and 75%, respectively. In conclusion, NGS can detect somatic activating GNAS mutations quantitatively and sensitively from peripheral blood samples. At present, the PNA-NGS method is likely the most sensitive method to detect low-abundance GNAS mutation.

  9. Quantitative and sensitive detection of GNAS mutations causing mccune-albright syndrome with next generation sequencing.

    Directory of Open Access Journals (Sweden)

    Satoshi Narumi

    Full Text Available Somatic activating GNAS mutations cause McCune-Albright syndrome (MAS. Owing to low mutation abundance, mutant-specific enrichment procedures, such as the peptide nucleic acid (PNA method, are required to detect mutations in peripheral blood. Next generation sequencing (NGS can analyze millions of PCR amplicons independently, thus it is expected to detect low-abundance GNAS mutations quantitatively. In the present study, we aimed to develop an NGS-based method to detect low-abundance somatic GNAS mutations. PCR amplicons encompassing exons 8 and 9 of GNAS, in which most activating mutations occur, were sequenced on the MiSeq instrument. As expected, our NGS-based method could sequence the GNAS locus with very high read depth (approximately 100,000 and low error rate. A serial dilution study with use of cloned mutant and wildtype DNA samples showed a linear correlation between dilution and measured mutation abundance, indicating the reliability of quantification of the mutation. Using the serially diluted samples, the detection limits of three mutation detection methods (the PNA method, NGS, and combinatory use of PNA and NGS [PNA-NGS] were determined. The lowest detectable mutation abundance was 1% for the PNA method, 0.03% for NGS and 0.01% for PNA-NGS. Finally, we analyzed 16 MAS patient-derived leukocytic DNA samples with the three methods, and compared the mutation detection rate of them. Mutation detection rate of the PNA method, NGS and PNA-NGS in 16 patient-derived peripheral blood samples were 56%, 63% and 75%, respectively. In conclusion, NGS can detect somatic activating GNAS mutations quantitatively and sensitively from peripheral blood samples. At present, the PNA-NGS method is likely the most sensitive method to detect low-abundance GNAS mutation.

  10. Mitochondrial matR sequences help to resolve deep phylogenetic relationships in rosids

    Directory of Open Access Journals (Sweden)

    Dilcher David L

    2007-11-01

    Full Text Available Abstract Background Rosids are a major clade in the angiosperms containing 13 orders and about one-third of angiosperm species. Recent molecular analyses recognized two major groups (i.e., fabids with seven orders and malvids with three orders. However, phylogenetic relationships within the two groups and among fabids, malvids, and potentially basal rosids including Geraniales, Myrtales, and Crossosomatales remain to be resolved with more data and a broader taxon sampling. In this study, we obtained DNA sequences of the mitochondrial matR gene from 174 species representing 72 families of putative rosids and examined phylogenetic relationships and phylogenetic utility of matR in rosids. We also inferred phylogenetic relationships within the "rosid clade" based on a combined data set of 91 taxa and four genes including matR, two plastid genes (rbcL, atpB, and one nuclear gene (18S rDNA. Results Comparison of mitochondrial matR and two plastid genes (rbcL and atpB showed that the synonymous substitution rate in matR was approximately four times slower than those of rbcL and atpB; however, the nonsynonymous substitution rate in matR was relatively high, close to its synonymous substitution rate, indicating that the matR has experienced a relaxed evolutionary history. Analyses of our matR sequences supported the monophyly of malvids and most orders of the rosids. However, fabids did not form a clade; instead, the COM clade of fabids (Celastrales, Oxalidales, Malpighiales, and Huaceae was sister to malvids. Analyses of the four-gene data set suggested that Geraniales and Myrtales were successively sister to other rosids, and that Crossosomatales were sister to malvids. Conclusion Compared to plastid genes such as rbcL and atpB, slowly evolving matR produced less homoplasious but not less informative substitutions. Thus, matR appears useful in higher-level angiosperm phylogenetics. Analysis of matR alone identified a novel deep relationship within

  11. Deep sequencing reveals microRNAs predictive of antiangiogenic drug response

    Science.gov (United States)

    García-Donas, Jesús; Beuselinck, Benoit; Inglada-Pérez, Lucía; Graña, Osvaldo; Schöffski, Patrick; Wozniak, Agnieszka; Bechter, Oliver; Apellániz-Ruiz, Maria; Leandro-García, Luis Javier; Esteban, Emilio; Castellano, Daniel E.; González del Alba, Aranzazu; Climent, Miguel Angel; Hernando, Susana; Arranz, José Angel; Morente, Manuel; Pisano, David G.; Robledo, Mercedes

    2016-01-01

    The majority of metastatic renal cell carcinoma (RCC) patients are treated with tyrosine kinase inhibitors (TKI) in first-line treatment; however, a fraction are refractory to these antiangiogenic drugs. MicroRNAs (miRNAs) are regulatory molecules proven to be accurate biomarkers in cancer. Here, we identified miRNAs predictive of progressive disease under TKI treatment through deep sequencing of 74 metastatic clear cell RCC cases uniformly treated with these drugs. Twenty-nine miRNAs were differentially expressed in the tumors of patients who progressed under TKI therapy (P values from 6 × 10–9 to 3 × 10–3). Among 6 miRNAs selected for validation in an independent series, the most relevant associations corresponded to miR–1307-3p, miR–155-5p, and miR–221-3p (P = 4.6 × 10–3, 6.5 × 10–3, and 3.4 × 10–2, respectively). Furthermore, a 2 miRNA–based classifier discriminated individuals with progressive disease upon TKI treatment (AUC = 0.75, 95% CI, 0.64–0.85; P = 1.3 × 10–4) with better predictive value than clinicopathological risk factors commonly used. We also identified miRNAs significantly associated with progression-free survival and overall survival (P = 6.8 × 10–8 and 7.8 × 10–7 for top hits, respectively), and 7 overlapped with early progressive disease. In conclusion, this is the first miRNome comprehensive study, to our knowledge, that demonstrates a predictive value of miRNAs for TKI response and provides a new set of relevant markers that can help rationalize metastatic RCC treatment. PMID:27699216

  12. Deep sequencing-based identification of small regulatory RNAs in Synechocystis sp. PCC 6803.

    Directory of Open Access Journals (Sweden)

    Wen Xu

    Full Text Available Synechocystis sp. PCC 6803 is a genetically tractable model organism for photosynthesis research. The genome of Synechocystis sp. PCC 6803 consists of a circular chromosome and seven plasmids. The importance of small regulatory RNAs (sRNAs as mediators of a number of cellular processes in bacteria has begun to be recognized. However, little is known regarding sRNAs in Synechocystis sp. PCC 6803. To provide a comprehensive overview of sRNAs in this model organism, the sRNAs of Synechocystis sp. PCC 6803 were analyzed using deep sequencing, and 7,951,189 reads were obtained. High quality mapping reads (6,127,890 were mapped onto the genome and assembled into 16,192 transcribed regions (clusters based on read overlap. A total number of 5211 putative sRNAs were revealed from the genome and the 4 megaplasmids, and 27 of these molecules, including four from plasmids, were confirmed by RT-PCR. In addition, possible target genes regulated by all of the putative sRNAs identified in this study were predicted by IntaRNA and analyzed for functional categorization and biological pathways, which provided evidence that sRNAs are indeed involved in many different metabolic pathways, including basic metabolic pathways, such as glycolysis/gluconeogenesis, the citrate cycle, fatty acid metabolism and adaptations to environmentally stress-induced changes. The information from this study provides a valuable reservoir for understanding the sRNA-mediated regulation of the complex physiology and metabolic processes of cyanobacteria.

  13. Identifying conserved and novel microRNAs in developing seeds of Brassica napus using deep sequencing.

    Directory of Open Access Journals (Sweden)

    Ana Paula Körbes

    Full Text Available MicroRNAs (miRNAs are important post-transcriptional regulators of plant development and seed formation. In Brassica napus, an important edible oil crop, valuable lipids are synthesized and stored in specific seed tissues during embryogenesis. The miRNA transcriptome of B. napus is currently poorly characterized, especially at different seed developmental stages. This work aims to describe the miRNAome of developing seeds of B. napus by identifying plant-conserved and novel miRNAs and comparing miRNA abundance in mature versus developing seeds. Members of 59 miRNA families were detected through a computational analysis of a large number of reads obtained from deep sequencing two small RNA and two RNA-seq libraries of (i pooled immature developing stages and (ii mature B. napus seeds. Among these miRNA families, 17 families are currently known to exist in B. napus; additionally 29 families not reported in B. napus but conserved in other plant species were identified by alignment with known plant mature miRNAs. Assembled mRNA-seq contigs allowed for a search of putative new precursors and led to the identification of 13 novel miRNA families. Analysis of miRNA population between libraries reveals that several miRNAs and isomiRNAs have different abundance in developing stages compared to mature seeds. The predicted miRNA target genes encode a broad range of proteins related to seed development and energy storage. This work presents a comparative study of the miRNA transcriptome of mature and developing B. napus seeds and provides a basis for future research on individual miRNAs and their functions in embryogenesis, seed maturation and lipid accumulation in B. napus.

  14. Acyclic identification of aptamers for human alpha-thrombin using over-represented libraries and deep sequencing.

    Directory of Open Access Journals (Sweden)

    Gillian V Kupakuwana

    Full Text Available BACKGROUND: Aptamers are oligonucleotides that bind proteins and other targets with high affinity and selectivity. Twenty years ago elements of natural selection were adapted to in vitro selection in order to distinguish aptamers among randomized sequence libraries. The primary bottleneck in traditional aptamer discovery is multiple cycles of in vitro evolution. METHODOLOGY/PRINCIPAL FINDINGS: We show that over-representation of sequences in aptamer libraries and deep sequencing enables acyclic identification of aptamers. We demonstrated this by isolating a known family of aptamers for human α-thrombin. Aptamers were found within a library containing an average of 56,000 copies of each possible randomized 15mer segment. The high affinity sequences were counted many times above the background in 2-6 million reads. Clustering analysis of sequences with more than 10 counts distinguished two sequence motifs with candidates at high abundance. Motif I contained the previously observed consensus 15mer, Thb1 (46,000 counts, and related variants with mostly G/T substitutions; secondary analysis showed that affinity for thrombin correlated with abundance (K(d = 12 nM for Thb1. The signal-to-noise ratio for this experiment was roughly 10,000∶1 for Thb1. Motif II was unrelated to Thb1 with the leading candidate (29,000 counts being a novel aptamer against hexose sugars in the storage and elution buffers for Concanavilin A (K(d = 0.5 µM for α-methyl-mannoside; ConA was used to immobilize α-thrombin. CONCLUSIONS/SIGNIFICANCE: Over-representation together with deep sequencing can dramatically shorten the discovery process, distinguish aptamers having a wide range of affinity for the target, allow an exhaustive search of the sequence space within a simplified library, reduce the quantity of the target required, eliminate cycling artifacts, and should allow multiplexing of sequencing experiments and targets.

  15. Microbial Dark Matter: Unusual intervening sequences in 16S rRNA genes of candidate phyla from the deep subsurface

    Energy Technology Data Exchange (ETDEWEB)

    Jarett, Jessica; Stepanauskas, Ramunas; Kieft, Thomas; Onstott, Tullis; Woyke, Tanja

    2014-03-17

    The Microbial Dark Matter project has sequenced genomes from over 200 single cells from candidate phyla, greatly expanding our knowledge of the ecology, inferred metabolism, and evolution of these widely distributed, yet poorly understood lineages. The second phase of this project aims to sequence an additional 800 single cells from known as well as potentially novel candidate phyla derived from a variety of environments. In order to identify whole genome amplified single cells, screening based on phylogenetic placement of 16S rRNA gene sequences is being conducted. Briefly, derived 16S rRNA gene sequences are aligned to a custom version of the Greengenes reference database and added to a reference tree in ARB using parsimony. In multiple samples from deep subsurface habitats but not from other habitats, a large number of sequences proved difficult to align and therefore to place in the tree. Based on comparisons to reference sequences and structural alignments using SSU-ALIGN, many of these ?difficult? sequences appear to originate from candidate phyla, and contain intervening sequences (IVSs) within the 16S rRNA genes. These IVSs are short (39 - 79 nt) and do not appear to be self-splicing or to contain open reading frames. IVSs were found in the loop regions of stem-loop structures in several different taxonomic groups. Phylogenetic placement of sequences is strongly affected by IVSs; two out of three groups investigated were classified as different phyla after their removal. Based on data from samples screened in this project, IVSs appear to be more common in microbes occurring in deep subsurface habitats, although the reasons for this remain elusive.

  16. Deep Sequencing of Porphyromonas gingivalis and comparative transcriptome analysis of a LuxS mutant

    Directory of Open Access Journals (Sweden)

    Takanoi eHirano

    2012-06-01

    Full Text Available Porphyromonas gingivalis is a major etiological agent and chronic and aggressive forms of periodontal disease. The organism is an assacharolytic anaerobe and is a constituent of mixed species biofilms in a variety of microenvironments in the oral cavity. P. gingivalis expresses a range of virulence factors over which it exerts tight control. High-throughput sequencing technologies provide the opportunity to relate functional genomics to basic biology. In this study we report qualitative and quantitative RNA-Seq analysis of the transcriptome of P. gingivalis. We have also applied RNA-Seq to the transcriptome of a ΔluxS mutant of P. gingivalis deficient in AI-2-mediated bacterial communication. The transcriptome analysis confirmed the expression of all predicted ORFs for strain ATCC 33277, including 854 hypothetical proteins, and allowed the identification of hitherto unknown transcriptional units. Twelve noncoding RNAs were identified, including 11 small RNAs and one cobalamine riboswitch. Fifty seven genes were differentially regulated in the LuxS mutant. Addition of exogenous synthetic 4,5-dihydroxy-2,3-pentanedione (DPD, AI-2 precursor to the ΔluxS mutant culture complemented expression of a subset of genes, indicating that LuxS is involved in both AI-2 signaling and non-signaling dependent systems in P. gingivalis. This work provides an important dataset for future study of P. gingivalis pathophysiology and further defines the LuxS regulon in this oral pathogen.

  17. Deep Sequencing Reveals the Complete Genome and Evidence for Transcriptional Activity of the First Virus-Like Sequences Identified in Aristotelia chilensis (Maqui Berry

    Directory of Open Access Journals (Sweden)

    Javier Villacreses

    2015-04-01

    Full Text Available Here, we report the genome sequence and evidence for transcriptional activity of a virus-like element in the native Chilean berry tree Aristotelia chilensis. We propose to name the endogenous sequence as Aristotelia chilensis Virus 1 (AcV1. High-throughput sequencing of the genome of this tree uncovered an endogenous viral element, with a size of 7122 bp, corresponding to the complete genome of AcV1. Its sequence contains three open reading frames (ORFs: ORFs 1 and 2 shares 66%–73% amino acid similarity with members of the Caulimoviridae virus family, especially the Petunia vein clearing virus (PVCV, Petuvirus genus. ORF1 encodes a movement protein (MP; ORF2 a Reverse Transcriptase (RT and a Ribonuclease H (RNase H domain; and ORF3 showed no amino acid sequence similarity with any other known virus proteins. Analogous to other known endogenous pararetrovirus sequences (EPRVs, AcV1 is integrated in the genome of Maqui Berry and showed low viral transcriptional activity, which was detected by deep sequencing technology (DNA and RNA-seq. Phylogenetic analysis of AcV1 and other pararetroviruses revealed a closer resemblance with Petuvirus. Overall, our data suggests that AcV1 could be a new member of Caulimoviridae family, genus Petuvirus, and the first evidence of this kind of virus in a fruit plant.

  18. Deep sequencing reveals the complete genome and evidence for transcriptional activity of the first virus-like sequences identified in Aristotelia chilensis (Maqui Berry).

    Science.gov (United States)

    Villacreses, Javier; Rojas-Herrera, Marcelo; Sánchez, Carolina; Hewstone, Nicole; Undurraga, Soledad F; Alzate, Juan F; Manque, Patricio; Maracaja-Coutinho, Vinicius; Polanco, Victor

    2015-04-03

    Here, we report the genome sequence and evidence for transcriptional activity of a virus-like element in the native Chilean berry tree Aristotelia chilensis. We propose to name the endogenous sequence as Aristotelia chilensis Virus 1 (AcV1). High-throughput sequencing of the genome of this tree uncovered an endogenous viral element, with a size of 7122 bp, corresponding to the complete genome of AcV1. Its sequence contains three open reading frames (ORFs): ORFs 1 and 2 shares 66%-73% amino acid similarity with members of the Caulimoviridae virus family, especially the Petunia vein clearing virus (PVCV), Petuvirus genus. ORF1 encodes a movement protein (MP); ORF2 a Reverse Transcriptase (RT) and a Ribonuclease H (RNase H) domain; and ORF3 showed no amino acid sequence similarity with any other known virus proteins. Analogous to other known endogenous pararetrovirus sequences (EPRVs), AcV1 is integrated in the genome of Maqui Berry and showed low viral transcriptional activity, which was detected by deep sequencing technology (DNA and RNA-seq). Phylogenetic analysis of AcV1 and other pararetroviruses revealed a closer resemblance with Petuvirus. Overall, our data suggests that AcV1 could be a new member of Caulimoviridae family, genus Petuvirus, and the first evidence of this kind of virus in a fruit plant.

  19. Quantitative Analysis of Fundus-Image Sequences Reveals Phase of Spontaneous Venous Pulsations

    Science.gov (United States)

    Moret, Fabrice; Reiff, Charlotte M.; Lagrèze, Wolf A.; Bach, Michael

    2015-01-01

    Purpose Spontaneous venous pulsation correlates negatively with elevated intracranial pressure and papilledema, and it relates to glaucoma. Yet, its etiology remains unclear. A key element to elucidate its underlying mechanism is the time at which collapse occurs with respect to the heart cycle, but previous reports are contradictory. We assessed this question in healthy subjects using quantitative measurements of both vein diameters and artery lateral displacements; the latter being used as the marker of the ocular systole time. Methods We recorded 5-second fundus sequences with a near-infrared scanning laser ophthalmoscope in 12 young healthy subjects. The image sequences were coregistered, cleaned from microsaccades, and filtered via a principal component analysis to remove nonpulsatile dynamic features. Time courses of arterial lateral displacement and of diameter at sites of spontaneous venous pulsation or proximal to the disk were retrieved from those image sequences and compared. Results Four subjects displayed both arterial and venous pulsatile waveforms. On those, we observed venous diameter waveforms differing markedly among the subjects, ranging from a waveform matching the typical intraocular pressure waveform to a close replica of the arterial waveform. Conclusions The heterogeneity in waveforms and arteriovenous phases suggests that the mechanism governing the venous outflow resistance differs among healthy subjects. Translational relevance Further characterizations are necessary to understand the heterogeneous mechanisms governing the venous outflow resistance as this resistance is altered in glaucoma and is instrumental when monitoring intracranial hypertension based on fundus observations. PMID:26396929

  20. Deep sequencing of organ- and stage-specific microRNAs in the evolutionarily basal insect Blattella germanica (L. (Dictyoptera, Blattellidae.

    Directory of Open Access Journals (Sweden)

    Alexandre S Cristino

    Full Text Available BACKGROUND: microRNAs (miRNAs have been reported as key regulators at post-transcriptional level in eukaryotic cells. In insects, most of the studies have focused in holometabolans while only recently two hemimetabolans (Locusta migratoria and Acyrthosiphon pisum have had their miRNAs identified. Therefore, the study of the miRNAs of the evolutionarily basal hemimetabolan Blattella germanica may provide valuable insights on the structural and functional evolution of miRNAs. METHODOLOGY/PRINCIPAL FINDINGS: Small RNA libraries of the cockroach B. germanica were built from the whole body of the last instar nymph, and the adult ovaries. The high throughput Solexa sequencing resulted in approximately 11 and 8 million reads for the whole-body and ovaries, respectively. Bioinformatic analyses identified 38 known miRNAs as well as 11 known miRNA*s. We also found 70 miRNA candidates conserved in other insects and 170 candidates specific to B. germanica. The positive correlation between Solexa data and real-time quantitative PCR showed that number of reads can be used as a quantitative approach. Five novel miRNA precursors were identified and validated by PCR and sequencing. Known miRNAs and novel candidates were also validated by decreasing levels of their expression in dicer-1 RNAi knockdown individuals. The comparison of the two libraries indicates that whole-body nymph contain more known miRNAs than ovaries, whereas the adult ovaries are enriched with novel miRNA candidates. CONCLUSIONS/SIGNIFICANCE: Our study has identified many known miRNAs and novel miRNA candidates in the basal hemimetabolan insect B. germanica, and most of the specific sequences were found in ovaries. Deep sequencing data reflect miRNA abundance and dicer-1 RNAi assay is shown to be a reliable method for validation of novel miRNAs.

  1. Analytical and Clinical Validation of a Digital Sequencing Panel for Quantitative, Highly Accurate Evaluation of Cell-Free Circulating Tumor DNA

    National Research Council Canada - National Science Library

    Lanman, Richard B; Mortimer, Stefanie A; Zill, Oliver A; Sebisanovic, Dragan; Lopez, Rene; Blau, Sibel; Collisson, Eric A; Divers, Stephen G; Hoon, Dave S B; Kopetz, E Scott; Lee, Jeeyun; Nikolinakos, Petros G; Baca, Arthur M; Kermani, Bahram G; Eltoukhy, Helmy; Talasaz, AmirAli

    2015-01-01

    .... First this method of massively parallel and deep sequencing enables assessment of a comprehensive panel of genomic targets from a single sample, and second, it obviates the need for repeat invasive tissue biopsies. Digital Sequencing...

  2. Rapid and Deep Proteomes by Faster Sequencing on a Benchtop Quadrupole Ultra-High-Field Orbitrap Mass Spectrometer

    DEFF Research Database (Denmark)

    Kelstrup, Christian D; Jersie-Christensen, Rosa R; Batth, Tanveer Singh

    2014-01-01

    per second or up to 600 new peptides sequenced per gradient minute. We identify 4400 proteins from one microgram of HeLa digest using a one hour gradient, which is an approximately 30% improvement compared to previous instrumentation. In addition, we show very deep proteome coverage can be achieved...... in less than 24 hours of analysis time by offline high pH reversed-phase peptide fractionation from which we identify more than 140,000 unique peptide sequences. This is comparable to state-of-the-art multi-day, multi-enzyme efforts. Finally the acquisition methods are evaluated for single...

  3. Insight in genome-wide association of metabolite quantitative traits by exome sequence analyses.

    Science.gov (United States)

    Demirkan, Ayşe; Henneman, Peter; Verhoeven, Aswin; Dharuri, Harish; Amin, Najaf; van Klinken, Jan Bert; Karssen, Lennart C; de Vries, Boukje; Meissner, Axel; Göraler, Sibel; van den Maagdenberg, Arn M J M; Deelder, André M; C 't Hoen, Peter A; van Duijn, Cornelia M; van Dijk, Ko Willems

    2015-01-01

    Metabolite quantitative traits carry great promise for epidemiological studies, and their genetic background has been addressed using Genome-Wide Association Studies (GWAS). Thus far, the role of less common variants has not been exhaustively studied. Here, we set out a GWAS for metabolite quantitative traits in serum, followed by exome sequence analysis to zoom in on putative causal variants in the associated genes. 1H Nuclear Magnetic Resonance (1H-NMR) spectroscopy experiments yielded successful quantification of 42 unique metabolites in 2,482 individuals from The Erasmus Rucphen Family (ERF) study. Heritability of metabolites were estimated by SOLAR. GWAS was performed by linear mixed models, using HapMap imputations. Based on physical vicinity and pathway analyses, candidate genes were screened for coding region variation using exome sequence data. Heritability estimates for metabolites ranged between 10% and 52%. GWAS replicated three known loci in the metabolome wide significance: CPS1 with glycine (P-value  = 1.27×10-32), PRODH with proline (P-value  = 1.11×10-19), SLC16A9 with carnitine level (P-value  = 4.81×10-14) and uncovered a novel association between DMGDH and dimethyl-glycine (P-value  = 1.65×10-19) level. In addition, we found three novel, suggestively significant loci: TNP1 with pyruvate (P-value  = 1.26×10-8), KCNJ16 with 3-hydroxybutyrate (P-value  = 1.65×10-8) and 2p12 locus with valine (P-value  = 3.49×10-8). Exome sequence analysis identified potentially causal coding and regulatory variants located in the genes CPS1, KCNJ2 and PRODH, and revealed allelic heterogeneity for CPS1 and PRODH. Combined GWAS and exome analyses of metabolites detected by high-resolution 1H-NMR is a robust approach to uncover metabolite quantitative trait loci (mQTL), and the likely causative variants in these loci. It is anticipated that insight in the genetics of intermediate phenotypes will provide additional insight into the

  4. Insight in genome-wide association of metabolite quantitative traits by exome sequence analyses.

    Directory of Open Access Journals (Sweden)

    Ayşe Demirkan

    2015-01-01

    Full Text Available Metabolite quantitative traits carry great promise for epidemiological studies, and their genetic background has been addressed using Genome-Wide Association Studies (GWAS. Thus far, the role of less common variants has not been exhaustively studied. Here, we set out a GWAS for metabolite quantitative traits in serum, followed by exome sequence analysis to zoom in on putative causal variants in the associated genes. 1H Nuclear Magnetic Resonance (1H-NMR spectroscopy experiments yielded successful quantification of 42 unique metabolites in 2,482 individuals from The Erasmus Rucphen Family (ERF study. Heritability of metabolites were estimated by SOLAR. GWAS was performed by linear mixed models, using HapMap imputations. Based on physical vicinity and pathway analyses, candidate genes were screened for coding region variation using exome sequence data. Heritability estimates for metabolites ranged between 10% and 52%. GWAS replicated three known loci in the metabolome wide significance: CPS1 with glycine (P-value  = 1.27×10-32, PRODH with proline (P-value  = 1.11×10-19, SLC16A9 with carnitine level (P-value  = 4.81×10-14 and uncovered a novel association between DMGDH and dimethyl-glycine (P-value  = 1.65×10-19 level. In addition, we found three novel, suggestively significant loci: TNP1 with pyruvate (P-value  = 1.26×10-8, KCNJ16 with 3-hydroxybutyrate (P-value  = 1.65×10-8 and 2p12 locus with valine (P-value  = 3.49×10-8. Exome sequence analysis identified potentially causal coding and regulatory variants located in the genes CPS1, KCNJ2 and PRODH, and revealed allelic heterogeneity for CPS1 and PRODH. Combined GWAS and exome analyses of metabolites detected by high-resolution 1H-NMR is a robust approach to uncover metabolite quantitative trait loci (mQTL, and the likely causative variants in these loci. It is anticipated that insight in the genetics of intermediate phenotypes will provide additional insight

  5. Quantitative comparison of cortical and deep grey matter in pathological subtypes of unilateral cerebral palsy.

    Science.gov (United States)

    Scheck, Simon M; Pannek, Kerstin; Fiori, Simona; Boyd, Roslyn N; Rose, Stephen E

    2014-10-01

    The aim of this study was to quantify grey matter changes in children with unilateral cerebral palsy (UCP), differentiating between cortical or deep grey matter (CDGM) lesions, periventricular white matter (PWM) lesions, and unilateral and bilateral lesions. In a cross-sectional study we obtained high resolution structural magnetic resonance images from 72 children (41 males, 31 females, mean age 10y 9mo [SD 3y 1mo], range 5y 1mo-17y 1mo) with UCP (33 left, 39 right hemiplegia; Manual Ability Classification System level I n=29, II n=43; Gross Motor Function Classification System level I n=46, II n=26), and 19 children with typical development (CTD; eight males, 11 females, mean age 11y 2mo [SD 2y 7mo], range 7y 8mo-16y 4mo). Images were classified by lesion type and analyzed using voxel-based morphometry (VBM) and subcortical volumetric analysis. Deep grey matter volumes were not significantly different between children with CDGM and PWM lesions, with the thalamus, putamen, and globus pallidus being reduced unilaterally in both groups compared with CTD (p≤0.001). Children with CDGM lesions additionally showed widespread cortical changes involving all lobes using VBM (p<0.01). Children with bilateral lesions had reduced thalamus and putamen volumes bilaterally (p<0.001). The thalamic volume was reduced bilaterally in children with unilateral lesions (p=0.004). Lesions to the PWM cause secondary changes to the deep grey matter structures similar to primary changes seen in CDGM lesions. Despite having a unilateral phenotype, grey matter changes are observed bilaterally, even in children with unilateral lesions. © 2014 Mac Keith Press.

  6. A regression framework incorporating quantitative and negative interaction data improves quantitative prediction of PDZ domain-peptide interaction from primary sequence.

    Science.gov (United States)

    Shao, Xiaojian; Tan, Chris S H; Voss, Courtney; Li, Shawn S C; Deng, Naiyang; Bader, Gary D

    2011-02-01

    Predicting protein interactions involving peptide recognition domains is essential for understanding the many important biological processes they mediate. It is important to consider the binding strength of these interactions to help us construct more biologically relevant protein interaction networks that consider cellular context and competition between potential binders. We developed a novel regression framework that considers both positive (quantitative) and negative (qualitative) interaction data available for mouse PDZ domains to quantitatively predict interactions between PDZ domains, a large peptide recognition domain family, and their peptide ligands using primary sequence information. First, we show that it is possible to learn from existing quantitative and negative interaction data to infer the relative binding strength of interactions involving previously unseen PDZ domains and/or peptides given their primary sequence. Performance was measured using cross-validated hold out testing and testing with previously unseen PDZ domain-peptide interactions. Second, we find that incorporating negative data improves quantitative interaction prediction. Third, we show that sequence similarity is an important prediction performance determinant, which suggests that experimentally collecting additional quantitative interaction data for underrepresented PDZ domain subfamilies will improve prediction. The Matlab code for our SemiSVR predictor and all data used here are available at http://baderlab.org/Data/PDZAffinity.

  7. The bias associated with amplicon sequencing does not affect the quantitative assessment of bacterial community dynamics.

    Directory of Open Access Journals (Sweden)

    Federico M Ibarbalz

    Full Text Available The performance of two sets of primers targeting variable regions of the 16S rRNA gene V1-V3 and V4 was compared in their ability to describe changes of bacterial diversity and temporal turnover in full-scale activated sludge. Duplicate sets of high-throughput amplicon sequencing data of the two 16S rRNA regions shared a collection of core taxa that were observed across a series of twelve monthly samples, although the relative abundance of each taxon was substantially different between regions. A case in point was the changes in the relative abundance of filamentous bacteria Thiothrix, which caused a large effect on diversity indices, but only in the V1-V3 data set. Yet the relative abundance of Thiothrix in the amplicon sequencing data from both regions correlated with the estimation of its abundance determined using fluorescence in situ hybridization. In nonmetric multidimensional analysis samples were distributed along the first ordination axis according to the sequenced region rather than according to sample identities. The dynamics of microbial communities indicated that V1-V3 and the V4 regions of the 16S rRNA gene yielded comparable patterns of: 1 the changes occurring within the communities along fixed time intervals, 2 the slow turnover of activated sludge communities and 3 the rate of species replacement calculated from the taxa-time relationships. The temperature was the only operational variable that showed significant correlation with the composition of bacterial communities over time for the sets of data obtained with both pairs of primers. In conclusion, we show that despite the bias introduced by amplicon sequencing, the variable regions V1-V3 and V4 can be confidently used for the quantitative assessment of bacterial community dynamics, and provide a proper qualitative account of general taxa in the community, especially when the data are obtained over a convenient time window rather than at a single time point.

  8. The mitochondrial genome sequence of a deep-sea, hydrothermal vent limpet, Lepetodrilus nux, presents a novel vetigastropod gene arrangement.

    Science.gov (United States)

    Nakajima, Yuichi; Shinzato, Chuya; Khalturina, Mariia; Nakamura, Masako; Watanabe, Hiromi; Satoh, Noriyuki; Mitarai, Satoshi

    2016-08-01

    While mitochondrial (mt) genomes are used extensively for comparative and evolutionary genomics, few mt genomes of deep-sea species, including hydrothermal vent species, have been determined. The Genus Lepetodrilus is a major deep-sea gastropod taxon that occurs in various deep-sea ecosystems. Using next-generation sequencing, we determined nearly the complete mitochondrial genome sequence of Lepetodrilus nux, which inhabits hydrothermal vents in the Okinawa Trough. The total length of the mitochondrial genome is 16,353bp, excluding the repeat region. It contains 13 protein-coding genes, 22 tRNA genes, two rRNA genes, and a control region, typical of most metazoan genomes. Compared with other vetigastropod mt genome sequences, L. nux employs a novel mt gene arrangement. Other novel arrangements have been identified in the vetigastropod, Fissurella volcano, and in Chrysomallon squamiferum, a neomphaline gastropod; however, all three gene arrangements are different, and Bayesian inference suggests that each lineage diverged independently. Our findings suggest that vetigastropod mt gene arrangements are more diverse than previously realized.

  9. Quantitative trait loci markers derived from whole genome sequence data increases the reliability of genomic prediction

    DEFF Research Database (Denmark)

    Brøndum, Rasmus Froberg; Su, Guosheng; Janss, Luc

    2015-01-01

    This study investigated the effect on the reliability of genomic prediction when a small number of significant variants from single marker analysis based on whole genome sequence data were added to the regular 54k single nucleotide polymorphism (SNP) array data. The extra markers were selected...... itself. Depending on the trait’s economic weight, 15, 10, or 5 quantitative trait loci (QTL) were selected per trait per breed and 3 to 5 markers were selected to tag each QTL. After removing duplicate markers (same marker selected for more than one trait or breed) and filtering for high pairwise linkage...... was observed for mastitis, but only a 0.5 percentage point increase was seen for fertility. When using a Bayesian model accuracies were generally higher with only 54k data compared with the genomic BLUP approach, but increases in reliability were relatively smaller when QTL markers were included. Results from...

  10. Next-Generation Sequencing Reveals Deep Intronic Cryptic ABCC8 and HADH Splicing Founder Mutations Causing Hyperinsulinism by Pseudoexon Activation

    Science.gov (United States)

    Flanagan, Sarah E.; Xie, Weijia; Caswell, Richard; Damhuis, Annet; Vianey-Saban, Christine; Akcay, Teoman; Darendeliler, Feyza; Bas, Firdevs; Guven, Ayla; Siklar, Zeynep; Ocal, Gonul; Berberoglu, Merih; Murphy, Nuala; O’Sullivan, Maureen; Green, Andrew; Clayton, Peter E.; Banerjee, Indraneel; Clayton, Peter T.; Hussain, Khalid; Weedon, Michael N.; Ellard, Sian

    2013-01-01

    Next-generation sequencing (NGS) enables analysis of the human genome on a scale previously unachievable by Sanger sequencing. Exome sequencing of the coding regions and conserved splice sites has been very successful in the identification of disease-causing mutations, and targeting of these regions has extended clinical diagnostic testing from analysis of fewer than ten genes per phenotype to more than 100. Noncoding mutations have been less extensively studied despite evidence from mRNA analysis for the existence of deep intronic mutations in >20 genes. We investigated individuals with hyperinsulinaemic hypoglycaemia and biochemical or genetic evidence to suggest noncoding mutations by using NGS to analyze the entire genomic regions of ABCC8 (117 kb) and HADH (94 kb) from overlapping ∼10 kb PCR amplicons. Two deep intronic mutations, c.1333-1013A>G in ABCC8 and c.636+471G>T HADH, were identified. Both are predicted to create a cryptic splice donor site and an out-of-frame pseudoexon. Sequence analysis of mRNA from affected individuals’ fibroblasts or lymphoblastoid cells confirmed mutant transcripts with pseudoexon inclusion and premature termination codons. Testing of additional individuals showed that these are founder mutations in the Irish and Turkish populations, accounting for 14% of focal hyperinsulinism cases and 32% of subjects with HADH mutations in our cohort. The identification of deep intronic mutations has previously focused on the detection of aberrant mRNA transcripts in a subset of disorders for which RNA is readily obtained from the target tissue or ectopically expressed at sufficient levels. Our approach of using NGS to analyze the entire genomic DNA sequence is applicable to any disease. PMID:23273570

  11. Clinical Application of Targeted Deep Sequencing in Solid-Cancer Patients; Utility of Targeted Deep Sequencing for Biomarker-Selected Clinical Trial.

    Science.gov (United States)

    Kim, Seung Tae; Kim, Kyoung-Mee; Kim, Nayoung K D; Park, Joon Oh; Ahn, Soomin; Yun, Jae-Won; Kim, Kyu-Tae; Park, Se Hoon; Park, Peter J; Kim, Hee Cheol; Sohn, Tae Sung; Choi, Dong Il; Cho, Jong Ho; Heo, Jin Seok; Kwon, Wooil; Lee, Hyuk; Min, Byung-Hoon; Hong, Sung No; Park, Young Suk; Lim, Ho Yeong; Kang, Won Ki; Park, Woong-Yang; Lee, Jeeyun

    2017-07-12

    Molecular profiling of actionable mutations in refractory cancer patients has the potential to enable "precision medicine," wherein individualized therapies are guided based on genomic profiling. The molecular-screening program was intended to route participants to different candidate drugs in trials based on clinical-sequencing reports. In this screening program, we used a custom target-enrichment panel consisting of cancer-related genes to interrogate single-nucleotide variants, insertions and deletions, copy number variants, and a subset of gene fusions. From August 2014 through April 2015, 654 patients consented to participate in the program at Samsung Medical Center. Of these patients, 588 passed the quality control process for the 381-gene cancer-panel test, and 418 patients were included in the final analysis as being eligible for any anticancer treatment (127 gastric cancer, 122 colorectal cancer, 62 pancreatic/biliary tract cancer, 67 sarcoma/other cancer, and 40 genitourinary cancer patients). Of the 418 patients, 55 (12%) harbored a biomarker that guided them to a biomarker-selected clinical trial, and 184 (44%) patients harbored at least one genomic alteration that was potentially targetable. This study demonstrated that the panel-based sequencing program resulted in an increased rate of trial enrollment of metastatic cancer patients into biomarker-selected clinical trials. Given the expanding list of biomarker-selected trials, the guidance percentage to matched trials is anticipated to increase. This study demonstrated that the panel-based sequencing program resulted in an increased rate of trial enrollment of metastatic cancer patients into biomarker-selected clinical trials. Given the expanding list of biomarker-selected trials, the guidance percentage to matched trials is anticipated to increase. © AlphaMed Press 2017.

  12. FID-SPI pulse sequence for quantitative MRI of fluids in porous media.

    Science.gov (United States)

    Marica, Florea; Goora, Frédéric G; Balcom, Bruce J

    2014-03-01

    MRI has great potential for providing quantitative, spatially resolved information about fluids imbibed in porous media. The pure phase encode SPRITE technique has proven to be a very general method for the generation of density images in porous media; however, low flip-angle RF pulses and broad filter widths, required by short encoding times, yield sub-optimal S/N images. A 1-D phase-encoding sequence for T2(∗) mapping, named FID-SPI, is presented and analyzed in terms of image quality and accuracy of fluid content distribution in porous media. Extension to 2-D and 3-D imaging was straightforward and images of heterogeneous samples are presented. The FID-SPI measurement results in a series of individual T2(∗) weighted images acquired following RF excitation and pulsed phase-encoding gradients. Key to the performance of the FID-SPI method is high quality control of the magnetic field gradient pulse to ensure each FID point has identical spatial encoding. FID-SPI is intended for a quantitative determination of the spatially resolved fluid content in heterogeneous porous media, having the ability to determine the T2(∗) decay for each image pixel. T2(∗) mapping aids in estimation of the local fluid content.

  13. FID-SPI pulse sequence for quantitative MRI of fluids in porous media

    Science.gov (United States)

    Marica, Florea; Goora, Frédéric G.; Balcom, Bruce J.

    2014-03-01

    MRI has great potential for providing quantitative, spatially resolved information about fluids imbibed in porous media. The pure phase encode SPRITE technique has proven to be a very general method for the generation of density images in porous media; however, low flip-angle RF pulses and broad filter widths, required by short encoding times, yield sub-optimal S/N images. A 1-D phase-encoding sequence for T2∗ mapping, named FID-SPI, is presented and analyzed in terms of image quality and accuracy of fluid content distribution in porous media. Extension to 2-D and 3-D imaging was straightforward and images of heterogeneous samples are presented. The FID-SPI measurement results in a series of individual T2∗ weighted images acquired following RF excitation and pulsed phase-encoding gradients. Key to the performance of the FID-SPI method is high quality control of the magnetic field gradient pulse to ensure each FID point has identical spatial encoding. FID-SPI is intended for a quantitative determination of the spatially resolved fluid content in heterogeneous porous media, having the ability to determine the T2∗ decay for each image pixel. T2∗ mapping aids in estimation of the local fluid content.

  14. Homology-independent discovery of replicating pathogenic circular RNAs by deep sequencing and a new computational algorithm.

    Science.gov (United States)

    Wu, Qingfa; Wang, Ying; Cao, Mengji; Pantaleo, Vitantonio; Burgyan, Joszef; Li, Wan-Xiang; Ding, Shou-Wei

    2012-03-06

    A common challenge in pathogen discovery by deep sequencing approaches is to recognize viral or subviral pathogens in samples of diseased tissue that share no significant homology with a known pathogen. Here we report a homology-independent approach for discovering viroids, a distinct class of free circular RNA subviral pathogens that encode no protein and are known to infect plants only. Our approach involves analyzing the sequences of the total small RNAs of the infected plants obtained by deep sequencing with a unique computational algorithm, progressive filtering of overlapping small RNAs (PFOR). Viroid infection triggers production of viroid-derived overlapping siRNAs that cover the entire genome with high densities. PFOR retains viroid-specific siRNAs for genome assembly by progressively eliminating nonoverlapping small RNAs and those that overlap but cannot be assembled into a direct repeat RNA, which is synthesized from circular or multimeric repeated-sequence templates during viroid replication. We show that viroids from the two known families are readily identified and their full-length sequences assembled by PFOR from small RNAs sequenced from infected plants. PFOR analysis of a grapevine library further identified a viroid-like circular RNA 375 nt long that shared no significant sequence homology with known molecules and encoded active hammerhead ribozymes in RNAs of both plus and minus polarities, which presumably self-cleave to release monomer from multimeric replicative intermediates. A potential application of the homology-independent approach for viroid discovery in plant and animal species where RNA replication triggers the biogenesis of siRNAs is discussed.

  15. Eigenspectra optoacoustic tomography achieves quantitative blood oxygenation imaging deep in tissues

    CERN Document Server

    Tzoumas, Stratis; Olefir, Ivan; Stangl, Stefan; Symvoulidis, Panagiotis; Glasl, Sarah; Bayer, Christine; Multhoff, Gabriele; Ntziachristos, Vasilis

    2015-01-01

    Light propagating in tissue attains a spectrum that varies with location due to wavelength-dependent fluence attenuation by tissue optical properties, an effect that causes spectral corruption. Predictions of the spectral variations of light fluence in tissue are challenging since the spatial distribution of optical properties in tissue cannot be resolved in high resolution or with high accuracy by current methods. Spectral corruption has fundamentally limited the quantification accuracy of optical and optoacoustic methods and impeded the long sought-after goal of imaging blood oxygen saturation (sO2) deep in tissues; a critical but still unattainable target for the assessment of oxygenation in physiological processes and disease. We discover a new principle underlying light fluence in tissues, which describes the wavelength dependence of light fluence as an affine function of a few reference base spectra, independently of the specific distribution of tissue optical properties. This finding enables the introd...

  16. Quantitative ultrasound venous valve movement: early diagnosis of deep vein thrombosis

    Science.gov (United States)

    Muhd Suberi, Anis Azwani; Wan Zakaria, Wan Nurshazwani; Tomari, Razali; Ibrahim, Nabilah

    2016-07-01

    The purpose of this paper is to provide an in-depth analysis of computer aided system for the early diagnosis of Deep Vein Thrombosis (DVT). Normally, patients are diagnosed with DVT through ultrasound examination after they have a serious complication. Thus, this study proposes a new approach to reduce the risk of recurrent DVT by tracking the venous valve movement behaviour. Inspired by image processing technology, several image processing methods namely, image enhancement, segmentation and morphological have been implemented to improve the image quality for further tracking procedure. In segmentation, Otsu thresholding provides a significant result in segmenting valve structure. Subsequently, morphological dilation method is able to enhance the region shape of the valve distinctly and precisely. Lastly, image subtraction method is presented and evaluated to track the valve movement. Based on the experimental results the normal range of valve velocity lies within the range of blood flow velocity (Vb) and occasionally may result in higher values.

  17. Massively parallel digital high resolution melt for rapid and absolutely quantitative sequence profiling

    Science.gov (United States)

    Velez, Daniel Ortiz; Mack, Hannah; Jupe, Julietta; Hawker, Sinead; Kulkarni, Ninad; Hedayatnia, Behnam; Zhang, Yang; Lawrence, Shelley; Fraley, Stephanie I.

    2017-02-01

    In clinical diagnostics and pathogen detection, profiling of complex samples for low-level genotypes represents a significant challenge. Advances in speed, sensitivity, and extent of multiplexing of molecular pathogen detection assays are needed to improve patient care. We report the development of an integrated platform enabling the identification of bacterial pathogen DNA sequences in complex samples in less than four hours. The system incorporates a microfluidic chip and instrumentation to accomplish universal PCR amplification, High Resolution Melting (HRM), and machine learning within 20,000 picoliter scale reactions, simultaneously. Clinically relevant concentrations of bacterial DNA molecules are separated by digitization across 20,000 reactions and amplified with universal primers targeting the bacterial 16S gene. Amplification is followed by HRM sequence fingerprinting in all reactions, simultaneously. The resulting bacteria-specific melt curves are identified by Support Vector Machine learning, and individual pathogen loads are quantified. The platform reduces reaction volumes by 99.995% and achieves a greater than 200-fold increase in dynamic range of detection compared to traditional PCR HRM approaches. Type I and II error rates are reduced by 99% and 100% respectively, compared to intercalating dye-based digital PCR (dPCR) methods. This technology could impact a number of quantitative profiling applications, especially infectious disease diagnostics.

  18. Massively parallel digital high resolution melt for rapid and absolutely quantitative sequence profiling

    Science.gov (United States)

    Velez, Daniel Ortiz; Mack, Hannah; Jupe, Julietta; Hawker, Sinead; Kulkarni, Ninad; Hedayatnia, Behnam; Zhang, Yang; Lawrence, Shelley; Fraley, Stephanie I.

    2017-01-01

    In clinical diagnostics and pathogen detection, profiling of complex samples for low-level genotypes represents a significant challenge. Advances in speed, sensitivity, and extent of multiplexing of molecular pathogen detection assays are needed to improve patient care. We report the development of an integrated platform enabling the identification of bacterial pathogen DNA sequences in complex samples in less than four hours. The system incorporates a microfluidic chip and instrumentation to accomplish universal PCR amplification, High Resolution Melting (HRM), and machine learning within 20,000 picoliter scale reactions, simultaneously. Clinically relevant concentrations of bacterial DNA molecules are separated by digitization across 20,000 reactions and amplified with universal primers targeting the bacterial 16S gene. Amplification is followed by HRM sequence fingerprinting in all reactions, simultaneously. The resulting bacteria-specific melt curves are identified by Support Vector Machine learning, and individual pathogen loads are quantified. The platform reduces reaction volumes by 99.995% and achieves a greater than 200-fold increase in dynamic range of detection compared to traditional PCR HRM approaches. Type I and II error rates are reduced by 99% and 100% respectively, compared to intercalating dye-based digital PCR (dPCR) methods. This technology could impact a number of quantitative profiling applications, especially infectious disease diagnostics. PMID:28176860

  19. Massively parallel digital high resolution melt for rapid and absolutely quantitative sequence profiling.

    Science.gov (United States)

    Velez, Daniel Ortiz; Mack, Hannah; Jupe, Julietta; Hawker, Sinead; Kulkarni, Ninad; Hedayatnia, Behnam; Zhang, Yang; Lawrence, Shelley; Fraley, Stephanie I

    2017-02-08

    In clinical diagnostics and pathogen detection, profiling of complex samples for low-level genotypes represents a significant challenge. Advances in speed, sensitivity, and extent of multiplexing of molecular pathogen detection assays are needed to improve patient care. We report the development of an integrated platform enabling the identification of bacterial pathogen DNA sequences in complex samples in less than four hours. The system incorporates a microfluidic chip and instrumentation to accomplish universal PCR amplification, High Resolution Melting (HRM), and machine learning within 20,000 picoliter scale reactions, simultaneously. Clinically relevant concentrations of bacterial DNA molecules are separated by digitization across 20,000 reactions and amplified with universal primers targeting the bacterial 16S gene. Amplification is followed by HRM sequence fingerprinting in all reactions, simultaneously. The resulting bacteria-specific melt curves are identified by Support Vector Machine learning, and individual pathogen loads are quantified. The platform reduces reaction volumes by 99.995% and achieves a greater than 200-fold increase in dynamic range of detection compared to traditional PCR HRM approaches. Type I and II error rates are reduced by 99% and 100% respectively, compared to intercalating dye-based digital PCR (dPCR) methods. This technology could impact a number of quantitative profiling applications, especially infectious disease diagnostics.

  20. Quantitative sequencing of 5-formylcytosine in DNA at single-base resolution

    Science.gov (United States)

    Booth, Michael J.; Marsico, Giovanni; Bachman, Martin; Beraldi, Dario; Balasubramanian, Shankar

    2014-05-01

    Recently, the cytosine modifications 5-hydroxymethylcytosine (5hmC) and 5-formylcytosine (5fC) were found to exist in the genomic deoxyribonucleic acid (DNA) of a wide range of mammalian cell types. It is now important to understand their role in normal biological function and disease. Here we introduce reduced bisulfite sequencing (redBS-Seq), a quantitative method to decode 5fC in DNA at single-base resolution, based on a selective chemical reduction of 5fC to 5hmC followed by bisulfite treatment. After extensive validation on synthetic and genomic DNA, we combined redBS-Seq and oxidative bisulfite sequencing (oxBS-Seq) to generate the first combined genomic map of 5-methylcytosine, 5hmC and 5fC in mouse embryonic stem cells. Our experiments revealed that in certain genomic locations 5fC is present at comparable levels to 5hmC and 5mC. The combination of these chemical methods can quantify and precisely map these three cytosine derivatives in the genome and will help provide insights into their function.

  1. Deep Sequencing Analysis of miRNA Expression in Breast Muscle of Fast-Growing and Slow-Growing Broilers

    Directory of Open Access Journals (Sweden)

    Hongjia Ouyang

    2015-07-01

    Full Text Available Growth performance is an important economic trait in chicken. MicroRNAs (miRNAs have been shown to play important roles in various biological processes, but their functions in chicken growth are not yet clear. To investigate the function of miRNAs in chicken growth, breast muscle tissues of the two-tail samples (highest and lowest body weight from Recessive White Rock (WRR and Xinghua Chickens (XH were performed on high throughput small RNA deep sequencing. In this study, a total of 921 miRNAs were identified, including 733 known mature miRNAs and 188 novel miRNAs. There were 200, 279, 257 and 297 differentially expressed miRNAs in the comparisons of WRRh vs. WRRl, WRRh vs. XHh, WRRl vs. XHl, and XHh vs. XHl group, respectively. A total of 22 highly differentially expressed miRNAs (fold change > 2 or < 0.5; p-value < 0.05; q-value < 0.01, which also have abundant expression (read counts > 1000 were found in our comparisons. As far as two analyses (WRRh vs. WRRl, and XHh vs. XHl are concerned, we found 80 common differentially expressed miRNAs, while 110 miRNAs were found in WRRh vs. XHh and WRRl vs. XHl. Furthermore, 26 common miRNAs were identified among all four comparisons. Four differentially expressed miRNAs (miR-223, miR-16, miR-205a and miR-222b-5p were validated by quantitative real-time RT-PCR (qRT-PCR. Regulatory networks of interactions among miRNAs and their targets were constructed using integrative miRNA target-prediction and network-analysis. Growth hormone receptor (GHR was confirmed as a target of miR-146b-3p by dual-luciferase assay and qPCR, indicating that miR-34c, miR-223, miR-146b-3p, miR-21 and miR-205a are key growth-related target genes in the network. These miRNAs are proposed as candidate miRNAs for future studies concerning miRNA-target function on regulation of chicken growth.

  2. Deep sequencing extends the diversity of human papillomaviruses in human skin.

    OpenAIRE

    Bzhalava, Davit; Mühr, Laila Sara Arroyo; Lagheden, Camilla; Ekström, Johanna; Forslund, Ola; Dillner, Joakim; Hultin, Emilie

    2014-01-01

    Most viruses in human skin are known to be human papillomaviruses (HPVs). Previous sequencing of skin samples has identified 273 different cutaneous HPV types, including 47 previously unknown types. In the present study, we wished to extend prior studies using deeper sequencing. This deeper sequencing without prior PCR of a pool of 142 whole genome amplified skin lesions identified 23 known HPV types, 3 novel putative HPV types and 4 non-HPV viruses. The complete sequence was obtained for one...

  3. Metavisitor, a Suite of Galaxy Tools for Simple and Rapid Detection and Discovery of Viruses in Deep Sequence Data

    Science.gov (United States)

    Vernick, Kenneth D.

    2017-01-01

    Metavisitor is a software package that allows biologists and clinicians without specialized bioinformatics expertise to detect and assemble viral genomes from deep sequence datasets. The package is composed of a set of modular bioinformatic tools and workflows that are implemented in the Galaxy framework. Using the graphical Galaxy workflow editor, users with minimal computational skills can use existing Metavisitor workflows or adapt them to suit specific needs by adding or modifying analysis modules. Metavisitor works with DNA, RNA or small RNA sequencing data over a range of read lengths and can use a combination of de novo and guided approaches to assemble genomes from sequencing reads. We show that the software has the potential for quick diagnosis as well as discovery of viruses from a vast array of organisms. Importantly, we provide here executable Metavisitor use cases, which increase the accessibility and transparency of the software, ultimately enabling biologists or clinicians to focus on biological or medical questions. PMID:28045932

  4. The Subclonal Structure and Genomic Evolution of Oral Squamous Cell Carcinoma Revealed by Ultra-deep Sequencing

    DEFF Research Database (Denmark)

    Tabatabaeifar, Siavosh; Thomassen, Mads; Larsen, Martin Jakob

    Background: Oral squamous cell carcinoma (OSCC), a subgroup of head and neck squamous cell carcinoma (HNSCC), is primarily caused by alcohol consumption and tobacco use. Recent DNA sequencing studies suggests that HNSCC are very heterogeneous between patients; however the intra-patient subclonal...... structure remains unexplored due to lack of sampling multiple tumor biopsies from each patient. Materials and methods: To examine the clonal structure and describe the genomic cancer evolution we applied whole-exome sequencing combined with targeted ultra-deep targeted sequencing on biopsies from 5stage IV...... complex subclonal architectures comprising distinct subclones only found in geographically distinct regions of the tumors. The metastatic potential of the tumor is acquired early in the tumor evolution, as indicated by the lymph node sharing the majority of the mutations with the tumor biopsies, while...

  5. Deep Proteomics of Mouse Skeletal Muscle Enables Quantitation of Protein Isoforms, Metabolic Pathways, and Transcription Factors*

    Science.gov (United States)

    Deshmukh, Atul S.; Murgia, Marta; Nagaraj, Nagarjuna; Treebak, Jonas T.; Cox, Jürgen; Mann, Matthias

    2015-01-01

    Skeletal muscle constitutes 40% of individual body mass and plays vital roles in locomotion and whole-body metabolism. Proteomics of skeletal muscle is challenging because of highly abundant contractile proteins that interfere with detection of regulatory proteins. Using a state-of-the art MS workflow and a strategy to map identifications from the C2C12 cell line model to tissues, we identified a total of 10,218 proteins, including skeletal muscle specific transcription factors like myod1 and myogenin and circadian clock proteins. We obtain absolute abundances for proteins expressed in a muscle cell line and skeletal muscle, which should serve as a valuable resource. Quantitation of protein isoforms of glucose uptake signaling pathways and in glucose and lipid metabolic pathways provides a detailed metabolic map of the cell line compared with tissue. This revealed unexpectedly complex regulation of AMP-activated protein kinase and insulin signaling in muscle tissue at the level of enzyme isoforms. PMID:25616865

  6. Deep proteomics of mouse skeletal muscle enables quantitation of protein isoforms, metabolic pathways and transcription factors

    DEFF Research Database (Denmark)

    Deshmukh, Atul S; Murgia, Marta; Nagaraja, Nagarjuna

    2015-01-01

    expressed in a muscle cell line and skeletal muscle, which should serve as a valuable resource. Quantitation of protein isoforms of glucose uptake signaling pathways and in glucose and lipid metabolic pathways provides a detailed metabolic map of the cell line compare to tissue. This revealed unexpectedly...... complex regulation of AMP-activated protein kinase and insulin signaling in muscle tissue at the level of enzyme isoforms.......Skeletal muscle constitutes 40% of individual body mass and plays vital roles in locomotion and whole-body metabolism. Proteomics of skeletal muscle is challenging due to highly abundant contractile proteins that interfere with detection of regulatory proteins. Using a state-of-the art mass...

  7. Deep sequencing unearths nuclear mitochondrial sequences under Leber's hereditary optic neuropathy-associated false heteroplasmic mitochondrial DNA variants.

    Science.gov (United States)

    Petruzzella, Vittoria; Carrozzo, Rosalba; Calabrese, Claudia; Dell'Aglio, Rosa; Trentadue, Raffaella; Piredda, Roberta; Artuso, Lucia; Rizza, Teresa; Bianchi, Marzia; Porcelli, Anna Maria; Guerriero, Silvana; Gasparre, Giuseppe; Attimonelli, Marcella

    2012-09-01

    Leber's hereditary optic neuropathy (LHON) is associated with mitochondrial DNA (mtDNA) ND mutations that are mostly homoplasmic. However, these mutations are not sufficient to explain the peculiar features of penetrance and the tissue-specific expression of the disease and are believed to be causative in association with unknown environmental or other genetic factors. Discerning between clear-cut pathogenetic variants, such as those that appear to be heteroplasmic, and less penetrant variants, such as the homoplasmic, remains a challenging issue that we have addressed here using next-generation sequencing approach. We set up a protocol to quantify MTND5 heteroplasmy levels in a family in which the proband manifests a LHON phenotype. Furthermore, to study this mtDNA haplotype, we applied the cybridization protocol. The results demonstrate that the mutations are mostly homoplasmic, whereas the suspected heteroplasmic feature of the observed mutations is due to the co-amplification of Nuclear mitochondrial Sequences.

  8. Quantitative analysis of gait and balance response to deep brain stimulation in Parkinson's disease.

    Science.gov (United States)

    Mera, Thomas O; Filipkowski, Danielle E; Riley, David E; Whitney, Christina M; Walter, Benjamin L; Gunzler, Steven A; Giuffrida, Joseph P

    2013-05-01

    Gait and balance disturbances in Parkinson's disease (PD) can be debilitating and may lead to increased fall risk. Deep brain stimulation (DBS) is a treatment option once therapeutic benefits from medication are limited due to motor fluctuations and dyskinesia. Optimizing DBS parameters for gait and balance can be significantly more challenging than for other PD motor symptoms. Furthermore, inter-rater reliability of the standard clinical PD assessment scale, Unified Parkinson's Disease Rating Scale (UPDRS), may introduce bias and washout important features of gait and balance that may respond differently to PD therapies. Study objectives were to evaluate clinician UPDRS gait and balance scoring inter-rater reliability, UPDRS sensitivity to different aspects of gait and balance, and how kinematic features extracted from motion sensor data respond to stimulation. Forty-two subjects diagnosed with PD were recruited with varying degrees of gait and balance impairment. All subjects had been prescribed dopaminergic medication, and 20 subjects had previously undergone DBS surgery. Subjects performed seven items of the gait and balance subset of the UPDRS while wearing motion sensors on the sternum and each heel and thigh. Inter-rater reliability varied by UPDRS item. Correlation coefficients between at least one kinematic feature and corresponding UPDRS scores were greater than 0.75 for six of the seven items. Kinematic features improved (pUPDRS items. Despite achieving high correlations with the UPDRS, evaluating individual kinematic features may help address inter-rater reliability issues and rater bias associated with focusing on different aspects of a motor task.

  9. HPV Population Profiling in Healthy Men by Next-Generation Deep Sequencing Coupled with HPV-QUEST.

    Science.gov (United States)

    Yin, Li; Yao, Jin; Chang, Kaifen; Gardner, Brent P; Yu, Fahong; Giuliano, Anna R; Goodenow, Maureen M

    2016-01-25

    Multiple-type human papillomaviruses (HPV) infection presents a greater risk for persistence in asymptomatic individuals and may accelerate cancer development. To extend the scope of HPV types defined by probe-based assays, multiplexing deep sequencing of HPV L1, coupled with an HPV-QUEST genotyping server and a bioinformatic pipeline, was established and applied to survey the diversity of HPV genotypes among a subset of healthy men from the HPV in Men (HIM) Multinational Study. Twenty-one HPV genotypes (12 high-risk and 9 low-risk) were detected in the genital area from 18 asymptomatic individuals. A single HPV type, either HPV16, HPV6b or HPV83, was detected in 7 individuals, while coinfection by 2 to 5 high-risk and/or low-risk genotypes was identified in the other 11 participants. In two individuals studied for over one year, HPV16 persisted, while fluctuations of coinfecting genotypes occurred. HPV L1 regions were generally identical between query and reference sequences, although nonsynonymous and synonymous nucleotide polymorphisms of HPV16, 18, 31, 35h, 59, 70, 73, cand85, 6b, 62, 81, 83, cand89 or JEB2 L1 genotypes, mostly unidentified by linear array, were evident. Deep sequencing coupled with HPV-QUEST provides efficient and unambiguous classification of HPV genotypes in multiple-type HPV infection in host ecosystems.

  10. High-resolution deep sequencing reveals biodiversity, population structure, and persistence of HIV-1 quasispecies within host ecosystems

    Directory of Open Access Journals (Sweden)

    Yin Li

    2012-12-01

    Full Text Available Abstract Background Deep sequencing provides the basis for analysis of biodiversity of taxonomically similar organisms in an environment. While extensively applied to microbiome studies, population genetics studies of viruses are limited. To define the scope of HIV-1 population biodiversity within infected individuals, a suite of phylogenetic and population genetic algorithms was applied to HIV-1 envelope hypervariable domain 3 (Env V3 within peripheral blood mononuclear cells from a group of perinatally HIV-1 subtype B infected, therapy-naïve children. Results Biodiversity of HIV-1 Env V3 quasispecies ranged from about 70 to 270 unique sequence clusters across individuals. Viral population structure was organized into a limited number of clusters that included the dominant variants combined with multiple clusters of low frequency variants. Next generation viral quasispecies evolved from low frequency variants at earlier time points through multiple non-synonymous changes in lineages within the evolutionary landscape. Minor V3 variants detected as long as four years after infection co-localized in phylogenetic reconstructions with early transmitting viruses or with subsequent plasma virus circulating two years later. Conclusions Deep sequencing defines HIV-1 population complexity and structure, reveals the ebb and flow of dominant and rare viral variants in the host ecosystem, and identifies an evolutionary record of low-frequency cell-associated viral V3 variants that persist for years. Bioinformatics pipeline developed for HIV-1 can be applied for biodiversity studies of virome populations in human, animal, or plant ecosystems.

  11. The DEEP2 Galaxy Redshift Survey: The Red Sequence AGN Fraction and its Environment and Redshift Dependence

    CERN Document Server

    Montero-Dorta, Antonio D; Yan, Renbin; Cooper, Michael C; Newman, Jeffery A; Georgakakis, Antonis; Prada, Francisco; Davis, Marc; Nandra, Kirpal; Coil, Alison

    2008-01-01

    We measure the dependence of the AGN fraction on local environment at z~1, using spectroscopic data taken from the DEEP2 Galaxy Redshift Survey, and Chandra X-ray data from the All-Wavelength Extended Groth Strip International Survey (AEGIS). To provide a clean sample of AGN we restrict our analysis to the red sequence population; this also reduces additional colour-environment correlations. We find evidence that high redshift LINERs in DEEP2 tend to favour higher density environments relative to the red population from which they are drawn. In contrast, Seyferts and X-ray selected AGN at z~1 show little (or no) environmental dependencies within the same underlying population. We compare these results with a sample of local AGN drawn from the SDSS. Contrary to the high redshift behaviour, we find that both LINERs and Seyferts in the SDSS show a slowly declining red sequence AGN fraction towards high density environments. Interestingly, at z~1 red sequence Seyferts and LINERs are approximately equally abundant...

  12. Deep sequencing detects very-low-grade somatic mosaicism in the unaffected mother of siblings with nemaline myopathy.

    Science.gov (United States)

    Miyatake, Satoko; Koshimizu, Eriko; Hayashi, Yukiko K; Miya, Kazushi; Shiina, Masaaki; Nakashima, Mitsuko; Tsurusaki, Yoshinori; Miyake, Noriko; Saitsu, Hirotomo; Ogata, Kazuhiro; Nishino, Ichizo; Matsumoto, Naomichi

    2014-07-01

    When an expected mutation in a particular disease-causing gene is not identified in a suspected carrier, it is usually assumed to be due to germline mosaicism. We report here very-low-grade somatic mosaicism in ACTA1 in an unaffected mother of two siblings affected with a neonatal form of nemaline myopathy. The mosaicism was detected by deep resequencing using a next-generation sequencer. We identified a novel heterozygous mutation in ACTA1, c.448A>G (p.Thr150Ala), in the affected siblings. Three-dimensional structural modeling suggested that this mutation may affect polymerization and/or actin's interactions with other proteins. In this family, we expected autosomal dominant inheritance with either parent demonstrating germline or somatic mosaicism. Sanger sequencing identified no mutation. However, further deep resequencing of this mutation on a next-generation sequencer identified very-low-grade somatic mosaicism in the mother: 0.4%, 1.1%, and 8.3% in the saliva, blood leukocytes, and nails, respectively. Our study demonstrates the possibility of very-low-grade somatic mosaicism in suspected carriers, rather than germline mosaicism. Copyright © 2014 Elsevier B.V. All rights reserved.

  13. Comparison of illumina and 454 deep sequencing in participants failing raltegravir-based antiretroviral therapy.

    Directory of Open Access Journals (Sweden)

    Jonathan Z Li

    Full Text Available The impact of raltegravir-resistant HIV-1 minority variants (MVs on raltegravir treatment failure is unknown. Illumina sequencing offers greater throughput than 454, but sequence analysis tools for viral sequencing are needed. We evaluated Illumina and 454 for the detection of HIV-1 raltegravir-resistant MVs.A5262 was a single-arm study of raltegravir and darunavir/ritonavir in treatment-naïve patients. Pre-treatment plasma was obtained from 5 participants with raltegravir resistance at the time of virologic failure. A control library was created by pooling integrase clones at predefined proportions. Multiplexed sequencing was performed with Illumina and 454 platforms at comparable costs. Illumina sequence analysis was performed with the novel snp-assess tool and 454 sequencing was analyzed with V-Phaser.Illumina sequencing resulted in significantly higher sequence coverage and a 0.095% limit of detection. Illumina accurately detected all MVs in the control library at ≥0.5% and 7/10 MVs expected at 0.1%. 454 sequencing failed to detect any MVs at 0.1% with 5 false positive calls. For MVs detected in the patient samples by both 454 and Illumina, the correlation in the detected variant frequencies was high (R2 = 0.92, P<0.001. Illumina sequencing detected 2.4-fold greater nucleotide MVs and 2.9-fold greater amino acid MVs compared to 454. The only raltegravir-resistant MV detected was an E138K mutation in one participant by Illumina sequencing, but not by 454.In participants of A5262 with raltegravir resistance at virologic failure, baseline raltegravir-resistant MVs were rarely detected. At comparable costs to 454 sequencing, Illumina demonstrated greater depth of coverage, increased sensitivity for detecting HIV MVs, and fewer false positive variant calls.

  14. Deep proteomics of mouse skeletal muscle enables quantitation of protein isoforms, metabolic pathways, and transcription factors.

    Science.gov (United States)

    Deshmukh, Atul S; Murgia, Marta; Nagaraj, Nagarjuna; Treebak, Jonas T; Cox, Jürgen; Mann, Matthias

    2015-04-01

    Skeletal muscle constitutes 40% of individual body mass and plays vital roles in locomotion and whole-body metabolism. Proteomics of skeletal muscle is challenging because of highly abundant contractile proteins that interfere with detection of regulatory proteins. Using a state-of-the art MS workflow and a strategy to map identifications from the C2C12 cell line model to tissues, we identified a total of 10,218 proteins, including skeletal muscle specific transcription factors like myod1 and myogenin and circadian clock proteins. We obtain absolute abundances for proteins expressed in a muscle cell line and skeletal muscle, which should serve as a valuable resource. Quantitation of protein isoforms of glucose uptake signaling pathways and in glucose and lipid metabolic pathways provides a detailed metabolic map of the cell line compared with tissue. This revealed unexpectedly complex regulation of AMP-activated protein kinase and insulin signaling in muscle tissue at the level of enzyme isoforms. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.

  15. Metagenomes obtained by "deep sequencing" - what do they tell about the EBPR communities?

    DEFF Research Database (Denmark)

    Albertsen, Mads; Saunders, Aaron Marc; Nielsen, Kåre Lehmann

    2013-01-01

    . In this study deep metagenomics and fluorescence in situ hybridization (FISH) were used to study a full-scale wastewater treatment plant with enhanced biological phosphorus removal (EBPR) and compared to an existing EBPR metagenome. EBPR is a widely used process that relies on a complex community...

  16. Dynamics of hepatitis B virus quasispecies in association with nucleos(tide analogue treatment determined by ultra-deep sequencing.

    Directory of Open Access Journals (Sweden)

    Norihiro Nishijima

    Full Text Available BACKGROUND AND AIMS: Although the advent of ultra-deep sequencing technology allows for the analysis of heretofore-undetectable minor viral mutants, a limited amount of information is currently available regarding the clinical implications of hepatitis B virus (HBV genomic heterogeneity. METHODS: To characterize the HBV genetic heterogeneity in association with anti-viral therapy, we performed ultra-deep sequencing of full-genome HBV in the liver and serum of 19 patients with chronic viral infection, including 14 therapy-naïve and 5 nucleos(tide analogue(NA-treated cases. RESULTS: Most genomic changes observed in viral variants were single base substitutions and were widely distributed throughout the HBV genome. Four of eight (50% chronic therapy-naïve HBeAg-negative patients showed a relatively low prevalence of the G1896A pre-core (pre-C mutant in the liver tissues, suggesting that other mutations were involved in their HBeAg seroconversion. Interestingly, liver tissues in 4 of 5 (80% of the chronic NA-treated anti-HBe-positive cases had extremely low levels of the G1896A pre-C mutant (0.0%, 0.0%, 0.1%, and 1.1%, suggesting the high sensitivity of the G1896A pre-C mutant to NA. Moreover, various abundances of clones resistant to NA were common in both the liver and serum of treatment-naïve patients, and the proportion of M204VI mutants resistant to lamivudine and entecavir expanded in response to entecavir treatment in the serum of 35.7% (5/14 of patients, suggesting the putative risk of developing drug resistance to NA. CONCLUSION: Our findings illustrate the strong advantage of deep sequencing on viral genome as a tool for dissecting the pathophysiology of HBV infection.

  17. Draft Genome Sequence of the Deep-Sea Bacterium Moritella sp. JT01 and Identification of Biotechnologically Relevant Genes.

    Science.gov (United States)

    Freitas, Robert Cardoso de; Odisi, Estácio Jussie; Kato, Chiaki; da Silva, Marcus Adonai Castro; Lima, André Oliveira de Souza

    2017-07-22

    Deep-sea bacteria can produce various biotechnologically relevant enzymes due to their adaptations to high pressures and low temperatures. To identify such enzymes, we have sequenced the genome of the polycaprolactone-degrading bacterium Moritella sp. JT01, isolated from sediment samples from Japan Trench (6957 m depth), using a Illumina HiSeq2000 sequencer (12.1 million paired-end reads) and CLC Genomics Workbench (version 6.5.1) for the assembly, resulting in a 4.83-Mb genome (42 scaffolds). The genome was annotated using Rapid Annotation using Subsystem Technology (RAST), Protein Homology/analogY Recognition Engine V 2.0 (PHYRE2), and BLAST2Go, revealing 4439 protein coding sequences and 101 RNAs. Gene products with industrial relevance, such as lipases (three) and esterases (four), were identified and are related to bacterium's ability to degrade polycaprolactone. The annotation revealed proteins related to deep-sea survival, such as cold-shock proteins (six) and desaturases (three). The presence of secondary metabolite biosynthetic gene clusters suggests that this bacterium could produce nonribosomal peptides, polyunsaturated fatty acids, and bacteriocins. To demonstrate the potential of this genome, a lipase was cloned an introduced into Escherichia coli. The lipase was purified and characterized, showing activity over a wide temperature range (over 50% at 20-60 °C) and pH range (over 80% at pH 6.3 to 9). This enzyme has tolerance to the surfactant action of sodium dodecyl sulfate and shows 30% increased activity when subjected to a working pressure of 200 MPa. The genomic characterization of Moritella sp. JT01 reveals traits associated with survival in the deep-sea and their potential uses in biotechnology, as exemplified by the characterized lipase.

  18. High diversity of picornaviruses in rats from different continents revealed by deep sequencing

    DEFF Research Database (Denmark)

    Arn Hansen, Thomas; Mollerup, Sarah; Nguyen, Nam-Phuong;

    2016-01-01

    ) collected from two continents by analyzing 2.2 billion next-generation sequencing reads derived from both DNA and RNA. Among other virus families, we found sequences from members of the Picornaviridae to be abundant in the microbiome of all the samples. Here we describe the diversity of the picornavirus...

  19. A Quantitative Visual Mapping and Visualization Approach for Deep Ocean Floor Research

    Science.gov (United States)

    Hansteen, T. H.; Kwasnitschka, T.

    2013-12-01

    Geological fieldwork on the sea floor is still impaired by our inability to resolve features on a sub-meter scale resolution in a quantifiable reference frame and over an area large enough to reveal the context of local observations. In order to overcome these issues, we have developed an integrated workflow of visual mapping techniques leading to georeferenced data sets which we examine using state-of-the-art visualization technology to recreate an effective working style of field geology. We demonstrate a microbathymetrical workflow, which is based on photogrammetric reconstruction of ROV imagery referenced to the acoustic vehicle track. The advantage over established acoustical systems lies in the true three-dimensionality of the data as opposed to the perspective projection from above produced by downward looking mapping methods. A full color texture mosaic derived from the imagery allows studies at resolutions beyond the resolved geometry (usually one order of magnitude below the image resolution) while color gives additional clues, which can only be partly resolved in acoustic backscatter. The creation of a three-dimensional model changes the working style from the temporal domain of a video recording back to the spatial domain of a map. We examine these datasets using a custom developed immersive virtual visualization environment. The ARENA (Artificial Research Environment for Networked Analysis) features a (lower) hemispherical screen at a diameter of six meters, accommodating up to four scientists at once thus providing the ability to browse data interactively among a group of researchers. This environment facilitates (1) the development of spatial understanding analogue to on-land outcrop studies, (2) quantitative observations of seafloor morphology and physical parameters of its deposits, (3) more effective formulation and communication of working hypotheses.

  20. STUDY ON THE SEQUENCE STRUCTURE OF BUTADIENE-STYRENE RUBBER BY 13C-NMR METHOD Ⅲ. QUANTITATIVE CHARACTERIZATION OF SEQUENCE STRUCTURE

    Institute of Scientific and Technical Information of China (English)

    CHEN Xiaonong; HU Liping; YAN Baozhen; JIAO Shuke

    1990-01-01

    The quantitative description of the sequence structure of emulsion-processed SBR and solution-processed SBR (by lithium catalyst)was carried out based on their spectral data of 13C-NMR.The calculating formulae which could be used to obtain diad concentration from the peak intensities of carbon spectra, average block length, average number of block, and the microstructure composition of the molecular chain were derived. The quantitative result showed that on the molecular chain styrene unit had the tendency to attach to trans-1,4 butadiene unit. The calculated result of the microstructure was in good agreement with that obtained through IR measurement.

  1. Characterization of the Genomic Diversity of Norovirus in Linked Patients Using a Metagenomic Deep Sequencing Approach

    Science.gov (United States)

    Nasheri, Neda; Petronella, Nicholas; Ronholm, Jennifer; Bidawid, Sabah; Corneau, Nathalie

    2017-01-01

    Norovirus (NoV) is the leading cause of gastroenteritis worldwide. A robust cell culture system does not exist for NoV and therefore detailed characterization of outbreak and sporadic strains relies on molecular techniques. In this study, we employed a metagenomic approach that uses non-specific amplification followed by next-generation sequencing to whole genome sequence NoV genomes directly from clinical samples obtained from 8 linked patients. Enough sequencing depth was obtained for each sample to use a de novo assembly of near-complete genome sequences. The resultant consensus sequences were then used to identify inter-host nucleotide variations that occur after direct transmission, analyze amino acid variations in the major capsid protein, and provide evidence of recombination events. The analysis of intra-host quasispecies diversity was possible due to high coverage-depth. We also observed a linear relationship between NoV viral load in the clinical sample and the number of sequence reads that could be attributed to NoV. The method demonstrated here has the potential for future use in whole genome sequence analyses of other RNA viruses isolated from clinical, environmental, and food specimens. PMID:28197136

  2. Using deep RNA sequencing for the structural annotation of the Laccaria bicolor mycorrhizal transcriptome.

    Directory of Open Access Journals (Sweden)

    Peter E Larsen

    Full Text Available BACKGROUND: Accurate structural annotation is important for prediction of function and required for in vitro approaches to characterize or validate the gene expression products. Despite significant efforts in the field, determination of the gene structure from genomic data alone is a challenging and inaccurate process. The ease of acquisition of transcriptomic sequence provides a direct route to identify expressed sequences and determine the correct gene structure. METHODOLOGY: We developed methods to utilize RNA-seq data to correct errors in the structural annotation and extend the boundaries of current gene models using assembly approaches. The methods were validated with a transcriptomic data set derived from the fungus Laccaria bicolor, which develops a mycorrhizal symbiotic association with the roots of many tree species. Our analysis focused on the subset of 1501 gene models that are differentially expressed in the free living vs. mycorrhizal transcriptome and are expected to be important elements related to carbon metabolism, membrane permeability and transport, and intracellular signaling. Of the set of 1501 gene models, 1439 (96% successfully generated modified gene models in which all error flags were successfully resolved and the sequences aligned to the genomic sequence. The remaining 4% (62 gene models either had deviations from transcriptomic data that could not be spanned or generated sequence that did not align to genomic sequence. The outcome of this process is a set of high confidence gene models that can be reliably used for experimental characterization of protein function. CONCLUSIONS: 69% of expressed mycorrhizal JGI "best" gene models deviated from the transcript sequence derived by this method. The transcriptomic sequence enabled correction of a majority of the structural inconsistencies and resulted in a set of validated models for 96% of the mycorrhizal genes. The method described here can be applied to improve gene

  3. Using deep RNA sequencing for the structural annotation of the laccaria bicolor mycorrhizal transcriptome.

    Energy Technology Data Exchange (ETDEWEB)

    Larsen, P. E.; Trivedi, G.; Sreedasyam, A.; Lu, V.; Podila, G. K.; Collart, F. R.; Biosciences Division; Univ. of Alabama

    2010-07-06

    Accurate structural annotation is important for prediction of function and required for in vitro approaches to characterize or validate the gene expression products. Despite significant efforts in the field, determination of the gene structure from genomic data alone is a challenging and inaccurate process. The ease of acquisition of transcriptomic sequence provides a direct route to identify expressed sequences and determine the correct gene structure. We developed methods to utilize RNA-seq data to correct errors in the structural annotation and extend the boundaries of current gene models using assembly approaches. The methods were validated with a transcriptomic data set derived from the fungus Laccaria bicolor, which develops a mycorrhizal symbiotic association with the roots of many tree species. Our analysis focused on the subset of 1501 gene models that are differentially expressed in the free living vs. mycorrhizal transcriptome and are expected to be important elements related to carbon metabolism, membrane permeability and transport, and intracellular signaling. Of the set of 1501 gene models, 1439 (96%) successfully generated modified gene models in which all error flags were successfully resolved and the sequences aligned to the genomic sequence. The remaining 4% (62 gene models) either had deviations from transcriptomic data that could not be spanned or generated sequence that did not align to genomic sequence. The outcome of this process is a set of high confidence gene models that can be reliably used for experimental characterization of protein function. 69% of expressed mycorrhizal JGI 'best' gene models deviated from the transcript sequence derived by this method. The transcriptomic sequence enabled correction of a majority of the structural inconsistencies and resulted in a set of validated models for 96% of the mycorrhizal genes. The method described here can be applied to improve gene structural annotation in other species, provided

  4. High diversity of picornaviruses in rats from different continents revealed by deep sequencing

    DEFF Research Database (Denmark)

    Arn Hansen, Thomas; Mollerup, Sarah; Nguyen, Nam-Phuong

    2016-01-01

    Outbreaks of zoonotic diseases in humans and livestock are not uncommon, and an important component in containment of such emerging viral diseases is rapid and reliable diagnostics. Such methods are often PCR-based and hence require the availability of sequence data from the pathogen. Rattus......) collected from two continents by analyzing 2.2 billion next-generation sequencing reads derived from both DNA and RNA. Among other virus families, we found sequences from members of the Picornaviridae to be abundant in the microbiome of all the samples. Here we describe the diversity of the picornavirus...

  5. Gene discovery using mutagen-induced polymorphisms and deep sequencing: application to plant disease resistance.

    Science.gov (United States)

    Zhu, Ying; Mang, Hyung-gon; Sun, Qi; Qian, Jun; Hipps, Ashley; Hua, Jian

    2012-09-01

    Next-generation sequencing technologies are accelerating gene discovery by combining multiple steps of mapping and cloning used in the traditional map-based approach into one step using DNA sequence polymorphisms existing between two different accessions/strains/backgrounds of the same species. The existing next-generation sequencing method, like the traditional one, requires the use of a segregating population from a cross of a mutant organism in one accession with a wild-type (WT) organism in a different accession. It therefore could potentially be limited by modification of mutant phenotypes in different accessions and/or by the lengthy process required to construct a particular mapping parent in a second accession. Here we present mapping and cloning of an enhancer mutation with next-generation sequencing on bulked segregants in the same accession using sequence polymorphisms induced by a chemical mutagen. This method complements the conventional cloning approach and makes forward genetics more feasible and powerful in molecularly dissecting biological processes in any organisms. The pipeline developed in this study can be used to clone causal genes in background of single mutants or higher order of mutants and in species with or without sequence information on multiple accessions.

  6. Dissection of the octoploid strawberry genome by deep sequencing of the genomes of Fragaria species.

    Science.gov (United States)

    Hirakawa, Hideki; Shirasawa, Kenta; Kosugi, Shunichi; Tashiro, Kosuke; Nakayama, Shinobu; Yamada, Manabu; Kohara, Mistuyo; Watanabe, Akiko; Kishida, Yoshie; Fujishiro, Tsunakazu; Tsuruoka, Hisano; Minami, Chiharu; Sasamoto, Shigemi; Kato, Midori; Nanri, Keiko; Komaki, Akiko; Yanagi, Tomohiro; Guoxin, Qin; Maeda, Fumi; Ishikawa, Masami; Kuhara, Satoru; Sato, Shusei; Tabata, Satoshi; Isobe, Sachiko N

    2014-01-01

    Cultivated strawberry (Fragaria x ananassa) is octoploid and shows allogamous behaviour. The present study aims at dissecting this octoploid genome through comparison with its wild relatives, F. iinumae, F. nipponica, F. nubicola, and F. orientalis by de novo whole-genome sequencing on an Illumina and Roche 454 platforms. The total length of the assembled Illumina genome sequences obtained was 698 Mb for F. x ananassa, and ∼200 Mb each for the four wild species. Subsequently, a virtual reference genome termed FANhybrid_r1.2 was constructed by integrating the sequences of the four homoeologous subgenomes of F. x ananassa, from which heterozygous regions in the Roche 454 and Illumina genome sequences were eliminated. The total length of FANhybrid_r1.2 thus created was 173.2 Mb with the N50 length of 5137 bp. The Illumina-assembled genome sequences of F. x ananassa and the four wild species were then mapped onto the reference genome, along with the previously published F. vesca genome sequence to establish the subgenomic structure of F. x ananassa. The strategy adopted in this study has turned out to be successful in dissecting the genome of octoploid F. x ananassa and appears promising when applied to the analysis of other polyploid plant species.

  7. Mosaic KCNJ2 mutation in Andersen-Tawil syndrome: targeted deep sequencing is useful for the detection of mosaicism.

    Science.gov (United States)

    Hasegawa, K; Ohno, S; Kimura, H; Itoh, H; Makiyama, T; Yoshida, Y; Horie, M

    2015-03-01

    Andersen-Tawil syndrome (ATS) is an inherited disease characterized by ventricular arrhythmias, periodic paralysis, and dysmorphic features. It results from a heterozygous mutation of KCNJ2, but little is known about mosaicism in ATS. We performed genetic analysis of KCNJ2 in 32 ATS probands and their family members and identified KCNJ2 mutations in 25 probands, 20 families who underwent extensive genetic testing. These tests revealed that seven probands carried de novo mutations while 13 carried inherited mutations from their parents. We then specifically assessed a single proband and the respective family. The proband was a 9 year old girl who fulfilled the ATS triad and carried an insertion mutation (p.75_76insThr). We determined that the proband's mother carried a somatic mosaicism and that the proband's younger brother also carried the ATS phenotype with the same insertion mutation. The mother, who exhibited mosaicism, was asymptomatic, although she exhibited Q(T)U prolongation. Mutant allele frequency was 11% as per TA cloning and 17.3% as per targeted deep sequencing. Our observations suggest that targeted deep sequencing is useful for the detection of mosaicism and that the detection of mosaic mutations in parents of apparently sporadic ATS patients can help in the process of genetic counseling.

  8. Characterization and Development of EST-SSRs by Deep Transcriptome Sequencing in Chinese Cabbage (Brassica rapa L. ssp. pekinensis

    Directory of Open Access Journals (Sweden)

    Qian Ding

    2015-01-01

    Full Text Available Simple sequence repeats (SSRs are among the most important markers for population analysis and have been widely used in plant genetic mapping and molecular breeding. Expressed sequence tag-SSR (EST-SSR markers, located in the coding regions, are potentially more efficient for QTL mapping, gene targeting, and marker-assisted breeding. In this study, we investigated 51,694 nonredundant unigenes, assembled from clean reads from deep transcriptome sequencing with a Solexa/Illumina platform, for identification and development of EST-SSRs in Chinese cabbage. In total, 10,420 EST-SSRs with over 12 bp were identified and characterized, among which 2744 EST-SSRs are new and 2317 are known ones showing polymorphism with previously reported SSRs. A total of 7877 PCR primer pairs for 1561 EST-SSR loci were designed, and primer pairs for twenty-four EST-SSRs were selected for primer evaluation. In nineteen EST-SSR loci (79.2%, amplicons were successfully generated with high quality. Seventeen (89.5% showed polymorphism in twenty-four cultivars of Chinese cabbage. The polymorphic alleles of each polymorphic locus were sequenced, and the results showed that most polymorphisms were due to variations of SSR repeat motifs. The EST-SSRs identified and characterized in this study have important implications for developing new tools for genetics and molecular breeding in Chinese cabbage.

  9. Geochemical features and effects on deep-seated fluids during the May-June 2012 southern Po Valley seismic sequence

    Directory of Open Access Journals (Sweden)

    Francesco Italiano

    2012-10-01

    Full Text Available A periodic sampling of the groundwaters and dissolved and free gases in selected deep wells located in the area affected by the May-June 2012 southern Po Valley seismic sequence has provided insight into seismogenic-induced changes of the local aquifer systems. The results obtained show progressive changes in the fluid geochemistry, allowing it to be established that deep-seated fluids were mobilized during the seismic sequence and reached surface layers along faults and fractures, which generated significant geochemical anomalies. The May-June 2012 seismic swarm (mainshock on May 29, 2012, M 5.8; 7 shocks M >5, about 200 events 3 > M > 5 induced several modifications in the circulating fluids. This study reports the preliminary results obtained for the geochemical features of the waters and gases collected over the epicentral area from boreholes drilled at different depths, thus intercepting water and gases with different origins and circulation. The aim of the investigations was to improve our knowledge of the fluids circulating over the seismic area (e.g. origin, provenance, interactions, mixing of different components, temporal changes. This was achieved by collecting samples from both shallow and deep-drilled boreholes, and then, after the selection of the relevant sites, we looked for temporal changes with mid-to-long-term monitoring activity following a constant sampling rate. This allowed us to gain better insight into the relationships between the fluid circulation and the faulting activity. The sampling sites are listed in Table 1, along with the analytical results of the gas phase. […

  10. Quantitative trait loci markers derived from whole genome sequence data increases the reliability of genomic prediction.

    Science.gov (United States)

    Brøndum, R F; Su, G; Janss, L; Sahana, G; Guldbrandtsen, B; Boichard, D; Lund, M S

    2015-06-01

    This study investigated the effect on the reliability of genomic prediction when a small number of significant variants from single marker analysis based on whole genome sequence data were added to the regular 54k single nucleotide polymorphism (SNP) array data. The extra markers were selected with the aim of augmenting the custom low-density Illumina BovineLD SNP chip (San Diego, CA) used in the Nordic countries. The single-marker analysis was done breed-wise on all 16 index traits included in the breeding goals for Nordic Holstein, Danish Jersey, and Nordic Red cattle plus the total merit index itself. Depending on the trait's economic weight, 15, 10, or 5 quantitative trait loci (QTL) were selected per trait per breed and 3 to 5 markers were selected to tag each QTL. After removing duplicate markers (same marker selected for more than one trait or breed) and filtering for high pairwise linkage disequilibrium and assaying performance on the array, a total of 1,623 QTL markers were selected for inclusion on the custom chip. Genomic prediction analyses were performed for Nordic and French Holstein and Nordic Red animals using either a genomic BLUP or a Bayesian variable selection model. When using the genomic BLUP model including the QTL markers in the analysis, reliability was increased by up to 4 percentage points for production traits in Nordic Holstein animals, up to 3 percentage points for Nordic Reds, and up to 5 percentage points for French Holstein. Smaller gains of up to 1 percentage point was observed for mastitis, but only a 0.5 percentage point increase was seen for fertility. When using a Bayesian model accuracies were generally higher with only 54k data compared with the genomic BLUP approach, but increases in reliability were relatively smaller when QTL markers were included. Results from this study indicate that the reliability of genomic prediction can be increased by including markers significant in genome-wide association studies on whole genome

  11. A quantitative assessment of the Hadoop framework for analyzing massively parallel DNA sequencing data.

    Science.gov (United States)

    Siretskiy, Alexey; Sundqvist, Tore; Voznesenskiy, Mikhail; Spjuth, Ola

    2015-01-01

    New high-throughput technologies, such as massively parallel sequencing, have transformed the life sciences into a data-intensive field. The most common e-infrastructure for analyzing this data consists of batch systems that are based on high-performance computing resources; however, the bioinformatics software that is built on this platform does not scale well in the general case. Recently, the Hadoop platform has emerged as an interesting option to address the challenges of increasingly large datasets with distributed storage, distributed processing, built-in data locality, fault tolerance, and an appealing programming methodology. In this work we introduce metrics and report on a quantitative comparison between Hadoop and a single node of conventional high-performance computing resources for the tasks of short read mapping and variant calling. We calculate efficiency as a function of data size and observe that the Hadoop platform is more efficient for biologically relevant data sizes in terms of computing hours for both split and un-split data files. We also quantify the advantages of the data locality provided by Hadoop for NGS problems, and show that a classical architecture with network-attached storage will not scale when computing resources increase in numbers. Measurements were performed using ten datasets of different sizes, up to 100 gigabases, using the pipeline implemented in Crossbow. To make a fair comparison, we implemented an improved preprocessor for Hadoop with better performance for splittable data files. For improved usability, we implemented a graphical user interface for Crossbow in a private cloud environment using the CloudGene platform. All of the code and data in this study are freely available as open source in public repositories. From our experiments we can conclude that the improved Hadoop pipeline scales better than the same pipeline on high-performance computing resources, we also conclude that Hadoop is an economically viable

  12. Location and sequence of muscle onset in deep abdominal muscles measured by different modes of ultrasound imaging.

    Science.gov (United States)

    Westad, Christian; Mork, Paul J; Vasseljen, Ottar

    2010-10-01

    Various modes of ultrasound (US) imaging have been introduced as an alternative to electromyography for determining muscle onset. The purpose of this study was to compare the agreement between US motion-mode (US(m-mode)) and US strain rate (US(SR)) derived from tissue velocity imaging in determining latency time, location and sequence of muscle onset in abdominal muscles using the same data set (contractions). Twenty-four subjects performed four rapid arm flexions in response to a light signal while US recordings were made from the abdominal muscles on the contralateral side. The examined muscles were transversus abdominis (TrA), superficial and deep obliquus internus abdominis (OI(deep) and OI(sup)), and obliquus externus abdominis (OE). The results showed that the two methods detected the first muscle onset on average within 0.1 ms (95% CI; +/-1.4 ms) of each other. US(SR) detected the second muscle onset on average 27 ms after US(m-mode). While US(SR) and US(m-mode) can be used interchangeably to detect the first muscle onset, the location of both first onset and subsequent muscle onsets can be reliably detected by US(SR) only. Furthermore, this study indicates that OI may be functionally subdivided into a superficial and deep region, with onset in OI(deep) occurring on average 53 ms before OI(sup). First onset was detected more frequently in OI than in TrA (65% versus 25% of detected onsets, 10% were equal).

  13. Identification of hepatotropic viruses from plasma using deep sequencing: a next generation diagnostic tool.

    Directory of Open Access Journals (Sweden)

    John Law

    Full Text Available We conducted an unbiased metagenomics survey using plasma from patients with chronic hepatitis B, chronic hepatitis C, autoimmune hepatitis (AIH, non-alcoholic steatohepatitis (NASH, and patients without liver disease (control. RNA and DNA libraries were sequenced from plasma filtrates enriched in viral particles to catalog virus populations. Hepatitis viruses were readily detected at high coverage in patients with chronic viral hepatitis B and C, but only a limited number of sequences resembling other viruses were found. The exception was a library from a patient diagnosed with hepatitis C virus (HCV infection that contained multiple sequences matching GB virus C (GBV-C. Abundant GBV-C reads were also found in plasma from patients with AIH, whereas Torque teno virus (TTV was found at high frequency in samples from patients with AIH and NASH. After taxonomic classification of sequences by BLASTn, a substantial fraction in each library, ranging from 35% to 76%, remained unclassified. These unknown sequences were assembled into scaffolds along with virus, phage and endogenous retrovirus sequences and then analyzed by BLASTx against the non-redundant protein database. Nearly the full genome of a heretofore-unknown circovirus was assembled and many scaffolds that encoded proteins with similarity to plant, insect and mammalian viruses. The presence of this novel circovirus was confirmed by PCR. BLASTx also identified many polypeptides resembling nucleo-cytoplasmic large DNA viruses (NCLDV proteins. We re-evaluated these alignments with a profile hidden Markov method, HHblits, and observed inconsistencies in the target proteins reported by the different algorithms. This suggests that sequence alignments are insufficient to identify NCLDV proteins, especially when these alignments are only to small portions of the target protein. Nevertheless, we have now established a reliable protocol for the identification of viruses in plasma that can also be

  14. Quantitative recuperation of climatic sequences for the last 200 years in Xingcuo Lake, eastern Tibetan Plateau

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    The functional relation between theδ18O values in the shell of gastropod Gyraulus sibirica and the air temperature in the warm half-yearly period, and that between Sr/Ca ratio and the precipitation in the warm half-yearly period were established by calibrating the δ18O and δ13C values, Sr/Ca ratio and Mg/Ca ratio in the shell Gyraulus sibirica, as well as the total organic carbon (TOC) and its δ13C values in the Xingcuo Lake sediment in the eastern Tibetan Plateau. The sequences of air temperature and precipitation in the last 200 years in the region were quantitatively recuperated on this basis. The results showed the following: (i) There was a negative correlativity between Sr/Ca ratio and the precipitation in the warm half-yearly period, its correlation coefficient was 0.86. (ii) There was an obviously positive correlativity between indexδ18O and the running average temperature in the warm half-yearly period, its correlation coefficient was 0.89. (iii) Evolution of the air temperature and the precipitation in the last 200 years can be divided into three phases distinctly. The precipitation in the later mid-19th century was 220 mm higher than that today; the air temperature in the warm half-yearly period was 2℃ lower than that of the present. The precipitation in the minimum air temperature period of the early 20th century was below that today by 60 mm, and the air temperature in the warm half-yearly period was 3.4℃ lower than that today. (iv) An evidently warming and drying trend existed in the last five decades.

  15. Mitochondrial genome sequences reveal deep divergences among Anopheles punctulatus sibling species in Papua New Guinea

    Directory of Open Access Journals (Sweden)

    Logue Kyle

    2013-02-01

    Full Text Available Abstract Background Members of the Anopheles punctulatus group (AP group are the primary vectors of human malaria in Papua New Guinea. The AP group includes 13 sibling species, most of them morphologically indistinguishable. Understanding why only certain species are able to transmit malaria requires a better comprehension of their evolutionary history. In particular, understanding relationships and divergence times among Anopheles species may enable assessing how malaria-related traits (e.g. blood feeding behaviours, vector competence have evolved. Methods DNA sequences of 14 mitochondrial (mt genomes from five AP sibling species and two species of the Anopheles dirus complex of Southeast Asia were sequenced. DNA sequences from all concatenated protein coding genes (10,770 bp were then analysed using a Bayesian approach to reconstruct phylogenetic relationships and date the divergence of the AP sibling species. Results Phylogenetic reconstruction using the concatenated DNA sequence of all mitochondrial protein coding genes indicates that the ancestors of the AP group arrived in Papua New Guinea 25 to 54 million years ago and rapidly diverged to form the current sibling species. Conclusion Through evaluation of newly described mt genome sequences, this study has revealed a divergence among members of the AP group in Papua New Guinea that would significantly predate the arrival of humans in this region, 50 thousand years ago. The divergence observed among the mtDNA sequences studied here may have resulted from reproductive isolation during historical changes in sea-level through glacial minima and maxima. This leads to a hypothesis that the AP sibling species have evolved independently for potentially thousands of generations. This suggests that the evolution of many phenotypes, such as insecticide resistance will arise independently in each of the AP sibling species studied here.

  16. Whole genome deep sequencing of HIV-1 reveals the impact of early minor variants upon immune recognition during acute infection.

    Directory of Open Access Journals (Sweden)

    Matthew R Henn

    Full Text Available Deep sequencing technologies have the potential to transform the study of highly variable viral pathogens by providing a rapid and cost-effective approach to sensitively characterize rapidly evolving viral quasispecies. Here, we report on a high-throughput whole HIV-1 genome deep sequencing platform that combines 454 pyrosequencing with novel assembly and variant detection algorithms. In one subject we combined these genetic data with detailed immunological analyses to comprehensively evaluate viral evolution and immune escape during the acute phase of HIV-1 infection. The majority of early, low frequency mutations represented viral adaptation to host CD8+ T cell responses, evidence of strong immune selection pressure occurring during the early decline from peak viremia. CD8+ T cell responses capable of recognizing these low frequency escape variants coincided with the selection and evolution of more effective secondary HLA-anchor escape mutations. Frequent, and in some cases rapid, reversion of transmitted mutations was also observed across the viral genome. When located within restricted CD8 epitopes these low frequency reverting mutations were sufficient to prime de novo responses to these epitopes, again illustrating the capacity of the immune response to recognize and respond to low frequency variants. More importantly, rapid viral escape from the most immunodominant CD8+ T cell responses coincided with plateauing of the initial viral load decline in this subject, suggestive of a potential link between maintenance of effective, dominant CD8 responses and the degree of early viremia reduction. We conclude that the early control of HIV-1 replication by immunodominant CD8+ T cell responses may be substantially influenced by rapid, low frequency viral adaptations not detected by conventional sequencing approaches, which warrants further investigation. These data support the critical need for vaccine-induced CD8+ T cell responses to target more

  17. Analytical and Clinical Validation of a Digital Sequencing Panel for Quantitative, Highly Accurate Evaluation of Cell-Free Circulating Tumor DNA: e0140712

    National Research Council Canada - National Science Library

    Richard B Lanman; Stefanie A Mortimer; Oliver A Zill; Dragan Sebisanovic; Rene Lopez; Sibel Blau; Eric A Collisson; Stephen G Divers; Dave S B Hoon; E Scott Kopetz; Jeeyun Lee; Petros G Nikolinakos; Arthur M Baca; Bahram G Kermani; Helmy Eltoukhy

    2015-01-01

    .... First this method of massively parallel and deep sequencing enables assessment of a comprehensive panel of genomic targets from a single sample, and second, it obviates the need for repeat invasive tissue biopsies...

  18. Benchmarking of the Oxford Nanopore MinION sequencing for quantitative and qualitative assessment of cDNA populations.

    Science.gov (United States)

    Oikonomopoulos, Spyros; Wang, Yu Chang; Djambazian, Haig; Badescu, Dunarel; Ragoussis, Jiannis

    2016-08-24

    To assess the performance of the Oxford Nanopore Technologies MinION sequencing platform, cDNAs from the External RNA Controls Consortium (ERCC) RNA Spike-In mix were sequenced. This mix mimics mammalian mRNA species and consists of 92 polyadenylated transcripts with known concentration. cDNA libraries were generated using a template switching protocol to facilitate the direct comparison between different sequencing platforms. The MinION performance was assessed for its ability to sequence the cDNAs directly with good accuracy in terms of abundance and full length. The abundance of the ERCC cDNA molecules sequenced by MinION agreed with their expected concentration. No length or GC content bias was observed. The majority of cDNAs were sequenced as full length. Additionally, a complex cDNA population derived from a human HEK-293 cell line was sequenced on an Illumina HiSeq 2500, PacBio RS II and ONT MinION platforms. We observed that there was a good agreement in the measured cDNA abundance between PacBio RS II and ONT MinION (rpearson = 0.82, isoforms with length more than 700bp) and between Illumina HiSeq 2500 and ONT MinION (rpearson = 0.75). This indicates that the ONT MinION can sequence quantitatively both long and short full length cDNA molecules.

  19. Reproducibility of Illumina platform deep sequencing errors allows accurate determination of DNA barcodes in cells.

    NARCIS (Netherlands)

    Beltman, J.B.; Urbanus, J.; Velds, A.; Rooij, van N.; Rohr, J.C.; Naik, S.H.; Schumacher, T.N.

    2016-01-01

    BACKGROUND Next generation sequencing (NGS) of amplified DNA is a powerful tool to describe genetic heterogeneity within cell populations that can both be used to investigate the clonal structure of cell populations and to perform genetic lineage tracing. For applications in which both abundant and

  20. De Novo Peptide Sequencing: Deep Mining of High-Resolution Mass Spectrometry Data.

    Science.gov (United States)

    Islam, Mohammad Tawhidul; Mohamedali, Abidali; Fernandes, Criselda Santan; Baker, Mark S; Ranganathan, Shoba

    2017-01-01

    High resolution mass spectrometry has revolutionized proteomics over the past decade, resulting in tremendous amounts of data in the form of mass spectra, being generated in a relatively short span of time. The mining of this spectral data for analysis and interpretation though has lagged behind such that potentially valuable data is being overlooked because it does not fit into the mold of traditional database searching methodologies. Although the analysis of spectra by de novo sequences removes such biases and has been available for a long period of time, its uptake has been slow or almost nonexistent within the scientific community. In this chapter, we propose a methodology to integrate de novo peptide sequencing using three commonly available software solutions in tandem, complemented by homology searching, and manual validation of spectra. This simplified method would allow greater use of de novo sequencing approaches and potentially greatly increase proteome coverage leading to the unearthing of valuable insights into protein biology, especially of organisms whose genomes have been recently sequenced or are poorly annotated.

  1. Deep sequencing of uveal melanoma identifies a recurrent mutation in PLCB4

    DEFF Research Database (Denmark)

    Johansson, Peter; Aoude, Lauren G; Wadt, Karin;

    2016-01-01

    -genome or whole-exome sequencing of 28 tumors or primary cell lines. These samples have a low mutation burden, with a mean of 10.6 protein changing mutations per sample (range 0 to 53). As expected for these sun-shielded melanomas the mutation spectrum was not consistent with an ultraviolet radiation signature...

  2. rSW-seq: Algorithm for detection of copy number alterations in deep sequencing data

    Directory of Open Access Journals (Sweden)

    Kim Tae-Min

    2010-08-01

    Full Text Available Abstract Background Recent advances in sequencing technologies have enabled generation of large-scale genome sequencing data. These data can be used to characterize a variety of genomic features, including the DNA copy number profile of a cancer genome. A robust and reliable method for screening chromosomal alterations would allow a detailed characterization of the cancer genome with unprecedented accuracy. Results We develop a method for identification of copy number alterations in a tumor genome compared to its matched control, based on application of Smith-Waterman algorithm to single-end sequencing data. In a performance test with simulated data, our algorithm shows >90% sensitivity and >90% precision in detecting a single copy number change that contains approximately 500 reads for the normal sample. With 100-bp reads, this corresponds to a ~50 kb region for 1X genome coverage of the human genome. We further refine the algorithm to develop rSW-seq, (recursive Smith-Waterman-seq to identify alterations in a complex configuration, which are commonly observed in the human cancer genome. To validate our approach, we compare our algorithm with an existing algorithm using simulated and publicly available datasets. We also compare the sequencing-based profiles to microarray-based results. Conclusion We propose rSW-seq as an efficient method for detecting copy number changes in the tumor genome.

  3. Ultra deep sequencing of Listeria monocytogenes sRNA transcriptome revealed new antisense RNAs.

    Directory of Open Access Journals (Sweden)

    Sebastian Behrens

    Full Text Available Listeria monocytogenes, a gram-positive pathogen, and causative agent of listeriosis, has become a widely used model organism for intracellular infections. Recent studies have identified small non-coding RNAs (sRNAs as important factors for regulating gene expression and pathogenicity of L. monocytogenes. Increased speed and reduced costs of high throughput sequencing (HTS techniques have made RNA sequencing (RNA-Seq the state-of-the-art method to study bacterial transcriptomes. We created a large transcriptome dataset of L. monocytogenes containing a total of 21 million reads, using the SOLiD sequencing technology. The dataset contained cDNA sequences generated from L. monocytogenes RNA collected under intracellular and extracellular condition and additionally was size fractioned into three different size ranges from 150 nt. We report here, the identification of nine new sRNAs candidates of L. monocytogenes and a reevaluation of known sRNAs of L. monocytogenes EGD-e. Automatic comparison to known sRNAs revealed a high recovery rate of 55%, which was increased to 90% by manual revision of the data. Moreover, thorough classification of known sRNAs shed further light on their possible biological functions. Interestingly among the newly identified sRNA candidates are antisense RNAs (asRNAs associated to the housekeeping genes purA, fumC and pgi and potentially their regulation, emphasizing the significance of sRNAs for metabolic adaptation in L. monocytogenes.

  4. Ultra Deep Sequencing of Listeria monocytogenes sRNA Transcriptome Revealed New Antisense RNAs

    Science.gov (United States)

    Behrens, Sebastian; Widder, Stefanie; Mannala, Gopala Krishna; Qing, Xiaoxing; Madhugiri, Ramakanth; Kefer, Nathalie; Mraheil, Mobarak Abu; Rattei, Thomas; Hain, Torsten

    2014-01-01

    Listeria monocytogenes, a gram-positive pathogen, and causative agent of listeriosis, has become a widely used model organism for intracellular infections. Recent studies have identified small non-coding RNAs (sRNAs) as important factors for regulating gene expression and pathogenicity of L. monocytogenes. Increased speed and reduced costs of high throughput sequencing (HTS) techniques have made RNA sequencing (RNA-Seq) the state-of-the-art method to study bacterial transcriptomes. We created a large transcriptome dataset of L. monocytogenes containing a total of 21 million reads, using the SOLiD sequencing technology. The dataset contained cDNA sequences generated from L. monocytogenes RNA collected under intracellular and extracellular condition and additionally was size fractioned into three different size ranges from 150 nt. We report here, the identification of nine new sRNAs candidates of L. monocytogenes and a reevaluation of known sRNAs of L. monocytogenes EGD-e. Automatic comparison to known sRNAs revealed a high recovery rate of 55%, which was increased to 90% by manual revision of the data. Moreover, thorough classification of known sRNAs shed further light on their possible biological functions. Interestingly among the newly identified sRNA candidates are antisense RNAs (asRNAs) associated to the housekeeping genes purA, fumC and pgi and potentially their regulation, emphasizing the significance of sRNAs for metabolic adaptation in L. monocytogenes. PMID:24498259

  5. Identification of a new enamovirus associated with citrus vein enation disease by deep sequencing of small RNAs.

    Science.gov (United States)

    Vives, Mari Carmen; Velázquez, Karelia; Pina, José Antonio; Moreno, Pedro; Guerri, José; Navarro, Luis

    2013-10-01

    To identify the causal agent of citrus vein enation disease, we examined by deep sequencing (Solexa-Illumina) the small RNA (sRNA) fraction from infected and healthy Etrog citron plants. Our results showed that virus-derived sRNAs (vsRNAs): (i) represent about 14.21% of the total sRNA population, (ii) are predominantly of 21 and 24 nucleotides with a biased distribution of their 5' nucleotide and with a clear prevalence of those of (+) polarity, and (iii) derive from all the viral genome, although a prominent hotspot is present at a 5'-proximal region. Contigs assembled from vsRNAs showed similarity with luteovirus sequences, particularly with Pea enation mosaic virus, the type member of the genus Enamovirus. The genomic RNA (gRNA) sequence of a new virus, provisionally named Citrus vein enation virus (CVEV), was completed and characterized. The CVEV gRNA was found to be single-stranded, positive-sense, with a size of 5,983 nucleotides and five open reading frames. Phylogenetic comparisons based on amino acid signatures of the RNA polymerase and the coat protein clearly classifies CVEV within the genus Enamovirus. Dot-blot hybridization and reverse transcription-polymerase chain reaction tests were developed to detect CVEV in plants affected by vein enation disease. CVEV detection by these methods has already been adopted for use in the Spanish citrus quarantine, sanitation, and certification programs.

  6. Identification of representative genes of the central nervous system of the locust, Locusta migratoria manilensis by deep sequencing.

    Science.gov (United States)

    Zhang, Zhengyi; Peng, Zhi-Yu; Yi, Kang; Cheng, Yanbing; Xia, Yuxian

    2012-01-01

    The shortage of available genomic and transcriptomic data hampers the molecular study on the migratory locust, Locusta migratoria manilensis (L.) (Orthoptera: Acrididae) central nervous system (CNS). In this study, locust CNS RNA was sequenced by deep sequencing. 41,179 unigenes were obtained with an average length of 570 bp, and 5,519 unigenes were longer than 1,000 bp. Compared with an EST database of another locust species Schistocerca gregaria Forsskåi, 9,069 unigenes were found conserved, while 32,110 unigenes were differentially expressed. A total of 15,895 unigenes were identified, including 644 nervous system relevant unigenes. Among the 25,284 unknown unigenes, 9,482 were found to be specific to the CNS by filtering out the previous ESTs acquired from locust organs without CNS's. The locust CNS showed the most matches (18%) with Tribolium castaneum (Herbst) (Coleoptera: Tenebrionidae) sequences. Comprehensive assessment reveals that the database generated in this study is broadly representative of the CNS of adult locust, providing comprehensive gene information at the transcriptional level that could facilitate research of the locust CNS, including various physiological aspects and pesticide target finding.

  7. Deep sequencing of the oral microbiome reveals signatures of periodontal disease.

    Directory of Open Access Journals (Sweden)

    Bo Liu

    Full Text Available The oral microbiome, the complex ecosystem of microbes inhabiting the human mouth, harbors several thousands of bacterial types. The proliferation of pathogenic bacteria within the mouth gives rise to periodontitis, an inflammatory disease known to also constitute a risk factor for cardiovascular disease. While much is known about individual species associated with pathogenesis, the system-level mechanisms underlying the transition from health to disease are still poorly understood. Through the sequencing of the 16S rRNA gene and of whole community DNA we provide a glimpse at the global genetic, metabolic, and ecological changes associated with periodontitis in 15 subgingival plaque samples, four from each of two periodontitis patients, and the remaining samples from three healthy individuals. We also demonstrate the power of whole-metagenome sequencing approaches in characterizing the genomes of key players in the oral microbiome, including an unculturable TM7 organism. We reveal the disease microbiome to be enriched in virulence factors, and adapted to a parasitic lifestyle that takes advantage of the disrupted host homeostasis. Furthermore, diseased samples share a common structure that was not found in completely healthy samples, suggesting that the disease state may occupy a narrow region within the space of possible configurations of the oral microbiome. Our pilot study demonstrates the power of high-throughput sequencing as a tool for understanding the role of the oral microbiome in periodontal disease. Despite a modest level of sequencing (~2 lanes Illumina 76 bp PE and high human DNA contamination (up to ~90% we were able to partially reconstruct several oral microbes and to preliminarily characterize some systems-level differences between the healthy and diseased oral microbiomes.

  8. High diversity of picornaviruses in rats from different continents revealed by deep sequencing.

    Science.gov (United States)

    Hansen, Thomas Arn; Mollerup, Sarah; Nguyen, Nam-Phuong; White, Nicole E; Coghlan, Megan; Alquezar-Planas, David E; Joshi, Tejal; Jensen, Randi Holm; Fridholm, Helena; Kjartansdóttir, Kristín Rós; Mourier, Tobias; Warnow, Tandy; Belsham, Graham J; Bunce, Michael; Willerslev, Eske; Nielsen, Lars Peter; Vinner, Lasse; Hansen, Anders Johannes

    2016-08-17

    Outbreaks of zoonotic diseases in humans and livestock are not uncommon, and an important component in containment of such emerging viral diseases is rapid and reliable diagnostics. Such methods are often PCR-based and hence require the availability of sequence data from the pathogen. Rattus norvegicus (R. norvegicus) is a known reservoir for important zoonotic pathogens. Transmission may be direct via contact with the animal, for example, through exposure to its faecal matter, or indirectly mediated by arthropod vectors. Here we investigated the viral content in rat faecal matter (n=29) collected from two continents by analyzing 2.2 billion next-generation sequencing reads derived from both DNA and RNA. Among other virus families, we found sequences from members of the Picornaviridae to be abundant in the microbiome of all the samples. Here we describe the diversity of the picornavirus-like contigs including near-full-length genomes closely related to the Boone cardiovirus and Theiler's encephalomyelitis virus. From this study, we conclude that picornaviruses within R. norvegicus are more diverse than previously recognized. The virome of R. norvegicus should be investigated further to assess the full potential for zoonotic virus transmission.

  9. High-throughput, high-fidelity HLA genotyping with deep sequencing.

    Science.gov (United States)

    Wang, Chunlin; Krishnakumar, Sujatha; Wilhelmy, Julie; Babrzadeh, Farbod; Stepanyan, Lilit; Su, Laura F; Levinson, Douglas; Fernandez-Viña, Marcelo A; Davis, Ronald W; Davis, Mark M; Mindrinos, Michael

    2012-05-29

    Human leukocyte antigen (HLA) genes are the most polymorphic in the human genome. They play a pivotal role in the immune response and have been implicated in numerous human pathologies, especially autoimmunity and infectious diseases. Despite their importance, however, they are rarely characterized comprehensively because of the prohibitive cost of standard technologies and the technical challenges of accurately discriminating between these highly related genes and their many allelles. Here we demonstrate a high-resolution, and cost-effective methodology to type HLA genes by sequencing, which combines the advantage of long-range amplification, the power of high-throughput sequencing platforms, and a unique genotyping algorithm. We calibrated our method for HLA-A, -B, -C, and -DRB1 genes with both reference cell lines and clinical samples and identified several previously undescribed alleles with mismatches, insertions, and deletions. We have further demonstrated the utility of this method in a clinical setting by typing five clinical samples in an Illumina MiSeq instrument with a 5-d turnaround. Overall, this technology has the capacity to deliver low-cost, high-throughput, and accurate HLA typing by multiplexing thousands of samples in a single sequencing run, which will enable comprehensive disease-association studies with large cohorts. Furthermore, this approach can also be extended to include other polymorphic genes.

  10. New mutations in chronic lymphocytic leukemia identified by target enrichment and deep sequencing.

    Directory of Open Access Journals (Sweden)

    Elena Doménech

    Full Text Available Chronic lymphocytic leukemia (CLL is a heterogeneous disease without a well-defined genetic alteration responsible for the onset of the disease. Several lines of evidence coincide in identifying stimulatory and growth signals delivered by B-cell receptor (BCR, and co-receptors together with NFkB pathway, as being the driving force in B-cell survival in CLL. However, the molecular mechanism responsible for this activation has not been identified. Based on the hypothesis that BCR activation may depend on somatic mutations of the BCR and related pathways we have performed a complete mutational screening of 301 selected genes associated with BCR signaling and related pathways using massive parallel sequencing technology in 10 CLL cases. Four mutated genes in coding regions (KRAS, SMARCA2, NFKBIE and PRKD3 have been confirmed by capillary sequencing. In conclusion, this study identifies new genes mutated in CLL, all of them in cases with progressive disease, and demonstrates that next-generation sequencing technologies applied to selected genes or pathways of interest are powerful tools for identifying novel mutational changes.

  11. New mutations in chronic lymphocytic leukemia identified by target enrichment and deep sequencing.

    Science.gov (United States)

    Doménech, Elena; Gómez-López, Gonzalo; Gzlez-Peña, Daniel; López, Mar; Herreros, Beatriz; Menezes, Juliane; Gómez-Lozano, Natalia; Carro, Angel; Graña, Osvaldo; Pisano, David G; Domínguez, Orlando; García-Marco, José A; Piris, Miguel A; Sánchez-Beato, Margarita

    2012-01-01

    Chronic lymphocytic leukemia (CLL) is a heterogeneous disease without a well-defined genetic alteration responsible for the onset of the disease. Several lines of evidence coincide in identifying stimulatory and growth signals delivered by B-cell receptor (BCR), and co-receptors together with NFkB pathway, as being the driving force in B-cell survival in CLL. However, the molecular mechanism responsible for this activation has not been identified. Based on the hypothesis that BCR activation may depend on somatic mutations of the BCR and related pathways we have performed a complete mutational screening of 301 selected genes associated with BCR signaling and related pathways using massive parallel sequencing technology in 10 CLL cases. Four mutated genes in coding regions (KRAS, SMARCA2, NFKBIE and PRKD3) have been confirmed by capillary sequencing. In conclusion, this study identifies new genes mutated in CLL, all of them in cases with progressive disease, and demonstrates that next-generation sequencing technologies applied to selected genes or pathways of interest are powerful tools for identifying novel mutational changes.

  12. Single nucleotide polymorphism discovery in rainbow trout by deep sequencing of a reduced representation library

    Directory of Open Access Journals (Sweden)

    Salem Mohamed

    2009-11-01

    Full Text Available Abstract Background To enhance capabilities for genomic analyses in rainbow trout, such as genomic selection, a large suite of polymorphic markers that are amenable to high-throughput genotyping protocols must be identified. Expressed Sequence Tags (ESTs have been used for single nucleotide polymorphism (SNP discovery in salmonids. In those strategies, the salmonid semi-tetraploid genomes often led to assemblies of paralogous sequences and therefore resulted in a high rate of false positive SNP identification. Sequencing genomic DNA using primers identified from ESTs proved to be an effective but time consuming methodology of SNP identification in rainbow trout, therefore not suitable for high throughput SNP discovery. In this study, we employed a high-throughput strategy that used pyrosequencing technology to generate data from a reduced representation library constructed with genomic DNA pooled from 96 unrelated rainbow trout that represent the National Center for Cool and Cold Water Aquaculture (NCCCWA broodstock population. Results The reduced representation library consisted of 440 bp fragments resulting from complete digestion with the restriction enzyme HaeIII; sequencing produced 2,000,000 reads providing an average 6 fold coverage of the estimated 150,000 unique genomic restriction fragments (300,000 fragment ends. Three independent data analyses identified 22,022 to 47,128 putative SNPs on 13,140 to 24,627 independent contigs. A set of 384 putative SNPs, randomly selected from the sets produced by the three analyses were genotyped on individual fish to determine the validation rate of putative SNPs among analyses, distinguish apparent SNPs that actually represent paralogous loci in the tetraploid genome, examine Mendelian segregation, and place the validated SNPs on the rainbow trout linkage map. Approximately 48% (183 of the putative SNPs were validated; 167 markers were successfully incorporated into the rainbow trout linkage map. In

  13. Discovering novel microRNAs and age-related nonlinear changes in rat brains using deep sequencing.

    Science.gov (United States)

    Yin, Lanxuan; Sun, Yubai; Wu, Jinfeng; Yan, Siyu; Deng, Zhenglu; Wang, Jun; Liao, Shenke; Yin, Dazhong; Li, Guolin

    2015-02-01

    Elucidating the molecular mechanisms of brain aging remains a significant challenge for biogerontologists. The discovery of gene regulation by microRNAs (miRNAs) has added a new dimension for examining this process; however, the full complement of miRNAs involved in brain aging is still not known. In this study, miRNA profiles of young, adult, and old rats were obtained to evaluate molecular changes during aging. High-throughput deep sequencing revealed 547 known and 171 candidate novel miRNAs that were differentially expressed among groups. Unexpectedly, miRNA expression did not decline progressively with advancing age; moreover, genes targeted by age-associated miRNAs were predicted to be involved in biological processes linked to aging and neurodegenerative diseases. These findings provide novel insight into the molecular mechanisms underlying brain aging and a resource for future studies on age-related brain disorders.

  14. Identification of microRNAs Involved in the Host Response to Enterovirus 71 Infection by a Deep Sequencing Approach

    Directory of Open Access Journals (Sweden)

    Lunbiao Cui

    2010-01-01

    Full Text Available Role of microRNA (miRNA has been highlighted in pathogen-host interactions recently. To identify cellular miRNAs involved in the host response to enterovirus 71 (EV71 infection, we performed a comprehensive miRNA profiling in EV71-infected Hep2 cells through deep sequencing. 64 miRNAs were found whose expression levels changed for more than 2-fold in response to EV71 infection. Gene ontology analysis revealed that many of these mRNAs play roles in neurological process, immune response, and cell death pathways, which are known to be associated with the extreme virulence of EV71. To our knowledge, this is the first study on host miRNAs expression alteration response to EV71 infection. Our findings supported the hypothesis that certain miRNAs might be essential in the host-pathogen interactions.

  15. Deep sequencing of mRNA in CD24− and CD24+ mammary carcinoma Mvt1 cell line

    Directory of Open Access Journals (Sweden)

    Ran Rostoker

    2015-09-01

    Full Text Available CD24 is an anchored cell surface marker that is highly expressed in cancer cells (Lee et al., 2009 and its expression is associated with poorer outcome of cancer patients (Kristiansen et al., 2003. Phenotype comparison between two subpopulations derived from the Mvt1 cell line, CD24− cells (with no CD24 cell surface expression and the CD24+ cells, identified high tumorigenic capacity for the CD24+ cells. In order to reveal the transcripts that support the CD24+ aggressive and invasive phenotype we compared the gene profiles of these two subpopulations. mRNA profiles of CD24− and CD24+ cells were generated by deep sequencing, in triplicate, using an Illumina HiSeq 2500. Here we provide a detailed description of the mRNA-seq analysis from our recent study (Rostoker et al., 2015. The mRNA-seq data have been deposited in the NCBI GEO database (accession number GSE68746.

  16. Qualitative and quantitative event-specific PCR detection methods for oxy-235 canola based on the 3' integration flanking sequence.

    Science.gov (United States)

    Yang, Litao; Guo, Jinchao; Zhang, Haibo; Liu, Jia; Zhang, Dabing

    2008-03-26

    As more genetically modified plant events are approved for commercialization worldwide, the event-specific PCR method has become the key method for genetically modified organism (GMO) identification and quantification. This study reveals the 3' flanking sequence of the exogenous integration of Oxy-235 canola employing thermal asymmetric interlaced PCR (TAIL-PCR). On the basis of the revealed 3' flanking sequence, PCR primers and TaqMan probe were designed and qualitative and quantitative PCR assays were established for Oxy-235 canola. The specificity and limits of detection (LOD) and quantification (LOQ) of these two PCR assays were validated to as low as 0.1% for the relative LOD of qualitative PCR assay; the absolute LOD and LOQ were low to 10 and 20 copies of canola genomic DNA in quantitative PCR assay, respectively. Furthermore, ideal quantified results were obtained in the practical canola sample detection. All of the results indicate that the developed qualitative and quantitative PCR methods based on the revealed 3' integration flanking sequence are suitable for GM canola Oxy-235 identification and quantification.

  17. Deep transcriptome sequencing of Pecten maximus hemocytes: a genomic resource for bivalve immunology.

    Science.gov (United States)

    Pauletto, Marianna; Milan, Massimo; Moreira, Rebeca; Novoa, Beatriz; Figueras, Antonio; Babbucci, Massimiliano; Patarnello, Tomaso; Bargelloni, Luca

    2014-03-01

    Pecten maximus, the king scallop, is a bivalve species with important commercial value for both fisheries and aquaculture, traditionally consumed in several European countries. Major problems in larval rearing, however, still limit hatchery-based seed production. High mortalities during early larval stages, likely related to bacterial pathogens, represent the most relevant bottleneck. To address this issue, understanding host defense mechanisms against microbes is extremely important. In this study next-generation RNA-sequencing was carried on scallop hemocytes. To enrich for immune-related transcripts, cDNA libraries from hemocytes challenged in vivo with inactivated-Vibrio anguillarum and in vitro with pathogen-associated molecular patterns, as well as unchallenged controls, were sequenced yielding 216,444,674 sequence reads. De novo assembly of the scallop hemocyte transcriptome consisted of 73,732 contigs (31% annotated). A total of 934 contigs encoded proteins with a known immune function, grouped into several functional categories. Particular attention was reserved to Toll-like receptors (TLRs), a family of pattern recognition receptors (PRRs) involved in non-self recognition. Through mining the scallop hemocyte transcriptome, at least four TLRs could be identified. The organization of canonical TLR domains demonstrated that single cysteine cluster and multiple cysteine cluster TLRs co-exist in this species. In addition, preliminary data concerning their mRNA level following bacterial challenge suggested that different members of this family could exhibit opposite responses to pathogenic stimuli. Finally, a global analysis of differential expression comparing gene-expression levels in in vitro and in vivo stimulated hemocytes against controls provided evidence on a large set of transcripts involved in the great scallop immune response.

  18. Insights into the genetic structure and diversity of 38 South Asian Indians from deep whole-genome sequencing.

    Directory of Open Access Journals (Sweden)

    Lai-Ping Wong

    2014-05-01

    Full Text Available South Asia possesses a significant amount of genetic diversity due to considerable intergroup differences in culture and language. There have been numerous reports on the genetic structure of Asian Indians, although these have mostly relied on genotyping microarrays or targeted sequencing of the mitochondria and Y chromosomes. Asian Indians in Singapore are primarily descendants of immigrants from Dravidian-language-speaking states in south India, and 38 individuals from the general population underwent deep whole-genome sequencing with a target coverage of 30X as part of the Singapore Sequencing Indian Project (SSIP. The genetic structure and diversity of these samples were compared against samples from the Singapore Sequencing Malay Project and populations in Phase 1 of the 1,000 Genomes Project (1 KGP. SSIP samples exhibited greater intra-population genetic diversity and possessed higher heterozygous-to-homozygous genotype ratio than other Asian populations. When compared against a panel of well-defined Asian Indians, the genetic makeup of the SSIP samples was closely related to South Indians. However, even though the SSIP samples clustered distinctly from the Europeans in the global population structure analysis with autosomal SNPs, eight samples were assigned to mitochondrial haplogroups that were predominantly present in Europeans and possessed higher European admixture than the remaining samples. An analysis of the relative relatedness between SSIP with two archaic hominins (Denisovan, Neanderthal identified higher ancient admixture in East Asian populations than in SSIP. The data resource for these samples is publicly available and is expected to serve as a valuable complement to the South Asian samples in Phase 3 of 1 KGP.

  19. Deep sequencing of plant and animal DNA contained within traditional Chinese medicines reveals legality issues and health safety concerns.

    Directory of Open Access Journals (Sweden)

    Megan L Coghlan

    Full Text Available Traditional Chinese medicine (TCM has been practiced for thousands of years, but only within the last few decades has its use become more widespread outside of Asia. Concerns continue to be raised about the efficacy, legality, and safety of many popular complementary alternative medicines, including TCMs. Ingredients of some TCMs are known to include derivatives of endangered, trade-restricted species of plants and animals, and therefore contravene the Convention on International Trade in Endangered Species (CITES legislation. Chromatographic studies have detected the presence of heavy metals and plant toxins within some TCMs, and there are numerous cases of adverse reactions. It is in the interests of both biodiversity conservation and public safety that techniques are developed to screen medicinals like TCMs. Targeting both the p-loop region of the plastid trnL gene and the mitochondrial 16S ribosomal RNA gene, over 49,000 amplicon sequence reads were generated from 15 TCM samples presented in the form of powders, tablets, capsules, bile flakes, and herbal teas. Here we show that second-generation, high-throughput sequencing (HTS of DNA represents an effective means to genetically audit organic ingredients within complex TCMs. Comparison of DNA sequence data to reference databases revealed the presence of 68 different plant families and included genera, such as Ephedra and Asarum, that are potentially toxic. Similarly, animal families were identified that include genera that are classified as vulnerable, endangered, or critically endangered, including Asiatic black bear (Ursus thibetanus and Saiga antelope (Saiga tatarica. Bovidae, Cervidae, and Bufonidae DNA were also detected in many of the TCM samples and were rarely declared on the product packaging. This study demonstrates that deep sequencing via HTS is an efficient and cost-effective way to audit highly processed TCM products and will assist in monitoring their legality and safety

  20. Deep sequencing of plant and animal DNA contained within traditional Chinese medicines reveals legality issues and health safety concerns.

    Science.gov (United States)

    Coghlan, Megan L; Haile, James; Houston, Jayne; Murray, Dáithí C; White, Nicole E; Moolhuijzen, Paula; Bellgard, Matthew I; Bunce, Michael

    2012-01-01

    Traditional Chinese medicine (TCM) has been practiced for thousands of years, but only within the last few decades has its use become more widespread outside of Asia. Concerns continue to be raised about the efficacy, legality, and safety of many popular complementary alternative medicines, including TCMs. Ingredients of some TCMs are known to include derivatives of endangered, trade-restricted species of plants and animals, and therefore contravene the Convention on International Trade in Endangered Species (CITES) legislation. Chromatographic studies have detected the presence of heavy metals and plant toxins within some TCMs, and there are numerous cases of adverse reactions. It is in the interests of both biodiversity conservation and public safety that techniques are developed to screen medicinals like TCMs. Targeting both the p-loop region of the plastid trnL gene and the mitochondrial 16S ribosomal RNA gene, over 49,000 amplicon sequence reads were generated from 15 TCM samples presented in the form of powders, tablets, capsules, bile flakes, and herbal teas. Here we show that second-generation, high-throughput sequencing (HTS) of DNA represents an effective means to genetically audit organic ingredients within complex TCMs. Comparison of DNA sequence data to reference databases revealed the presence of 68 different plant families and included genera, such as Ephedra and Asarum, that are potentially toxic. Similarly, animal families were identified that include genera that are classified as vulnerable, endangered, or critically endangered, including Asiatic black bear (Ursus thibetanus) and Saiga antelope (Saiga tatarica). Bovidae, Cervidae, and Bufonidae DNA were also detected in many of the TCM samples and were rarely declared on the product packaging. This study demonstrates that deep sequencing via HTS is an efficient and cost-effective way to audit highly processed TCM products and will assist in monitoring their legality and safety especially when

  1. Identification of MicroRNAs in Helicoverpa armigera and Spodoptera litura Based on Deep Sequencing and Homology Analysis

    Science.gov (United States)

    Ge, Xie; Zhang, Yong; Jiang, Jianhao; Zhong, Yi; Yang, Xiaonan; Li, Zhiqian; Huang, Yongping; Tan, Anjiang

    2013-01-01

    The current identification of microRNAs (miRNAs) in insects is largely dependent on genome sequences. However, the lack of available genome sequences inhibits the identification of miRNAs in various insect species. In this study, we used a miRNA database of the silkworm Bombyx mori as a reference to identify miRNAs in Helicoverpa armigera and Spodoptera litura using deep sequencing and homology analysis. Because all three species belong to the Lepidoptera, the experiment produced reliable results. Our study identified 97 and 91 conserved miRNAs in H. armigera and S. litura, respectively. Using the genome of B. mori and BAC sequences of H. armigera as references, 1 novel miRNA and 8 novel miRNA candidates were identified in H. armigera, and 4 novel miRNA candidates were identified in S. litura. An evolutionary analysis revealed that most of the identified miRNAs were insect-specific, and more than 20 miRNAs were Lepidoptera-specific. The investigation of the expression patterns of miR-2a, miR-34, miR-2796-3p and miR-11 revealed their potential roles in insect development. miRNA target prediction revealed that conserved miRNA target sites exist in various genes in the 3 species. Conserved miRNA target sites for the Hsp90 gene among the 3 species were validated in the mammalian 293T cell line using a dual-luciferase reporter assay. Our study provides a new approach with which to identify miRNAs in insects lacking genome information and contributes to the functional analysis of insect miRNAs. PMID:23289012

  2. Deep sequencing of subseafloor eukaryotic rRNA reveals active Fungi across marine subsurface provinces.

    Directory of Open Access Journals (Sweden)

    William Orsi

    Full Text Available The deep marine subsurface is a vast habitat for microbial life where cells may live on geologic timescales. Because DNA in sediments may be preserved on long timescales, ribosomal RNA (rRNA is suggested to be a proxy for the active fraction of a microbial community in the subsurface. During an investigation of eukaryotic 18S rRNA by amplicon pyrosequencing, unique profiles of Fungi were found across a range of marine subsurface provinces including ridge flanks, continental margins, and abyssal plains. Subseafloor fungal populations exhibit statistically significant correlations with total organic carbon (TOC, nitrate, sulfide, and dissolved inorganic carbon (DIC. These correlations are supported by terminal restriction length polymorphism (TRFLP analyses of fungal rRNA. Geochemical correlations with fungal pyrosequencing and TRFLP data from this geographically broad sample set suggests environmental selection of active Fungi in the marine subsurface. Within the same dataset, ancient rRNA signatures were recovered from plants and diatoms in marine sediments ranging from 0.03 to 2.7 million years old, suggesting that rRNA from some eukaryotic taxa may be much more stable than previously considered in the marine subsurface.

  3. Deep sequencing uncovers numerous small RNAs on all four replicons of the plant pathogen Agrobacterium tumefaciens.

    Science.gov (United States)

    Wilms, Ina; Overlöper, Aaron; Nowrousian, Minou; Sharma, Cynthia M; Narberhaus, Franz

    2012-04-01

    Agrobacterium species are capable of interkingdom gene transfer between bacteria and plants. The genome of Agrobacterium tumefaciens consists of a circular and a linear chromosome, the At-plasmid and the Ti-plasmid, which harbors bacterial virulence genes required for tumor formation in plants. Little is known about promoter sequences and the small RNA (sRNA) repertoire of this and other α-proteobacteria. We used a differential RNA sequencing (dRNA-seq) approach to map transcriptional start sites of 388 annotated genes and operons. In addition, a total number of 228 sRNAs was revealed from all four Agrobacterium replicons. Twenty-two of these were confirmed by independent RNA gel blot analysis and several sRNAs were differentially expressed in response to growth media, growth phase, temperature or pH. One sRNA from the Ti-plasmid was massively induced under virulence conditions. The presence of 76 cis-antisense sRNAs, two of them on the reverse strand of virulence genes, suggests considerable antisense transcription in Agrobacterium. The information gained from this study provides a valuable reservoir for an in-depth understanding of sRNA-mediated regulation of the complex physiology and infection process of Agrobacterium.

  4. Evolutionary Relations of Hexanchiformes Deep-Sea Sharks Elucidated by Whole Mitochondrial Genome Sequences

    Directory of Open Access Journals (Sweden)

    Keiko Tanaka

    2013-01-01

    Full Text Available Hexanchiformes is regarded as a monophyletic taxon, but the morphological and genetic relationships between the five extant species within the order are still uncertain. In this study, we determined the whole mitochondrial DNA (mtDNA sequences of seven sharks including representatives of the five Hexanchiformes, one squaliform, and one carcharhiniform and inferred the phylogenetic relationships among those species and 12 other Chondrichthyes (cartilaginous fishes species for which the complete mitogenome is available. The monophyly of Hexanchiformes and its close relation with all other Squaliformes sharks were strongly supported by likelihood and Bayesian phylogenetic analysis of 13,749 aligned nucleotides of 13 protein coding genes and two rRNA genes that were derived from the whole mDNA sequences of the 19 species. The phylogeny suggested that Hexanchiformes is in the superorder Squalomorphi, Chlamydoselachus anguineus (frilled shark is the sister species to all other Hexanchiformes, and the relations within Hexanchiformes are well resolved as Chlamydoselachus, (Notorynchus, (Heptranchias, (Hexanchus griseus, H. nakamurai. Based on our phylogeny, we discussed evolutionary scenarios of the jaw suspension mechanism and gill slit numbers that are significant features in the sharks.

  5. Revolution of nephrology research by deep sequencing: ChIP-seq and RNA-seq.

    Science.gov (United States)

    Mimura, Imari; Kanki, Yasuharu; Kodama, Tatsuhiko; Nangaku, Masaomi

    2014-01-01

    The recent and rapid advent of next-generation sequencing (NGS) has made this technology broadly available not only to researchers in various molecular and cellular biology fields but also to those in kidney disease. In this paper, we describe the usage of ChIP-seq (chromatin immunoprecipitation with sequencing) and RNA-seq for sample preparation and interpretation of raw data in the investigation of biological phenomenon in renal diseases. ChIP-seq identifies genome-wide transcriptional DNA-binding sites as well as histone modifications, which are known to regulate gene expression, in the intragenic as well as in the intergenic regions. With regard to RNA-seq, this process analyzes not only the expression level of mRNA but also splicing variants, non-coding RNA, and microRNA on a genome-wide scale. The combination of ChIP-seq and RNA-seq allows the clarification of novel transcriptional mechanisms, which have important roles in various kinds of diseases, including chronic kidney disease. The rapid development of these techniques requires an update on the latest information and methods of NGS. In this review, we highlight the merits and characteristics of ChIP-seq and RNA-seq and discuss the use of the genome-wide analysis in kidney disease.

  6. Identification of Hop stunt viroid infecting Citrus limon in China using small RNAs deep sequencing approach.

    Science.gov (United States)

    Su, Xiu; Fu, Shuai; Qian, Yajuan; Xu, Yi; Zhou, Xueping

    2015-07-07

    The advent of next generation sequencing technology has allowed for significant advances in plant virus discovery, particularly for identification of covert viruses and previously undescribed viruses. The Citrus limon Burm. f. (C. limon) is a small evergreen tree native to Asia, and . China is the world's top lemon-producing nation. In this work, lemon samples were collected from southwestern of China, where an unknown disease outbreak had caused huge losses in the lemon production industry. Using high-throughput pyrosequencing and the assembly of small RNAs, we showed that the Hop stunt viroid (HSVd) was present in C. limon leaf sample. The majority of it is a main lemon producing agricultural cultivarHop stunt viroid derived siRNAs (HSVd-siRNAs) in C. limon were 21 nucleotides in length, and nearly equal amount of HSVd-siRNAs originated from the plus-genomic RNA strand as from the complementary strand. A bias of HSVd-siRNAs toward sequences beginning with a 5'-Guanine was observed. Furthermore, hotspot analysis showed that a large amount of HSVd-siRNAs derived from the central and variant domains of the HSVd genome. Our results suggest that C. limon could set up a small RNA-mediated gene silencing response to Hop stunt viroid, Interestingly, based on bioinformatics analysis, our results also suggest that the large amounts of HSVd-siRNAs from central and variant domains might be involved in interference with host gene expression and affect symptom development.

  7. Computational approaches for the analysis of ncRNA through Deep Sequencing techniques

    Directory of Open Access Journals (Sweden)

    Dario eVeneziano

    2015-06-01

    Full Text Available The majority of the human transcriptome is defined as non-coding RNA (ncRNA, since only a small fraction of human DNA encodes for proteins, as reported by the ENCODE project. Several distinct classes of ncRNAs, such as transfer RNA (tRNA, microRNA (miRNA, and long non-coding RNA (lncRNA, have been classified, each with its own three-dimensional folding and specific function. As ncRNAs are highly abundant in living organisms and have been discovered to play important roles in many biological processes, there has been an ever increasing need to investigate the entire ncRNAome in further unbiased detail.Recently, the advent of Next-Generation Sequencing (NGS technologies has substantially increased the throughput of transcriptome studies, allowing an unprecedented investigation of ncRNAs, as regulatory pathways and novel functions involving ncRNAs are now also emerging. The huge amount of transcript data produced by NGS has progressively required the development and implementation of suitable bioinformatics workflows, complemented by knowledge-based approaches, to identify, classify, and evaluate the expression of hundreds of ncRNAs in normal and pathological states, such as cancer.In this mini-review, we present and discuss current bioinformatics advances in the development of such computational approaches to analyze and classify the non-coding RNA component of human transcriptome sequence data obtained from NGS technologies.

  8. Functional characterization of a monoclonal antibody epitope using a lambda phage display-deep sequencing platform

    Science.gov (United States)

    Domina, Maria; Lanza Cariccio, Veronica; Benfatto, Salvatore; Venza, Mario; Venza, Isabella; Borgogni, Erica; Castellino, Flora; Midiri, Angelina; Galbo, Roberta; Romeo, Letizia; Biondo, Carmelo; Masignani, Vega; Teti, Giuseppe; Felici, Franco; Beninati, Concetta

    2016-01-01

    We have recently described a method, named PROFILER, for the identification of antigenic regions preferentially targeted by polyclonal antibody responses after vaccination. To test the ability of the technique to provide insights into the functional properties of monoclonal antibody (mAb) epitopes, we used here a well-characterized epitope of meningococcal factor H binding protein (fHbp), which is recognized by mAb 12C1. An fHbp library, engineered on a lambda phage vector enabling surface expression of polypeptides of widely different length, was subjected to massive parallel sequencing of the phage inserts after affinity selection with the 12C1 mAb. We detected dozens of unique antibody-selected sequences, the most enriched of which (designated as FrC) could largely recapitulate the ability of fHbp to bind mAb 12C1. Computational analysis of the cumulative enrichment of single amino acids in the antibody-selected fragments identified two overrepresented stretches of residues (H248-K254 and S140-G154), whose presence was subsequently found to be required for binding of FrC to mAb 12C1. Collectively, these results suggest that the PROFILER technology can rapidly and reliably identify, in the context of complex conformational epitopes, discrete “hot spots” with a crucial role in antigen-antibody interactions, thereby providing useful clues for the functional characterization of the epitope. PMID:27530334

  9. Deciphering the molecular profile of plaques, memory decline and neuron loss in two mouse models for Alzheimer's disease by deep sequencing.

    Science.gov (United States)

    Bouter, Yvonne; Kacprowski, Tim; Weissmann, Robert; Dietrich, Katharina; Borgers, Henning; Brauß, Andreas; Sperling, Christian; Wirths, Oliver; Albrecht, Mario; Jensen, Lars R; Kuss, Andreas W; Bayer, Thomas A

    2014-01-01

    One of the central research questions on the etiology of Alzheimer's disease (AD) is the elucidation of the molecular signatures triggered by the amyloid cascade of pathological events. Next-generation sequencing allows the identification of genes involved in disease processes in an unbiased manner. We have combined this technique with the analysis of two AD mouse models: (1) The 5XFAD model develops early plaque formation, intraneuronal Aβ aggregation, neuron loss, and behavioral deficits. (2) The Tg4-42 model expresses N-truncated Aβ4-42 and develops neuron loss and behavioral deficits albeit without plaque formation. Our results show that learning and memory deficits in the Morris water maze and fear conditioning tasks in Tg4-42 mice at 12 months of age are similar to the deficits in 5XFAD animals. This suggested that comparative gene expression analysis between the models would allow the dissection of plaque-related and -unrelated disease relevant factors. Using deep sequencing differentially expressed genes (DEGs) were identified and subsequently verified by quantitative PCR. Nineteen DEGs were identified in pre-symptomatic young 5XFAD mice, and none in young Tg4-42 mice. In the aged cohort, 131 DEGs were found in 5XFAD and 56 DEGs in Tg4-42 mice. Many of the DEGs specific to the 5XFAD model belong to neuroinflammatory processes typically associated with plaques. Interestingly, 36 DEGs were identified in both mouse models indicating common disease pathways associated with behavioral deficits and neuron loss.

  10. Human F7 sequence is split into three deep clades that are related to FVII plasma levels.

    Science.gov (United States)

    Sabater-Lleal, Maria; Soria, José Manuel; Bertranpetit, Jaume; Almasy, Laura; Blangero, John; Fontcuberta, Jordi; Calafell, Francesc

    2006-02-01

    It is widely accepted that FVII levels are strongly, consistently, and independently related to cardiovascular risk. These levels are influenced by genetic and environmental factors. Among the genetic factors, only a limited number of polymorphisms in the F7 gene have been reported, and they explain only a small proportion of the genetic variability. Recently, we have accomplished the complete dissection of the F7 quantitative trait locus responsible for all of the genetic variability observed in FVII levels. Now, we present the thorough study of the haplotype organization of F7 DNA sequence variation among individuals and the evolutionary processes that produced this variation, by sequencing 15 kb of genomic DNA sequence from the F7 locus in 40 unrelated individual (80 chromosomes) from the genetic analysis of idiopathic thrombophilia (GAIT) project as well as four non-human primate species. Our study revealed 49 polymorphisms, of which 39 SNPs were further considered. Genotyping of these DNA variations in the whole family-based GAIT sample helped resolve linkage phases, and a total of 37 distinct haplotypes were identified.Tajima's D was significantly positive in this sample, suggesting balancing selection. This parameter was a reflection of the phylogenetic structure of F7 haplotype, which was deeply split into three well-supported clades or haplogroups, suggesting that functional differences among F7 variants do not depend on a few single-site variations. Moreover, haplogroup 2 was associated with high FVII levels and haplogroup 3 with low levels. In this study, we have for the first time established a clear relation between genotypic variability structure and phenotypic variability of a particular quantitative trait involved in a complex disease.

  11. Focused Evolution of HIV-1 Neutralizing Antibodies Revealed by Structures and Deep Sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Wu, Xueling; Zhou, Tongqing; Zhu, Jiang; Zhang, Baoshan; Georgiev, Ivelin; Wang, Charlene; Chen, Xuejun; Longo, Nancy S.; Louder, Mark; McKee, Krisha; O’Dell, Sijy; Perfetto, Stephen; Schmidt, Stephen D.; Shi, Wei; Wu, Lan; Yang, Yongping; Yang, Zhi-Yong; Yang, Zhongjia; Zhang, Zhenhai; Bonsignori, Mattia; Crump, John A.; Kapiga, Saidi H.; Sam, Noel E.; Haynes, Barton F.; Simek, Melissa; Burton, Dennis R.; Koff, Wayne C.; Doria-Rose, Nicole A.; Connors, Mark; Mullikin, James C.; Nabel, Gary J.; Roederer, Mario; Shapiro, Lawrence; Kwong, Peter D.; Mascola, John R. (Tumaini); (NIH); (Duke); (Kilimanjaro Repro.); (IAVI)

    2013-03-04

    Antibody VRC01 is a human immunoglobulin that neutralizes about 90% of HIV-1 isolates. To understand how such broadly neutralizing antibodies develop, we used x-ray crystallography and 454 pyrosequencing to characterize additional VRC01-like antibodies from HIV-1-infected individuals. Crystal structures revealed a convergent mode of binding for diverse antibodies to the same CD4-binding-site epitope. A functional genomics analysis of expressed heavy and light chains revealed common pathways of antibody-heavy chain maturation, confined to the IGHV1-2*02 lineage, involving dozens of somatic changes, and capable of pairing with different light chains. Broadly neutralizing HIV-1 immunity associated with VRC01-like antibodies thus involves the evolution of antibodies to a highly affinity-matured state required to recognize an invariant viral structure, with lineages defined from thousands of sequences providing a genetic roadmap of their development.

  12. Genome-wide analysis of SRSF10-regulated alternative splicing by deep sequencing of chicken transcriptome

    Directory of Open Access Journals (Sweden)

    Xuexia Zhou

    2014-12-01

    Full Text Available Splicing factor SRSF10 is known to function as a sequence-specific splicing activator that is capable of regulating alternative splicing both in vitro and in vivo. We recently used an RNA-seq approach coupled with bioinformatics analysis to identify the extensive splicing network regulated by SRSF10 in chicken cells. We found that SRSF10 promoted both exon inclusion and exclusion. Functionally, many of the SRSF10-verified alternative exons are linked to pathways of response to external stimulus. Here we describe in detail the experimental design, bioinformatics analysis and GO/pathway enrichment analysis of SRSF10-regulated genes to correspond with our data in the Gene Expression Omnibus with accession number GSE53354. Our data thus provide a resource for studying regulation of alternative splicing in vivo that underlines biological functions of splicing regulatory proteins in cells.

  13. Ultra-deep sequencing reveals the microRNA expression pattern of the human stomach.

    Directory of Open Access Journals (Sweden)

    Ândrea Ribeiro-dos-Santos

    Full Text Available BACKGROUND: While microRNAs (miRNAs play important roles in tissue differentiation and in maintaining basal physiology, little is known about the miRNA expression levels in stomach tissue. Alterations in the miRNA profile can lead to cell deregulation, which can induce neoplasia. METHODOLOGY/PRINCIPAL FINDINGS: A small RNA library of stomach tissue was sequenced using high-throughput SOLiD sequencing technology. We obtained 261,274 quality reads with perfect matches to the human miRnome, and 42% of known miRNAs were identified. Digital Gene Expression profiling (DGE was performed based on read abundance and showed that fifteen miRNAs were highly expressed in gastric tissue. Subsequently, the expression of these miRNAs was validated in 10 healthy individuals by RT-PCR showed a significant correlation of 83.97% (P<0.05. Six miRNAs showed a low variable pattern of expression (miR-29b, miR-29c, miR-19b, miR-31, miR-148a, miR-451 and could be considered part of the expression pattern of the healthy gastric tissue. CONCLUSIONS/SIGNIFICANCE: This study aimed to validate normal miRNA profiles of human gastric tissue to establish a reference profile for healthy individuals. Determining the regulatory processes acting in the stomach will be important in the fight against gastric cancer, which is the second-leading cause of cancer mortality worldwide.

  14. Deep sequencing of MYC DNA-binding sites in Burkitt lymphoma.

    Directory of Open Access Journals (Sweden)

    Volkhard Seitz

    Full Text Available BACKGROUND: MYC is a key transcription factor involved in central cellular processes such as regulation of the cell cycle, histone acetylation and ribosomal biogenesis. It is overexpressed in the majority of human tumors including aggressive B-cell lymphoma. Especially Burkitt lymphoma (BL is a highlight example for MYC overexpression due to a chromosomal translocation involving the c-MYC gene. However, no genome-wide analysis of MYC-binding sites by chromatin immunoprecipitation (ChIP followed by next generation sequencing (ChIP-Seq has been conducted in BL so far. METHODOLOGY/PRINCIPAL FINDINGS: ChIP-Seq was performed on 5 BL cell lines with a MYC-specific antibody giving rise to 7,054 MYC-binding sites after bioinformatics analysis of a total of approx. 19 million sequence reads. In line with previous findings, binding sites accumulate in gene sets known to be involved in the cell cycle, ribosomal biogenesis, histone acetyltransferase and methyltransferase complexes demonstrating a regulatory role of MYC in these processes. Unexpectedly, MYC-binding sites also accumulate in many B-cell relevant genes. To assess the functional consequences of MYC binding, the ChIP-Seq data were supplemented with siRNA- mediated knock-downs of MYC in BL cell lines followed by gene expression profiling. Interestingly, amongst others, genes involved in the B-cell function were up-regulated in response to MYC silencing. CONCLUSION/SIGNIFICANCE: The 7,054 MYC-binding sites identified by our ChIP-Seq approach greatly extend the knowledge regarding MYC binding in BL and shed further light on the enormous complexity of the MYC regulatory network. Especially our observations that (i many B-cell relevant genes are targeted by MYC and (ii that MYC down-regulation leads to an up-regulation of B-cell genes highlight an interesting aspect of BL biology.

  15. Deep sequencing and in silico analyses identify MYB-regulated gene networks and signaling pathways in pancreatic cancer.

    Science.gov (United States)

    Azim, Shafquat; Zubair, Haseeb; Srivastava, Sanjeev K; Bhardwaj, Arun; Zubair, Asif; Ahmad, Aamir; Singh, Seema; Khushman, Moh'd; Singh, Ajay P

    2016-06-29

    We have recently demonstrated that the transcription factor MYB can modulate several cancer-associated phenotypes in pancreatic cancer. In order to understand the molecular basis of these MYB-associated changes, we conducted deep-sequencing of transcriptome of MYB-overexpressing and -silenced pancreatic cancer cells, followed by in silico pathway analysis. We identified significant modulation of 774 genes upon MYB-silencing (p networks by in silico analysis. Further analyses placed genes in our RNA sequencing-generated dataset to several canonical signalling pathways, such as cell-cycle control, DNA-damage and -repair responses, p53 and HIF1α. Importantly, we observed downregulation of the pancreatic adenocarcinoma signaling pathway in MYB-silenced pancreatic cancer cells exhibiting suppression of EGFR and NF-κB. Decreased expression of EGFR and RELA was validated by both qPCR and immunoblotting and they were both shown to be under direct transcriptional control of MYB. These observations were further confirmed in a converse approach wherein MYB was overexpressed ectopically in a MYB-null pancreatic cancer cell line. Our findings thus suggest that MYB potentially regulates growth and genomic stability of pancreatic cancer cells via targeting complex gene networks and signaling pathways. Further in-depth functional studies are warranted to fully understand MYB signaling in pancreatic cancer.

  16. Identifying genomic changes associated with insecticide resistance in the dengue mosquito Aedes aegypti by deep targeted sequencing.

    Science.gov (United States)

    Faucon, Frederic; Dusfour, Isabelle; Gaude, Thierry; Navratil, Vincent; Boyer, Frederic; Chandre, Fabrice; Sirisopa, Patcharawan; Thanispong, Kanutcharee; Juntarajumnong, Waraporn; Poupardin, Rodolphe; Chareonviriyaphap, Theeraphap; Girod, Romain; Corbel, Vincent; Reynaud, Stephane; David, Jean-Philippe

    2015-09-01

    The capacity of mosquitoes to resist insecticides threatens the control of diseases such as dengue and malaria. Until alternative control tools are implemented, characterizing resistance mechanisms is crucial for managing resistance in natural populations. Insecticide biodegradation by detoxification enzymes is a common resistance mechanism; however, the genomic changes underlying this mechanism have rarely been identified, precluding individual resistance genotyping. In particular, the role of copy number variations (CNVs) and polymorphisms of detoxification enzymes have never been investigated at the genome level, although they can represent robust markers of metabolic resistance. In this context, we combined target enrichment with high-throughput sequencing for conducting the first comprehensive screening of gene amplifications and polymorphisms associated with insecticide resistance in mosquitoes. More than 760 candidate genes were captured and deep sequenced in several populations of the dengue mosquito Ae. aegypti displaying distinct genetic backgrounds and contrasted resistance levels to the insecticide deltamethrin. CNV analysis identified 41 gene amplifications associated with resistance, most affecting cytochrome P450s overtranscribed in resistant populations. Polymorphism analysis detected more than 30,000 variants and strong selection footprints in specific genomic regions. Combining Bayesian and allele frequency filtering approaches identified 55 nonsynonymous variants strongly associated with resistance. Both CNVs and polymorphisms were conserved within regions but differed across continents, confirming that genomic changes underlying metabolic resistance to insecticides are not universal. By identifying novel DNA markers of insecticide resistance, this study opens the way for tracking down metabolic changes developed by mosquitoes to resist insecticides within and among populations.

  17. Analysis of tumor heterogeneity and cancer gene networks using deep sequencing of MMTV-induced mouse mammary tumors.

    Directory of Open Access Journals (Sweden)

    Christiaan Klijn

    Full Text Available Cancer develops through a multistep process in which normal cells progress to malignant tumors via the evolution of their genomes as a result of the acquisition of mutations in cancer driver genes. The number, identity and mode of action of cancer driver genes, and how they contribute to tumor evolution is largely unknown. This study deployed the Mouse Mammary Tumor Virus (MMTV as an insertional mutagen to find both the driver genes and the networks in which they function. Using deep insertion site sequencing we identified around 31000 retroviral integration sites in 604 MMTV-induced mammary tumors from mice with mammary gland-specific deletion of Trp53, Pten heterozygous knockout mice, or wildtype strains. We identified 18 known common integration sites (CISs and 12 previously unknown CISs marking new candidate cancer genes. Members of the Wnt, Fgf, Fgfr, Rspo and Pdgfr gene families were commonly mutated in a mutually exclusive fashion. The sequence data we generated yielded also information on the clonality of insertions in individual tumors, allowing us to develop a data-driven model of MMTV-induced tumor development. Insertional mutations near Wnt and Fgf genes mark the earliest "initiating" events in MMTV induced tumorigenesis, whereas Fgfr genes are targeted later during tumor progression. Our data shows that insertional mutagenesis can be used to discover the mutational networks, the timing of mutations, and the genes that initiate and drive tumor evolution.

  18. Deep sequencing and proteomic analysis of the microRNA-induced silencing complex in human red blood cells.

    Science.gov (United States)

    Azzouzi, Imane; Moest, Hansjoerg; Wollscheid, Bernd; Schmugge, Markus; Eekels, Julia J M; Speer, Oliver

    2015-05-01

    During maturation, erythropoietic cells extrude their nuclei but retain their ability to respond to oxidant stress by tightly regulating protein translation. Several studies have reported microRNA-mediated regulation of translation during terminal stages of erythropoiesis, even after enucleation. In the present study, we performed a detailed examination of the endogenous microRNA machinery in human red blood cells using a combination of deep sequencing analysis of microRNAs and proteomic analysis of the microRNA-induced silencing complex. Among the 197 different microRNAs detected, miR-451a was the most abundant, representing more than 60% of all read sequences. In addition, miR-451a and its known target, 14-3-3ζ mRNA, were bound to the microRNA-induced silencing complex, implying their direct interaction in red blood cells. The proteomic characterization of endogenous Argonaute 2-associated microRNA-induced silencing complex revealed 26 cofactor candidates. Among these cofactors, we identified several RNA-binding proteins, as well as motor proteins and vesicular trafficking proteins. Our results demonstrate that red blood cells contain complex microRNA machinery, which might enable immature red blood cells to control protein translation independent of de novo nuclei information.

  19. Identifying genomic changes associated with insecticide resistance in the dengue mosquito Aedes aegypti by deep targeted sequencing

    Science.gov (United States)

    Faucon, Frederic; Dusfour, Isabelle; Gaude, Thierry; Navratil, Vincent; Boyer, Frederic; Chandre, Fabrice; Sirisopa, Patcharawan; Thanispong, Kanutcharee; Juntarajumnong, Waraporn; Poupardin, Rodolphe; Chareonviriyaphap, Theeraphap; Girod, Romain; Corbel, Vincent; Reynaud, Stephane; David, Jean-Philippe

    2015-01-01

    The capacity of mosquitoes to resist insecticides threatens the control of diseases such as dengue and malaria. Until alternative control tools are implemented, characterizing resistance mechanisms is crucial for managing resistance in natural populations. Insecticide biodegradation by detoxification enzymes is a common resistance mechanism; however, the genomic changes underlying this mechanism have rarely been identified, precluding individual resistance genotyping. In particular, the role of copy number variations (CNVs) and polymorphisms of detoxification enzymes have never been investigated at the genome level, although they can represent robust markers of metabolic resistance. In this context, we combined target enrichment with high-throughput sequencing for conducting the first comprehensive screening of gene amplifications and polymorphisms associated with insecticide resistance in mosquitoes. More than 760 candidate genes were captured and deep sequenced in several populations of the dengue mosquito Ae. aegypti displaying distinct genetic backgrounds and contrasted resistance levels to the insecticide deltamethrin. CNV analysis identified 41 gene amplifications associated with resistance, most affecting cytochrome P450s overtranscribed in resistant populations. Polymorphism analysis detected more than 30,000 variants and strong selection footprints in specific genomic regions. Combining Bayesian and allele frequency filtering approaches identified 55 nonsynonymous variants strongly associated with resistance. Both CNVs and polymorphisms were conserved within regions but differed across continents, confirming that genomic changes underlying metabolic resistance to insecticides are not universal. By identifying novel DNA markers of insecticide resistance, this study opens the way for tracking down metabolic changes developed by mosquitoes to resist insecticides within and among populations. PMID:26206155

  20. Deep sequencing reveals the complex and coordinated transcriptional regulation of genes related to grain quality in rice cultivars

    Directory of Open Access Journals (Sweden)

    An Gynheung

    2011-04-01

    Full Text Available Abstract Background Milling yield and eating quality are two important grain quality traits in rice. To identify the genes involved in these two traits, we performed a deep transcriptional analysis of developing seeds using both massively parallel signature sequencing (MPSS and sequencing-by-synthesis (SBS. Five MPSS and five SBS libraries were constructed from 6-day-old developing seeds of Cypress (high milling yield, LaGrue (low milling yield, Ilpumbyeo (high eating quality, YR15965 (low eating quality, and Nipponbare (control. Results The transcriptomes revealed by MPSS and SBS had a high correlation co-efficient (0.81 to 0.90, and about 70% of the transcripts were commonly identified in both types of the libraries. SBS, however, identified 30% more transcripts than MPSS. Among the highly expressed genes in Cypress and Ilpumbyeo, over 100 conserved cis regulatory elements were identified. Numerous specifically expressed transcription factor (TF genes were identified in Cypress (282, LaGrue (312, Ilpumbyeo (363, YR15965 (260, and Nipponbare (357. Many key grain quality-related genes (i.e., genes involved in starch metabolism, aspartate amino acid metabolism, storage and allergenic protein synthesis, and seed maturation that were expressed at high levels underwent alternative splicing and produced antisense transcripts either in Cypress or Ilpumbyeo. Further, a time course RT-PCR analysis confirmed a higher expression level of genes involved in starch metabolism such as those encoding ADP glucose pyrophosphorylase (AGPase and granule bound starch synthase I (GBSS I in Cypress than that in LaGrue during early seed development. Conclusion This study represents the most comprehensive analysis of the developing seed transcriptome of rice available to date. Using two high throughput sequencing methods, we identified many differentially expressed genes that may affect milling yield or eating quality in rice. Many of the identified genes are involved

  1. Small RNA and transcriptome deep sequencing proffers insight into floral gene regulation in Rosa cultivars

    Directory of Open Access Journals (Sweden)

    Kim Jungeun

    2012-11-01

    Full Text Available Abstract Background Roses (Rosa sp., which belong to the family Rosaceae, are the most economically important ornamental plants—making up 30% of the floriculture market. However, given high demand for roses, rose breeding programs are limited in molecular resources which can greatly enhance and speed breeding efforts. A better understanding of important genes that contribute to important floral development and desired phenotypes will lead to improved rose cultivars. For this study, we analyzed rose miRNAs and the rose flower transcriptome in order to generate a database to expound upon current knowledge regarding regulation of important floral characteristics. A rose genetic database will enable comprehensive analysis of gene expression and regulation via miRNA among different Rosa cultivars. Results We produced more than 0.5 million reads from expressed sequences, totalling more than 110 million bp. From these, we generated 35,657, 31,434, 34,725, and 39,722 flower unigenes from Rosa hybrid: ‘Vital’, ‘Maroussia’, and ‘Sympathy’ and Rosa rugosa Thunb. , respectively. The unigenes were assigned functional annotations, domains, metabolic pathways, Gene Ontology (GO terms, Plant Ontology (PO terms, and MIPS Functional Catalogue (FunCat terms. Rose flower transcripts were compared with genes from whole genome sequences of Rosaceae members (apple, strawberry, and peach and grape. We also produced approximately 40 million small RNA reads from flower tissue for Rosa, representing 267 unique miRNA tags. Among identified miRNAs, 25 of them were novel and 242 of them were conserved miRNAs. Statistical analyses of miRNA profiles revealed both shared and species-specific miRNAs, which presumably effect flower development and phenotypes. Conclusions In this study, we constructed a Rose miRNA and transcriptome database, and we analyzed the miRNAs and transcriptome generated from the flower tissues of four Rosa cultivars. The database provides a

  2. Deep Sequencing of HIV-Infected Cells: Insights into Nascent Transcription and Host-Directed Therapy

    Science.gov (United States)

    Peng, Xinxia; Sova, Pavel; Green, Richard R.; Thomas, Matthew J.; Korth, Marcus J.; Proll, Sean; Xu, Jiabao; Cheng, Yanbing; Yi, Kang; Chen, Li; Peng, Zhiyu; Wang, Jun; Palermo, Robert E.

    2014-01-01

    ABSTRACT Polyadenylated mature mRNAs are the focus of standard transcriptome analyses. However, the profiling of nascent transcripts, which often include nonpolyadenylated RNAs, can unveil novel insights into transcriptional regulation. Here, we separately sequenced total RNAs (Total RNAseq) and mRNAs (mRNAseq) from the same HIV-1-infected human CD4+ T cells. We found that many nonpolyadenylated RNAs were differentially expressed upon HIV-1 infection, and we identified 8 times more differentially expressed genes at 12 h postinfection by Total RNAseq than by mRNAseq. These expression changes were also evident by concurrent changes in introns and were recapitulated by later mRNA changes, revealing an unexpectedly significant delay between transcriptional initiation and mature mRNA production early after HIV-1 infection. We computationally derived and validated the underlying regulatory programs, and we predicted drugs capable of reversing these HIV-1-induced expression changes followed by experimental confirmation. Our results show that combined total and mRNA transcriptome analysis is essential for fully capturing the early host response to virus infection and provide a framework for identifying candidate drugs for host-directed therapy against HIV/AIDS. IMPORTANCE In this study, we used mass sequencing to identify genes differentially expressed in CD4+ T cells during HIV-1 infection. To our surprise, we found many differentially expressed genes early after infection by analyzing both newly transcribed unprocessed pre-mRNAs and fully processed mRNAs, but not by analyzing mRNAs alone, indicating a significant delay between transcription initiation and mRNA production early after HIV-1 infection. These results also show that important findings could be missed by the standard practice of analyzing mRNAs alone. We then derived the regulatory mechanisms driving the observed expression changes using integrative computational analyses. Further, we predicted drugs that

  3. The 2007 Nazko, British Columbia, earthquake sequence: Injection of magma deep in the crust beneath the Anahim volcanic belt

    Science.gov (United States)

    Cassidy, J.F.; Balfour, N.; Hickson, C.; Kao, H.; White, Rickie; Caplan-Auerbach, J.; Mazzotti, S.; Rogers, Gary C.; Al-Khoubbi, I.; Bird, A.L.; Esteban, L.; Kelman, M.; Hutchinson, J.; McCormack, D.

    2011-01-01

    On 9 October 2007, an unusual sequence of earthquakes began in central British Columbia about 20 km west of the Nazko cone, the most recent (circa 7200 yr) volcanic center in the Anahim volcanic belt. Within 25 hr, eight earthquakes of magnitude 2.3-2.9 occurred in a region where no earthquakes had previously been recorded. During the next three weeks, more than 800 microearthquakes were located (and many more detected), most at a depth of 25-31 km and within a radius of about 5 km. After about two months, almost all activity ceased. The clear P- and S-wave arrivals indicated that these were high-frequency (volcanic-tectonic) earthquakes and the b value of 1.9 that we calculated is anomalous for crustal earthquakes but consistent with volcanic-related events. Analysis of receiver functions at a station immediately above the seismicity indicated a Moho near 30 km depth. Precise relocation of the seismicity using a double-difference method suggested a horizontal migration at the rate of about 0:5 km=d, with almost all events within the lowermost crust. Neither harmonic tremor nor long-period events were observed; however, some spasmodic bursts were recorded and determined to be colocated with the earthquake hypocenters. These observations are all very similar to a deep earthquake sequence recorded beneath Lake Tahoe, California, in 2003-2004. Based on these remarkable similarities, we interpret the Nazko sequence as an indication of an injection of magma into the lower crust beneath the Anahim volcanic belt. This magma injection fractures rock, producing high-frequency, volcanic-tectonic earthquakes and spasmodic bursts.

  4. Systematic Analysis of Small RNAs Associated with Human Mitochondria by Deep Sequencing: Detailed Analysis of Mitochondrial Associated miRNA

    Science.gov (United States)

    Sripada, Lakshmi; Tomar, Dhanendra; Prajapati, Paresh; Singh, Rochika; Singh, Arun Kumar; Singh, Rajesh

    2012-01-01

    Mitochondria are one of the central regulators of many cellular processes beyond its well established role in energy metabolism. The inter-organellar crosstalk is critical for the optimal function of mitochondria. Many nuclear encoded proteins and RNA are imported to mitochondria. The translocation of small RNA (sRNA) including miRNA to mitochondria and other sub-cellular organelle is still not clear. We characterized here sRNA including miRNA associated with human mitochondria by cellular fractionation and deep sequencing approach. Mitochondria were purified from HEK293 and HeLa cells for RNA isolation. The sRNA library was generated and sequenced using Illumina system. The analysis showed the presence of unique population of sRNA associated with mitochondria including miRNA. Putative novel miRNAs were characterized from unannotated sRNA sequences. The study showed the association of 428 known, 196 putative novel miRNAs to mitochondria of HEK293 and 327 known, 13 putative novel miRNAs to mitochondria of HeLa cells. The alignment of sRNA to mitochondrial genome was also studied. The targets were analyzed using DAVID to classify them in unique networks using GO and KEGG tools. Analysis of identified targets showed that miRNA associated with mitochondria regulates critical cellular processes like RNA turnover, apoptosis, cell cycle and nucleotide metabolism. The six miRNAs (counts >1000) associated with mitochondria of both HEK293 and HeLa were validated by RT-qPCR. To our knowledge, this is the first systematic study demonstrating the associations of sRNA including miRNA with mitochondria that may regulate site-specific turnover of target mRNA important for mitochondrial related functions. PMID:22984580

  5. Deep sequencing reveals microbiota dysbiosis of tongue coat in patients with liver carcinoma

    Science.gov (United States)

    Lu, Haifeng; Ren, Zhigang; Li, Ang; Zhang, Hua; Jiang, Jianwen; Xu, Shaoyan; Luo, Qixia; Zhou, Kai; Sun, Xiaoli; Zheng, Shusen; Li, Lanjuan

    2016-09-01

    Liver carcinoma (LC) is a common malignancy worldwide, associated with high morbidity and mortality. Characterizing microbiome profiles of tongue coat may provide useful insights and potential diagnostic marker for LC patients. Herein, we are the first time to investigate tongue coat microbiome of LC patients with cirrhosis based on 16S ribosomal RNA (rRNA) gene sequencing. After strict inclusion and exclusion criteria, 35 early LC patients with cirrhosis and 25 matched healthy subjects were enrolled. Microbiome diversity of tongue coat in LC patients was significantly increased shown by Shannon, Simpson and Chao 1 indexes. Microbiome on tongue coat was significantly distinguished LC patients from healthy subjects by principal component analysis. Tongue coat microbial profiles represented 38 operational taxonomic units assigned to 23 different genera, distinguishing LC patients. Linear discriminant analysis (LDA) effect size (LEfSe) reveals significant microbial dysbiosis of tongue coats in LC patients. Strikingly, Oribacterium and Fusobacterium could distinguish LC patients from healthy subjects. LEfSe outputs show microbial gene functions related to categories of nickel/iron_transport, amino_acid_transport, energy produced system and metabolism between LC patients and healthy subjects. These findings firstly identify microbiota dysbiosis of tongue coat in LC patients, may providing novel and non-invasive potential diagnostic biomarker of LC.

  6. Deep near-IR variability survey of pre-main-sequence stars in Rho Ophiuchi

    CERN Document Server

    de Oliveira, Catarina Alves

    2008-01-01

    Variability is a common characteristic of pre-main-sequence stars (PMS). Near-IR variability surveys of young stellar objects (YSOs) can probe stellar and circumstellar environments and provide information about the dynamics of the on going magnetic and accretion processes. Furthermore, variability can be used as a tool to uncover new cluster members in star formation regions. We hope to achieve the deepest near-IR variability study of YSOs targeting the Rho Ophiuchi cluster. Fourteen epochs of observations were obtained with the Wide Field Camera (WFCAM) at the UKIRT telescope scheduled in a manner that allowed the study of variability on timescales of days, months, and years. Statistical tools, such as the multi-band cross correlation index and the reduced chi-square, were used to disentangle signals of variability from noise. Variability characteristics are compared to existing models of YSOs in order to relate them to physical processes, and then used to select new candidate members of this star-forming r...

  7. Genomic DNA sequences from mastodon and woolly mammoth reveal deep speciation of forest and savanna elephants.

    Directory of Open Access Journals (Sweden)

    Nadin Rohland

    Full Text Available To elucidate the history of living and extinct elephantids, we generated 39,763 bp of aligned nuclear DNA sequence across 375 loci for African savanna elephant, African forest elephant, Asian elephant, the extinct American mastodon, and the woolly mammoth. Our data establish that the Asian elephant is the closest living relative of the extinct mammoth in the nuclear genome, extending previous findings from mitochondrial DNA analyses. We also find that savanna and forest elephants, which some have argued are the same species, are as or more divergent in the nuclear genome as mammoths and Asian elephants, which are considered to be distinct genera, thus resolving a long-standing debate about the appropriate taxonomic classification of the African elephants. Finally, we document a much larger effective population size in forest elephants compared with the other elephantid taxa, likely reflecting species differences in ancient geographic structure and range and differences in life history traits such as variance in male reproductive success.

  8. Deep Sequencing Analyses of Low Density Microbial Communities: Working at the Boundary of Accurate Microbiota Detection

    Science.gov (United States)

    Biesbroek, Giske; Sanders, Elisabeth A. M.; Roeselers, Guus; Wang, Xinhui; Caspers, Martien P. M.; Trzciński, Krzysztof

    2012-01-01

    Introduction Accurate analyses of microbiota composition of low-density communities (103–104 bacteria/sample) can be challenging. Background DNA from chemicals and consumables, extraction biases as well as differences in PCR efficiency can significantly interfere with microbiota assessment. This study was aiming to establish protocols for accurate microbiota analysis at low microbial density. Methods To examine possible effects of bacterial density on microbiota analyses we compared microbiota profiles of serial diluted saliva and low (nares, nasopharynx) and high-density (oropharynx) upper airway communities in four healthy individuals. DNA was extracted with four different extraction methods (Epicentre Masterpure, Qiagen DNeasy, Mobio Powersoil and a phenol bead-beating protocol combined with Agowa-Mag-mini). Bacterial DNA recovery was analysed by 16S qPCR and microbiota profiles through GS-FLX-Titanium-Sequencing of 16S rRNA gene amplicons spanning the V5–V7 regions. Results Lower template concentrations significantly impacted microbiota profiling results. With higher dilutions, low abundant species were overrepresented. In samples of bacteria per ml, e.g. DNA DNA extraction method determined if DNA levels were below or above 1 pg/µl and, together with lysis preferences per method, had profound impact on microbiota analyses in both relative abundance as well as representation of species. Conclusion This study aimed to interpret microbiota analyses of low-density communities. Bacterial density seemed to interfere with microbiota analyses at bacteria per ml or DNA <1 pg/µl. We therefore recommend this threshold for working with low density materials. This study underlines that bias reduction is crucial for adequate profiling of especially low-density bacterial communities. PMID:22412957

  9. Deep sequencing whole transcriptome exploration of the σE regulon in Neisseria meningitidis.

    Science.gov (United States)

    Huis in 't Veld, Robert Antonius Gerhardus; Willemsen, Antonius Marcellinus; van Kampen, Antonius Hubertus Cornelis; Bradley, Edward John; Baas, Frank; Pannekoek, Yvonne; van der Ende, Arie

    2011-01-01

    Bacteria live in an ever-changing environment and must alter protein expression promptly to adapt to these changes and survive. Specific response genes that are regulated by a subset of alternative σ(70)-like transcription factors have evolved in order to respond to this changing environment. Recently, we have described the existence of a σ(E) regulon including the anti-σ-factor MseR in the obligate human bacterial pathogen Neisseria meningitidis. To unravel the complete σ(E) regulon in N. meningitidis, we sequenced total RNA transcriptional content of wild type meningococci and compared it with that of mseR mutant cells (ΔmseR) in which σ(E) is highly expressed. Eleven coding genes and one non-coding gene were found to be differentially expressed between H44/76 wildtype and H44/76ΔmseR cells. Five of the 6 genes of the σ(E) operon, msrA/msrB, and the gene encoding a pepSY-associated TM helix family protein showed enhanced transcription, whilst aniA encoding a nitrite reductase and nspA encoding the vaccine candidate Neisserial surface protein A showed decreased transcription. Analysis of differential expression in IGRs showed enhanced transcription of a non-coding RNA molecule, identifying a σ(E) dependent small non-coding RNA. Together this constitutes the first complete exploration of an alternative σ-factor regulon in N. meningitidis. The results direct to a relatively small regulon indicative for a strictly defined response consistent with a relatively stable niche, the human throat, where N. meningitidis resides.

  10. Deep sequencing whole transcriptome exploration of the σE regulon in Neisseria meningitidis.

    Directory of Open Access Journals (Sweden)

    Robert Antonius Gerhardus Huis in 't Veld

    Full Text Available Bacteria live in an ever-changing environment and must alter protein expression promptly to adapt to these changes and survive. Specific response genes that are regulated by a subset of alternative σ(70-like transcription factors have evolved in order to respond to this changing environment. Recently, we have described the existence of a σ(E regulon including the anti-σ-factor MseR in the obligate human bacterial pathogen Neisseria meningitidis. To unravel the complete σ(E regulon in N. meningitidis, we sequenced total RNA transcriptional content of wild type meningococci and compared it with that of mseR mutant cells (ΔmseR in which σ(E is highly expressed. Eleven coding genes and one non-coding gene were found to be differentially expressed between H44/76 wildtype and H44/76ΔmseR cells. Five of the 6 genes of the σ(E operon, msrA/msrB, and the gene encoding a pepSY-associated TM helix family protein showed enhanced transcription, whilst aniA encoding a nitrite reductase and nspA encoding the vaccine candidate Neisserial surface protein A showed decreased transcription. Analysis of differential expression in IGRs showed enhanced transcription of a non-coding RNA molecule, identifying a σ(E dependent small non-coding RNA. Together this constitutes the first complete exploration of an alternative σ-factor regulon in N. meningitidis. The results direct to a relatively small regulon indicative for a strictly defined response consistent with a relatively stable niche, the human throat, where N. meningitidis resides.

  11. Deep sequencing analyses of low density microbial communities: working at the boundary of accurate microbiota detection.

    Directory of Open Access Journals (Sweden)

    Giske Biesbroek

    Full Text Available INTRODUCTION: Accurate analyses of microbiota composition of low-density communities (10(3-10(4 bacteria/sample can be challenging. Background DNA from chemicals and consumables, extraction biases as well as differences in PCR efficiency can significantly interfere with microbiota assessment. This study was aiming to establish protocols for accurate microbiota analysis at low microbial density. METHODS: To examine possible effects of bacterial density on microbiota analyses we compared microbiota profiles of serial diluted saliva and low (nares, nasopharynx and high-density (oropharynx upper airway communities in four healthy individuals. DNA was extracted with four different extraction methods (Epicentre Masterpure, Qiagen DNeasy, Mobio Powersoil and a phenol bead-beating protocol combined with Agowa-Mag-mini. Bacterial DNA recovery was analysed by 16S qPCR and microbiota profiles through GS-FLX-Titanium-Sequencing of 16S rRNA gene amplicons spanning the V5-V7 regions. RESULTS: Lower template concentrations significantly impacted microbiota profiling results. With higher dilutions, low abundant species were overrepresented. In samples of <10(5 bacteria per ml, e.g. DNA <1 pg/µl, microbiota profiling deviated from the original sample and other dilutions showing a significant increase in the taxa Proteobacteria and decrease in Bacteroidetes. In similar low density samples, DNA extraction method determined if DNA levels were below or above 1 pg/µl and, together with lysis preferences per method, had profound impact on microbiota analyses in both relative abundance as well as representation of species. CONCLUSION: This study aimed to interpret microbiota analyses of low-density communities. Bacterial density seemed to interfere with microbiota analyses at < than 10(6 bacteria per ml or DNA <1 pg/µl. We therefore recommend this threshold for working with low density materials. This study underlines that bias reduction is crucial for adequate

  12. Deep sequencing reveals small RNA characterization of invasive micropapillary carcinomas of the breast.

    Science.gov (United States)

    Li, Shuai; Yang, Cuicui; Zhai, Lili; Zhang, Wenwei; Yu, Jing; Gu, Feng; Lang, Ronggang; Fan, Yu; Gong, Meihua; Zhang, Xiuqing; Fu, Li

    2012-11-01

    Invasive micropapillary carcinoma (IMPC) is an uncommon histological type of breast cancer. IMPC has a special growth pattern and a more aggressive behavior than invasive ductal carcinomas of no special types (IDC-NSTs). microRNAs are a large class of non-coding RNAs involved in the regulation of various biological processes. Here, we analyzed the small RNA transcriptomes of five formalin-fixed paraffin-embedded (FFPE) pure IMPC samples and five FFPE IDC-NSTs samples by means of next-generation sequencing, generating a total of >170,000,000 clean reads. In an unsupervised cluster analysis, differently expressed miRNAs generated a tree with clear distinction between IMPC and IDC-NSTs classes. Paired fresh-frozen and FFPE specimens showed very similar miRNA expression profiles. By means of RT-qPCR, we further investigated miRNA expression in more IMPC (n = 22) and IDC-NSTs (n = 24) FFPE samples and found let-7b, miR-30c, miR-148a, miR-181a, miR-181a*, and miR-181b were significantly differently expressed between the two groups. We also elucidated several features of miRNA in these breast cancer tissues including 5' variability, miRNA editing, and 3' untemplated addition. Our findings will lead to further understanding of the invasive potency of IMPC and gain an insight into the diversity and complexity of small RNA molecules in breast cancer tissues.

  13. Antibody performance in ChIP-sequencing assays: From quality scores of public data sets to quantitative certification

    Science.gov (United States)

    Mendoza-Parra, Marco-Antonio; Saravaki, Vincent; Cholley, Pierre-Etienne; Blum, Matthias; Billoré, Benjamin; Gronemeyer, Hinrich

    2016-01-01

    We have established a certification system for antibodies to be used in chromatin immunoprecipitation assays coupled to massive parallel sequencing (ChIP-seq). This certification comprises a standardized ChIP procedure and the attribution of a numerical quality control indicator (QCi) to biological replicate experiments. The QCi computation is based on a universally applicable quality assessment that quantitates the global deviation of randomly sampled subsets of ChIP-seq dataset with the original genome-aligned sequence reads. Comparison with a QCi database for >28,000 ChIP-seq assays were used to attribute quality grades (ranging from ‘AAA’ to ‘DDD’) to a given dataset. In the present report we used the numerical QC system to assess the factors influencing the quality of ChIP-seq assays, including the nature of the target, the sequencing depth and the commercial source of the antibody.  We have used this approach specifically to certify mono and polyclonal antibodies obtained from Active Motif directed against the histone modification marks H3K4me3, H3K27ac and H3K9ac for ChIP-seq. The antibodies received the grades AAA to BBC ( www.ngs-qc.org). We propose to attribute such quantitative grading of all antibodies attributed with the label “ChIP-seq grade”. PMID:27335635

  14. Antibody performance in ChIP-sequencing assays: From quality scores of public data sets to quantitative certification.

    Science.gov (United States)

    Mendoza-Parra, Marco-Antonio; Saravaki, Vincent; Cholley, Pierre-Etienne; Blum, Matthias; Billoré, Benjamin; Gronemeyer, Hinrich

    2016-01-01

    We have established a certification system for antibodies to be used in chromatin immunoprecipitation assays coupled to massive parallel sequencing (ChIP-seq). This certification comprises a standardized ChIP procedure and the attribution of a numerical quality control indicator (QCi) to biological replicate experiments. The QCi computation is based on a universally applicable quality assessment that quantitates the global deviation of randomly sampled subsets of ChIP-seq dataset with the original genome-aligned sequence reads. Comparison with a QCi database for >28,000 ChIP-seq assays were used to attribute quality grades (ranging from 'AAA' to 'DDD') to a given dataset. In the present report we used the numerical QC system to assess the factors influencing the quality of ChIP-seq assays, including the nature of the target, the sequencing depth and the commercial source of the antibody.  We have used this approach specifically to certify mono and polyclonal antibodies obtained from Active Motif directed against the histone modification marks H3K4me3, H3K27ac and H3K9ac for ChIP-seq. The antibodies received the grades AAA to BBC ( www.ngs-qc.org). We propose to attribute such quantitative grading of all antibodies attributed with the label "ChIP-seq grade".

  15. Deep Sea Coral voucher sequence dataset - Identification of deep-sea corals collected during the 2009 - 2014 West Coast Groundfish Bottom Trawl Survey

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — Data for this project resides in the West Coast Groundfish Bottom Trawl Survey Database. Deep-sea corals are often components of trawling bycatch, though their...

  16. Draft Genome Sequence of Pseudoalteromonas sp. Strain XI10 Isolated from the Brine-Seawater Interface of Erba Deep in the Red Sea

    KAUST Repository

    Zhang, Guishan

    2016-03-10

    Pseudoalteromonas sp. strain XI10 was isolated from the brine-seawater interface of Erba Deep in the Red Sea, Saudi Arabia. Here, we present the draft genome sequence of strain XI10, a gammaproteobacterium that synthesizes polysaccharides for biofilm formation when grown in liquid culture.

  17. Draft Genome Sequences of TwoThiomicrospiraStrains Isolated from the Brine-Seawater Interface of Kebrit Deep in the Red Sea

    KAUST Repository

    Zhang, Guishan

    2016-03-11

    Two Thiomicrospira strains, WB1 and XS5, were isolated from the Kebrit Deep brine-seawater interface in the Red Sea, Saudi Arabia. Here, we present the draft genome sequences of these gammaproteobacteria, which both produce sulfuric acid from thiosulfate in culture.

  18. Identification and expression profiling of Vigna mungo microRNAs from leaf small RNA transcriptome by deep sequencing.

    Science.gov (United States)

    Paul, Sujay; Kundu, Anirban; Pal, Amita

    2014-01-01

    MicroRNAs (miRNAs) represent a class of small non-coding RNA molecules that play a crucial role in post-transcriptional gene regulation. Several conserved and species-specific miRNAs have been characterized to date, predominantly from the plant species whose genome is well characterized. However, information on the variability of these regulatory RNAs in economically important but genetically less characterized crop species are limited. Vigna mungo is an important grain legume, which is grown primarily for its protein-rich edible seeds. miRNAs from this species have not been identified to date due to lack of genome sequence information. To identify miRNAs from V. mungo, a small RNA library was constructed from young leaves. High-throughput Illumina sequencing technology and bioinformatic analysis of the small RNA reads led to the identification of 66 miRNA loci represented by 45 conserved miRNAs belonging to 19 families and eight non-conserved miRNAs belonging to seven families. Besides, 13 novel miRNA candidates in V. mungo were also identified. Expression patterns of selected conserved, non-conserved, and novel miRNA candidates have been demonstrated in leaf, stem, and root tissues by quantitative polymerase chain reaction, and potential target genes were predicted for most of the conserved miRNAs. This information offers genomic resources for better understanding of miRNA mediated post-transcriptional gene regulation.

  19. Identification and expression profiling of Vigna mungo microRNAs from leaf small RNA transcriptome by deep sequencing

    Institute of Scientific and Technical Information of China (English)

    Sujay Paul; Anirban Kundu; Amita Pal

    2014-01-01

    MicroRNAs (miRNAs) represent a class of small non-coding RNA molecules that play a crucial role in post-transcriptional gene regulation. Several conserved and species-specific miRNAs have been characterized to date, predominantly from the plant species whose genome is well characterized. However, information on the variability of these regulatory RNAs in economically important but genetically less characterized crop species are limited. Vigna mungo is an important grain legume, which is grown primarily for its protein-rich edible seeds. miRNAs from this species have not been identified to date due to lack of genome sequence information. To identify miRNAs from V. mungo, a small RNA library was constructed from young leaves. High-throughput Illumina sequencing technology and bioinformat-ic analysis of the small RNA reads led to the identification of 66 miRNA loci represented by 45 conserved miRNAs belonging to 19 families and eight non-conserved miRNAs belonging to seven families. Besides, 13 novel miRNA candidates in V. mungo were also identified. Expression patterns of selected conserved, non-conserved, and novel miRNA candidates have been demonstrated in leaf, stem, and root tissues by quantitative polymerase chain reaction, and potential target genes were predicted for most of the conserved miRNAs. This information offers genomic resour-ces for better understanding of miRNA mediated post-transcriptional gene regulation.

  20. Comparison and quantitative verification of mapping algorithms for whole genome bisulfite sequencing

    Science.gov (United States)

    Coupling bisulfite conversion with next-generation sequencing (Bisulfite-seq) enables genome-wide measurement of DNA methylation, but poses unique challenges for mapping. However, despite a proliferation of Bisulfite-seq mapping tools, no systematic comparison of their genomic coverage and quantitat...

  1. Refining transcriptional programs in kidney development by integration of deep RNA-sequencing and array-based spatial profiling

    Directory of Open Access Journals (Sweden)

    Rumballe Bree A

    2011-09-01

    Full Text Available Abstract Background The developing mouse kidney is currently the best-characterized model of organogenesis at a transcriptional level. Detailed spatial maps have been generated for gene expression profiling combined with systematic in situ screening. These studies, however, fall short of capturing the transcriptional complexity arising from each locus due to the limited scope of microarray-based technology, which is largely based on "gene-centric" models. Results To address this, the polyadenylated RNA and microRNA transcriptomes of the 15.5 dpc mouse kidney were profiled using strand-specific RNA-sequencing (RNA-Seq to a depth sufficient to complement spatial maps from pre-existing microarray datasets. The transcriptional complexity of RNAs arising from mouse RefSeq loci was catalogued; including 3568 alternatively spliced transcripts and 532 uncharacterized alternate 3' UTRs. Antisense expressions for 60% of RefSeq genes was also detected including uncharacterized non-coding transcripts overlapping kidney progenitor markers, Six2 and Sall1, and were validated by section in situ hybridization. Analysis of genes known to be involved in kidney development, particularly during mesenchymal-to-epithelial transition, showed an enrichment of non-coding antisense transcripts extended along protein-coding RNAs. Conclusion The resulting resource further refines the transcriptomic cartography of kidney organogenesis by integrating deep RNA sequencing data with locus-based information from previously published expression atlases. The added resolution of RNA-Seq has provided the basis for a transition from classical gene-centric models of kidney development towards more accurate and detailed "transcript-centric" representations, which highlights the extent of transcriptional complexity of genes that direct complex development events.

  2. Longitudinal copy number, whole exome and targeted deep sequencing of 'good risk' IGHV-mutated CLL patients with progressive disease.

    Science.gov (United States)

    Rose-Zerilli, M J J; Gibson, J; Wang, J; Tapper, W; Davis, Z; Parker, H; Larrayoz, M; McCarthy, H; Walewska, R; Forster, J; Gardiner, A; Steele, A J; Chelala, C; Ennis, S; Collins, A; Oakes, C C; Oscier, D G; Strefford, J C

    2016-06-01

    The biological features of IGHV-M chronic lymphocytic leukemia responsible for disease progression are still poorly understood. We undertook a longitudinal study close to diagnosis, pre-treatment and post relapse in 13 patients presenting with cMBL or Stage A disease and good-risk biomarkers (IGHV-M genes, no del(17p) or del(11q) and low CD38 expression) who nevertheless developed progressive disease, of whom 10 have required therapy. Using cytogenetics, fluorescence in situ hybridisation, genome-wide DNA methylation and copy number analysis together with whole exome, targeted deep- and Sanger sequencing at diagnosis, we identified mutations in established chronic lymphocytic leukemia driver genes in nine patients (69%), non-coding mutations (PAX5 enhancer region) in three patients and genomic complexity in two patients. Branching evolutionary trajectories predominated (n=9/13), revealing intra-tumoural epi- and genetic heterogeneity and sub-clonal competition before therapy. Of the patients subsequently requiring treatment, two had sub-clonal TP53 mutations that would not be detected by standard methodologies, three qualified for the very-low-risk category defined by integrated mutational and cytogenetic analysis and yet had established or putative driver mutations and one patient developed progressive, therapy-refractory disease associated with the emergence of an IGHV-U clone. These data suggest that extended genomic and immunogenetic screening may have clinical utility in patients with apparent good-risk disease.

  3. Complete characterization of the edited transcriptome of the mitochondrion of Physarum polycephalum using deep sequencing of RNA.

    Science.gov (United States)

    Bundschuh, R; Altmüller, J; Becker, C; Nürnberg, P; Gott, J M

    2011-08-01

    RNAs transcribed from the mitochondrial genome of Physarum polycephalum are heavily edited. The most prevalent editing event is the insertion of single Cs, with Us and dinucleotides also added at specific sites. The existence of insertional editing makes gene identification difficult and localization of editing sites has relied upon characterization of individual cDNAs. We have now determined the complete mitochondrial transcriptome of Physarum using Illumina deep sequencing of purified mitochondrial RNA. We report the first instances of A and G insertions and sites of partial and extragenic editing in Physarum mitochondrial RNAs, as well as an additional 772 C, U and dinucleotide insertions. The notable lack of antisense RNAs in our non-size selected, directional library argues strongly against an RNA-guided editing mechanism. Also of interest are our findings that sites of C to U changes are unedited at a significantly higher frequency than insertional editing sites and that substitutional editing of neighboring sites appears to be coupled. Finally, in addition to the characterization of RNAs from 17 predicted genes, our data identified nine new mitochondrial genes, four of which encode proteins that do not resemble other proteins in the database. Curiously, one of the latter mRNAs contains no editing sites.

  4. Deep sequencing of pyrethroid-resistant bed bugs reveals multiple mechanisms of resistance within a single population.

    Science.gov (United States)

    Adelman, Zach N; Kilcullen, Kathleen A; Koganemaru, Reina; Anderson, Michelle A E; Anderson, Troy D; Miller, Dini M

    2011-01-01

    A frightening resurgence of bed bug infestations has occurred over the last 10 years in the U.S. and current chemical methods have been inadequate for controlling this pest due to widespread insecticide resistance. Little is known about the mechanisms of resistance present in U.S. bed bug populations, making it extremely difficult to develop intelligent strategies for their control. We have identified bed bugs collected in Richmond, VA which exhibit both kdr-type (L925I) and metabolic resistance to pyrethroid insecticides. Using LD(50) bioassays, we determined that resistance ratios for Richmond strain bed bugs were ∼5200-fold to the insecticide deltamethrin. To identify metabolic genes potentially involved in the detoxification of pyrethroids, we performed deep-sequencing of the adult bed bug transcriptome, obtaining more than 2.5 million reads on the 454 titanium platform. Following assembly, analysis of newly identified gene transcripts in both Harlan (susceptible) and Richmond (resistant) bed bugs revealed several candidate cytochrome P450 and carboxylesterase genes which were significantly over-expressed in the resistant strain, consistent with the idea of increased metabolic resistance. These data will accelerate efforts to understand the biochemical basis for insecticide resistance in bed bugs, and provide molecular markers to assist in the surveillance of metabolic resistance.

  5. Deep sequencing of pyrethroid-resistant bed bugs reveals multiple mechanisms of resistance within a single population.

    Directory of Open Access Journals (Sweden)

    Zach N Adelman

    Full Text Available A frightening resurgence of bed bug infestations has occurred over the last 10 years in the U.S. and current chemical methods have been inadequate for controlling this pest due to widespread insecticide resistance. Little is known about the mechanisms of resistance present in U.S. bed bug populations, making it extremely difficult to develop intelligent strategies for their control. We have identified bed bugs collected in Richmond, VA which exhibit both kdr-type (L925I and metabolic resistance to pyrethroid insecticides. Using LD(50 bioassays, we determined that resistance ratios for Richmond strain bed bugs were ∼5200-fold to the insecticide deltamethrin. To identify metabolic genes potentially involved in the detoxification of pyrethroids, we performed deep-sequencing of the adult bed bug transcriptome, obtaining more than 2.5 million reads on the 454 titanium platform. Following assembly, analysis of newly identified gene transcripts in both Harlan (susceptible and Richmond (resistant bed bugs revealed several candidate cytochrome P450 and carboxylesterase genes which were significantly over-expressed in the resistant strain, consistent with the idea of increased metabolic resistance. These data will accelerate efforts to understand the biochemical basis for insecticide resistance in bed bugs, and provide molecular markers to assist in the surveillance of metabolic resistance.

  6. Identiifcation of microRNAs in two species of tomato,Solanum lycopersicum and Solanum habrochaites, by deep sequencing

    Institute of Scientific and Technical Information of China (English)

    FAN Shan-shan; LI Qian-nan; GUO Guang-jun; GAO Jian-chang; WANG Xiao-xuan; GUO Yan-mei; John C Snyder; DU Yong-chen

    2015-01-01

    MicroRNAs (miRNAs) are ~21 nucleotide (nt), endogenous RNAs that regulate gene expression in plants. Increasing evidence suggests that miRNAs play an important role in species-speciifc development in plants. However, the detailed miRNA proifle divergence has not been performed among tomato species. In this study, the smal RNA (sRNA) proifles of Solanum lycopersicumcultivar 9706 andSolanum habrochaites species PI 134417 were obtained by deep sequencing. Sixty-three known miRNA families were identiifed from these two species, of which 39 were common. Further miRNA proifle comparison showed that 24 known non-conserved miRNA families were species-speciifc between these two tomato species. In addition, six conserved miRNA families displayed an apparent divergent expression pattern between the two tomato species. Our results suggested that species-speciifc, non-conserved miRNAs and divergent expression of conserved miRNAs might contribute to developmental changes and phenotypic variation between the two tomato species. Twenty new miRNAs were also identiifed inS. lycopersicum. This research signiifcantly increases the number of known miRNA families in tomato and provides the ifrst set of smal RNAs inS. habrochaites. It also suggests that miRNAs have an important role in species-speciifc plant developmental regulation.

  7. Deep sequencing analysis reveals a TMV mutant with a poly(A) tract reduces host defense responses in Nicotiana benthamiana.

    Science.gov (United States)

    Guo, Song; Wong, Sek-Man

    2017-07-15

    Tobacco mosaic virus (TMV) possesses an upstream pseudoknotted domain (UPD), which is important for replication. After substituting the UPD with an internal poly(A) tract (43 nt), a mutant TMV-43A was constructed. TMV-43A replicated slower than TMV and induced a non-lethal mosaic symptom in Nicotiana benthamiana. In this study, deep sequencing was performed to detect the differences of small RNA profiles between TMV- and TMV-43A-infected N. benthamiana. The results showed that TMV-43A produced lesser amount of virus-derived interfering RNAs (vsiRNAs) than that of TMV. However, the distributions of vsiRNAs generation hotspots between TMV and TMV-43A were similar. Expression of genes related to small RNA biogenesis in TMV-43A-infected N. benthamiana was significantly lower than that of TMV, which leads to generation of lesser vsiRNAs. The expressions of host defense response genes were up-regulated after TMV infection, as compared to TMV-43A-infected plants. Host defense response to TMV-43A infection was lower than that to TMV. The absence of UPD might contribute to the reduced host response to TMV-43A. Our study provides valuable information in the role of the UPD in eliciting host response genes after TMV infection in N. benthamiana. (187 words). Copyright © 2017 Elsevier B.V. All rights reserved.

  8. Transmission Bottleneck Size Estimation from Pathogen Deep-Sequencing Data, with an Application to Human Influenza A Virus.

    Science.gov (United States)

    Sobel Leonard, Ashley; Weissman, Daniel B; Greenbaum, Benjamin; Ghedin, Elodie; Koelle, Katia

    2017-07-15

    The bottleneck governing infectious disease transmission describes the size of the pathogen population transferred from the donor to the recipient host. Accurate quantification of the bottleneck size is particularly important for rapidly evolving pathogens such as influenza virus, as narrow bottlenecks reduce the amount of transferred viral genetic diversity and, thus, may decrease the rate of viral adaptation. Previous studies have estimated bottleneck sizes governing viral transmission by using statistical analyses of variants identified in pathogen sequencing data. These analyses, however, did not account for variant calling thresholds and stochastic viral replication dynamics within recipient hosts. Because these factors can skew bottleneck size estimates, we introduce a new method for inferring bottleneck sizes that accounts for these factors. Through the use of a simulated data set, we first show that our method, based on beta-binomial sampling, accurately recovers transmission bottleneck sizes, whereas other methods fail to do so. We then apply our method to a data set of influenza A virus (IAV) infections for which viral deep-sequencing data from transmission pairs are available. We find that the IAV transmission bottleneck size estimates in this study are highly variable across transmission pairs, while the mean bottleneck size of 196 virions is consistent with a previous estimate for this data set. Furthermore, regression analysis shows a positive association between estimated bottleneck size and donor infection severity, as measured by temperature. These results support findings from experimental transmission studies showing that bottleneck sizes across transmission events can be variable and influenced in part by epidemiological factors.IMPORTANCE The transmission bottleneck size describes the size of the pathogen population transferred from the donor to the recipient host and may affect the rate of pathogen adaptation within host populations. Recent

  9. Transmission Bottleneck Size Estimation from Pathogen Deep-Sequencing Data, with an Application to Human Influenza A Virus

    Science.gov (United States)

    Sobel Leonard, Ashley; Weissman, Daniel B.; Greenbaum, Benjamin; Ghedin, Elodie

    2017-01-01

    ABSTRACT The bottleneck governing infectious disease transmission describes the size of the pathogen population transferred from the donor to the recipient host. Accurate quantification of the bottleneck size is particularly important for rapidly evolving pathogens such as influenza virus, as narrow bottlenecks reduce the amount of transferred viral genetic diversity and, thus, may decrease the rate of viral adaptation. Previous studies have estimated bottleneck sizes governing viral transmission by using statistical analyses of variants identified in pathogen sequencing data. These analyses, however, did not account for variant calling thresholds and stochastic viral replication dynamics within recipient hosts. Because these factors can skew bottleneck size estimates, we introduce a new method for inferring bottleneck sizes that accounts for these factors. Through the use of a simulated data set, we first show that our method, based on beta-binomial sampling, accurately recovers transmission bottleneck sizes, whereas other methods fail to do so. We then apply our method to a data set of influenza A virus (IAV) infections for which viral deep-sequencing data from transmission pairs are available. We find that the IAV transmission bottleneck size estimates in this study are highly variable across transmission pairs, while the mean bottleneck size of 196 virions is consistent with a previous estimate for this data set. Furthermore, regression analysis shows a positive association between estimated bottleneck size and donor infection severity, as measured by temperature. These results support findings from experimental transmission studies showing that bottleneck sizes across transmission events can be variable and influenced in part by epidemiological factors. IMPORTANCE The transmission bottleneck size describes the size of the pathogen population transferred from the donor to the recipient host and may affect the rate of pathogen adaptation within host populations

  10. RNA2DNAlign: nucleotide resolution allele asymmetries through quantitative assessment of RNA and DNA paired sequencing data.

    Science.gov (United States)

    Movassagh, Mercedeh; Alomran, Nawaf; Mudvari, Prakriti; Dede, Merve; Dede, Cem; Kowsari, Kamran; Restrepo, Paula; Cauley, Edmund; Bahl, Sonali; Li, Muzi; Waterhouse, Wesley; Tsaneva-Atanasova, Krasimira; Edwards, Nathan; Horvath, Anelia

    2016-12-15

    We introduce RNA2DNAlign, a computational framework for quantitative assessment of allele counts across paired RNA and DNA sequencing datasets. RNA2DNAlign is based on quantitation of the relative abundance of variant and reference read counts, followed by binomial tests for genotype and allelic status at SNV positions between compatible sequences. RNA2DNAlign detects positions with differential allele distribution, suggesting asymmetries due to regulatory/structural events. Based on the type of asymmetry, RNA2DNAlign outlines positions likely to be implicated in RNA editing, allele-specific expression or loss, somatic mutagenesis or loss-of-heterozygosity (the first three also in a tumor-specific setting). We applied RNA2DNAlign on 360 matching normal and tumor exomes and transcriptomes from 90 breast cancer patients from TCGA. Under high-confidence settings, RNA2DNAlign identified 2038 distinct SNV sites associated with one of the aforementioned asymetries, the majority of which have not been linked to functionality before. The performance assessment shows very high specificity and sensitivity, due to the corroboration of signals across multiple matching datasets. RNA2DNAlign is freely available from http://github.com/HorvathLab/NGS as a self-contained binary package for 64-bit Linux systems.

  11. Quantitative analysis of polycomb response elements (PREs at identical genomic locations distinguishes contributions of PRE sequence and genomic environment

    Directory of Open Access Journals (Sweden)

    Okulski Helena

    2011-03-01

    Full Text Available Abstract Background Polycomb/Trithorax response elements (PREs are cis-regulatory elements essential for the regulation of several hundred developmentally important genes. However, the precise sequence requirements for PRE function are not fully understood, and it is also unclear whether these elements all function in a similar manner. Drosophila PRE reporter assays typically rely on random integration by P-element insertion, but PREs are extremely sensitive to genomic position. Results We adapted the ΦC31 site-specific integration tool to enable systematic quantitative comparison of PREs and sequence variants at identical genomic locations. In this adaptation, a miniwhite (mw reporter in combination with eye-pigment analysis gives a quantitative readout of PRE function. We compared the Hox PRE Frontabdominal-7 (Fab-7 with a PRE from the vestigial (vg gene at four landing sites. The analysis revealed that the Fab-7 and vg PREs have fundamentally different properties, both in terms of their interaction with the genomic environment at each site and their inherent silencing abilities. Furthermore, we used the ΦC31 tool to examine the effect of deletions and mutations in the vg PRE, identifying a 106 bp region containing a previously predicted motif (GTGT that is essential for silencing. Conclusions This analysis showed that different PREs have quantifiably different properties, and that changes in as few as four base pairs have profound effects on PRE function, thus illustrating the power and sensitivity of ΦC31 site-specific integration as a tool for the rapid and quantitative dissection of elements of PRE design.

  12. Quantitative assessment of hematopoietic chimerism by quantitative real-time polymerase chain reaction of sequence polymorphism systems after hematopoietic stem cell transplantation

    Institute of Scientific and Technical Information of China (English)

    QIN Xiao-ying; WANG Jing-zhi; ZHANG Xiao-hui; LI Jin-lan; LI Ling-di; LIU Kai-yan; HUANG Xiao-jun; LI Guo-xuan; QIN Ya-zhen; WANG Yu; WANG Feng-rong; LIU Dai-hong; XU Lan-ping; CHEN Huan; HAN Wei

    2011-01-01

    Background Analysis of changes in recipient and donor hematopoietic cell origin is extremely useful to monitor the effect of hematopoietic stem cell transplantation (HSCT) and sequential adoptive immunotherapy by donor lymphocyte infusions. We developed a sensitive, reliable and rapid real-time PCR method based on sequence polymorphism systems to quantitatively assess the hematopoietic chimerism after HSCT. Methods A panel of 29 selected sequence polymorphism (SP) markers was screened by real-time PCR in 101 HSCT patients with leukemia and other hematological diseases. The chimerism kinetics of bone marrow samples of 8 HSCT patients in remission and relapse situations were followed longitudinally. Results Recipient genotype discrimination was possible in 97.0% (98 of 101) with a mean number of 2.5 (1-7)informative markers per recipient/donor pair. Using serial dilutions of plasmids containing specific SP markers, the linear correlation (r) of 0.99, the slope between -3.2 and -3.7 and the sensitivity of 0.1% were proved reproducible. By this method, it was possible to very accurately detect autologous signals in the range from 0.1% to 30%. The accuracy of the method in the very important range of autologous signals below 5% was extraordinarily high (standard deviation <1.85%),which might significantly improve detection accuracy of changes in autologous signals early in the post-transplantation course of follow-up. The main advantage of the real-time PCR method over short tandem repeat PCR chimerism assays is the absence of PCR competition and plateau biases, with demonstrated greater sensitivity and linearity. Finally, we prospectively analyzed bone marrow samples of 8 patients who received allografts and presented the chimerism kinetics of remission and relapse situations that illustrated the sensitivity level and the promising clinical application of this method. Conclusion This SP-based real-time PCR assay provides a rapid, sensitive, and accurate quantitative

  13. NNAlign: A Web-Based Prediction Method Allowing Non-Expert End-User Discovery of Sequence Motifs in Quantitative Peptide Data

    DEFF Research Database (Denmark)

    Andreatta, Massimo; Schafer-Nielsen, Claus; Lund, Ole

    2011-01-01

    to interpret large data sets. We have recently developed a method, NNAlign, which is generally applicable to any biological problem where quantitative peptide data is available. This method efficiently identifies underlying sequence patterns by simultaneously aligning peptide sequences and identifying motifs...

  14. Using whole-genome sequence data to predict quantitative trait phenotypes in Drosophila melanogaster.

    Directory of Open Access Journals (Sweden)

    Ulrike Ober

    Full Text Available Predicting organismal phenotypes from genotype data is important for plant and animal breeding, medicine, and evolutionary biology. Genomic-based phenotype prediction has been applied for single-nucleotide polymorphism (SNP genotyping platforms, but not using complete genome sequences. Here, we report genomic prediction for starvation stress resistance and startle response in Drosophila melanogaster, using ∼2.5 million SNPs determined by sequencing the Drosophila Genetic Reference Panel population of inbred lines. We constructed a genomic relationship matrix from the SNP data and used it in a genomic best linear unbiased prediction (GBLUP model. We assessed predictive ability as the correlation between predicted genetic values and observed phenotypes by cross-validation, and found a predictive ability of 0.239±0.008 (0.230±0.012 for starvation resistance (startle response. The predictive ability of BayesB, a Bayesian method with internal SNP selection, was not greater than GBLUP. Selection of the 5% SNPs with either the highest absolute effect or variance explained did not improve predictive ability. Predictive ability decreased only when fewer than 150,000 SNPs were used to construct the genomic relationship matrix. We hypothesize that predictive power in this population stems from the SNP-based modeling of the subtle relationship structure caused by long-range linkage disequilibrium and not from population structure or SNPs in linkage disequilibrium with causal variants. We discuss the implications of these results for genomic prediction in other organisms.

  15. Transcriptome profiling analysis of Mactra veneriformis by deep sequencing after exposure to 2,2',4,4'-tetrabromodiphenyl ether

    Science.gov (United States)

    Shi, Pengju; Dong, Shihang; Zhang, Huanjun; Wang, Peiliang; Niu, Zhuang; Fang, Yan

    2017-06-01

    Polybrominated diphenyl ethers (PBDEs) are ubiquitous global pollutants, which are known to have immune, development, reproduction, and endocrine toxicity in aquatic organisms, including bivalves. 2,2',4,4'-Tetrabromodiphenyl ether (BDE-47) is the predominant PBDE congener detected in environmental samples and the tissues of organisms. However, the mechanism of its toxicity remains unclear. In this study, high-throughput sequencing was performed using the clam Mactra veneriformis, a good model for toxicological research, to clarify the transcriptomic response to BDE-47 and the mechanism responsible for the toxicity of BDE-47. The clams were exposed to 5 μg/L BDE-47 for 3 days and the digestive glands were sampled for high-throughput sequencing analysis. We obtained 127648, 154225, and 124985 unigenes by de novo assembly of the control group reads (CG), BDE-47 group reads (BDEG), and control and BDE-47 reads (CG & BDEG), respectively. We annotated 32176 unigenes from the CG & BDEG reads using the NR database. We categorized 24401 unigenes into 25 functional COG clusters and 21749 unigenes were assigned to 259 KEGG pathways. Moreover, 17625 differentially expressed genes (DEGs) were detected, with 10028 upregulated DEGs and 7597 downregulated DEGs. Functional enrichment analysis showed that the DEGs were involved with detoxification, antioxidant defense, immune response, apoptosis, and other functions. The mRNA expression levels of 26 DEGs were verified by quantitative real-time PCR, which demonstrated the high agreement between the two methods. These results provide a good basis for future research using the M. veneriformis model into the mechanism of PBDEs toxicity and molecular biomarkers for BDE-47 pollution. The regulation and interaction of the DEGs would be studied in the future for clarifying the mechanism of PBDEs toxicity.

  16. Structural parameterization and functional prediction of antigenic polypeptome sequences with biological activity through quantitative sequence-activity models (QSAM) by molecular electronegativity edge-distance vector (VMED)

    Institute of Scientific and Technical Information of China (English)

    LI; ZhiLiang; WU; ShiRong; CHEN; ZeCong; YE; Nancy; YANG; ShengXi; LIAO; ChunYang; ZHANG; MengJun; YANG; Li; MEI; Hu; YANG; Yan; ZHAO; Na; ZHOU; Yuan; ZHOU; Ping; XIONG; Qing; XU; Hong; LIU; ShuShen; LING; ZiHua; CHEN; Gang; LI; GenRong

    2007-01-01

    Only from the primary structures of peptides, a new set of descriptors called the molecular electronegativity edge-distance vector (VMED) was proposed and applied to describing and characterizing the molecular structures of oligopeptides and polypeptides, based on the electronegativity of each atom or electronic charge index (ECI) of atomic clusters and the bonding distance between atom-pairs. Here, the molecular structures of antigenic polypeptides were well expressed in order to propose the automated technique for the computerized identification of helper T lymphocyte (Th) epitopes. Furthermore, a modified MED vector was proposed from the primary structures of polypeptides, based on the ECI and the relative bonding distance of the fundamental skeleton groups. The side-chains of each amino acid were here treated as a pseudo-atom. The developed VMED was easy to calculate and able to work. Some quantitative model was established for 28 immunogenic or antigenic polypeptides (AGPP) with 14 (1―14) Ad and 14 other restricted activities assigned as "1"(+) and "0"(-), respectively. The latter comprised 6 Ab(15-20), 3 Ak(21-23), 2 Ek(24-26), 2 H-2k(27 and 28) restricted sequences. Good results were obtained with 90% correct classification (only 2 wrong ones for 20 training samples) and 100% correct prediction (none wrong for 8 testing samples); while contrastively 100% correct classification (none wrong for 20 training samples) and 88% correct classification (1 wrong for 8 testing samples). Both stochastic samplings and cross validations were performed to demonstrate good performance. The described method may also be suitable for estimation and prediction of classes I and II for major histocompatibility antigen (MHC) epitope of human. It will be useful in immune identification and recognition of proteins and genes and in the design and development of subunit vaccines. Several quantitative structure activity relationship (QSAR) models were developed for various

  17. RNA deep sequencing reveals novel candidate genes and polymorphisms in boar testis and liver tissues with divergent androstenone levels.

    Directory of Open Access Journals (Sweden)

    Asep Gunawan

    Full Text Available Boar taint is an unpleasant smell and taste of pork meat derived from some entire male pigs. The main causes of boar taint are the two compounds androstenone (5α-androst-16-en-3-one and skatole (3-methylindole. It is crucial to understand the genetic mechanism of boar taint to select pigs for lower androstenone levels and thus reduce boar taint. The aim of the present study was to investigate transcriptome differences in boar testis and liver tissues with divergent androstenone levels using RNA deep sequencing (RNA-Seq. The total number of reads produced for each testis and liver sample ranged from 13,221,550 to 33,206,723 and 12,755,487 to 46,050,468, respectively. In testis samples 46 genes were differentially regulated whereas 25 genes showed differential expression in the liver. The fold change values ranged from -4.68 to 2.90 in testis samples and -2.86 to 3.89 in liver samples. Differentially regulated genes in high androstenone testis and liver samples were enriched in metabolic processes such as lipid metabolism, small molecule biochemistry and molecular transport. This study provides evidence for transcriptome profile and gene polymorphisms of boars with divergent androstenone level using RNA-Seq technology. Digital gene expression analysis identified candidate genes in flavin monooxygenease family, cytochrome P450 family and hydroxysteroid dehydrogenase family. Moreover, polymorphism and association analysis revealed mutation in IRG6, MX1, IFIT2, CYP7A1, FMO5 and KRT18 genes could be potential candidate markers for androstenone levels in boars. Further studies are required for proving the role of candidate genes to be used in genomic selection against boar taint in pig breeding programs.

  18. A deep sequencing approach to comparatively analyze the transcriptome of lifecycle stages of the filarial worm, Brugia malayi.

    Directory of Open Access Journals (Sweden)

    Young-Jun Choi

    2011-12-01

    Full Text Available BACKGROUND: Developing intervention strategies for the control of parasitic nematodes continues to be a significant challenge. Genomic and post-genomic approaches play an increasingly important role for providing fundamental molecular information about these parasites, thus enhancing basic as well as translational research. Here we report a comprehensive genome-wide survey of the developmental transcriptome of the human filarial parasite Brugia malayi. METHODOLOGY/PRINCIPAL FINDINGS: Using deep sequencing, we profiled the transcriptome of eggs and embryos, immature (≤3 days of age and mature microfilariae (MF, third- and fourth-stage larvae (L3 and L4, and adult male and female worms. Comparative analysis across these stages provided a detailed overview of the molecular repertoires that define and differentiate distinct lifecycle stages of the parasite. Genome-wide assessment of the overall transcriptional variability indicated that the cuticle collagen family and those implicated in molting exhibit noticeably dynamic stage-dependent patterns. Of particular interest was the identification of genes displaying sex-biased or germline-enriched profiles due to their potential involvement in reproductive processes. The study also revealed discrete transcriptional changes during larval development, namely those accompanying the maturation of MF and the L3 to L4 transition that are vital in establishing successful infection in mosquito vectors and vertebrate hosts, respectively. CONCLUSIONS/SIGNIFICANCE: Characterization of the transcriptional program of the parasite's lifecycle is an important step toward understanding the developmental processes required for the infectious cycle. We find that the transcriptional program has a number of stage-specific pathways activated during worm development. In addition to advancing our understanding of transcriptome dynamics, these data will aid in the study of genome structure and organization by facilitating

  19. MicroRNA deep sequencing reveals chamber-specific miR-208 family expression patterns in the human heart.

    Science.gov (United States)

    Kakimoto, Yu; Tanaka, Masayuki; Kamiguchi, Hiroshi; Hayashi, Hideki; Ochiai, Eriko; Osawa, Motoki

    2016-05-15

    Heart chamber-specific mRNA expression patterns have been extensively studied, and dynamic changes have been reported in many cardiovascular diseases. MicroRNAs (miRNAs) are also important regulators of normal cardiac development and functions that generally suppress gene expression at the posttranscriptional level. Recent focus has been placed on circulating miRNAs as potential biomarkers for cardiac disorders. However, miRNA expression levels in human normal hearts have not been thoroughly studied, and chamber-specific miRNA expression signatures in particular remain unclear. We performed miRNA deep sequencing on human paired left atria (LA) and ventricles (LV) under normal physiologic conditions. Among 438 miRNAs, miR-1 was the most abundant in both chambers, representing 21% of the miRNAs in LA and 26% in LV. A total of 25 miRNAs were differentially expressed between LA and LV; 14 were upregulated in LA, and 11 were highly expressed in LV. Notably, the miR-208 family in particular showed prominent chamber specificity; miR-208a-3p and miR-208a-5p were abundant in LA, whereas miR-208b-3p and miR-208b-5p were preferentially expressed in LV. Subsequent real-time polymerase chain reaction analysis validated the predominant expression of miR-208a in LA and miR-208b in LV. Human atrial and ventricular tissues display characteristic miRNA expression signatures under physiological conditions. Notably, miR-208a and miR-208b show significant chamber-specificity as do their host genes, α-MHC and β-MHC, which are mainly expressed in the atria and ventricles, respectively. These findings might also serve to enhance our understanding of cardiac miRNAs and various heart diseases. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  20. Deep sequencing of the murine Igh repertoire reveals complex regulation of non-random V gene rearrangement frequencies

    Science.gov (United States)

    Choi, Nancy M.; Loguercio, Salvatore; Verma-Gaur, Jiyoti; Degner, Stephanie C.; Torkamani, Ali; Su, Andrew I.; Oltz, Eugene M.; Artyomov, Maxim; Feeney, Ann J.

    2013-01-01

    A diverse antibody repertoire is formed through the rearrangement of V, D, and J segments at the immunoglobulin heavy chain (Igh) loci. The C57BL/6 murine Igh locus has over 100 functional VH gene segments that can recombine to a rearranged DJH. While the non-random usage of VH genes is well documented, it is not clear what elements determine recombination frequency. To answer this question we conducted deep sequencing of 5′-RACE products of the Igh repertoire in pro-B cells, amplified in an unbiased manner. ChIP-seq results for several histone modifications and RNA polymerase II binding, RNA-seq for sense and antisense non-coding germline transcripts, and proximity to CTCF and Rad21 sites were compared to the usage of individual V genes. Computational analyses assessed the relative importance of these various accessibility elements. These elements divide the Igh locus into four epigenetically and transcriptionally distinct domains, and our computational analyses reveal different regulatory mechanisms for each region. Proximal V genes are relatively devoid of active histone marks and non-coding RNA in general, but having a CTCF site near their RSS is critical, suggesting that being positioned near the base of the chromatin loops is important for rearrangement. In contrast, distal V genes have higher levels of histone marks and non-coding RNA, which may compensate for their poorer RSSs and for being distant from CTCF sites. Thus, the Igh locus has evolved a complex system for the regulation of V(D)J rearrangement that is different for each of the four domains that comprise this locus. PMID:23898036

  1. Modeling sequence-specific polymers using anisotropic coarse-grained sites allows quantitative comparison with experiment

    CERN Document Server

    Haxton, Thomas K; Zuckermann, Ronald N; Whitelam, Stephen

    2014-01-01

    Certain sequences of peptoid polymers (synthetic analogs of peptides) assemble into bilayer nanosheets via a nonequilibrium assembly pathway of adsorption, compression, and collapse at an air-water interface. As with other large-scale dynamic processes in biology and materials science, understanding the details of this supramolecular assembly process requires a modeling approach that captures behavior on a wide range of length and time scales, from those on which individual sidechains fluctuate to those on which assemblies of polymers evolve. Here we demonstrate that a new coarse-grained modeling approach is accurate and computationally efficient enough to do so. Our approach uses only a minimal number of coarse-grained sites, but retains independently fluctuating orientational degrees of freedom for each site. These orientational degrees of freedom allow us to accurately parameterize both bonded and nonbonded interactions, and to generate all-atom configurations with sufficient accuracy to perform atomic sca...

  2. Feasibility of noninvasive quantitative measurements of intrarenal R(2) ' in humans using an asymmetric spin echo echo planar imaging sequence.

    Science.gov (United States)

    Zhang, Xiaodong; Zhang, Yudong; Yang, Xuedong; Wang, Xiaoying; An, Hongyu; Zhang, Jue; Fang, Jing

    2013-01-01

    The purpose of this study was to demonstrate the feasibility of an asymmetric spin echo (ASE) single-shot echo planar imaging (EPI) sequence for the noninvasive quantitative measurement of intrarenal R(2) ' in humans within 20 s. The reproducibility of R(2) ' measurements with the ASE-EPI sequence was assessed in nine healthy young subjects in repeated studies conducted over three consecutive days. Moreover, we also evaluated whether the ASE-EPI sequence-measured R(2) ' reflected the intrarenal oxygenation changes induced by furosemide in another group of normal human subjects (n = 10). Different flow attenuation gradients (b = 0, 40 and 80 s/mm(2) ) were utilized to examine the impact of the intravascular signal contribution on the estimation of intrarenal R(2) '. In the absence of flow dephasing gradients (b = 0 s/mm(2) ), the computed coefficient of variation (CV) of R(2) ' was 21.31 ± 4.52%, and the estimated R(2) ' value decreased slightly, but not statistically significantly (p > 0.05), after the administration of furosemide in the medullary region. However, CV of R(2) ' was much smaller in the presence of flow dephasing gradients (9.68 ± 3.58% with b = 40 s/mm(2) and 10.50 ± 3.62% with b = 80 s/mm(2) ). Moreover, a significant reduction in R(2) ' in the renal medulla was obtained (p R(2) ' measurements did not differ between the b = 40 s/mm(2) and b = 80 s/mm(2) scans, suggesting that small diffusion gradients were sufficient to minimize the intravascular signal contribution. In summary, we have demonstrated that renal R(2) ' can be obtained rapidly using an ASE-EPI sequence. The measurement was highly reproducible and reflected the expected intrarenal oxygenation changes induced by furosemide.

  3. deepBase v2.0: identification, expression, evolution and function of small RNAs, LncRNAs and circular RNAs from deep-sequencing data.

    Science.gov (United States)

    Zheng, Ling-Ling; Li, Jun-Hao; Wu, Jie; Sun, Wen-Ju; Liu, Shun; Wang, Ze-Lin; Zhou, Hui; Yang, Jian-Hua; Qu, Liang-Hu

    2016-01-04

    Small non-coding RNAs (e.g. miRNAs) and long non-coding RNAs (e.g. lincRNAs and circRNAs) are emerging as key regulators of various cellular processes. However, only a very small fraction of these enigmatic RNAs have been well functionally characterized. In this study, we describe deepBase v2.0 (http://biocenter.sysu.edu.cn/deepBase/), an updated platform, to decode evolution, expression patterns and functions of diverse ncRNAs across 19 species. deepBase v2.0 has been updated to provide the most comprehensive collection of ncRNA-derived small RNAs generated from 588 sRNA-Seq datasets. Moreover, we developed a pipeline named lncSeeker to identify 176 680 high-confidence lncRNAs from 14 species. Temporal and spatial expression patterns of various ncRNAs were profiled. We identified approximately 24 280 primate-specific, 5193 rodent-specific lncRNAs, and 55 highly conserved lncRNA orthologs between human and zebrafish. We annotated 14 867 human circRNAs, 1260 of which are orthologous to mouse circRNAs. By combining expression profiles and functional genomic annotations, we developed lncFunction web-server to predict the function of lncRNAs based on protein-lncRNA co-expression networks. This study is expected to provide considerable resources to facilitate future experimental studies and to uncover ncRNA functions.

  4. Ultra-deep T cell receptor sequencing reveals the complexity and intratumour heterogeneity of T cell clones in renal cell carcinomas.

    Science.gov (United States)

    Gerlinger, Marco; Quezada, Sergio A; Peggs, Karl S; Furness, Andrew J S; Fisher, Rosalie; Marafioti, Teresa; Shende, Vishvesh H; McGranahan, Nicholas; Rowan, Andrew J; Hazell, Steven; Hamm, David; Robins, Harlan S; Pickering, Lisa; Gore, Martin; Nicol, David L; Larkin, James; Swanton, Charles

    2013-12-01

    The recognition of cancer cells by T cells can impact upon prognosis and be exploited for immunotherapeutic approaches. This recognition depends on the specific interaction between antigens displayed on the surface of cancer cells and the T cell receptor (TCR), which is generated by somatic rearrangements of TCR α- and β-chains (TCRb). Our aim was to assess whether ultra-deep sequencing of the rearranged TCRb in DNA extracted from unfractionated clear cell renal cell carcinoma (ccRCC) samples can provide insights into the clonality and heterogeneity of intratumoural T cells in ccRCCs, a tumour type that can display extensive genetic intratumour heterogeneity (ITH). For this purpose, DNA was extracted from two to four tumour regions from each of four primary ccRCCs and was analysed by ultra-deep TCR sequencing. In parallel, tumour infiltration by CD4, CD8 and Foxp3 regulatory T cells was evaluated by immunohistochemistry and correlated with TCR-sequencing data. A polyclonal T cell repertoire with 367-16 289 (median 2394) unique TCRb sequences was identified per tumour region. The frequencies of the 100 most abundant T cell clones/tumour were poorly correlated between most regions (Pearson correlation coefficient, -0.218 to 0.465). 3-93% of these T cell clones were not detectable across all regions. Thus, the clonal composition of T cell populations can be heterogeneous across different regions of the same ccRCC. T cell ITH was higher in tumours pretreated with an mTOR inhibitor, which could suggest that therapy can influence adaptive tumour immunity. These data show that ultra-deep TCR-sequencing technology can be applied directly to DNA extracted from unfractionated tumour samples, allowing novel insights into the clonality of T cell populations in cancers. These were polyclonal and displayed ITH in ccRCC. TCRb sequencing may shed light on mechanisms of cancer immunity and the efficacy of immunotherapy approaches.

  5. Ultra-Deep Sequencing Analysis of the Hepatitis A Virus 5'-Untranslated Region among Cases of the Same Outbreak from a Single Source

    Science.gov (United States)

    Wu, Shuang; Nakamoto, Shingo; Kanda, Tatsuo; Jiang, Xia; Nakamura, Masato; Miyamura, Tatsuo; Shirasawa, Hiroshi; Sugiura, Nobuyuki; Takahashi-Nakaguchi, Azusa; Gonoi, Tohru; Yokosuka, Osamu

    2014-01-01

    Hepatitis A virus (HAV) is a causative agent of acute viral hepatitis for which an effective vaccine has been developed. Here we describe ultra-deep pyrosequences (UDPSs) of HAV 5'-untranslated region (5'UTR) among cases of the same outbreak, which arose from a single source, associated with a revolving sushi bar. We determined the reference sequence from HAV-derived clone from an attendant by the Sanger method. Sixteen UDPSs from this outbreak and one from another sporadic case were compared with this reference. Nucleotide errors yielded a UDPS error rate of hepatitis A and HAV 5'UTR substitutions. It might be more interesting to perform ultra-deep sequencing of full length HAV genome in order to reveal possible unknown genomic determinants associated with disease severity. Further studies will be needed. PMID:24396287

  6. Quantitative and qualitative differences in celiac disease epitopes among durum wheat varieties identified through deep RNA-amplican sequencing

    NARCIS (Netherlands)

    Salentijn, E.M.J.; Esselink, D.G.; Goryunova, S.V.; Meer, van der I.M.; Gilissen, L.J.W.J.; Smulders, M.J.M.

    2013-01-01

    Background - Wheat gluten is important for the industrial quality of bread wheat (Triticum aestivum L.) and durum wheat (T. turgidum L.). Gluten proteins are also the source of immunogenic peptides that can trigger a T cell reaction in celiac disease (CD) patients, leading to inflammatory responses

  7. Ultrasensitive quantitation of human papillomavirus type 16 E6 oncogene sequences by nested real time PCR

    Directory of Open Access Journals (Sweden)

    López-Revilla Rubén

    2010-05-01

    Full Text Available Abstract Background We have developed an ultrasensitive method based on conventional PCR preamplification followed by nested amplification through real time PCR (qPCR in the presence of the DNA intercalating agent EvaGreen. Results Amplification mixtures calibrated with a known number of pHV101 copies carrying a 645 base pair (bp-long insert of the human papillomavirus type 16 (HPV16 E6 oncogene were used to generate the E6-1 amplicon of 645 bp by conventional PCR and then the E6-2 amplicon of 237 bp by nested qPCR. Direct and nested qPCR mixtures for E6-2 amplification corresponding to 2.5 × 102-2.5 × 106 initial pHV101 copies had threshold cycle (Ct values in the ranges of 18.7-29.0 and 10.0-25.0, respectively. The Ct of qPCR mixtures prepared with 1/50 volumes of preamplified mixtures containing 50 ng of DNA of the SiHa cell line (derived from an invasive cervical cancer with one HPV16 genome per cell was 19.9. Thermal fluorescence extinction profiles of E6-2 amplicons generated from pHV101 and SiHa DNA were identical, with a peak at 85.5°C. Conclusions Our method based on conventional preamplification for 15 cycles increased 10,750 times the sensitivity of nested qPCR for the quantitation of the E6 viral oncogene and confirmed that the SiHa cell line contains one E6-HPV16 copy per cell.

  8. Complete Genome Sequence of the Hyperthermophilic Archaeon Pyrococcus sp. Strain ST04, Isolated from a Deep-Sea Hydrothermal Sulfide Chimney on the Juan de Fuca Ridge

    Science.gov (United States)

    Jung, Jong-Hyun; Lee, Ju-Hoon; Holden, James F.; Seo, Dong-Ho; Shin, Hakdong; Kim, Hae-Yeong; Kim, Wooki; Ryu, Sangryeol

    2012-01-01

    Pyrococcus sp. strain ST04 is a hyperthermophilic, anaerobic, and heterotrophic archaeon isolated from a deep-sea hydrothermal sulfide chimney on the Endeavour Segment of the Juan de Fuca Ridge in the northeastern Pacific Ocean. To further understand the distinct characteristics of this archaeon at the genome level (polysaccharide utilization at high temperature and ATP generation by a Na+ gradient), the genome of strain ST04 was completely sequenced and analyzed. Here, we present the complete genome sequence analysis results of Pyrococcus sp. ST04 and report the major findings from the genome annotation, with a focus on its saccharolytic and metabolite production potential. PMID:22843576

  9. Higher and lower-level relationships of the deep-sea fish order Alepocephaliformes (Teleostei: Otocephala) inferred from whole mitogenome sequences

    DEFF Research Database (Denmark)

    Poulsen, Jan Yde; Møller, Peter Rask; Lavoué, Sébastien

    2009-01-01

    Fishes of the order Alepocephaliformes, slickheads and tubeshoulders, constitute a group of deep-sea fishes poorly known in respect to most areas of their biology and systematics. Morphological studies have found alepocephaliform fishes to display a mosaic of synapomorphic and symplesiomorphic...... are alepocephaliforms and unambiguously aligned sequences were subjected to partitioned maximum likelihood and Bayesian analyses. Results from the present study support Alepocephaliformes as a genetically distinct otocephalan order as sister clade to Ostariophysi (mostly freshwater fishes comprising Gonorynchiformes...

  10. Using scores of amino acid topological descriptors for quantitative sequence-mobility modeling of peptides based on support vector machine

    Institute of Scientific and Technical Information of China (English)

    LIANG Guizhao; YANG Shanbin; ZHOU Yuan; ZHOU Peng; LI Zhiliang

    2006-01-01

    Scores of amino acid topological descriptors (SATD) derived from principle components analysis of a matrix of 1262 structural variables related to 23 amino acids were employed to express the structure of 125 peptides in different length.Quantitative sequence-mobility modelings (QSMMs)were constructed using partial least square (PLS)and support vector machine (SVM), respectively. As new amino acid scales, SATD including plentiful information related to biological activity were easily manipulated. Better results were obtained compared to those obtained with PLS, which indicated that SVM presented robust stability and excellent predictive ability for electrophoretic mobilities. These results show that there is a wide prospect for the applications of SATD and SVM regression in QSMMs.

  11. Combining real-time PCR and next-generation DNA sequencing to provide quantitative comparisons of fungal aerosol populations

    Science.gov (United States)

    Dannemiller, Karen C.; Lang-Yona, Naama; Yamamoto, Naomichi; Rudich, Yinon; Peccia, Jordan

    2014-02-01

    We examined fungal communities associated with the PM10 mass of Rehovot, Israel outdoor air samples collected in the spring and fall seasons. Fungal communities were described by 454 pyrosequencing of the internal transcribed spacer (ITS) region of the fungal ribosomal RNA encoding gene. To allow for a more quantitative comparison of fungal exposure in humans, the relative abundance values of specific taxa were transformed to absolute concentrations through multiplying these values by the sample's total fungal spore concentration (derived from universal fungal qPCR). Next, the sequencing-based absolute concentrations for Alternaria alternata, Cladosporium cladosporioides, Epicoccum nigrum, and Penicillium/Aspergillus spp. were compared to taxon-specific qPCR concentrations for A. alternata, C. cladosporioides, E. nigrum, and Penicillium/Aspergillus spp. derived from the same spring and fall aerosol samples. Results of these comparisons showed that the absolute concentration values generated from pyrosequencing were strongly associated with the concentration values derived from taxon-specific qPCR (for all four species, p 0.70). The correlation coefficients were greater for species present in higher concentrations. Our microbial aerosol population analyses demonstrated that fungal diversity (number of fungal operational taxonomic units) was higher in the spring compared to the fall (p = 0.02), and principal coordinate analysis showed distinct seasonal differences in taxa distribution (ANOSIM p = 0.004). Among genera containing allergenic and/or pathogenic species, the absolute concentrations of Alternaria, Aspergillus, Fusarium, and Cladosporium were greater in the fall, while Cryptococcus, Penicillium, and Ulocladium concentrations were greater in the spring. The transformation of pyrosequencing fungal population relative abundance data to absolute concentrations can improve next-generation DNA sequencing-based quantitative aerosol exposure assessment.

  12. An efficient strategy of screening for pathogens in wild-caught ticks and mosquitoes by reusing small RNA deep sequencing data.

    Directory of Open Access Journals (Sweden)

    Lu Zhuang

    Full Text Available This paper explored our hypothesis that sRNA (18 ∼ 30 bp deep sequencing technique can be used as an efficient strategy to identify microorganisms other than viruses, such as prokaryotic and eukaryotic pathogens. In the study, the clean reads derived from the sRNA deep sequencing data of wild-caught ticks and mosquitoes were compared against the NCBI nucleotide collection (non-redundant nt database using Blastn. The blast results were then analyzed with in-house Python scripts. An empirical formula was proposed to identify the putative pathogens. Results showed that not only viruses but also prokaryotic and eukaryotic species of interest can be screened out and were subsequently confirmed with experiments. Specially, a novel Rickettsia spp. was indicated to exist in Haemaphysalis longicornis ticks collected in Beijing. Our study demonstrated the reuse of sRNA deep sequencing data would have the potential to trace the origin of pathogens or discover novel agents of emerging/re-emerging infectious diseases.

  13. Meta-analysis of quantitative pleiotropic traits for next-generation sequencing with multivariate functional linear models.

    Science.gov (United States)

    Chiu, Chi-Yang; Jung, Jeesun; Chen, Wei; Weeks, Daniel E; Ren, Haobo; Boehnke, Michael; Amos, Christopher I; Liu, Aiyi; Mills, James L; Ting Lee, Mei-Ling; Xiong, Momiao; Fan, Ruzong

    2017-02-01

    To analyze next-generation sequencing data, multivariate functional linear models are developed for a meta-analysis of multiple studies to connect genetic variant data to multiple quantitative traits adjusting for covariates. The goal is to take the advantage of both meta-analysis and pleiotropic analysis in order to improve power and to carry out a unified association analysis of multiple studies and multiple traits of complex disorders. Three types of approximate F -distributions based on Pillai-Bartlett trace, Hotelling-Lawley trace, and Wilks's Lambda are introduced to test for association between multiple quantitative traits and multiple genetic variants. Simulation analysis is performed to evaluate false-positive rates and power of the proposed tests. The proposed methods are applied to analyze lipid traits in eight European cohorts. It is shown that it is more advantageous to perform multivariate analysis than univariate analysis in general, and it is more advantageous to perform meta-analysis of multiple studies instead of analyzing the individual studies separately. The proposed models require individual observations. The value of the current paper can be seen at least for two reasons: (a) the proposed methods can be applied to studies that have individual genotype data; (b) the proposed methods can be used as a criterion for future work that uses summary statistics to build test statistics to meta-analyze the data.

  14. Discovery of MicroRNAs associated with myogenesis by deep sequencing of serial developmental skeletal muscles in pigs.

    Directory of Open Access Journals (Sweden)

    Xinhua Hou

    Full Text Available MicroRNAs (miRNAs are short, single-stranded non-coding RNAs that repress their target genes by binding their 3' UTRs. These RNAs play critical roles in myogenesis. To gain knowledge about miRNAs involved in the regulation of myogenesis, porcine longissimus muscles were collected from 18 developmental stages (33-, 40-, 45-, 50-, 55-, 60-, 65-, 70-, 75-, 80-, 85-, 90-, 95-, 100- and 105-day post-gestation fetuses, 0 and 10-day postnatal piglets and adult pigs to identify miRNAs using Solexa sequencing technology. We detected 197 known miRNAs and 78 novel miRNAs according to comparison with known miRNAs in the miRBase (release 17.0 database. Moreover, variations in sequence length and single nucleotide polymorphisms were also observed in 110 known miRNAs. Expression analysis of the 11 most abundant miRNAs were conducted using quantitative PCR (qPCR in eleven tissues (longissimus muscles, leg muscles, heart, liver, spleen, lung, kidney, stomach, small intestine and colon, and the results revealed that ssc-miR-378, ssc-miR-1 and ssc-miR-206 were abundantly expressed in skeletal muscles. During skeletal muscle development, the expression level of ssc-miR-378 was low at 33 days post-coitus (dpc, increased at 65 and 90 dpc, peaked at postnatal day 0, and finally declined and maintained a comparatively stable level. This expression profile suggested that ssc-miR-378 was a new candidate miRNA for myogenesis and participated in skeletal muscle development in pigs. Target prediction and KEGG pathway analysis suggested that bone morphogenetic protein 2 (BMP2 and mitogen-activated protein kinase 1 (MAPK1, both of which were relevant to proliferation and differentiation, might be the potential targets of miR-378. Luciferase activities of report vectors containing the 3'UTR of porcine BMP2 or MAPK1 were downregulated by miR-378, which suggested that miR-378 probably regulated myogenesis though the regulation of these two genes.

  15. Quantitative evaluaiton of porphyromonas gingivalis before and after non- surgical periodontal treatment in deep pockets of patients with aggressive periodontitis

    Directory of Open Access Journals (Sweden)

    Kadkhoda Z.

    2004-08-01

    Full Text Available Statement of Problem: Elimination of porphyromonas gingivalis (p.g from subgingival area in order to successfully treatment out comes in patients with Aggressive periodntitis AP is necessary. Purpose: The aim of this study was the evaluation of non-surgical treatment efficacy in reduction of bacterial population in deep pockets. Materials and Methods: In this randomized clinical trial study we evaluated the result of non- surgical therapy on reduction of p.g count from deep pockets of patients with aggressive periodontitis that had at least one (p.g plus deep pocket (>5mm in each quadrant. At first stage of non-surgical treatment intra pocket irrigation with chlorhexidin was done after scaling and root planning for all patients. In second stage (one week later antibiotics including amoxicillin- metronidazol prescribed for ten days. At base line, one, six and twelve weeks after beginning of therapy, microbial samples, plaque index, bleeding on probing index and probing pocket index were recorded. Result: There was statistically important difference between one and six weeks after treatment with base line in colony count of p.g and all of clinical indices. But in 12 weeks after therapy just, PI and PPD had statistical difference with base line. In this stage, colony count and BOP was reduced but this reduction had not statistically important difference with base line. Conclusion: Thus in present study our non- surgical strategy in elimination of p.g and clinical improvement was successful in short time but three month after therapy recurrence of disease happened in some patients.

  16. Quantitative and Depth-Resolved Investigation of Deep-Level Defects in InGaN/GaN Heterostructures

    Energy Technology Data Exchange (ETDEWEB)

    Armstrong, Andrew [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Crawford, Mary H. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Koleske, Daniel D. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2010-12-24

    Deep-level defects in In0.17Ga0.83N/In0.02Ga0.98N/p-GaN:Mg heterostructures were studied using deep-level optical spectroscopy (DLOS). Depth-resolved DLOS was achieved by exploiting the polarization-induced electric fields to discriminate among defects located in the In0.17Ga0.83N and the In0.02Ga0.98N regions. Growth conditions for the Inx Ga1-x N layers were nominally the same as those in InGaN/GaN multi-quantum-well (MQW) structures, so the defect states reported here are expected to be active in MQW regions. Thus, this work provides important insight into defects that are likely to influence MQW radiative efficiency. In0.17Ga0.83N-related bandgap states were observed at Ev + 1.60 eV and Ev + 2.59 eV, where Ev is the valence-band maximum, compared with levels at Ev + 1.85 eV, Ev + 2.51 eV, and Ev + 3.30 eV in the In0.02Ga0.98N region. A lighted capacitance–voltage technique was used to determine the areal density of deep states. The possible origins of the associated defects are considered along with their potential roles in light-emitting diodes.

  17. Quantitative sampling of nanobiota (microbiota) of the deep-sea benthos—III. The bathyal San Diego trough

    Science.gov (United States)

    Burnett, Bryan R.

    1981-07-01

    Nanobiota (microbiota) from the 1200-m bottom of the San Diego Trough were sampled in 5-m layers to approx. 100 mm deep in the sediment. Unlike the macrofauna, the nanobiota were relatively uniformly distributed to at least 60 and perhaps to 100-mm depths. However, there is probably a thin surface film richer both in numbers and protoplasm volume (biovolume) than the sediment layer immediately below. Yeast-like cells were the predominant nanobiotal organisms, typically constituting over 70% of the biovolume of the sediment. Yeast-like cells may occupy part of the decomposer niche normally occupied by bacteria in marine sediments from shallower depths.

  18. Quantitative, small-scale, fluorophore-assisted carbohydrate electrophoresis implemented on a capillary electrophoresis-based DNA sequence analyzer.

    Science.gov (United States)

    Murray, Sarah; McKenzie, Marian; Butler, Ruth; Baldwin, Samantha; Sutton, Kevin; Batey, Ian; Timmerman-Vaughan, Gail M

    2011-06-15

    Fluorophore-assisted carbohydrate electrophoresis (FACE) is an analytical method for characterizing carbohydrate chain length that has been applied to neutral, charged, and N-linked oligosaccharides and that has been implemented using diverse separation platforms, including polyacrylamide gel electrophoresis and capillary electrophoresis. In this article, we describe three substantial improvements to FACE: (i) reducing the amount of starch and APTS required in labeling reactions and systematically analyzing the effect of altering the starch and 8-amino-1,3,6-pyrenetrisulfonic acid (APTS) concentrations on the reproducibility of the FACE peak area distributions; (ii) implementing FACE on a multiple capillary DNA sequencer (an ABI 3130xl), enabling higher throughput than is possible on other separation platforms; and (iii) developing a protocol for producing quantitative output of peak heights and areas using genetic marker analysis software. The results of a designed experiment to determine the effect of decreasing both the starch and fluorophore concentrations on the sensitivity and reproducibility of FACE electrophoregrams are presented. Analysis of the peak area distributions of the FACE electrophoregrams identified the labeling reaction conditions that resulted in the smallest variances in the peak area distributions while retaining strong fluorescence signals from the capillary-based DNA sequencer.

  19. Identification and characterization of microRNAs by deep-sequencing in Hyalomma anatolicum anatolicum (Acari: Ixodidae) ticks.

    Science.gov (United States)

    Luo, Jin; Liu, Guang-Yuan; Chen, Ze; Ren, Qiao-Yun; Yin, Hong; Luo, Jian-Xun; Wang, Hui

    2015-06-15

    Hyalomma anatolicum anatolicum (H.a. anatolicum) (Acari: Ixodidae) ticks are globally distributed ectoparasites with veterinary and medical importance. These ticks not only weaken animals by sucking their blood but also transmit different species of parasitic protozoans. Multiple factors influence these parasitic infections including miRNAs, which are non-coding, small regulatory RNA molecules essential for the complex life cycle of parasites. To identify and characterize miRNAs in H.a. anatolicum, we developed an integrative approach combining deep sequencing, bioinformatics and real-time PCR analysis. Here we report the use of this approach to identify miRNA expression, family distribution, and nucleotide characteristics, and discovered novel miRNAs in H.a. anatolicum. The result showed that miR-1-3p, miR-275-3p, and miR-92a were expressed abundantly. There was a strong bias on miRNA, family members, and nucleotide compositions at certain positions in H.a. anatolicum miRNA. Uracil was the dominant nucleotide, particularly at positions 1, 6, 16, and 18, which were located approximately at the beginning, middle, and end of conserved miRNAs. Analysis of the conserved miRNAs indicated that miRNAs in H.a. anatolicum were concentrated along three diverse phylogenetic branches of bilaterians, insects and coelomates. Two possible roles for the use of miRNA in H.a. anatolicum could be presumed based on its parasitic life cycle: to maintain a large category of miRNA families of different animals, and/or to preserve stringent conserved seed regions with active changes in other places of miRNAs mainly in the middle and the end regions. These might help the parasite to undergo its complex life style in different hosts and adapt more readily to the host changes. The present study represents the first large scale characterization of H.a. anatolicum miRNAs, which could further the understanding of the complex biology of this zoonotic parasite, as well as initiate miRNA studies

  20. Quantitative analysis of hydrogen sites and occupancy in deep mantle hydrous wadsleyite using single crystal neutron diffraction

    Science.gov (United States)

    Purevjav, Narangoo; Okuchi, Takuo; Tomioka, Naotaka; Wang, Xiaoping; Hoffmann, Christina

    2016-01-01

    Evidence from seismological and mineralogical studies increasingly indicates that water from the oceans has been transported to the deep earth to form water-bearing dense mantle minerals. Wadsleyite [(Mg, Fe2+)2SiO4] has been identified as one of the most important host minerals incorporating this type of water, which is capable of storing the entire mass of the oceans as a hidden reservoir. To understand the effects of such water on the physical properties and chemical evolution of Earth’s interior, it is essential to determine where in the crystal structure the hydration occurs and which chemical bonds are altered and weakened after hydration. Here, we conduct a neutron time-of-flight single-crystal Laue diffraction study on hydrous wadsleyite. Single crystals were grown under pressure to a size suitable for the experiment and with physical qualities representative of wet, deep mantle conditions. The results of this neutron single crystal diffraction study unambiguously demonstrate the method of hydrogen incorporation into the wadsleyite, which is qualitatively different from that of its denser polymorph, ringwoodite, in the wet mantle. The difference is a vital clue towards understanding why these dense mantle minerals show distinctly different softening behaviours after hydration. PMID:27725749

  1. Quantitative analysis of hydrogen sites and occupancy in deep mantle hydrous wadsleyite using single crystal neutron diffraction

    Science.gov (United States)

    Purevjav, Narangoo; Okuchi, Takuo; Tomioka, Naotaka; Wang, Xiaoping; Hoffmann, Christina

    2016-10-01

    Evidence from seismological and mineralogical studies increasingly indicates that water from the oceans has been transported to the deep earth to form water-bearing dense mantle minerals. Wadsleyite [(Mg, Fe2+)2SiO4] has been identified as one of the most important host minerals incorporating this type of water, which is capable of storing the entire mass of the oceans as a hidden reservoir. To understand the effects of such water on the physical properties and chemical evolution of Earth’s interior, it is essential to determine where in the crystal structure the hydration occurs and which chemical bonds are altered and weakened after hydration. Here, we conduct a neutron time-of-flight single-crystal Laue diffraction study on hydrous wadsleyite. Single crystals were grown under pressure to a size suitable for the experiment and with physical qualities representative of wet, deep mantle conditions. The results of this neutron single crystal diffraction study unambiguously demonstrate the method of hydrogen incorporation into the wadsleyite, which is qualitatively different from that of its denser polymorph, ringwoodite, in the wet mantle. The difference is a vital clue towards understanding why these dense mantle minerals show distinctly different softening behaviours after hydration.

  2. Discovery and characterization of artifactual mutations in deep coverage targeted capture sequencing data due to oxidative DNA damage during sample preparation.

    Science.gov (United States)

    Costello, Maura; Pugh, Trevor J; Fennell, Timothy J; Stewart, Chip; Lichtenstein, Lee; Meldrim, James C; Fostel, Jennifer L; Friedrich, Dennis C; Perrin, Danielle; Dionne, Danielle; Kim, Sharon; Gabriel, Stacey B; Lander, Eric S; Fisher, Sheila; Getz, Gad

    2013-04-01

    As researchers begin probing deep coverage sequencing data for increasingly rare mutations and subclonal events, the fidelity of next generation sequencing (NGS) laboratory methods will become increasingly critical. Although error rates for sequencing and polymerase chain reaction (PCR) are well documented, the effects that DNA extraction and other library preparation steps could have on downstream sequence integrity have not been thoroughly evaluated. Here, we describe the discovery of novel C > A/G > T transversion artifacts found at low allelic fractions in targeted capture data. Characteristics such as sequencer read orientation and presence in both tumor and normal samples strongly indicated a non-biological mechanism. We identified the source as oxidation of DNA during acoustic shearing in samples containing reactive contaminants from the extraction process. We show generation of 8-oxoguanine (8-oxoG) lesions during DNA shearing, present analysis tools to detect oxidation in sequencing data and suggest methods to reduce DNA oxidation through the introduction of antioxidants. Further, informatics methods are presented to confidently filter these artifacts from sequencing data sets. Though only seen in a low percentage of reads in affected samples, such artifacts could have profoundly deleterious effects on the ability to confidently call rare mutations, and eliminating other possible sources of artifacts should become a priority for the research community.

  3. Deep learning

    CERN Document Server

    Goodfellow, Ian; Courville, Aaron

    2016-01-01

    Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts. Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning. The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language proces...

  4. Phylogenetic and genome-wide deep-sequencing analyses of canine parvovirus reveal co-infection with field variants and emergence of a recent recombinant strain.

    Science.gov (United States)

    Pérez, Ruben; Calleros, Lucía; Marandino, Ana; Sarute, Nicolás; Iraola, Gregorio; Grecco, Sofia; Blanc, Hervé; Vignuzzi, Marco; Isakov, Ofer; Shomron, Noam; Carrau, Lucía; Hernández, Martín; Francia, Lourdes; Sosa, Katia; Tomás, Gonzalo; Panzera, Yanina

    2014-01-01

    Canine parvovirus (CPV), a fast-evolving single-stranded DNA virus, comprises three antigenic variants (2a, 2b, and 2c) with different frequencies and genetic variability among countries. The contribution of co-infection and recombination to the genetic variability of CPV is far from being fully elucidated. Here we took advantage of a natural CPV population, recently formed by the convergence of divergent CPV-2c and CPV-2a strains, to study co-infection and recombination. Complete sequences of the viral coding region of CPV-2a and CPV-2c strains from 40 samples were generated and analyzed using phylogenetic tools. Two samples showed co-infection and were further analyzed by deep sequencing. The sequence profile of one of the samples revealed the presence of CPV-2c and CPV-2a strains that differed at 29 nucleotides. The other sample included a minor CPV-2a strain (13.3% of the viral population) and a major recombinant strain (86.7%). The recombinant strain arose from inter-genotypic recombination between CPV-2c and CPV-2a strains within the VP1/VP2 gene boundary. Our findings highlight the importance of deep-sequencing analysis to provide a better understanding of CPV molecular diversity.

  5. Phylogenetic and genome-wide deep-sequencing analyses of canine parvovirus reveal co-infection with field variants and emergence of a recent recombinant strain.

    Directory of Open Access Journals (Sweden)

    Ruben Pérez

    Full Text Available Canine parvovirus (CPV, a fast-evolving single-stranded DNA virus, comprises three antigenic variants (2a, 2b, and 2c with different frequencies and genetic variability among countries. The contribution of co-infection and recombination to the genetic variability of CPV is far from being fully elucidated. Here we took advantage of a natural CPV population, recently formed by the convergence of divergent CPV-2c and CPV-2a strains, to study co-infection and recombination. Complete sequences of the viral coding region of CPV-2a and CPV-2c strains from 40 samples were generated and analyzed using phylogenetic tools. Two samples showed co-infection and were further analyzed by deep sequencing. The sequence profile of one of the samples revealed the presence of CPV-2c and CPV-2a strains that differed at 29 nucleotides. The other sample included a minor CPV-2a strain (13.3% of the viral population and a major recombinant strain (86.7%. The recombinant strain arose from inter-genotypic recombination between CPV-2c and CPV-2a strains within the VP1/VP2 gene boundary. Our findings highlight the importance of deep-sequencing analysis to provide a better understanding of CPV molecular diversity.

  6. Finding the needle in the haystack: differentiating "identical" twins in paternity testing and forensics by ultra-deep next generation sequencing.

    Science.gov (United States)

    Weber-Lehmann, Jacqueline; Schilling, Elmar; Gradl, Georg; Richter, Daniel C; Wiehler, Jens; Rolf, Burkhard

    2014-03-01

    Monozygotic (MZ) twins are considered being genetically identical, therefore they cannot be differentiated using standard forensic DNA testing. Here we describe how identification of extremely rare mutations by ultra-deep next generation sequencing can solve such cases. We sequenced DNA from sperm samples of two twins and from a blood sample of the child of one twin. Bioinformatics analysis revealed five single nucleotide polymorphisms (SNPs) present in the twin father and the child, but not in the twin uncle. The SNPs were confirmed by classical Sanger sequencing. Our results give experimental evidence for the hypothesis that rare mutations will occur early after the human blastocyst has split into two, the origin of twins, and that such mutations will be carried on into somatic tissue and the germline. The method provides a solution to solve paternity and forensic cases involving monozygotic twins as alleged fathers or originators of DNA traces.

  7. Magnetic Resonance Elastography of the Liver: Qualitative and Quantitative Comparison of Gradient Echo and Spin Echo Echoplanar Imaging Sequences.

    Science.gov (United States)

    Wagner, Mathilde; Besa, Cecilia; Bou Ayache, Jad; Yasar, Temel Kaya; Bane, Octavia; Fung, Maggie; Ehman, Richard L; Taouli, Bachir

    2016-09-01

    The aim of this study was to compare 2-dimensional (2D) gradient recalled echo (GRE) and 2D spin echo echoplanar imaging (SE-EPI) magnetic resonance elastography (MRE) sequences of the liver in terms of image quality and quantitative liver stiffness (LS) measurement. This prospective study involved 50 consecutive subjects (male/female, 33/17; mean age, 58 years) who underwent liver magnetic resonance imaging at 3.0 T including 2 MRE sequences, 2D GRE, and 2D SE-EPI (acquisition time 56 vs 16 seconds, respectively). Image quality scores were assessed by 2 independent observers based on wave propagation and organ coverage on the confidence map (range, 0-15). A third observer measured LS on stiffness maps (in kilopascal). Mean LS values, regions of interest size (based on confidence map), and image quality scores between SE-EPI and GRE-MRE were compared using paired nonparametric Wilcoxon test. Reproducibility of LS values between the 2 sequences was assessed using intraclass coefficient correlation, coefficient of variation, and Bland-Altman limits of agreement. T2* effect on image quality was assessed using partial Spearman correlation. There were 4 cases of failure with GRE-MRE and none with SE-EPI-MRE. Image quality scores and region of interest size were significantly higher using SE-EPI-MRE versus GRE-MRE (P < 0.0001 for both measurements and observers). Liver stiffness measurements were not significantly different between the 2 sequences (3.75 ± 1.87 kPa vs 3.55 ± 1.51 kPa, P = 0.062), were significantly correlated (intraclass coefficient correlation, 0.909), and had excellent reproducibility (coefficient of variation, 10.2%; bias, 0.023; Bland-Altman limits of agreement, -1.19; 1.66 kPa). Image quality scores using GRE-MRE were significantly correlated with T2* while there was no correlation for SE-EPI-MRE. Our data suggest that SE-EPI-MRE may be a better alternative to GRE-MRE. The diagnostic performance of SE-EPI-MRE for detection of liver fibrosis needs

  8. The behavioral satiety sequence in pigeons (Columba livia). Description and development of a method for quantitative analysis.

    Science.gov (United States)

    Spudeit, William Anderson; Sulzbach, Natalia Saretta; Bittencourt, Myla de A; Duarte, Anita Maurício Camillo; Liang, Hua; Lino-de-Oliveira, Cilene; Marino-Neto, José

    2013-10-01

    The postprandial event known as the specific dynamic action is an evolutionarily conserved physiological set of metabolic responses to feeding. Its behavioral counterpart, a sequence of drinking, maintenance (e.g., grooming) and sleep-like behaviors known as the behavioral satiety sequence (BSS), has been thoroughly described in rodents and has enabled the refined evaluation of potential appetite modifiers. However, the presence and attributes of a BSS have not been systematically studied in non-mammalian species. Here, we describe the BSS induced in pigeons (Columba livia) by 1) the presentation of a palatable seed mixture (SM) food to free-feeding animals (SM+FF condition) and 2) re-feeding after a 24-h fasting period (FD24h+SM), which was examined by continuous behavioral recording for 2h. We then compare these patterns to those observed in free-feeding (FF) animals. A set of graphic representations and indexes, drawn from these behaviors (latency, time-to-peak, inter-peak intervals and the first intersection between feeding curves and those of other BSS-typical behaviors) were used to describe the temporal structure and sequential relationships between the pigeon's BSS components. Cramér-von Mises-based statistical procedures and bootstrapping-based methods to compare pairs of complex behavioral curves were described and used for comparisons among the behavioral profiles during the free-feeding recordings and after fasting- and SM-induced BSS. FD24h+SM- and SM+FF-induced feeding were consistently followed by a similar sequence of increased bouts of drinking, followed by preening and then sleep, which were significantly different from that of FF birds. The sequential and temporal patterns of the pigeon's BSS were not affected by differences in food intake or by dissimilarity in motivational content of feeding stimuli. The present data indicated that a BSS pattern can be reliably evoked in the pigeon, in a chronological succession and sequence that strongly

  9. Transcriptional Slippage and RNA Editing Increase the Diversity of Transcripts in Chloroplasts: Insight from Deep Sequencing of Vigna radiata Genome and Transcriptome.

    Directory of Open Access Journals (Sweden)

    Ching-Ping Lin

    Full Text Available We performed deep sequencing of the nuclear and organellar genomes of three mungbean genotypes: Vigna radiata ssp. sublobata TC1966, V. radiata var. radiata NM92 and the recombinant inbred line RIL59 derived from a cross between TC1966 and NM92. Moreover, we performed deep sequencing of the RIL59 transcriptome to investigate transcript variability. The mungbean chloroplast genome has a quadripartite structure including a pair of inverted repeats separated by two single copy regions. A total of 213 simple sequence repeats were identified in the chloroplast genomes of NM92 and RIL59; 78 single nucleotide variants and nine indels were discovered in comparing the chloroplast genomes of TC1966 and NM92. Analysis of the mungbean chloroplast transcriptome revealed mRNAs that were affected by transcriptional slippage and RNA editing. Transcriptional slippage frequency was positively correlated with the length of simple sequence repeats of the mungbean chloroplast genome (R2=0.9911. In total, 41 C-to-U editing sites were found in 23 chloroplast genes and in one intergenic spacer. No editing site that swapped U to C was found. A combination of bioinformatics and experimental methods revealed that the plastid-encoded RNA polymerase-transcribed genes psbF and ndhA are affected by transcriptional slippage in mungbean and in main lineages of land plants, including three dicots (Glycine max, Brassica rapa, and Nicotiana tabacum, two monocots (Oryza sativa and Zea mays, two gymnosperms (Pinus taeda and Ginkgo biloba and one moss (Physcomitrella patens. Transcript analysis of the rps2 gene showed that transcriptional slippage could affect transcripts at single sequence repeat regions with poly-A runs. It showed that transcriptional slippage together with incomplete RNA editing may cause sequence diversity of transcripts in chloroplasts of land plants.

  10. Arthropod phylogenetics in light of three novel millipede (myriapoda: diplopoda mitochondrial genomes with comments on the appropriateness of mitochondrial genome sequence data for inferring deep level relationships.

    Directory of Open Access Journals (Sweden)

    Michael S Brewer

    Full Text Available BACKGROUND: Arthropods are the most diverse group of eukaryotic organisms, but their phylogenetic relationships are poorly understood. Herein, we describe three mitochondrial genomes representing orders of millipedes for which complete genomes had not been characterized. Newly sequenced genomes are combined with existing data to characterize the protein coding regions of myriapods and to attempt to reconstruct the evolutionary relationships within the Myriapoda and Arthropoda. RESULTS: The newly sequenced genomes are similar to previously characterized millipede sequences in terms of synteny and length. Unique translocations occurred within the newly sequenced taxa, including one half of the Appalachioria falcifera genome, which is inverted with respect to other millipede genomes. Across myriapods, amino acid conservation levels are highly dependent on the gene region. Additionally, individual loci varied in the level of amino acid conservation. Overall, most gene regions showed low levels of conservation at many sites. Attempts to reconstruct the evolutionary relationships suffered from questionable relationships and low support values. Analyses of phylogenetic informativeness show the lack of signal deep in the trees (i.e., genes evolve too quickly. As a result, the myriapod tree resembles previously published results but lacks convincing support, and, within the arthropod tree, well established groups were recovered as polyphyletic. CONCLUSIONS: The novel genome sequences described herein provide useful genomic information concerning millipede groups that had not been investigated. Taken together with existing sequences, the variety of compositions and evolution of myriapod mitochondrial genomes are shown to be more complex than previously thought. Unfortunately, the use of mitochondrial protein-coding regions in deep arthropod phylogenetics appears problematic, a result consistent with previously published studies. Lack of phylogenetic

  11. Quantitative Single-letter Sequencing: a method for simultaneously monitoring numerous known allelic variants in single DNA samples

    Directory of Open Access Journals (Sweden)

    Duborjal Hervé

    2008-02-01

    Full Text Available Abstract Background Pathogens such as fungi, bacteria and especially viruses, are highly variable even within an individual host, intensifying the difficulty of distinguishing and accurately quantifying numerous allelic variants co-existing in a single nucleic acid sample. The majority of currently available techniques are based on real-time PCR or primer extension and often require multiplexing adjustments that impose a practical limitation of the number of alleles that can be monitored simultaneously at a single locus. Results Here, we describe a novel method that allows the simultaneous quantification of numerous allelic variants in a single reaction tube and without multiplexing. Quantitative Single-letter Sequencing (QSS begins with a single PCR amplification step using a pair of primers flanking the polymorphic region of interest. Next, PCR products are submitted to single-letter sequencing with a fluorescently-labelled primer located upstream of the polymorphic region. The resulting monochromatic electropherogram shows numerous specific diagnostic peaks, attributable to specific variants, signifying their presence/absence in the DNA sample. Moreover, peak fluorescence can be quantified and used to estimate the frequency of the corresponding variant in the DNA population. Using engineered allelic markers in the genome of Cauliflower mosaic virus, we reliably monitored six different viral genotypes in DNA extracted from infected plants. Evaluation of the intrinsic variance of this method, as applied to both artificial plasmid DNA mixes and viral genome populations, demonstrates that QSS is a robust and reliable method of detection and quantification for variants with a relative frequency of between 0.05 and 1. Conclusion This simple method is easily transferable to many other biological systems and questions, including those involving high throughput analysis, and can be performed in any laboratory since it does not require specialized

  12. Establishment of quantitative sequencing and filter contact vial bioassay for monitoring pyrethroid resistance in the common bed bug, Cimex lectularius.

    Science.gov (United States)

    Seong, Keon Mook; Lee, Da-Young; Yoon, Kyong Sup; Kwon, Deok Ho; Kim, Heung Chul; Klein, Terry A; Clark, J Marshall; Lee, Si Hyeock

    2010-07-01

    Two point mutations (V419L and L925I) in the voltage-sensitive sodium channel alpha-subunit gene have been identified in deltamethrin-resistant bed bugs. A quantitative sequencing (QS) protocol was developed to establish a population-based genotyping method as a molecular resistance-monitoring tool based on the frequency of the two mutations. The nucleotide signal ratio at each mutation site was generated from sequencing chromatograms and plotted against the corresponding resistance allele frequency. Frequency prediction equations were generated from the plots by linear regression, and the signal ratios were shown to highly correlate with resistance allele frequencies (r2 > 0.9928). As determined by QS, neither mutation was found in a bed bug population collected in 1993. Populations collected in recent years (2007-2009), however, exhibited completely or nearly saturating L925I mutation frequencies and highly variable frequencies of the V419L mutation. In addition to QS, the filter contact vial bioassay (FCVB) method was established and used to determine the baseline susceptibility and resistance of bed bugs to deltamethrin and lambda-cyhalothrin. A pyrethroid-resistant strain showed >9,375- and 6,990-fold resistance to deltamethrin and lambda-cyhalothrin, respectively. Resistance allele frequencies in different bed bug populations predicted by QS correlated well with the FCVB results, confirming the roles of the two mutations in pyrethroid resistance. Taken together, employment of QS in conjunction with FCVB should greatly facilitate the detection and monitoring of pyrethroid-resistant bed bugs in the field. The advantages of FCVB as an on-site resistance-monitoring tool are discussed.

  13. An improved method for quantitatively measuring the sequences of total organic carbon and black carbon in marine sediment cores

    Science.gov (United States)

    Xu, Xiaoming; Zhu, Qing; Zhou, Qianzhi; Liu, Jinzhong; Yuan, Jianping; Wang, Jianghai

    2017-04-01

    Understanding global carbon cycle is critical to uncover the mechanisms of global warming and remediate its adverse effects on human activities. Organic carbon in marine sediments is an indispensable part of the global carbon reservoir in global carbon cycling. Evaluating such a reservoir calls for quantitative studies of marine carbon burial, which closely depend on quantifying total organic carbon and black carbon in marine sediment cores and subsequently on obtaining their high-resolution temporal sequences. However, the conventional methods for detecting the contents of total organic carbon or black carbon cannot resolve the following specific difficulties, i.e., (1) a very limited amount of each subsample versus the diverse analytical items, (2) a low and fluctuating recovery rate of total organic carbon or black carbon versus the reproducibility of carbon data, and (3) a large number of subsamples versus the rapid batch measurements. In this work, (i) adopting the customized disposable ceramic crucibles with the micropore-controlled ability, (ii) developing self-made or customized facilities for the procedures of acidification and chemothermal oxidization, and (iii) optimizing procedures and carbon-sulfur analyzer, we have built a novel Wang-Xu-Yuan method (the WXY method) for measuring the contents of total organic carbon or black carbon in marine sediment cores, which includes the procedures of pretreatment, weighing, acidification, chemothermal oxidation and quantification; and can fully meet the requirements of establishing their highresolution temporal sequences, whatever in the recovery, experimental efficiency, accuracy and reliability of the measurements, and homogeneity of samples. In particular, the usage of disposable ceramic crucibles leads to evidently simplify the experimental scenario, which further results in the very high recovery rates for total organic carbon and black carbon. This new technique may provide a significant support for

  14. Deep Eutectic Solvents as Novel and Effective Extraction Media for Quantitative Determination of Ochratoxin A in Wheat and Derived Products

    Directory of Open Access Journals (Sweden)

    Luca Piemontese

    2017-01-01

    Full Text Available An unprecedented, environmentally friendly, and faster method for the determination of Ochratoxin A (OTA (a mycotoxin produced by several species of Aspergillus and Penicillium and largely widespread in nature, in wheat and derived products has, for the first time, been set up and validated using choline chloride (ChCl-based deep eutectic solvents (DESs (e.g., ChCl/glycerol (1:2 and ChCl/ urea (1:2 up to 40% (w/w water as privileged, green, and biodegradable extraction solvents. This also reduces worker exposure to toxic chemicals. Results are comparable to those obtained using conventional, hazardous and volatile organic solvents (VOCs typical of the standard and official methods. OTA recovery from spiked durum wheat samples, in particular, was to up to 89% versus 93% using the traditional acetonitrile-water mixture with a repeatability of the results (RSDr of 7%. Compatibility of the DES mixture with the antibodies of the immunoaffinity column was excellent as it was able to retain up to 96% of the OTA. Recovery and repeatability for durum wheat, bread crumbs, and biscuits proved to be within the specifications required by the current European Commission (EC regulation. Good results in terms of accuracy and precision were achieved with mean recoveries between 70% (durum wheat and 88% (bread crumbs and an RSDr between 2% (biscuits and 7% (bread.

  15. Identification of genetic risk variants for deep vein thrombosis by multiplexed next-generation sequencing of 186 hemostatic/pro-inflammatory genes

    Directory of Open Access Journals (Sweden)

    Lotta Luca A

    2012-02-01

    Full Text Available Abstract Background Next-generation DNA sequencing is opening new avenues for genetic association studies in common diseases that, like deep vein thrombosis (DVT, have a strong genetic predisposition still largely unexplained by currently identified risk variants. In order to develop sequencing and analytical pipelines for the application of next-generation sequencing to complex diseases, we conducted a pilot study sequencing the coding area of 186 hemostatic/proinflammatory genes in 10 Italian cases of idiopathic DVT and 12 healthy controls. Results A molecular-barcoding strategy was used to multiplex DNA target capture and sequencing, while retaining individual sequence information. Genomic libraries with barcode sequence-tags were pooled (in pools of 8 or 16 samples and enriched for target DNA sequences. Sequencing was performed on ABI SOLiD-4 platforms. We produced > 12 gigabases of raw sequence data to sequence at high coverage (average: 42X the 700-kilobase target area in 22 individuals. A total of 1876 high-quality genetic variants were identified (1778 single nucleotide substitutions and 98 insertions/deletions. Annotation on databases of genetic variation and human disease mutations revealed several novel, potentially deleterious mutations. We tested 576 common variants in a case-control association analysis, carrying the top-5 associations over to replication in up to 719 DVT cases and 719 controls. We also conducted an analysis of the burden of nonsynonymous variants in coagulation factor and anticoagulant genes. We found an excess of rare missense mutations in anticoagulant genes in DVT cases compared to controls and an association for a missense polymorphism of FGA (rs6050; p = 1.9 × 10-5, OR 1.45; 95% CI, 1.22-1.72; after replication in > 1400 individuals. Conclusions We implemented a barcode-based strategy to efficiently multiplex sequencing of hundreds of candidate genes in several individuals. In the relatively small dataset of

  16. Whole Genome Re-Sequencing Identifies a Quantitative Trait Locus Repressing Carbon Reserve Accumulation during Optimal Growth in Chlamydomonas reinhardtii.

    Science.gov (United States)

    Goold, Hugh Douglas; Nguyen, Hoa Mai; Kong, Fantao; Beyly-Adriano, Audrey; Légeret, Bertrand; Billon, Emmanuelle; Cuiné, Stéphan; Beisson, Fred; Peltier, Gilles; Li-Beisson, Yonghua

    2016-05-04

    Microalgae have emerged as a promising source for biofuel production. Massive oil and starch accumulation in microalgae is possible, but occurs mostly when biomass growth is impaired. The molecular networks underlying the negative correlation between growth and reserve formation are not known. Thus isolation of strains capable of accumulating carbon reserves during optimal growth would be highly desirable. To this end, we screened an insertional mutant library of Chlamydomonas reinhardtii for alterations in oil content. A mutant accumulating five times more oil and twice more starch than wild-type during optimal growth was isolated and named constitutive oil accumulator 1 (coa1). Growth in photobioreactors under highly controlled conditions revealed that the increase in oil and starch content in coa1 was dependent on light intensity. Genetic analysis and DNA hybridization pointed to a single insertional event responsible for the phenotype. Whole genome re-sequencing identified in coa1 a >200 kb deletion on chromosome 14 containing 41 genes. This study demonstrates that, 1), the generation of algal strains accumulating higher reserve amount without compromising biomass accumulation is feasible; 2), light is an important parameter in phenotypic analysis; and 3), a chromosomal region (Quantitative Trait Locus) acts as suppressor of carbon reserve accumulation during optimal growth.

  17. Quantitative analysis of herpes virus sequences from normal tissue and fibropapillomas of marine turtles with real-time PCR

    Science.gov (United States)

    Quackenbush, S.L.; Casey, R.N.; Murcek, R.J.; Paul, T.A.; Work, T.M.; Limpus, C.J.; Chaves, A.; duToit, L.; Perez, J.V.; Aguirre, A.A.; Spraker, T.R.; Horrocks, J.A.; Vermeer, L.A.; Balazs, G.S.; Casey, J.W.

    2001-01-01

    Quantitative real-time PCR has been used to measure fibropapilloma-associated turtle herpesvirus (FPTHV) pol DNA loads in fibropapillomas, fibromas, and uninvolved tissues of green, loggerhead, and olive ridley turtles from Hawaii, Florida, Costa Rica, Australia, Mexico, and the West Indies. The viral DNA loads from tumors obtained from terminal animals were relatively homogenous (range 2a??20 copies/cell), whereas DNA copy numbers from biopsied tumors and skin of otherwise healthy turtles displayed a wide variation (range 0.001a??170 copies/cell) and may reflect the stage of tumor development. FPTHV DNA loads in tumors were 2.5a??4.5 logs higher than in uninvolved skin from the same animal regardless of geographic location, further implying a role for FPTHV in the etiology of fibropapillomatosis. Although FPTHV pol sequences amplified from tumors are highly related to each other, single signature amino acid substitutions distinguish the Australia/Hawaii, Mexico/Costa Rica, and Florida/Caribbean groups.

  18. Whole Genome Re-Sequencing Identifies a Quantitative Trait Locus Repressing Carbon Reserve Accumulation during Optimal Growth in Chlamydomonas reinhardtii

    Science.gov (United States)

    Goold, Hugh Douglas; Nguyen, Hoa Mai; Kong, Fantao; Beyly-Adriano, Audrey; Légeret, Bertrand; Billon, Emmanuelle; Cuiné, Stéphan; Beisson, Fred; Peltier, Gilles; Li-Beisson, Yonghua

    2016-01-01

    Microalgae have emerged as a promising source for biofuel production. Massive oil and starch accumulation in microalgae is possible, but occurs mostly when biomass growth is impaired. The molecular networks underlying the negative correlation between growth and reserve formation are not known. Thus isolation of strains capable of accumulating carbon reserves during optimal growth would be highly desirable. To this end, we screened an insertional mutant library of Chlamydomonas reinhardtii for alterations in oil content. A mutant accumulating five times more oil and twice more starch than wild-type during optimal growth was isolated and named constitutive oil accumulator 1 (coa1). Growth in photobioreactors under highly controlled conditions revealed that the increase in oil and starch content in coa1 was dependent on light intensity. Genetic analysis and DNA hybridization pointed to a single insertional event responsible for the phenotype. Whole genome re-sequencing identified in coa1 a >200 kb deletion on chromosome 14 containing 41 genes. This study demonstrates that, 1), the generation of algal strains accumulating higher reserve amount without compromising biomass accumulation is feasible; 2), light is an important parameter in phenotypic analysis; and 3), a chromosomal region (Quantitative Trait Locus) acts as suppressor of carbon reserve accumulation during optimal growth. PMID:27141848

  19. Quantifying evidence for candidate gene polymorphisms: Bayesian analysis combining sequence-specific and quantitative trait loci colocation information.

    Science.gov (United States)

    Ball, Roderick D

    2007-12-01

    We calculate posterior probabilities for candidate genes as a function of genomic location. Posterior probabilities for quantitative trait loci (QTL) presence in a small interval are calculated using a Bayesian model-selection approach based on the Bayesian information criterion (BIC) and used to combine QTL colocation information with sequence-specific evidence, e.g., from differential expression and/or association studies. Our method takes into account uncertainty in estimation of number and locations of QTL and estimated map position. Posterior probabilities for QTL presence were calculated for simulated data with n = 100, 300, and 1200 QTL progeny and compared with interval mapping and composite-interval mapping. Candidate genes that mapped to QTL regions had substantially larger posterior probabilities. Among candidates with a given Bayes factor, those that map near a QTL are more promising for further investigation with association studies and functional testing or for use in marker-aided selection. The BIC is shown to correspond very closely to Bayes factors for linear models with a nearly noninformative Zellner prior for the simulated QTL data with n > or = 100. It is shown how to modify the BIC to use a subjective prior for the QTL effects.

  20. Noise estimation in infrared image sequences: a tool for the quantitative evaluation of the effectiveness of registration algorithms.

    Science.gov (United States)

    Agostini, Valentina; Delsanto, Silvia; Knaflitz, Marco; Molinari, Filippo

    2008-07-01

    Dynamic infrared imaging has been proposed in literature as an adjunctive technique to mammography in breast cancer diagnosis. It is based on the acquisition of hundreds of consecutive thermal images with a frame rate ranging from 50 to 200 frames/s, followed by the harmonic analysis of temperature time series at each image pixel. However, the temperature fluctuation due to blood perfusion, which is the signal of interest, is small compared to the signal fluctuation due to subject movements. Hence, before extracting the time series describing temperature fluctuations, it is fundamental to realign the thermal images to attenuate motion artifacts. In this paper, we describe a method for the quantitative evaluation of any kind of feature-based registration algorithm on thermal image sequences, provided that an estimation of local velocities of reference points on the skin is available. As an example of evaluation of a registration algorithm, we report the evaluation of the SNR improvement obtained by applying a nonrigid piecewise linear algorithm.

  1. Quantitative measurements of alternating finger tapping in Parkinson's disease correlate with UPDRS motor disability and reveal the improvement in fine motor control from medication and deep brain stimulation.

    Science.gov (United States)

    Taylor Tavares, Ana Lisa; Jefferis, Gregory S X E; Koop, Mandy; Hill, Bruce C; Hastie, Trevor; Heit, Gary; Bronte-Stewart, Helen M

    2005-10-01

    The Unified Parkinson's Disease Rating Scale (UPDRS) is the primary outcome measure in most clinical trials of Parkinson's disease (PD) therapeutics. Each subscore of the motor section (UPDRS III) compresses a wide range of motor performance into a coarse-grained scale from 0 to 4; the assessment of performance can also be subjective. Quantitative digitography (QDG) is an objective, quantitative assessment of digital motor control using a computer-interfaced musical keyboard. In this study, we show that the kinematics of a repetitive alternating finger-tapping (RAFT) task using QDG correlate with the UPDRS motor score, particularly with the bradykinesia subscore, in 33 patients with PD. We show that dopaminergic medication and an average of 9.5 months of bilateral subthalamic nucleus deep brain stimulation (B-STN DBS) significantly improve UPDRS and QDG scores but may have different effects on certain kinematic parameters. This study substantiates the use of QDG to measure motor outcome in trials of PD therapeutics and shows that medication and B-STN DBS both improve fine motor control.

  2. Genome sequence of Haloplasma contractile, an unusual contractile bacterium from a deep-sea anoxic brine lake.

    KAUST Repository

    Antunes, Andre

    2011-09-01

    We present the draft genome of Haloplasma contractile, isolated from a deep-sea brine and representing a new order between Firmicutes and Mollicutes. Its complex morphology with contractile protrusions might be strongly influenced by the presence of seven MreB/Mbl homologs, which appears to be the highest copy number ever reported.

  3. Genome sequence of Halorhabdus tiamatea, the first archaeon isolated from a deep-sea anoxic brine lake.

    KAUST Repository

    Antunes, Andre

    2011-09-01

    We present the draft genome of Halorhabdus tiamatea, the first member of the Archaea ever isolated from a deep-sea anoxic brine. Genome comparison with Halorhabdus utahensis revealed some striking differences, including a marked increase in genes associated with transmembrane transport and putative genes for a trehalose synthase and a lactate dehydrogenase.

  4. OCT structure, COB location and magmatic type of the S Angolan & SE Brazilian margins from integrated quantitative analysis of deep seismic reflection and gravity anomaly data

    Science.gov (United States)

    Cowie, Leanne; Kusznir, Nick; Horn, Brian

    2014-05-01

    Integrated quantitative analysis using deep seismic reflection data and gravity inversion have been applied to the S Angolan and SE Brazilian margins to determine OCT structure, COB location and magmatic type. Knowledge of these margin parameters are of critical importance for understanding rifted continental margin formation processes and in evaluating petroleum systems in deep-water frontier oil and gas exploration. The OCT structure, COB location and magmatic type of the S Angolan and SE Brazilian rifted continental margins are much debated; exhumed and serpentinised mantle have been reported at these margins. Gravity anomaly inversion, incorporating a lithosphere thermal gravity anomaly correction, has been used to determine Moho depth, crustal basement thickness and continental lithosphere thinning. Residual Depth Anomaly (RDA) analysis has been used to investigate OCT bathymetric anomalies with respect to expected oceanic bathymetries and subsidence analysis has been used to determine the distribution of continental lithosphere thinning. These techniques have been validated for profiles Lusigal 12 and ISE-01 on the Iberian margin. In addition a joint inversion technique using deep seismic reflection and gravity anomaly data has been applied to the ION-GXT BS1-575 SE Brazil and ION-GXT CS1-2400 S Angola deep seismic reflection lines. The joint inversion method solves for coincident seismic and gravity Moho in the time domain and calculates the lateral variations in crustal basement densities and velocities along the seismic profiles. Gravity inversion, RDA and subsidence analysis along the ION-GXT BS1-575 profile, which crosses the Sao Paulo Plateau and Florianopolis Ridge of the SE Brazilian margin, predict the COB to be located SE of the Florianopolis Ridge. Integrated quantitative analysis shows no evidence for exhumed mantle on this margin profile. The joint inversion technique predicts oceanic crustal thicknesses of between 7 and 8 km thickness with

  5. Implications of spatial and temporal development of the aftershock sequence for the Mw 8.3 June 9, 1994 Deep Bolivian Earthquake

    Science.gov (United States)

    Myers, Stephen C.; Wallace, Terry C.; Beck, Susan L.; Silver, Paul G.; Zandt, George; Vandecar, John; Minaya, Estela

    On June 9, 1994 the Mw 8.3 Bolivia earthquake (636 km depth) occurred in a region which had not experienced significant, deep seismicity for at least 30 years. The mainshock and aftershocks were recorded in Bolivia on the BANJO and SEDA broadband seismic arrays and on the San Calixto Network. We used the joint hypocenter determination method to determine the relative location of the aftershocks. We have identified no foreshocks and 89 aftershocks (m > 2.2) for the 20-day period following the mainshock. The frequency of aftershock occurrence decreased rapidly, with only one or two aftershocks per day occuring after day two. The temporal decay of aftershock activity is similar to shallow aftershock sequences, but the number of aftershocks is two orders of magnitude less. Additionally, a mb ∼6, apparently triggered earthquake occurred just 10 minutes after the mainshock about 330 km east-southeast of the mainshock at a depth of 671 km. The aftershock sequence occurred north and east of the mainshock and extends to a depth of 665 km. The aftershocks define a slab striking N68°W and dipping 45°NE. The strike, dip, and location of the aftershock zone are consistent with this seismicity being confined within the downward extension of the subducted Nazca plate. The location and orientation of the aftershock sequence indicate that the subducted Nazca plate bends between the NNW striking zone of deep seismicity in western Brazil and the N-S striking zone of seismicity in central Bolivia. A tear in the deep slab is not necessitated by the data. A subset of the aftershock hypocenters cluster along a subhorizontal plane near the depth of the mainshock, favoring a horizontal fault plane. The horizontal dimensions of the mainshock [Beck et al., this issue; Silver et al., 1995] and slab defined by the aftershocks are approximately equal, indicating that the mainshock ruptured through the slab.

  6. Tracking the evolution of multiple in vitro hepatitis C virus replicon variants under protease inhibitor selection pressure by 454 deep sequencing.

    Science.gov (United States)

    Verbinnen, Thierry; Van Marck, Herwig; Vandenbroucke, Ina; Vijgen, Leen; Claes, Marijke; Lin, Tse-I; Simmen, Kenneth; Neyts, Johan; Fanning, Gregory; Lenz, Oliver

    2010-11-01

    Resistance to hepatitis C virus (HCV) inhibitors targeting viral enzymes has been observed in in vitro replicon studies and during clinical trials. The factors determining the emergence of resistance and the changes in the viral quasispecies population under selective pressure are not fully understood. To assess the dynamics of variants emerging in vitro under various selective pressures with TMC380765, a potent macrocyclic HCV NS3/4A protease inhibitor, HCV genotype 1b replicon-containing cells were cultured in the presence of a low, high, or stepwise-increasing TMC380765 concentration(s). HCV replicon RNA from representative samples thus obtained was analyzed using (i) population, (ii) clonal, and (iii) 454 deep sequencing technologies. Depending on the concentration of TMC380765, distinct mutational patterns emerged. In particular, culturing with low concentrations resulted in the selection of low-level resistance mutations (F43S and A156G), whereas high concentrations resulted in the selection of high-level resistance mutations (A156V, D168V, and D168A). Clonal and 454 deep sequencing analysis of the replicon RNA allowed the identification of low-frequency preexisting mutations possibly contributing to the mutational pattern that emerged. Stepwise-increasing TMC380765 concentrations resulted in the emergence and disappearance of multiple replicon variants in response to the changing selection pressure. Moreover, two different codons for the wild-type amino acids were observed at certain NS3 positions within one population of replicons, which may contribute to the emerging mutational patterns. Deep sequencing technologies enabled the study of minority variants present in the HCV quasispecies population present at baseline and during antiviral drug pressure, giving new insights into the dynamics of resistance acquisition by HCV.

  7. Deep sequencing of ESTs from nacreous and prismatic layer producing tissues and a screen for novel shell formation-related genes in the pearl oyster.

    Directory of Open Access Journals (Sweden)

    Shigeharu Kinoshita

    Full Text Available BACKGROUND: Despite its economic importance, we have a limited understanding of the molecular mechanisms underlying shell formation in pearl oysters, wherein the calcium carbonate crystals, nacre and prism, are formed in a highly controlled manner. We constructed comprehensive expressed gene profiles in the shell-forming tissues of the pearl oyster Pinctada fucata and identified novel shell formation-related genes candidates. PRINCIPAL FINDINGS: We employed the GS FLX 454 system and constructed transcriptome data sets from pallial mantle and pearl sac, which form the nacreous layer, and from the mantle edge, which forms the prismatic layer in P. fucata. We sequenced 260477 reads and obtained 29682 unique sequences. We also screened novel nacreous and prismatic gene candidates by a combined analysis of sequence and expression data sets, and identified various genes encoding lectin, protease, protease inhibitors, lysine-rich matrix protein, and secreting calcium-binding proteins. We also examined the expression of known nacreous and prismatic genes in our EST library and identified novel isoforms with tissue-specific expressions. CONCLUSIONS: We constructed EST data sets from the nacre- and prism-producing tissues in P. fucata and found 29682 unique sequences containing novel gene candidates for nacreous and prismatic layer formation. This is the first report of deep sequencing of ESTs in the shell-forming tissues of P. fucata and our data provide a powerful tool for a comprehensive understanding of the molecular mechanisms of molluscan biomineralization.

  8. Detection and identification of human Plasmodium species with real-time quantitative nucleic acid sequence-based amplification

    Directory of Open Access Journals (Sweden)

    Kager Piet A

    2006-10-01

    Full Text Available Abstract Background Decisions concerning malaria treatment depend on species identification causing disease. Microscopy is most frequently used, but at low parasitaemia (Plasmodium antigen detection do often not allow for species discrimination as microscopy does, but also become insensitive at Methods This paper reports the development of a sensitive and specific real-time Quantitative Nucleic Acid Sequence Based Amplification (real-time QT-NASBA assays, based on the small-subunit 18S rRNA gene, to identify the four human Plasmodium species. Results The lower detection limit of the assay is 100 – 1000 molecules in vitro RNA for all species, which corresponds to 0.01 – 0.1 parasite per diagnostic sample (i.e. 50 μl of processed blood. The real-time QT-NASBA was further evaluated using 79 clinical samples from malaria patients: i.e. 11 Plasmodium. falciparum, 37 Plasmodium vivax, seven Plasmodium malariae, four Plasmodium ovale and 20 mixed infections. The initial diagnosis of 69 out of the 79 samples was confirmed with the developed real-time QT-NASBA. Re-analysis of seven available original slides resolved five mismatches. Three of those were initially identified as P. malariae mono-infection, but after re-reading the slides P. falciparum was found, confirming the real-time QT-NASBA result. The other two slides were of poor quality not allowing true species identification. The remaining five discordant results could not be explained by microscopy, but may be due to extreme low numbers of parasites present in the samples. In addition, 12 Plasmodium berghei isolates from mice and 20 blood samples from healthy donors did not show any reaction in the assay. Conclusion Real-time QT-NASBA is a very sensitive and specific technique with a detection limit of 0.1 Plasmodium parasite per diagnostic sample (50 μl of blood and can be used for the detection, identification and quantitative measurement of low parasitaemia of Plasmodium species, thus

  9. Identification of microRNAs from Amur grape (vitis amurensis Rupr. by deep sequencing and analysis of microRNA variations with bioinformatics

    Directory of Open Access Journals (Sweden)

    Wang Chen

    2012-03-01

    Full Text Available Abstract Background MicroRNA (miRNA is a class of functional non-coding small RNA with 19-25 nucleotides in length while Amur grape (Vitis amurensis Rupr. is an important wild fruit crop with the strongest cold resistance among the Vitis species, is used as an excellent breeding parent for grapevine, and has elicited growing interest in wine production. To date, there is a relatively large number of grapevine miRNAs (vv-miRNAs from cultivated grapevine varieties such as Vitis vinifera L. and hybrids of V. vinifera and V. labrusca, but there is no report on miRNAs from Vitis amurensis Rupr, a wild grapevine species. Results A small RNA library from Amur grape was constructed and Solexa technology used to perform deep sequencing of the library followed by subsequent bioinformatics analysis to identify new miRNAs. In total, 126 conserved miRNAs belonging to 27 miRNA families were identified, and 34 known but non-conserved miRNAs were also found. Significantly, 72 new potential Amur grape-specific miRNAs were discovered. The sequences of these new potential va-miRNAs were further validated through miR-RACE, and accumulation of 18 new va-miRNAs in seven tissues of grapevines confirmed by real time RT-PCR (qRT-PCR analysis. The expression levels of va-miRNAs in flowers and berries were found to be basically consistent in identity to those from deep sequenced sRNAs libraries of combined corresponding tissues. We also describe the conservation and variation of va-miRNAs using miR-SNPs and miR-LDs during plant evolution based on comparison of orthologous sequences, and further reveal that the number and sites of miR-SNP in diverse miRNA families exhibit distinct divergence. Finally, 346 target genes for the new miRNAs were predicted and they include a number of Amur grape stress tolerance genes and many genes regulating anthocyanin synthesis and sugar metabolism. Conclusions Deep sequencing of short RNAs from Amur grape flowers and berries identified 72

  10. Pilot study on molecular quantitation and sequencing of endometrial cytokines gene expression and their effect on the outcome of in vitro fertilization (IVF cycle

    Directory of Open Access Journals (Sweden)

    D. Sabry

    2014-09-01

    Full Text Available Human trophoblast invasion and differentiation are essential for successful pregnancy outcome. The molecular mechanisms, however, are poorly understood. Interleukin (IL-11, a cytokine, regulates endometrial epithelial cell adhesion. Leukemia inhibitory factor (LIF is one of the key cytokines in the embryo implantation regulation. The present study aimed to assess the levels of LIF, IL-11, and IL-11 α receptor gene expression in the endometrium of women undergoing IVF and correlate their levels with the IVF pregnancy outcome. Also, the study aimed to detect any mutation in these three genes among IVF pregnant and non-pregnant women versus control menstrual blood of fertile women. Endometrial tissue biopsies were taken from 15 women undergoing IVF on the day of oocyte retrieval. The quantitative expression of IL-11, IL-11Rα, and LIF genes was assessed by real-time PCR and PCR products were sequenced. Menstrual blood from 10 fertile women was used as control to compare the DNA sequence versus DNA sequence of the studied genes in endometrial biopsies. LH, FSH, and E2 were assessed for enrolled patients by ELISA. Endometrial thickness was also assessed by pelvic ultrasonography. No significant difference was detected between quantitative expression of the three studied genes and pregnancy IVF outcome. Although DNA sequence changes were found in IL-11 and LIF genes of women with negative pregnancy IVF outcome compared to women with positive pregnancy IVF outcome, no DNA sequence changes were detected for IL-11Rα. Other studied parameters (e.g., age, LH, FSH, E2, and endometrial thickness showed no significant differences or correlation of quantitative expression of the three studied involved genes. Data suggested that there were no significant differences between quantitative expression of IL-11, IL-11Rα, and LIF genes and the IVF pregnancy outcome. The present study may reveal that changes in IL-11 and LIF genes sequence may contribute in

  11. Ultra-deep sequencing reveals high prevalence and broad structural diversity of hepatitis B surface antigen mutations in a global population.

    Science.gov (United States)

    Gencay, Mikael; Hübner, Kirsten; Gohl, Peter; Seffner, Anja; Weizenegger, Michael; Neofytos, Dionysios; Batrla, Richard; Woeste, Andreas; Kim, Hyon-Suk; Westergaard, Gaston; Reinsch, Christine; Brill, Eva; Thu Thuy, Pham Thi; Hoang, Bui Huu; Sonderup, Mark; Spearman, C Wendy; Pabinger, Stephan; Gautier, Jérémie; Brancaccio, Giuseppina; Fasano, Massimo; Santantonio, Teresa; Gaeta, Giovanni B; Nauck, Markus; Kaminski, Wolfgang E

    2017-01-01

    The diversity of the hepatitis B surface antigen (HBsAg) has a significant impact on the performance of diagnostic screening tests and the clinical outcome of hepatitis B infection. Neutralizing or diagnostic antibodies against the HBsAg are directed towards its highly conserved major hydrophilic region (MHR), in particular towards its "a" determinant subdomain. Here, we explored, on a global scale, the genetic diversity of the HBsAg MHR in a large, multi-ethnic cohort of randomly selected subjects with HBV infection from four continents. A total of 1553 HBsAg positive blood samples of subjects originating from 20 different countries across Africa, America, Asia and central Europe were characterized for amino acid variation in the MHR. Using highly sensitive ultra-deep sequencing, we found 72.8% of the successfully sequenced subjects (n = 1391) demonstrated amino acid sequence variation in the HBsAg MHR. This indicates that the global variation frequency in the HBsAg MHR is threefold higher than previously reported. The majority of the amino acid mutations were found in the HBV genotypes B (28.9%) and C (25.4%). Collectively, we identified 345 distinct amino acid mutations in the MHR. Among these, we report 62 previously unknown mutations, which extends the worldwide pool of currently known HBsAg MHR mutations by 22%. Importantly, topological analysis identified the "a" determinant upstream flanking region as the structurally most diverse subdomain of the HBsAg MHR. The highest prevalence of "a" determinant region mutations was observed in subjects from Asia, followed by the African, American and European cohorts, respectively. Finally, we found that more than half (59.3%) of all HBV subjects investigated carried multiple MHR mutations. Together, this worldwide ultra-deep sequencing based genotyping study reveals that the global prevalence and structural complexity of variation in the hepatitis B surface antigen have, to date, been significantly underappreciated.

  12. Identification of SNPs in barley(Hordeum vulgare L.)by deep sequencing of six reduced representation libraries

    Institute of Scientific and Technical Information of China (English)

    Ganggang; Guo; Dawa; Dondup; Lisha; Zhang; Sha; Hu; Xingmiao; Yuan; Jing; Zhang

    2014-01-01

    High-density genetic markers are required for genotyping and linkage mapping in identifying genes from crops with complex genomes, such as barley. As the most common variation, single nucleotide polymorphisms(SNPs) are suitable for accurate genotyping by using the next-generation sequencing(NGS) technology. Reduced representation libraries(RRLs) of five barley accessions and one mutant were sequenced using NGS technology for SNP discovery. Twenty million short reads were generated and the proportion of repetitive sequences was reduced by more than 56%. A total of 6061 SNPs were identified, and 451 were mapped to the draft sequence of the barley genome with pairing reads. Eleven SNPs were validated using length polymorphic allele-specific PCR markers.

  13. Sensitive Next-Generation Sequencing Method Reveals Deep Genetic Diversity of HIV-1 in the Democratic Republic of the Congo

    Science.gov (United States)

    Wilkinson, Eduan; Vallari, Ana; McArthur, Carole; Sthreshley, Larry; Brennan, Catherine A.; Cloherty, Gavin; de Oliveira, Tulio

    2017-01-01

    ABSTRACT As the epidemiological epicenter of the human immunodeficiency virus (HIV) pandemic, the Democratic Republic of the Congo (DRC) is a reservoir of circulating HIV strains exhibiting high levels of diversity and recombination. In this study, we characterized HIV specimens collected in two rural areas of the DRC between 2001 and 2003 to identify rare strains of HIV. The env gp41 region was sequenced and characterized for 172 HIV-positive specimens. The env sequences were predominantly subtype A (43.02%), but 7 other subtypes (33.14%), 20 circulating recombinant forms (CRFs; 11.63%), and 20 unclassified (11.63%) sequences were also found. Of the rare and unclassified subtypes, 18 specimens were selected for next-generation sequencing (NGS) by a modified HIV-switching mechanism at the 5′ end of the RNA template (SMART) method to obtain full-genome sequences. NGS produced 14 new complete genomes, which included pure subtype C (n = 2), D (n = 1), F1 (n = 1), H (n = 3), and J (n = 1) genomes. The two subtype C genomes and one of the subtype H genomes branched basal to their respective subtype branches but had no evidence of recombination. The remaining 6 genomes were complex recombinants of 2 or more subtypes, including subtypes A1, F, G, H, J, and K and unclassified fragments, including one subtype CRF25 isolate, which branched basal to all CRF25 references. Notably, all recombinant subtype H fragments branched basal to the H clade. Spatial-geographical analysis indicated that the diverse sequences identified here did not expand globally. The full-genome and subgenomic sequences identified in our study population significantly increase the documented diversity of the strains involved in the continually evolving HIV-1 pandemic. IMPORTANCE Very little is known about the ancestral HIV-1 strains that founded the global pandemic, and very few complete genome sequences are available from patients in the Congo Basin, where HIV-1 expanded early in the global pandemic

  14. Transcriptome analysis of the mud crab (Scylla paramamosain by 454 deep sequencing: assembly, annotation, and marker discovery.

    Directory of Open Access Journals (Sweden)

    Hongyu Ma

    Full Text Available In this study, we reported the characterization of the first transcriptome of the mud crab (Scylla paramamosain. Pooled cDNAs of four tissue types from twelve wild individuals were sequenced using the Roche 454 FLX platform. Analysis performed included de novo assembly of transcriptome sequences, functional annotation, and molecular marker discovery. A total of 1,314,101 high quality reads with an average length of 411 bp were generated by 454 sequencing on a mixed cDNA library. De novo assembly of these 1,314,101 reads produced 76,778 contigs (consisting of 818,154 reads with 5.4-fold average sequencing coverage. The remaining 495,947 reads were singletons. A total of 78,268 unigenes were identified based on sequence similarity with known proteins (E≤0.00001 in UniProt and non-redundant protein databases. Meanwhile, 44,433 sequences were identified (E≤0.00001 using a BLASTN search against the NCBI nucleotide database. Gene Ontology (GO analysis indicated that biosynthetic process, cell part, and ion binding were the most abundant terms in biological process, cellular component, and molecular function categories, respectively. Kyoto Encyclopedia of Genes and Genome (KEGG pathway analysis revealed that 4,878 unigenes distributed in 281 different pathways. In addition, 19,011 microsatellites and 37,063 potential single nucleotide polymorphisms were detected from the transcriptome of S. paramamosain. Finally, thirty polymorphic microsatellite markers were developed and used to assess genetic diversity of a wild population of S. paramamosain. So far, existing sequence resources for S. paramamosain are extremely limited. The present study provides a characterization of transcriptome from multiple tissues and individuals, as well as an assessment of genetic diversity of a wild population. These sequence resources will facilitate the investigation of population genetic diversity, the development of genetic maps, and the conduct of molecular marker

  15. Expansion of the Knockdown Resistance Frequency Map for Human Head Lice (Phthiraptera: Pediculidae) in the United States Using Quantitative Sequencing.

    Science.gov (United States)

    Gellatly, Kyle J; Krim, Sarah; Palenchar, Daniel J; Shepherd, Katie; Yoon, Kyong Sup; Rhodes, Christopher J; Lee, Si Hyeock; Marshall Clark, J

    2016-05-01

    Pediculosis is a prevalent parasitic infestation of humans, which is increasing due, in part, to the selection of lice resistant to either the pyrethrins or pyrethroid insecticides by the knockdown resistance (kdr) mechanism. To determine the extent and magnitude of the kdr-type mutations responsible for this resistance, lice were collected from 138 collection sites in 48 U.S. states from 22 July 2013 to 11 May 2015 and analyzed by quantitative sequencing. Previously published data were used for comparisons of the changes in the frequency of the kdr-type mutations over time. Mean percent resistance allele frequency (mean % RAF) values across the three mutation loci were determined from each collection site. The overall mean % RAF (±SD) for all analyzed lice was 98.3 ± 10%. 132/138 sites (95.6%) had a mean % RAF of 100%, five sites (3.7%) had intermediate values, and only a single site had no mutations (0.0%). Forty-two states (88%) had a mean % RAF of 100%. The frequencies of kdr-type mutations did not differ regardless of the human population size that the lice were collected from, indicating a uniformly high level of resistant alleles. The loss of efficacy of the Nix formulation (Prestige Brand, Tarrytown, NY) from 1998 to 2013 was correlated to the increase in kdr-type mutations. These data provide a plausible reason for the decrease in the effectiveness of permethrin in the Nix formulation, which is the parallel increase of kdr-type mutations in lice over time. © The Authors 2016. Published by Oxford University Press on behalf of Entomological Society of America.

  16. Deep-sequencing of microRNA associated with Alzheimer’s disease in biological fluids: From biomarker discovery to diagnostic practice

    Directory of Open Access Journals (Sweden)

    Lesley eCheng

    2013-08-01

    Full Text Available Diagnostic tools for neurodegenerative diseases such as Alzheimer's disease (AD currently involve subjective neuropsychological testing and specialised brain imaging techniques. While definitive diagnosis requires a pathological brain evaluation at autopsy, neurodegenerative changes are believed to begin years before the clinical presentation of cognitive decline. Therefore, there is an essential need for reliable biomarkers to aid in the early detection of disease in order to implement preventative strategies. microRNAs (miRNA are small non-coding RNA species that are involved in post-transcriptional gene regulation. Expression levels of miRNA’s have potential as diagnostic biomarkers as they are known to circulate and tissue specific profiles can be identified in a number of bodily fluids such as plasma, CSF and urine. Recent developments in deep sequencing technology present a viable approach to develop biomarker discovery pipelines in order to profile microRNA signatures in bodily fluids specific to neurodegenerative diseases. Here we review the potential use of microRNA deep sequencing in biomarker identification from biological fluids and its translation into clinical practice.

  17. Deep sequencing of voodoo lily (Amorphophallus konjac): an approach to identify relevant genes involved in the synthesis of the hemicellulose glucomannan.

    Science.gov (United States)

    Gille, Sascha; Cheng, Kun; Skinner, Mary E; Liepman, Aaron H; Wilkerson, Curtis G; Pauly, Markus

    2011-09-01

    A Roche 454 cDNA deep sequencing experiment was performed on a developing corm of Amorphophallus konjac--also known as voodoo lily. The dominant storage polymer in the corm of this plant is the polysaccharide glucomannan, a hemicellulose known to exist in the cell walls of higher plants and a major component of plant biomass derived from softwoods. A total of 246 mega base pairs of sequence data was obtained from which 4,513 distinct contigs were assembled. Within this voodoo lily expressed sequence tag collection genes representing the carbohydrate related pathway of glucomannan biosynthesis were identified, including sucrose metabolism, nucleotide sugar conversion pathways for the formation of activated precursors as well as a putative glucomannan synthase. In vivo expression of the putative glucomannan synthase and subsequent in vitro activity assays unambiguously demonstrate that the enzyme has indeed glucomannan mannosyl- and glucosyl transferase activities. Based on the expressed sequence tag analysis hitherto unknown pathways for the synthesis of GDP-glucose, a necessary precursor for glucomannan biosynthesis, could be proposed. Moreover, the results highlight transcriptional bottlenecks for the synthesis of this hemicellulose.

  18. Deep Sequencing Analysis of RNAs from Citrus Plants Grown in a Citrus Sudden Death-Affected Area Reveals Diverse Known and Putative Novel Viruses

    Directory of Open Access Journals (Sweden)

    Emilyn E. Matsumura

    2017-04-01

    Full Text Available Citrus sudden death (CSD has caused the death of approximately four million orange trees in a very important citrus region in Brazil. Although its etiology is still not completely clear, symptoms and distribution of affected plants indicate a viral disease. In a search for viruses associated with CSD, we have performed a comparative high-throughput sequencing analysis of the transcriptome and small RNAs from CSD-symptomatic and -asymptomatic plants using the Illumina platform. The data revealed mixed infections that included Citrus tristeza virus (CTV as the most predominant virus, followed by the Citrus sudden death-associated virus (CSDaV, Citrus endogenous pararetrovirus (CitPRV and two putative novel viruses tentatively named Citrus jingmen-like virus (CJLV, and Citrus virga-like virus (CVLV. The deep sequencing analyses were sensitive enough to differentiate two genotypes of both viruses previously associated with CSD-affected plants: CTV and CSDaV. Our data also showed a putative association of the CSD-symptomatic plants with a specific CSDaV genotype and a likely association with CitPRV as well, whereas the two putative novel viruses showed to be more associated with CSD-asymptomatic plants. This is the first high-throughput sequencing-based study of the viral sequences present in CSD-affected citrus plants, and generated valuable information for further CSD studies.

  19. Quantitative measurements on the paleo-weathering intensity of the loess-soil sequences and implication on paleomonsoon

    Institute of Scientific and Technical Information of China (English)

    HAO; Qingzhen

    2001-01-01

    [1]Liu, T. S., Loess and the Environment, Beijing: China Ocean Press, 1985, 1-251.[2]Chen, L. X., Zhu, Q. G., Luo, H. B. et al., East Asian Monsoon, Beijing: China Meteorology Press, 1991, 28-61.[3]An, Z. S., Liu, T. S., Lu, Y. C. et al., The long-term palaeomonsoon variation recorded by the loess-palaeosol sequence in central China, Quaternary International, 1990, (7/8): 91-95.[4]Guo, Z. T., Liu, T. S., Fedoroff, N. et al., Shift of the monsoon intensity on the Loess Plateau at ca. 0.85 MaBP, Chinese Science Bulletin, 1993, 38(2): 586-591.[5]Chen, J., An, Z. S., Wang, Y. J. et al., Distributions of Rb and Sr in the Luochuan loess-paleosol sequence of China during the last 800 ka: Implications for paleomonsoon variations, Science in China, Ser. D, 1999, 42(3): 225-232.[6]Chen, J., Wang, Y. J., Ji, J. F. et al., Rb/Sr variations and its climatic stratigraphical significance of a loess-paleosol profile from Luochuan, Shaanxi Province, Quaternary Sciences (in Chinese), 1999, 19(4): 350-356.[7]Guo, Z. T.,Liu, T. S., Fedoroff, N. et al., Climate extremes in loess of China coupled with the strength of deep-water for-mation in the North Atlantic, Global and Planetary Change, 1998, 18: 113-128.[8]Guo, Z. T., Liu, T. S., An, Z. S., Paleosols of the last 0.15 Ma in the Weinan loess section and their paleoclimate signifi-cance, Quaternary Sciences (in Chinese), 1994, 14(3): 256-269.[9]Guo, Z, T,, Fedoroff, N., Liu, T. S., Micromorphology of the loess-paleosol sequence of the last 130 ka in China and pa-leoclimatic event, Science in China (in Chinese), Ser. D, 1996, 26(3): 392-398.[10]Guo, Z., Liu, T., Guiot, J., et al., High frequency pulses of East Asian monsoon climate in the last two glaciations: Link with the North Atlantic, Climate Dynamics, 1996, 12: 701-709.[11]Guo, Z. T., Peng, S. Z., Wei, L. Y. et al., Weathering signals of Millennial-Scale oscillations of the East Asian Summer monsoon over the last 220 ka, Chinese Science

  20. Whole transcriptome RNA sequencing data from blood leukocytes derived from Parkinson's disease patients prior to and following deep brain stimulation treatment

    Directory of Open Access Journals (Sweden)

    Lilach Soreq

    2015-03-01

    Full Text Available Recent evidence demonstrates the power of RNA sequencing (RNA-Seq for identifying valuable and urgently needed blood biomarkers and advancing both early and accurate detection of neurological diseases, and in particular Parkinson's disease (PD. RNA sequencing technology enables non-biased, high throughput, probe-independent inspection of expression data and high coverage and both quantification of global transcript levels as well as the detection of expressed exons and junctions given a sufficient sequencing depth (coverage. However, the analysis of sequencing data frequently presents a bottleneck. Tools for quantification of alternative splicing from sequenced libraries hardly exist at the present time, and methods that support multiple sequencing platforms are especially lacking. Here, we describe in details a whole RNA-Seq transcriptome dataset produced from PD patient's blood leukocytes. The samples were taken prior to, and following deep brain stimulation (DBS treatment while being on stimulation and following 1 h of complete electrical stimulation cessation and from healthy control volunteers. We describe in detail the methodology applied for analyzing the RNA-Seq data including differential expression of long noncoding RNAs (lncRNAs. We also provide details of the corresponding analysis of in-depth splice isoform data from junction and exon reads, with the use of the software AltAnalyze. Both the RNA-Seq raw (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE42608 and analyzed data (https://www.synapse.org/#!Synapse:syn2805267 may be found valuable towards detection of novel blood biomarkers for PD.

  1. Ultra-deep Illumina sequencing accurately identifies MHC class IIb alleles and provides evidence for copy number variation in the guppy (Poecilia reticulata).

    Science.gov (United States)

    Lighten, Jackie; van Oosterhout, Cock; Paterson, Ian G; McMullan, Mark; Bentzen, Paul

    2014-07-01

    We address the bioinformatic issue of accurately separating amplified genes of the major histocompatibility complex (MHC) from artefacts generated during high-throughput sequencing workflows. We fit observed ultra-deep sequencing depths (hundreds to thousands of sequences per amplicon) of allelic variants to expectations from genetic models of copy number variation (CNV). We provide a simple, accurate and repeatable method for genotyping multigene families, evaluating our method via analyses of 209 b of MHC class IIb exon 2 in guppies (Poecilia reticulata). Genotype repeatability for resequenced individuals (N = 49) was high (100%) within the same sequencing run. However, repeatability dropped to 83.7% between independent runs, either because of lower mean amplicon sequencing depth in the initial run or random PCR effects. This highlights the importance of fully independent replicates. Significant improvements in genotyping accuracy were made by greatly reducing type I genotyping error (i.e. accepting an artefact as a true allele), which may occur when using low-depth allele validation thresholds used by previous methods. Only a small amount (4.9%) of type II error (i.e. rejecting a genuine allele as an artefact) was detected through fully independent sequencing runs. We observed 1-6 alleles per individual, and evidence of sharing of alleles across loci. Variation in the total number of MHC class II loci among individuals, both among and within populations was also observed, and some genotypes appeared to be partially hemizygous; total allelic dosage added up to an odd number of allelic copies. Collectively, observations provide evidence of MHC CNV and its complex basis in natural populations.

  2. A Method for Amplicon Deep Sequencing of Drug Resistance Genes in Plasmodium falciparum Clinical Isolates from India.

    Science.gov (United States)

    Rao, Pavitra N; Uplekar, Swapna; Kayal, Sriti; Mallick, Prashant K; Bandyopadhyay, Nabamita; Kale, Sonal; Singh, Om P; Mohanty, Akshaya; Mohanty, Sanjib; Wassmer, Samuel C; Carlton, Jane M

    2016-06-01

    A major challenge to global malaria control and elimination is early detection and containment of emerging drug resistance. Next-generation sequencing (NGS) methods provide the resolution, scalability, and sensitivity required for high-throughput surveillance of molecular markers of drug resistance. We have developed an amplicon sequencing method on the Ion Torrent PGM platform for targeted resequencing of a panel of six Plasmodium falciparum genes implicated in resistance to first-line antimalarial therapy, including artemisinin combination therapy, chloroquine, and sulfadoxine-pyrimethamine. The protocol was optimized using 12 geographically diverse P. falciparum reference strains and successfully applied to multiplexed sequencing of 16 clinical isolates from India. The sequencing results from the reference strains showed 100% concordance with previously reported drug resistance-associated mutations. Single-nucleotide polymorphisms (SNPs) in clinical isolates revealed a number of known resistance-associated mutations and other nonsynonymous mutations that have not been implicated in drug resistance. SNP positions containing multiple allelic variants were used to identify three clinical samples containing mixed genotypes indicative of multiclonal infections. The amplicon sequencing protocol has been designed for the benchtop Ion Torrent PGM platform and can be operated with minimal bioinformatics infrastructure, making it ideal for use in countries that are endemic for the disease to facilitate routine large-scale surveillance of the emergence of drug resistance and to ensure continued success of the malaria treatment policy.

  3. SU-F-BRB-02: Towards Quantitative Clinical Decision On Deep Inspiration Breath Hold (DIBH) Or Prone for Left-Sided Breast Irradiation

    Energy Technology Data Exchange (ETDEWEB)

    Lin, H; Gao, Y [Rensselaer Polytechnic Institute, Troy, NY (United States); Liu, T [Memorial Sloan Kettering West Harrison, West Harrison, New York (United States); Gelblum, D; Ho, A [Memorial Sloan Lettering Cancer Center, New York City, New York (United States); Powell, S [Memorial Sloan Kettering Cancer Center, New York, NY (United States); Tang, X [Memorial Sloan Kettering Cancer Center, West Harrison, NY (United States); Xu, X [Rensselaer Polytechnic Inst., Troy, NY (United States)

    2015-06-15

    Purpose: To develop quantitative clinical guidelines between supine Deep Inspiratory Breath Hold (DIBH) and prone free breathing treatments for breast patients, we applied 3D deformable phantoms to perform Monte Carlo simulation to predict corresponding Dose to the Organs at Risk (OARs). Methods: The RPI-adult female phantom (two selected cup sizes: A and D) was used to represent the female patient, and it was simulated using the MCNP6 Monte Carlo code. Doses to OARs were investigated for supine DIBH and prone treatments, considering two breast sizes. The fluence maps of the 6-MV opposed tangential fields were exported. In the Monte Carlo simulation, the fluence maps allow each simulated photon particle to be weighed in the final dose calculation. The relative error of all dose calculations was kept below 5% by simulating 3*10{sup 7} photons for each projection. Results: In terms of dosimetric accuracy, the RPI Adult Female phantom with cup size D in DIBH positioning matched with a DIBH treatment plan of the patient. Based on the simulation results, for cup size D phantom, prone positioning reduced the cardiac dose and the dose to other OARs, while cup size A phantom benefits more from DIBH positioning. Comparing simulation results for cup size A and D phantom, dose to OARs was generally higher for the large breast size due to increased scattering arising from a larger portion of the body in the primary beam. The lower dose that was registered for the heart in the large breast phantom in prone positioning was due to the increase of the distance between the heart and the primary beam when the breast was pendulous. Conclusion: Our 3D deformable phantom appears an excellent tool to predict dose to the OARs for the supine DIBH and prone positions, which might help quantitative clinical decisions. Further investigation will be conducted. National Institutes of Health R01EB015478.

  4. Pooled deep sequencing of Plasmodium falciparum isolates: an efficient and scalable tool to quantify prevailing malaria drug-resistance genotypes.

    Science.gov (United States)

    Taylor, Steve M; Parobek, Christian M; Aragam, Nash; Ngasala, Billy E; Mårtensson, Andreas; Meshnick, Steven R; Juliano, Jonathan J

    2013-12-15

    Molecular surveillance for drug-resistant malaria parasites requires reliable, timely, and scalable methods. These data may be efficiently produced by genotyping parasite populations using second-generation sequencing (SGS). We designed and validated a SGS protocol to quantify mutant allele frequencies in the Plasmodium falciparum genes dhfr and dhps in mixed isolates. We applied this new protocol to field isolates from children and compared it to standard genotyping using Sanger sequencing. The SGS protocol accurately quantified dhfr and dhps allele frequencies in a mixture of parasite strains. Using SGS of DNA that was extracted and then pooled from individual isolates, we estimated mutant allele frequencies that were closely correlated to those estimated by Sanger sequencing (correlations, >0.98). The SGS protocol obviated most molecular steps in conventional methods and is cost saving for parasite populations >50. This SGS genotyping method efficiently and reproducibly estimates parasite allele frequencies within populations of P. falciparum for molecular epidemiologic studies.

  5. Ultra-deep sequencing detects ovarian cancer cells in peritoneal fluid and reveals somatic TP53 mutations in noncancerous tissues.

    Science.gov (United States)

    Krimmel, Jeffrey D; Schmitt, Michael W; Harrell, Maria I; Agnew, Kathy J; Kennedy, Scott R; Emond, Mary J; Loeb, Lawrence A; Swisher, Elizabeth M; Risques, Rosa Ana

    2016-05-24

    Current sequencing methods are error-prone, which precludes the identification of low frequency mutations for early cancer detection. Duplex sequencing is a sequencing technology that decreases errors by scoring mutations present only in both strands of DNA. Our aim was to determine whether duplex sequencing could detect extremely rare cancer cells present in peritoneal fluid from women with high-grade serous ovarian carcinomas (HGSOCs). These aggressive cancers are typically diagnosed at a late stage and are characterized by TP53 mutations and peritoneal dissemination. We used duplex sequencing to analyze TP53 mutations in 17 peritoneal fluid samples from women with HGSOC and 20 from women without cancer. The tumor TP53 mutation was detected in 94% (16/17) of peritoneal fluid samples from women with HGSOC (frequency as low as 1 mutant per 24,736 normal genomes). Additionally, we detected extremely low frequency TP53 mutations (median mutant fraction 1/13,139) in peritoneal fluid from nearly all patients with and without cancer (35/37). These mutations were mostly deleterious, clustered in hotspots, increased with age, and were more abundant in women with cancer than in controls. The total burden of TP53 mutations in peritoneal fluid distinguished cancers from controls with 82% sensitivity (14/17) and 90% specificity (18/20). Age-associated, low frequency TP53 mutations were also found in 100% of peripheral blood samples from 15 women with and without ovarian cancer (none with hematologic disorder). Our results demonstrate the ability of duplex sequencing to detect rare cancer cells and provide evidence of widespread, low frequency, age-associated somatic TP53 mutation in noncancerous tissue.

  6. Global Transcriptome Analysis of the Tentacle of the Jellyfish Cyanea capillata Using Deep Sequencing and Expressed Sequence Tags: Insight into the Toxin- and Degenerative Disease-Related Transcripts.

    Directory of Open Access Journals (Sweden)

    Guoyan Liu

    Full Text Available Jellyfish contain diverse toxins and other bioactive components. However, large-scale identification of novel toxins and bioactive components from jellyfish has been hampered by the low efficiency of traditional isolation and purification methods.We performed de novo transcriptome sequencing of the tentacle tissue of the jellyfish Cyanea capillata. A total of 51,304,108 reads were obtained and assembled into 50,536 unigenes. Of these, 21,357 unigenes had homologues in public databases, but the remaining unigenes had no significant matches due to the limited sequence information available and species-specific novel sequences. Functional annotation of the unigenes also revealed general gene expression profile characteristics in the tentacle of C. capillata. A primary goal of this study was to identify putative toxin transcripts. As expected, we screened many transcripts encoding proteins similar to several well-known toxin families including phospholipases, metalloproteases, serine proteases and serine protease inhibitors. In addition, some transcripts also resembled molecules with potential toxic activities, including cnidarian CfTX-like toxins with hemolytic activity, plancitoxin-1, venom toxin-like peptide-6, histamine-releasing factor, neprilysin, dipeptidyl peptidase 4, vascular endothelial growth factor A, angiotensin-converting enzyme-like and endothelin-converting enzyme 1-like proteins. Most of these molecules have not been previously reported in jellyfish. Interestingly, we also characterized a number of transcripts with similarities to proteins relevant to several degenerative diseases, including Huntington's, Alzheimer's and Parkinson's diseases. This is the first description of degenerative disease-associated genes in jellyfish.We obtained a well-categorized and annotated transcriptome of C. capillata tentacle that will be an important and valuable resource for further understanding of jellyfish at the molecular level and information

  7. Identification of Circulating miRNAs in a Mouse Model of Nerve Allograft Transplantation under FK506 Immunosuppression by Illumina Small RNA Deep Sequencing

    Directory of Open Access Journals (Sweden)

    Shao-Chun Wu

    2015-01-01

    Full Text Available Background. This study aimed to establish the expression profile of circulating microRNAs (miRNAs during nerve allotransplantation in the presence and absence of FK506 immunosuppression. Methods. A 1 cm BALB/c donor sciatic nerve graft was transplanted into the sciatic nerve gaps created in recipient C57BL/6 mice with or without daily FK506 immunosuppression [1 mg/(kg·d]. At 3, 7, and 14 d after nerve allotransplantation, serum samples were collected for miRNA expression analysis by Illumina small RNA deep sequencing. Results. Sequence analysis showed that the dominant size of circulating small RNAs after nerve allotransplantation was 22 nucleotides, followed by 23-nucleotide sequences. Nine upregulated circulating miRNAs (let-7e-5p, miR-101a-3p, miR-151-5p, miR-181a-5p, miR-204-5p, miR-340-5p, miR-381-3p, miR-411-5p, miR-9-5p, and miR-219-2-3p were identified at 3 d, but none was identified at 7 or 14 d. Among them, miR-9-5p had the highest fold-change of >50-fold, followed by miR-340-5p with 38.8-fold. The presence of these nine miRNAs was not significant at 7 and 14 d after nerve allotransplantation with or without immunosuppression, showing that these miRNAs are not ideal biomarkers for monitoring rejection of deep-buried nerve allografts, a response usually observed later. Conclusions. We identified nine upregulated circulating miRNAs, which may have a biological function, particularly during the early stages after nerve allotransplantation under FK506 immunosuppression.

  8. Recurrence of the 'deep-intronic' FGG IVS6-320A>T mutation causing quantitative fibrinogen deficiency in the Italian population of Veneto.

    Science.gov (United States)

    Platè, Manuela; Duga, Stefano; Castaman, Giancarlo; Rodeghiero, Francesco; Asselta, Rosanna

    2009-07-01

    Quantitative fibrinogen deficiency is a rare bleeding disorder characterized by abnormally low levels of fibrinogen in plasma, generally due to mutations in one of the three fibrinogen genes: FGA, FGB, and FGG, coding for A alpha, B beta, and gamma chain, respectively. Although the partial defect (hypofibrinogenemia) is due to mutations occurring in the heterozygous state, homozygosity or compound heterozygosity for the same genetic defects give rise to the more severe afibrinogenemia. Mutations responsible for these conditions are scattered throughout the three fibrinogen genes, with only few sites representing relative mutational hot spots. In this study, we report the identification of the FGG IVS6-320A>T mutation in an Italian hypofibrinogenemic patient from Veneto (a region of North-Eastern Italy). This 'deep-intronic' mutation, which would go unnoticed by using conventional mutational screening strategies was previously reported in an afibrinogenemic family from Vicenza (a province of Veneto). The geographic clustering of patients carrying the FGG IVS6-320A>T mutation and the results of haplotype analysis suggest the existence of a common founder. This information will be useful to direct future genetic screenings in patients coming from the same geographic area.

  9. Deep wide-field imaging down to the oldest main sequence turn-offs in the Sculptor dwarf spheroidal galaxy

    NARCIS (Netherlands)

    de Boer, T. J. L.; Tolstoy, E.; Saha, A.; Olsen, K.; Irwin, M. J.; Battaglia, G.; Hill, V.; Shetrone, M. D.; Fiorentino, G.; Cole, A.

    2011-01-01

    We present wide-field photometry of resolved stars in the nearby Sculptor dwarf spheroidal galaxy using CTIO/MOSAIC, going down to the oldest main sequence turn-off. The accurately flux calibrated wide field colour-magnitude diagrams can be used to constrain the ages of different stellar populations

  10. Characterization of attached bacterial populations in deep granitic groundwater from the Stripa research mine by 16S rRNA gene sequencing and scanning electron microscopy.

    Science.gov (United States)

    Ekendahl, S; Arlinger, J; Ståhl, F; Pedersen, K

    1994-07-01

    This paper presents the molecular characterization of attached bacterial populations growing in slowly flowing artesian groundwater from deep crystalline bed-rock of the Stripa mine, south central Sweden. Bacteria grew on glass slides in laminar flow reactors connected to the anoxic groundwater flowing up through tubing from two levels of a borehole, 812-820 m and 970-1240 m. The glass slides were collected, the bacterial DNA was extracted and the 16S rRNA genes were amplified by PCR using primers matching universally conserved positions 519-536 and 1392-1405. The resulting PCR fragments were subsequently cloned and sequenced. The sequences were compared with each other and with 16S rRNA gene sequences in the EMBL database. Three major groups of bacteria were found. Signature bases placed the clones in the appropriate systematic groups. All belonged to the proteobacterial groups beta and gamma. One group was found only at the 812-820 m level, where it constituted 63% of the sequenced clones, whereas the second group existed almost exclusively at the 970-1240 m level, where it constituted 83% of the sequenced clones. The third group was equally distributed between the levels. A few other bacteria were also found. None of the 16S rRNA genes from the dominant bacteria showed more than 88% similarity to any of the others, and none of them resembled anything in the database by more than 96%. Temperature did not seem to have any effect on species composition at the deeper level. SEM images showed rods appearing in microcolonies.(ABSTRACT TRUNCATED AT 250 WORDS)

  11. Deep Illumina-based shotgun sequencing reveals dietary effects on the structure and function of the fecal microbiome of growing kittens.

    Directory of Open Access Journals (Sweden)

    Oliver Deusch

    Full Text Available Previously, we demonstrated that dietary protein:carbohydrate ratio dramatically affects the fecal microbial taxonomic structure of kittens using targeted 16S gene sequencing. The present study, using the same fecal samples, applied deep Illumina shotgun sequencing to identify the diet-associated functional potential and analyze taxonomic changes of the feline fecal microbiome.Fecal samples from kittens fed one of two diets differing in protein and carbohydrate content (high-protein, low-carbohydrate, HPLC; and moderate-protein, moderate-carbohydrate, MPMC were collected at 8, 12 and 16 weeks of age (n = 6 per group. A total of 345.3 gigabases of sequence were generated from 36 samples, with 99.75% of annotated sequences identified as bacterial. At the genus level, 26% and 39% of reads were annotated for HPLC- and MPMC-fed kittens, with HPLC-fed cats showing greater species richness and microbial diversity. Two phyla, ten families and fifteen genera were responsible for more than 80% of the sequences at each taxonomic level for both diet groups, consistent with the previous taxonomic study. Significantly different abundances between diet groups were observed for 324 genera (56% of all genera identified demonstrating widespread diet-induced changes in microbial taxonomic structure. Diversity was not affected over time. Functional analysis identified 2,013 putative enzyme function groups were different (p<0.000007 between the two dietary groups and were associated to 194 pathways, which formed five discrete clusters based on average relative abundance. Of those, ten contained more (p<0.022 enzyme functions with significant diet effects than expected by chance. Six pathways were related to amino acid biosynthesis and metabolism linking changes in dietary protein with functional differences of the gut microbiome.These data indicate that feline feces-derived microbiomes have large structural and functional differences relating to the dietary

  12. Genotyping-by-sequencing markers facilitate the identification of quantitative trait loci controlling resistance to Penicillium expansum in Malus sieversii

    Science.gov (United States)

    Wisniewski, Michael; Fazio, Gennaro; Burchard, Erik; Gutierrez, Benjamin; Levin, Elena; Droby, Samir

    2017-01-01

    Blue mold caused by Penicillium expansum is the most important postharvest disease of apple worldwide and results in significant financial losses. There are no defined sources of resistance to blue mold in domesticated apple. However, resistance has been described in wild Malus sieversii accessions, including plant introduction (PI)613981. The objective of the present study was to identify the genetic loci controlling resistance to blue mold in this accession. We describe the first quantitative trait loci (QTL) reported in the Rosaceae tribe Maleae conditioning resistance to P. expansum on genetic linkage group 3 (qM-Pe3.1) and linkage group 10 (qM-Pe10.1). These loci were identified in a M.× domestica ‘Royal Gala’ X M. sieversii PI613981 family (GMAL4593) based on blue mold lesion diameter seven days post-inoculation in mature, wounded apple fruit inoculated with P. expansum. Phenotypic analyses were conducted in 169 progeny over a four year period. PI613981 was the source of the resistance allele for qM-Pe3.1, a QTL with a major effect on blue mold resistance, accounting for 27.5% of the experimental variability. The QTL mapped from 67.3 to 74 cM on linkage group 3 of the GMAL4593 genetic linkage map. qM-Pe10.1 mapped from 73.6 to 81.8 cM on linkage group 10. It had less of an effect on resistance, accounting for 14% of the experimental variation. ‘Royal Gala’ was the primary contributor to the resistance effect of this QTL. However, resistance-associated alleles in both parents appeared to contribute to the least square mean blue mold lesion diameter in an additive manner at qM-Pe10.1. A GMAL4593 genetic linkage map composed of simple sequence repeats and ‘Golden Delicious’ single nucleotide polymorphism markers was able to detect qM-Pe10.1, but failed to detect qM-Pe3.1. The subsequent addition of genotyping-by-sequencing markers to the linkage map provided better coverage of the PI613981 genome on linkage group 3 and facilitated discovery of q

  13. Genotyping-by-sequencing markers facilitate the identification of quantitative trait loci controlling resistance to Penicillium expansum in Malus sieversii.

    Science.gov (United States)

    Norelli, John L; Wisniewski, Michael; Fazio, Gennaro; Burchard, Erik; Gutierrez, Benjamin; Levin, Elena; Droby, Samir

    2017-01-01

    Blue mold caused by Penicillium expansum is the most important postharvest disease of apple worldwide and results in significant financial losses. There are no defined sources of resistance to blue mold in domesticated apple. However, resistance has been described in wild Malus sieversii accessions, including plant introduction (PI)613981. The objective of the present study was to identify the genetic loci controlling resistance to blue mold in this accession. We describe the first quantitative trait loci (QTL) reported in the Rosaceae tribe Maleae conditioning resistance to P. expansum on genetic linkage group 3 (qM-Pe3.1) and linkage group 10 (qM-Pe10.1). These loci were identified in a M.× domestica 'Royal Gala' X M. sieversii PI613981 family (GMAL4593) based on blue mold lesion diameter seven days post-inoculation in mature, wounded apple fruit inoculated with P. expansum. Phenotypic analyses were conducted in 169 progeny over a four year period. PI613981 was the source of the resistance allele for qM-Pe3.1, a QTL with a major effect on blue mold resistance, accounting for 27.5% of the experimental variability. The QTL mapped from 67.3 to 74 cM on linkage group 3 of the GMAL4593 genetic linkage map. qM-Pe10.1 mapped from 73.6 to 81.8 cM on linkage group 10. It had less of an effect on resistance, accounting for 14% of the experimental variation. 'Royal Gala' was the primary contributor to the resistance effect of this QTL. However, resistance-associated alleles in both parents appeared to contribute to the least square mean blue mold lesion diameter in an additive manner at qM-Pe10.1. A GMAL4593 genetic linkage map composed of simple sequence repeats and 'Golden Delicious' single nucleotide polymorphism markers was able to detect qM-Pe10.1, but failed to detect qM-Pe3.1. The subsequent addition of genotyping-by-sequencing markers to the linkage map provided better coverage of the PI613981 genome on linkage group 3 and facilitated discovery of qM-Pe3.1. A DNA

  14. [Sequence of venous blood flow alterations in patients after recently endured acute thrombosis of lower-limb deep veins based on the findings of ultrasonographic duplex scanning].

    Science.gov (United States)

    Tarkovskiĭ, A A; Zudin, A M; Aleksandrova, E S

    2009-01-01

    This study was undertaken to investigate the sequence of alterations in the venous blood flow to have occurred within the time frame of one year after sustained acute thrombosis of the lower-limb deep veins, which was carried out using the standard technique of ultrasonographic duplex scanning. A total of thirty-two 24-to-62-year-old patients presenting with newly onset acute phlebothrombosis were followed up. All the patients were sequentially examined at 2 days, 3 weeks, 3 months, 6 months and 12 months after the manifestation of the initial clinical signs of the disease. Amongst the parameters to determine were the patency of the deep veins and the condition of the valvular apparatus of the deep, superficial and communicant veins. According to the obtained findings, it was as early as at the first stage of the phlebohaemodynamic alterations after the endured thrombosis, i. e., during the acute period of the disease, that seven (21.9%) patients were found to have developed valvular insufficiency of the communicant veins of the cms, manifesting itself in the formation of a horizontal veno-venous reflux, and 6 months later, these events were observed to have occurred in all the patients examined (100%). Afterwards, the second stage of the phlebohaemodynamic alterations was, simultaneously with the process of recanalization of the thrombotic masses in the deep veins, specifically characterized by the formation of valvular insufficiency of the latter, manifesting itself in the form of the development of a deep vertical veno-venous reflux, which was revealed at month six after the onset of the disease in 56.3% of the examined subjects, to be then observed after 12 months in 93.8% of the patients involved. Recanalization of thrombotic masses was noted to commence 3 months after the onset of thrombosis in twelve (37.5%) patients, and after 12 months it was seen to ensue in all the patients (100%), eventually ending in complete restoration of the patency of the affected

  15. Construction and sequence sampling of deep-coverage, large-insert BAC libraries for three model lepidopteran species

    Directory of Open Access Journals (Sweden)

    Zhao Shaying

    2009-06-01

    Full Text Available Abstract Background Manduca sexta, Heliothis virescens, and Heliconius erato represent three widely-used insect model species for genomic and fundamental studies in Lepidoptera. Large-insert BAC libraries of these insects are critical resources for many molecular studies, including physical mapping and genome sequencing, but not available to date. Results We report the construction and characterization of six large-insert BAC libraries for the three species and sampling sequence analysis of the genomes. The six BAC libraries were constructed with two restriction enzymes, two libraries for each species, and each has an average clone insert size ranging from 152–175 kb. We estimated that the genome coverage of each library ranged from 6–9 ×, with the two combined libraries of each species being equivalent to 13.0–16.3 × haploid genomes. The genome coverage, quality and utility of the libraries were further confirmed by library screening using 6~8 putative single-copy probes. To provide a first glimpse into these genomes, we sequenced and analyzed the BAC ends of ~200 clones randomly selected from the libraries of each species. The data revealed that the genomes are AT-rich, contain relatively small fractions of repeat elements with a majority belonging to the category of low complexity repeats, and are more abundant in retro-elements than DNA transposons. Among the species, the H. erato genome is somewhat more abundant in repeat elements and simple repeats than those of M. sexta and H. virescens. The BLAST analysis of the BAC end sequences suggested that the evolution of the three genomes is widely varied, with the genome of H. virescens being the most conserved as a typical lepidopteran, whereas both genomes of H. erato and M. sexta appear to have evolved significantly, resulting in a higher level of species- or evolutionary lineage-specific sequences. Conclusion The high-quality and large-insert BAC libraries of the insects, together

  16. Identification of MiRNA from eggplant (Solanum melongena L. by small RNA deep sequencing and their response to Verticillium dahliae infection.

    Directory of Open Access Journals (Sweden)

    Liu Yang

    Full Text Available MiRNAs are a class of non-coding small RNAs that play important roles in the regulation of gene expression. Although plant miRNAs have been extensively studied in model systems, less is known in other plants with limited genome sequence data, including eggplant (Solanum melongena L.. To identify miRNAs in eggplant and their response to Verticillium dahliae infection, a fungal pathogen for which clear understanding of infection mechanisms and effective cure methods are currently lacking, we deep-sequenced two small RNA (sRNA libraries prepared from mock-infected and infected seedlings of eggplants. Specifically, 30,830,792 reads produced 7,716,328 unique miRNAs representing 99 known miRNA families that have been identified in other plant species. Two novel putative miRNAs were predicted with eggplant ESTs. The potential targets of the identified known and novel miRNAs were also predicted based on sequence homology search. It was observed that the length distribution of obtained sRNAs and the expression of 6 miRNA families were obviously different between the two libraries. These results provide a framework for further analysis of miRNAs and their role in regulating plant response to fungal infection and Verticillium wilt in particular.

  17. The utility of diversity profiling using Illumina 18S rRNA gene amplicon deep sequencing to detect and discriminate Toxoplasma gondii among the cyst-forming coccidia.

    Science.gov (United States)

    Cooper, Madalyn K; Phalen, David N; Donahoe, Shannon L; Rose, Karrie; Šlapeta, Jan

    2016-01-30

    Next-generation sequencing (NGS) has the capacity to screen a single DNA sample and detect pathogen DNA from thousands of host DNA sequence reads, making it a versatile and informative tool for investigation of pathogens in diseased animals. The technique is effective and labor saving in the initial identification of pathogens, and will complement conventional diagnostic tests to associate the candidate pathogen with a disease process. In this report, we investigated the utility of the diversity profiling NGS approach using Illumina small subunit ribosomal RNA (18S rRNA) gene amplicon deep sequencing to detect Toxoplasma gondii in previously confirmed cases of toxoplasmosis. We then tested the diagnostic approach with species-specific PCR genotyping, histopathology and immunohistochemistry of toxoplasmosis in a Risso's dolphin (Grampus griseus) to systematically characterise the disease and associate causality. We show that the Euk7A/Euk570R primer set targeting the V1-V3 hypervariable region of the 18S rRNA gene can be used as a species-specific assay for cyst-forming coccidia and discriminate T. gondii. Overall, the approach is cost-effective and improves diagnostic decision support by narrowing the differential diagnosis list with more certainty than was previously possible. Furthermore, it supplements the limitations of cryptic protozoan morphology and surpasses the need for species-specific PCR primer combinations.

  18. Analytical and Clinical Validation of a Digital Sequencing Panel for Quantitative, Highly Accurate Evaluation of Cell-Free Circulating Tumor DNA

    Science.gov (United States)

    Zill, Oliver A.; Sebisanovic, Dragan; Lopez, Rene; Blau, Sibel; Collisson, Eric A.; Divers, Stephen G.; Hoon, Dave S. B.; Kopetz, E. Scott; Lee, Jeeyun; Nikolinakos, Petros G.; Baca, Arthur M.; Kermani, Bahram G.; Eltoukhy, Helmy; Talasaz, AmirAli

    2015-01-01

    Next-generation sequencing of cell-free circulating solid tumor DNA addresses two challenges in contemporary cancer care. First this method of massively parallel and deep sequencing enables assessment of a comprehensive panel of genomic targets from a single sample, and second, it obviates the need for repeat invasive tissue biopsies. Digital SequencingTM is a novel method for high-quality sequencing of circulating tumor DNA simultaneously across a comprehensive panel of over 50 cancer-related genes with a simple blood test. Here we report the analytic and clinical validation of the gene panel. Analytic sensitivity down to 0.1% mutant allele fraction is demonstrated via serial dilution studies of known samples. Near-perfect analytic specificity (> 99.9999%) enables complete coverage of many genes without the false positives typically seen with traditional sequencing assays at mutant allele frequencies or fractions below 5%. We compared digital sequencing of plasma-derived cell-free DNA to tissue-based sequencing on 165 consecutive matched samples from five outside centers in patients with stage III-IV solid tumor cancers. Clinical sensitivity of plasma-derived NGS was 85.0%, comparable to 80.7% sensitivity for tissue. The assay success rate on 1,000 consecutive samples in clinical practice was 99.8%. Digital sequencing of plasma-derived DNA is indicated in advanced cancer patients to prevent repeated invasive biopsies when the initial biopsy is inadequate, unobtainable for genomic testing, or uninformative, or when the patient’s cancer has progressed despite treatment. Its clinical utility is derived from reduction in the costs, complications and delays associated with invasive tissue biopsies for genomic testing. PMID:26474073

  19. Deep sequencing of 16S rRNA gene to efficiently assess bacterial richness and the rare biosphere in soils

    OpenAIRE

    Terrat, Sébastien; Dequiedt, Samuel; Bachar, Dipankar; Christen, Richard; Mougel, Christophe; Lelievre, Mélanie; Maron, Pierre-Alain; Plassart, Pierre; Wincker, Patrick; Cruaud, C; Jolivet, Claudy; Arrouays, Dominique; Bispo, Antonio; Lemanceau, Philippe; Ranjard, Lionel

    2012-01-01

    Soil is one of the most important reservoirs of microbiological diversity on our planet and, above all, one of the last bastions of such biodiversity. Since two decades, numerous molecular tools have been developed to assess accurately this huge diversity directly from soil DNA. In this context, 16S rRNA gene is widely used to study microbial community taxonomic diversity, as it can be easily amplified by PCR, and large databases are available relating obtained sequences to bacteria...

  20. The venom gland transcriptome of Latrodectus tredecimguttatus revealed by deep sequencing and cDNA library analysis.

    Directory of Open Access Journals (Sweden)

    Quanze He

    Full Text Available Latrodectus tredecimguttatus, commonly known as black widow spider, is well known for its dangerous bite. Although its venom has been characterized extensively, some fundamental questions about its molecular composition remain unanswered. The limited transcriptome and genome data available prevent further understanding of spider venom at the molecular level. In the present study, we combined next-generation sequencing and conventional DNA sequencing to construct a venom gland transcriptome of the spider L. tredecimguttatus, which resulted in the identification of 9,666 and 480 high-confidence proteins among 34,334 de novo sequences and 1,024 cDNA sequences, respectively, by assembly, translation, filtering, quantification and annotation. Extensive functional analyses of these proteins indicated that mRNAs involved in RNA transport and spliceosome, protein translation, processing and transport were highly enriched in the venom gland, which is consistent with the specific function of venom glands, namely the production of toxins. Furthermore, we identified 146 toxin-like proteins forming 12 families, including 6 new families in this spider in which α-LTX-Lt1a family2 is firstly identified as a subfamily of α-LTX-Lt1a family. The toxins were classified according to their bioactivities into five categories that functioned in a coordinate way. Few ion channels were expressed in venom gland cells, suggesting a possible mechanism of protection from the attack of their own toxins. The present study provides a gland transcriptome profile and extends our understanding of the toxinome of spiders and coordination mechanism for toxin production in protein expression quantity.

  1. Genome-wide discovery and differential regulation of conserved and novel microRNAs in chickpea via deep sequencing

    Science.gov (United States)

    Jain, Mukesh; Chevala, VVS Narayana; Garg, Rohini

    2014-01-01

    MicroRNAs (miRNAs) are essential components of complex gene regulatory networks that orchestrate plant development. Although several genomic resources have been developed for the legume crop chickpea, miRNAs have not been discovered until now. For genome-wide discovery of miRNAs in chickpea (Cicer arietinum), we sequenced the small RNA content from seven major tissues/organs employing Illumina technology. About 154 million reads were generated, which represented more than 20 million distinct small RNA sequences. We identified a total of 440 conserved miRNAs in chickpea based on sequence similarity with known miRNAs in other plants. In addition, 178 novel miRNAs were identified using a miRDeep pipeline with plant-specific scoring. Some of the conserved and novel miRNAs with significant sequence similarity were grouped into families. The chickpea miRNAs targeted a wide range of mRNAs involved in diverse cellular processes, including transcriptional regulation (transcription factors), protein modification and turnover, signal transduction, and metabolism. Our analysis revealed several miRNAs with differential spatial expression. Many of the chickpea miRNAs were expressed in a tissue-specific manner. The conserved and differential expression of members of the same miRNA family in different tissues was also observed. Some of the same family members were predicted to target different chickpea mRNAs, which suggested the specificity and complexity of miRNA-mediated developmental regulation. This study, for the first time, reveals a comprehensive set of conserved and novel miRNAs along with their expression patterns and putative targets in chickpea, and provides a framework for understanding regulation of developmental processes in legumes. PMID:25151616

  2. Global characterization of the root transcriptome of a wild species of rice, Oryza longistaminata, by deep sequencing

    Directory of Open Access Journals (Sweden)

    Reinhold-Hurek Barbara

    2010-12-01

    Full Text Available Abstract Background Oryza longistaminata, an AA genome type (2 n = 24, originates from Africa and is closely related to Asian cultivated rice (O. sativa L.. It contains various valuable traits with respect to tolerance to biotic and abiotic stress, QTLs with agronomically important traits and high ability to use nitrogen efficiently (NUE. However, only limited genomic or transcriptomic data of O. longistaminata are currently available. Results In this study we present the first comprehensive characterization of the O. longistaminata root transcriptome using 454 pyrosequencing. One sequencing run using a normalized cDNA library from O. longistaminata roots adapted to low N conditions generated 337,830 reads, which assembled into 41,189 contigs and 30,178 singletons. By similarity search against protein databases, putative functions were assigned to over 34,510 uni-ESTs. Comparison with ESTs derived from cultivated rice collections revealed expressed genes across different plant species, however 16.7% of the O. longistaminata ESTs had not been detected as expressed in O. sativa. Additionally, 15.7% had no significant similarity to known sequences. RT-PCR and Southern blot analyses confirmed the expression of selected novel transcripts in O. longistaminata. Conclusion Our results show that one run using a Genome Sequencer FLX from 454 Life Science/Roche generates sufficient genomic information for adequate de novo assembly of a large number of transcripts in a wild rice species, O. longistaminata. The generated sequence data are publicly available and will facilitate gene discovery in O. longistaminata and rice functional genomic studies. The large number of abundant of novel ESTs suggests different metabolic activity in O. longistaminata roots in comparison to O. sativa roots.

  3. In-depth cDNA Library Sequencing Provides Quantitative Gene Expression Profiling in Cancer Biomarker Discovery

    Institute of Scientific and Technical Information of China (English)

    Wanling Yang; Dingge Ying; Yu-Lung Lau

    2009-01-01

    procedures may allow detection of many expres-sion features for less abundant gene variants. With the reduction of sequencing cost and the emerging of new generation sequencing technology, in-depth sequencing of cDNA pools or libraries may represent a better and powerful tool in gene expression profiling and cancer biomarker detection. We also propose using sequence-specific subtraction to remove hundreds of the most abundant housekeeping genes to in-crease sequencing depth without affecting relative expression ratio of other genes, as transcripts from as few as 300 most abundantly expressed genes constitute about 20% of the total transcriptome. In-depth sequencing also represents a unique ad-vantage of detecting unknown forms of transcripts, such as alternative splicing variants, fusion genes, and regulatory RNAs, as well as detecting mutations and polymorphisms that may play important roles in disease pathogenesis.

  4. Deep Sequencing of Mixed Total DNA without Barcodes Allows Efficient Assembly of Highly Plastic Ascidian Mitochondrial Genomes

    Science.gov (United States)

    Rubinstein, Nimrod D.; Feldstein, Tamar; Shenkar, Noa; Botero-Castro, Fidel; Griggio, Francesca; Mastrototaro, Francesco; Delsuc, Frédéric; Douzery, Emmanuel J.P.; Gissi, Carmela; Huchon, Dorothée

    2013-01-01

    Ascidians or sea squirts form a diverse group within chordates, which includes a few thousand members of marine sessile filter-feeding animals. Their mitochondrial genomes are characterized by particularly high evolutionary rates and rampant gene rearrangements. This extreme variability complicates standard polymerase chain reaction (PCR) based techniques for molecular characterization studies, and consequently only a few complete Ascidian mitochondrial genome sequences are available. Using the standard PCR and Sanger sequencing approach, we produced the mitochondrial genome of Ascidiella aspersa only after a great effort. In contrast, we produced five additional mitogenomes (Botrylloides aff. leachii, Halocynthia spinosa, Polycarpa mytiligera, Pyura gangelion, and Rhodosoma turcicum) with a novel strategy, consisting in sequencing the pooled total DNA samples of these five species using one Illumina HiSeq 2000 flow cell lane. Each mitogenome was efficiently assembled in a single contig using de novo transcriptome assembly, as de novo genome assembly generally performed poorly for this task. Each of the new six mitogenomes presents a different and novel gene order, showing that no syntenic block has been conserved at the ordinal level (in Stolidobranchia and in Phlebobranchia). Phylogenetic analyses support the paraphyly of both Ascidiacea and Phlebobranchia, with Thaliacea nested inside Phlebobranchia, although the deepest nodes of the Phlebobranchia–Thaliacea clade are not well resolved. The strategy described here thus provides a cost-effective approach to obtain complete mitogenomes characterized by a highly plastic gene order and a fast nucleotide/amino acid substitution rate. PMID:23709623

  5. Genome re-sequencing of semi-wild soybean reveals a complex Soja population structure and deep introgression.

    Directory of Open Access Journals (Sweden)

    Jie Qiu

    Full Text Available Semi-wild soybean is a unique type of soybean that retains both wild and domesticated characteristics, which provides an important intermediate type for understanding the evolution of the subgenus Soja population in the Glycine genus. In this study, a semi-wild soybean line (Maliaodou and a wild line (Lanxi 1 collected from the lower Yangtze regions were deeply sequenced while nine other semi-wild lines were sequenced to a 3-fold genome coverage. Sequence analysis revealed that (1 no independent phylogenetic branch covering all 10 semi-wild lines was observed in the Soja phylogenetic tree; (2 besides two distinct subpopulations of wild and cultivated soybean in the Soja population structure, all semi-wild lines were mixed with some wild lines into a subpopulation rather than an independent one or an intermediate transition type of soybean domestication; (3 high heterozygous rates (0.19-0.49 were observed in several semi-wild lines; and (4 over 100 putative selective regions were identified by selective sweep analysis, including those related to the development of seed size. Our results suggested a hybridization origin for the semi-wild soybean, which makes a complex Soja population structure.

  6. Genome re-sequencing of semi-wild soybean reveals a complex Soja population structure and deep introgression.

    Science.gov (United States)

    Qiu, Jie; Wang, Yu; Wu, Sanling; Wang, Ying-Ying; Ye, Chu-Yu; Bai, Xuefei; Li, Zefeng; Yan, Chenghai; Wang, Weidi; Wang, Ziqiang; Shu, Qingyao; Xie, Jiahua; Lee, Suk-Ha; Fan, Longjiang

    2014-01-01

    Semi-wild soybean is a unique type of soybean that retains both wild and domesticated characteristics, which provides an important intermediate type for understanding the evolution of the subgenus Soja population in the Glycine genus. In this study, a semi-wild soybean line (Maliaodou) and a wild line (Lanxi 1) collected from the lower Yangtze regions were deeply sequenced while nine other semi-wild lines were sequenced to a 3-fold genome coverage. Sequence analysis revealed that (1) no independent phylogenetic branch covering all 10 semi-wild lines was observed in the Soja phylogenetic tree; (2) besides two distinct subpopulations of wild and cultivated soybean in the Soja population structure, all semi-wild lines were mixed with some wild lines into a subpopulation rather than an independent one or an intermediate transition type of soybean domestication; (3) high heterozygous rates (0.19-0.49) were observed in several semi-wild lines; and (4) over 100 putative selective regions were identified by selective sweep analysis, including those related to the development of seed size. Our results suggested a hybridization origin for the semi-wild soybean, which makes a complex Soja population structure.

  7. Identification of novel microRNAs in primates by using the synteny information and small RNA deep sequencing data.

    Science.gov (United States)

    Yuan, Zhidong; Liu, Hongde; Nie, Yumin; Ding, Suping; Yan, Mingli; Tan, Shuhua; Jin, Yuanchang; Sun, Xiao

    2013-10-16

    Current technologies that are used for genome-wide microRNA (miRNA) prediction are mainly based on BLAST tool. They often produce a large number of false positives. Here, we describe an effective approach for identifying orthologous pre-miRNAs in several primates based on syntenic information. Some of them have been validated by small RNA high throughput sequencing data. This approach uses the synteny information and experimentally validated miRNAs of human, and incorporates currently available algorithms and tools to identify the pre-miRNAs in five other primates. First, we identified 929 potential pre-miRNAs in the marmoset in which miRNAs have not yet been reported. Then, we predicted the miRNAs in other primates, and we successfully re-identified most of the published miRNAs and found 721, 979, 650 and 639 new potential pre-miRNAs in chimpanzee, gorilla, orangutan and rhesus macaque, respectively. Furthermore, the miRNA transcriptome in the four primates have been re-analyzed and some novel predicted miRNAs have been supported by the small RNA sequencing data. Finally, we analyzed the potential functions of those validated miRNAs and explored the regulatory elements and transcription factors of some validated miRNA genes of interest. The results show that our approach can effectively identify novel miRNAs and some miRNAs that supported by small RNA sequencing data maybe play roles in the nervous system.

  8. Deep sequencing of mixed total DNA without barcodes allows efficient assembly of highly plastic ascidian mitochondrial genomes.

    Science.gov (United States)

    Rubinstein, Nimrod D; Feldstein, Tamar; Shenkar, Noa; Botero-Castro, Fidel; Griggio, Francesca; Mastrototaro, Francesco; Delsuc, Frédéric; Douzery, Emmanuel J P; Gissi, Carmela; Huchon, Dorothée

    2013-01-01

    Ascidians or sea squirts form a diverse group within chordates, which includes a few thousand members of marine sessile filter-feeding animals. Their mitochondrial genomes are characterized by particularly high evolutionary rates and rampant gene rearrangements. This extreme variability complicates standard polymerase chain reaction (PCR) based techniques for molecular characterization studies, and consequently only a few complete Ascidian mitochondrial genome sequences are available. Using the standard PCR and Sanger sequencing approach, we produced the mitochondrial genome of Ascidiella aspersa only after a great effort. In contrast, we produced five additional mitogenomes (Botrylloides aff. leachii, Halocynthia spinosa, Polycarpa mytiligera, Pyura gangelion, and Rhodosoma turcicum) with a novel strategy, consisting in sequencing the pooled total DNA samples of these five species using one Illumina HiSeq 2000 flow cell lane. Each mitogenome was efficiently assembled in a single contig using de novo transcriptome assembly, as de novo genome assembly generally performed poorly for this task. Each of the new six mitogenomes presents a different and novel gene order, showing that no syntenic block has been conserved at the ordinal level (in Stolidobranchia and in Phlebobranchia). Phylogenetic analyses support the paraphyly of both Ascidiacea and Phlebobranchia, with Thaliacea nested inside Phlebobranchia, although the deepest nodes of the Phlebobranchia-Thaliacea clade are not well resolved. The strategy described here thus provides a cost-effective approach to obtain complete mitogenomes characterized by a highly plastic gene order and a fast nucleotide/amino acid substitution rate.

  9. DASAF: An R Package for Deep Sequencing-Based Detection of Fetal Autosomal Abnormalities from Maternal Cell-Free DNA

    Directory of Open Access Journals (Sweden)

    Baohong Liu

    2016-01-01

    Full Text Available Background. With the development of massively parallel sequencing (MPS, noninvasive prenatal diagnosis using maternal cell-free DNA is fast becoming the preferred method of fetal chromosomal abnormality detection, due to its inherent high accuracy and low risk. Typically, MPS data is parsed to calculate a risk score, which is used to predict whether a fetal chromosome is normal or not. Although there are several highly sensitive and specific MPS data-parsing algorithms, there are currently no tools that implement these methods. Results. We developed an R package, detection of autosomal abnormalities for fetus (DASAF, that implements the three most popular trisomy detection methods—the standard Z-score (STDZ method, the GC correction Z-score (GCCZ method, and the internal reference Z-score (IRZ method—together with one subchromosome abnormality identification method (SCAZ. Conclusions. With the cost of DNA sequencing declining and with advances in personalized medicine, the demand for noninvasive prenatal testing will undoubtedly increase, which will in turn trigger an increase in the tools available for subsequent analysis. DASAF is a user-friendly tool, implemented in R, that supports identification of whole-chromosome as well as subchromosome abnormalities, based on maternal cell-free DNA sequencing data after genome mapping.

  10. A deep sequencing analysis of transcriptomes and the development of EST-SSR markers in mungbean (Vigna radiata)

    Indian Academy of Sciences (India)

    CHANGYOU LIU; BAOJIE FAN; ZHIMIN CAO; QIUZHU SU; YAN WANG; ZHIXIAO ZHANG; JING WU; JING TIAN

    2016-09-01

    Mungbean (Vigna radiata L. Wilczek) is one of the most important leguminous food crops in Asia. We employed Illumina paired-end sequencing to analyse transcriptomes of three different mungbean genotypes. A total of 38.3–39.8 million paired-end reads with 73 bp lengths were generated. The pooled reads from the three libraries were assembled into 56,471 transcripts. Following a cluster analysis, 43,293 unigenes were obtained with an average length of 739 bp and N50 length of 1176 bp. Of the unigenes, 34,903 (80.6%) had significant similarity to known proteins in the NCBI nonredundant protein database (Nr), while 21,450 (58.4%) had BLAST hits in the Swiss-Prot database (E-value < 10⁻⁵). Further, 1245 differential expression genes were detected among three mungbean genotypes. In addition, we identified 3788 expressed sequence tag-simple sequence repeat (EST-SSR) motifs that could be used as potential molecular markers. Among 320 tested loci, 310 (96.5%) yielded amplification products, and 151 (47.0%) exhibited polymorphisms among six mungbean accessions. These transcriptome data and mungbean EST-SSRs could serve as a valuable resource for novel gene discovery and the marker-assisted selective breeding of this specie

  11. Bioinformatic prediction, deep sequencing of microRNAs and expression analysis during phenotypic plasticity in the pea aphid, Acyrthosiphon pisum

    Directory of Open Access Journals (Sweden)

    Leterme Nathalie

    2010-05-01

    Full Text Available Abstract Background Post-transcriptional regulation in eukaryotes can be operated through microRNA (miRNAs mediated gene silencing. MiRNAs are small (18-25 nucleotides non-coding RNAs that play crucial role in regulation of gene expression in eukaryotes. In insects, miRNAs have been shown to be involved in multiple mechanisms such as embryonic development, tissue differentiation, metamorphosis or circadian rhythm. Insect miRNAs have been identified in different species belonging to five orders: Coleoptera, Diptera, Hymenoptera, Lepidoptera and Orthoptera. Results We developed high throughput Solexa sequencing and bioinformatic analyses of the genome of the pea aphid Acyrthosiphon pisum in order to identify the first miRNAs from a hemipteran insect. By combining these methods we identified 149 miRNAs including 55 conserved and 94 new miRNAs. Moreover, we investigated the regulation of these miRNAs in different alternative morphs of the pea aphid by analysing the expression of miRNAs across the switch of reproduction mode. Pea aphid microRNA sequences have been posted to miRBase: http://microrna.sanger.ac.uk/sequences/ Conclusions Our study has identified candidates as putative regulators involved in reproductive polyphenism in aphids and opens new avenues for further functional analyses.

  12. Deep Sequencing of the Trypanosoma cruzi GP63 Surface Proteases Reveals Diversity and Diversifying Selection among Chronic and Congenital Chagas Disease Patients

    Science.gov (United States)

    Llewellyn, Martin S.; Messenger, Louisa A.; Luquetti, Alejandro O.; Garcia, Lineth; Torrico, Faustino; Tavares, Suelene B. N.; Cheaib, Bachar; Derome, Nicolas; Delepine, Marc; Baulard, Céline; Deleuze, Jean-Francois; Sauer, Sascha; Miles, Michael A.

    2015-01-01

    Background Chagas disease results from infection with the diploid protozoan parasite Trypanosoma cruzi. T. cruzi is highly genetically diverse, and multiclonal infections in individual hosts are common, but little studied. In this study, we explore T. cruzi infection multiclonality in the context of age, sex and clinical profile among a cohort of chronic patients, as well as paired congenital cases from Cochabamba, Bolivia and Goias, Brazil using amplicon deep sequencing technology. Methodology/ Principal Findings A 450bp fragment of the trypomastigote TcGP63I surface protease gene was amplified and sequenced across 70 chronic and 22 congenital cases on the Illumina MiSeq platform. In addition, a second, mitochondrial target—ND5—was sequenced across the same cohort of cases. Several million reads were generated, and sequencing read depths were normalized within patient cohorts (Goias chronic, n = 43, Goias congenital n = 2, Bolivia chronic, n = 27; Bolivia congenital, n = 20), Among chronic cases, analyses of variance indicated no clear correlation between intra-host sequence diversity and age, sex or symptoms, while principal coordinate analyses showed no clustering by symptoms between patients. Between congenital pairs, we found evidence for the transmission of multiple sequence types from mother to infant, as well as widespread instances of novel genotypes in infants. Finally, non-synonymous to synonymous (dn:ds) nucleotide substitution ratios among sequences of TcGP63Ia and TcGP63Ib subfamilies within each cohort provided powerful evidence of strong diversifying selection at this locus. Conclusions/Significance Our results shed light on the diversity of parasite DTUs within each patient, as well as the extent to which parasite strains pass between mother and foetus in congenital cases. Although we were unable to find any evidence that parasite diversity accumulates with age in our study cohorts, putative diversifying selection within members of the TcGP63I

  13. Deep transcriptome-sequencing and proteome analysis of the hydrothermal vent annelid Alvinella pompejana identifies the CvP-bias as a robust measure of eukaryotic thermostability

    Directory of Open Access Journals (Sweden)

    Holder Thomas

    2013-01-01

    Full Text Available Abstract Background Alvinella pompejana is an annelid worm that inhabits deep-sea hydrothermal vent sites in the Pacific Ocean. Living at a depth of approximately 2500 meters, these worms experience extreme environmental conditions, including high temperature and pressure as well as high levels of sulfide and heavy metals. A. pompejana is one of the most thermotolerant metazoans, making this animal a subject of great interest for studies of eukaryotic thermoadaptation. Results In order to complement existing EST resources we performed deep sequencing of the A. pompejana transcriptome. We identified several thousand novel protein-coding transcripts, nearly doubling the sequence data for this annelid. We then performed an extensive survey of previously established prokaryotic thermoadaptation measures to search for global signals of thermoadaptation in A. pompejana in comparison with mesophilic eukaryotes. In an orthologous set of 457 proteins, we found that the best indicator of thermoadaptation was the difference in frequency of charged versus polar residues (CvP-bias, which was highest in A. pompejana. CvP-bias robustly distinguished prokaryotic thermophiles from prokaryotic mesophiles, as well as the thermophilic fungus Chaetomium thermophilum from mesophilic eukaryotes. Experimental values for thermophilic proteins supported higher CvP-bias as a measure of thermal stability when compared to their mesophilic orthologs. Proteome-wide mean CvP-bias also correlated with the body temperatures of homeothermic birds and mammals. Conclusions Our work extends the transcriptome resources for A. pompejana and identifies the CvP-bias as a robust and widely applicable measure of eukaryotic thermoadaptation. Reviewer This article was reviewed by Sándor Pongor, L. Aravind and Anthony M. Poole.

  14. Integrative analysis of deep sequencing data identifies estrogen receptor early response genes and links ATAD3B to poor survival in breast cancer.

    Directory of Open Access Journals (Sweden)

    Kristian Ovaska

    Full Text Available Identification of responsive genes to an extra-cellular cue enables characterization of pathophysiologically crucial biological processes. Deep sequencing technologies provide a powerful means to identify responsive genes, which creates a need for computational methods able to analyze dynamic and multi-level deep sequencing data. To answer this need we introduce here a data-driven algorithm, SPINLONG, which is designed to search for genes that match the user-defined hypotheses or models. SPINLONG is applicable to various experimental setups measuring several molecular markers in parallel. To demonstrate the SPINLONG approach, we analyzed ChIP-seq data reporting PolII, estrogen receptor α (ERα, H3K4me3 and H2A.Z occupancy at five time points in the MCF-7 breast cancer cell line after estradiol stimulus. We obtained 777 ERa early responsive genes and compared the biological functions of the genes having ERα binding within 20 kb of the transcription start site (TSS to genes without such binding site. Our results show that the non-genomic action of ERα via the MAPK pathway, instead of direct ERa binding, may be responsible for early cell responses to ERα activation. Our results also indicate that the ERα responsive genes triggered by the genomic pathway are transcribed faster than those without ERα binding sites. The survival analysis of the 777 ERα responsive genes with 150 primary breast cancer tumors and in two independent validation cohorts indicated the ATAD3B gene, which does not have ERα binding site within 20 kb of its TSS, to be significantly associated with poor patient survival.

  15. Integrative analysis of deep sequencing data identifies estrogen receptor early response genes and links ATAD3B to poor survival in breast cancer.

    Directory of Open Access Journals (Sweden)

    Kristian Ovaska

    Full Text Available Identification of responsive genes to an extra-cellular cue enables characterization of pathophysiologically crucial biological processes. Deep sequencing technologies provide a powerful means to identify responsive genes, which creates a need for computational methods able to analyze dynamic and multi-level deep sequencing data. To answer this need we introduce here a data-driven algorithm, SPINLONG, which is designed to search for genes that match the user-defined hypotheses or models. SPINLONG is applicable to various experimental setups measuring several molecular markers in parallel. To demonstrate the SPINLONG approach, we analyzed ChIP-seq data reporting PolII, estrogen receptor α (ERα, H3K4me3 and H2A.Z occupancy at five time points in the MCF-7 breast cancer cell line after estradiol stimulus. We obtained 777 ERa early responsive genes and compared the biological functions of the genes having ERα binding within 20 kb of the transcription start site (TSS to genes without such binding site. Our results show that the non-genomic action of ERα via the MAPK pathway, instead of direct ERa binding, may be responsible for early cell responses to ERα activation. Our results also indicate that the ERα responsive genes triggered by the genomic pathway are transcribed faster than those without ERα binding sites. The survival analysis of the 777 ERα responsive genes with 150 primary breast cancer tumors and in two independent validation cohorts indicated the ATAD3B gene, which does not have ERα binding site within 20 kb of its TSS, to be significantly associated with poor patient survival.

  16. Identification of Novel MicroRNAs in Primates by Using the Synteny Information and Small RNA Deep Sequencing Data

    Directory of Open Access Journals (Sweden)

    Xiao Sun

    2013-10-01

    Full Text Available Current technologies that are used for genome-wide microRNA (miRNA prediction are mainly based on BLAST tool. They often produce a large number of false positives. Here, we describe an effective approach for identifying orthologous pre-miRNAs in several primates based on syntenic information. Some of them have been validated by small RNA high throughput sequencing data. This approach uses the synteny information and experimentally validated miRNAs of human, and incorporates currently available algorithms and tools to identify the pre-miRNAs in five other primates. First, we identified 929 potential pre-miRNAs in the marmoset in which miRNAs have not yet been reported. Then, we predicted the miRNAs in other primates, and we successfully re-identified most of the published miRNAs and found 721, 979, 650 and 639 new potential pre-miRNAs in chimpanzee, gorilla, orangutan and rhesus macaque, respectively. Furthermore, the miRNA transcriptome in the four primates have been re-analyzed and some novel predicted miRNAs have been supported by the small RNA sequencing data. Finally, we analyzed the potential functions of those validated miRNAs and explored the regulatory elements and transcription factors of some validated miRNA genes of interest. The results show that our approach can effectively identify novel miRNAs and some miRNAs that supported by small RNA sequencing data maybe play roles in the nervous system.

  17. Deep Sequencing Reveals Novel Genetic Variants in Children with Acute Liver Failure and Tissue Evidence of Impaired Energy Metabolism.

    Science.gov (United States)

    Valencia, C Alexander; Wang, Xinjian; Wang, Jin; Peters, Anna; Simmons, Julia R; Moran, Molly C; Mathur, Abhinav; Husami, Ammar; Qian, Yaping; Sheridan, Rachel; Bove, Kevin E; Witte, David; Huang, Taosheng; Miethke, Alexander G

    2016-01-01

    The etiology of acute liver failure (ALF) remains elusive in almost half of affected children. We hypothesized that inherited mitochondrial and fatty acid oxidation disorders were occult etiological factors in patients with idiopathic ALF and impaired energy metabolism. Twelve patients with elevated blood molar lactate/pyruvate ratio and indeterminate etiology were selected from a retrospective cohort of 74 subjects with ALF because their fixed and frozen liver samples were available for histological, ultrastructural, molecular and biochemical analysis. A customized next-generation sequencing panel for 26 genes associated with mitochondrial and fatty acid oxidation defects revealed mutations and sequence variants in five subjects. Variants involved the genes ACAD9, POLG, POLG2, DGUOK, and RRM2B; the latter not previously reported in subjects with ALF. The explanted livers of the patients with heterozygous, truncating insertion mutations in RRM2B showed patchy micro- and macrovesicular steatosis, decreased mitochondrial DNA (mtDNA) content <30% of controls, and reduced respiratory chain complex activity; both patients had good post-transplant outcome. One infant with severe lactic acidosis was found to carry two heterozygous variants in ACAD9, which was associated with isolated complex I deficiency and diffuse hypergranular hepatocytes. The two subjects with heterozygous variants of unknown clinical significance in POLG and DGUOK developed ALF following drug exposure. Their hepatocytes displayed abnormal mitochondria by electron microscopy. Targeted next generation sequencing and correlation with histological, ultrastructural and functional studies on liver tissue in children with elevated lactate/pyruvate ratio expand the spectrum of genes associated with pediatric ALF.

  18. Role of IL-17 Pathways in Immune Privilege: A RNA Deep Sequencing Analysis of the Mice Testis Exposure to Fluoride

    Science.gov (United States)

    Huo, Meijun; Han, Haijun; Sun, Zilong; Lu, Zhaojing; Yao, Xinglei; Wang, Shaolin; Wang, Jundong

    2016-01-01

    We sequenced RNA transcripts from the testicles of healthy male mice, divided into a control group with distilled water and two experimental groups with 50 and 100 mg/l NaF in drinking water for 56 days. Bowtie/Tophat were used to align 50-bp paired-end reads into transcripts, Cufflinks to measure the relative abundance of each transcript and IPA to analyze RNA-Sequencing data. In the 100 mg/l NaF-treated group, four pathways related to IL-17, TGF-β and other cellular growth factor pathways were overexpressed. The mRNA expression of IL-17RA, IL-17RC, MAP2K1, MAP2K2, MAP2K3 and MAPKAPK2, monitored by qRT-PCR, increased remarkably in the 100 mg/L NaF group and coincided with the result of RNA-Sequencing. Fluoride exposure could disrupt spermatogenesis and testicles in male mice by influencing many signaling pathways and genes, which work on the immune signal transduction and cellular metabolism. The high expression of the IL-17 signal pathway was a response to the invasion of the testicular immune system due to extracellular fluoride. The PI3-kinase/AKT, MAPKs and the cytokines in TGF-β family were contributed to control the IL-17 pathway activation and maintain the immune privilege and spermatogenesis. All the findings provided new ideas for further molecular researches of fluorosis on the reproduction and immune response mechanism. PMID:27572304

  19. Detection of Inter-lineage Natural Recombination in Avian Paramyxovirus Serotype 1 using Simplified Deep Sequencing Platform

    Directory of Open Access Journals (Sweden)

    Dilan Amila Satharasinghe

    2016-11-01

    Full Text Available Newcastle disease virus (NDV is a prototype member of avian paramyxovirus serotype 1 (APMV-1, which causes severe and contagious disease in the commercial poultry and wild birds. Despite extensive vaccination programs and other control measures, the disease remains endemic around the globe especially in Asia, Africa, and the Middle East. Being a single serotype, genotype II based vaccines remained most acceptable means of immunization. However, the evidence is emerging on failures of vaccines mainly due to evolving nature of the virus and higher genetic gaps between vaccine and field strains of APMV-1. Most of the epidemiological and genetic characterizations of APMVs are based on conventional methods, which are prone to mask the diverse population of viruses in complex samples. In this study, we report the application of a simple, robust, and less resource-demanding methodology for the whole genome sequencing of NDV, using next-generation sequencing on the Illumina MiSeq platform. Using this platform, we sequenced full genomes of five virulent Malaysian NDV strains collected during 2004-2013. All isolates clustered within highly prevalent lineage 5 (specifically in lineage 5a; however, a significantly greater genetic divergence was observed in isolates collected from 2004 to 2011. Interestingly, genetic characterization of one isolate collected in 2013 (IBS025/13 shown natural recombination between lineage 2 and lineage 5. In the event of recombination, the isolate (IBS025/13 carried nucleocapsid protein consist of 55-1801 nucleotides (nts and near-complete phosphoprotein (1804-3254 nts genes of lineage 2 whereas surface glycoproteins (fusion, hemagglutinin-neuraminidase and large polymerase of lineage 5. Additionally, the recombinant virus has a genome size of 15,186 nts which is characteristics for the old genotypes I to IV isolated from 1930 to 1960. Taken together, we report the occurrence of a natural recombination in circulating strains

  20. Deep sequencing analysis of tick-borne encephalitis virus from questing ticks at natural foci reveals similarities between quasispecies pools of the virus.

    Science.gov (United States)

    Asghar, Naveed; Pettersson, John H-O; Dinnetz, Patrik; Andreassen, Åshild; Johansson, Magnus

    2017-01-10

    Every year, tick-borne encephalitis virus (TBEV) causes severe central nervous system infection in 10,000 to 15,000 people in Europe and Asia. TBEV is maintained in the environment by an enzootic cycle that requires a tick vector and a vertebrate host, and the adaptation of TBEV to vertebrate and invertebrate environments is essential for TBEV persistence in nature. This adaptation is facilitated by the error-prone nature of the virus' RNA-dependent RNA polymerase that generates genetically distinct virus variants called quasispecies. TBEV shows a focal geographical distribution pattern where each focus represents a TBEV hotspot. Here we sequenced and characterized two TBEV genomes, JP-296 and JP-554, from questing Ixodes ricinus ticks at a TBEV focus in central Sweden. Phylogenetic analysis showed geographical clustering among the newly sequenced strains and three previously sequenced Scandinavian strains, Toro-2003, Saringe-2009, and Mandal-2009, which originated from same ancestor. Among these five Scandinavian TBEV strains, only Mandal-2009 showed a large deletion within the 3´ non-coding region (NCR) similar to the highly virulent TBEV strain Hypr. Deep sequencing of JP-296, JP-554, and Mandal-2009 revealed significantly high quasispecies diversity for JP-296 and JP-554, with intact 3´NCRs, compared to the low diversity in Mandal-2009, with a truncated 3´NCR. SNP analysis showed that 40% of the SNPs were common between quasispecies populations of JP-296 and JP-554, indicating a putative mechanism for how TBEV persists and is maintained within its natural foci.

  1. Analytical and Clinical Validation of a Digital Sequencing Panel for Quantitative, Highly Accurate Evaluation of Cell-Free Circulating Tumor DNA.

    Directory of Open Access Journals (Sweden)

    Richard B Lanman

    Full Text Available Next-generation sequencing of cell-free circulating solid tumor DNA addresses two challenges in contemporary cancer care. First this method of massively parallel and deep sequencing enables assessment of a comprehensive panel of genomic targets from a single sample, and second, it obviates the need for repeat invasive tissue biopsies. Digital Sequencing™ is a novel method for high-quality sequencing of circulating tumor DNA simultaneously across a comprehensive panel of over 50 cancer-related genes with a simple blood test. Here we report the analytic and clinical validation of the gene panel. Analytic sensitivity down to 0.1% mutant allele fraction is demonstrated via serial dilution studies of known samples. Near-perfect analytic specificity (> 99.9999% enables complete coverage of many genes without the false positives typically seen with traditional sequencing assays at mutant allele frequencies or fractions below 5%. We compared digital sequencing of plasma-derived cell-free DNA to tissue-based sequencing on 165 consecutive matched samples from five outside centers in patients with stage III-IV solid tumor cancers. Clinical sensitivity of plasma-derived NGS was 85.0%, comparable to 80.7% sensitivity for tissue. The assay success rate on 1,000 consecutive samples in clinical practice was 99.8%. Digital sequencing of plasma-derived DNA is indicated in advanced cancer patients to prevent repeated invasive biopsies when the initial biopsy is inadequate, unobtainable for genomic testing, or uninformative, or when the patient's cancer has progressed despite treatment. Its clinical utility is derived from reduction in the costs, complications and delays associated with invasive tissue biopsies for genomic testing.

  2. Analytical and Clinical Validation of a Digital Sequencing Panel for Quantitative, Highly Accurate Evaluation of Cell-Free Circulating Tumor DNA.

    Science.gov (United States)

    Lanman, Richard B; Mortimer, Stefanie A; Zill, Oliver A; Sebisanovic, Dragan; Lopez, Rene; Blau, Sibel; Collisson, Eric A; Divers, Stephen G; Hoon, Dave S B; Kopetz, E Scott; Lee, Jeeyun; Nikolinakos, Petros G; Baca, Arthur M; Kermani, Bahram G; Eltoukhy, Helmy; Talasaz, AmirAli

    2015-01-01

    Next-generation sequencing of cell-free circulating solid tumor DNA addresses two challenges in contemporary cancer care. First this method of massively parallel and deep sequencing enables assessment of a comprehensive panel of genomic targets from a single sample, and second, it obviates the need for repeat invasive tissue biopsies. Digital Sequencing™ is a novel method for high-quality sequencing of circulating tumor DNA simultaneously across a comprehensive panel of over 50 cancer-related genes with a simple blood test. Here we report the analytic and clinical validation of the gene panel. Analytic sensitivity down to 0.1% mutant allele fraction is demonstrated via serial dilution studies of known samples. Near-perfect analytic specificity (> 99.9999%) enables complete coverage of many genes without the false positives typically seen with traditional sequencing assays at mutant allele frequencies or fractions below 5%. We compared digital sequencing of plasma-derived cell-free DNA to tissue-based sequencing on 165 consecutive matched samples from five outside centers in patients with stage III-IV solid tumor cancers. Clinical sensitivity of plasma-derived NGS was 85.0%, comparable to 80.7% sensitivity for tissue. The assay success rate on 1,000 consecutive samples in clinical practice was 99.8%. Digital sequencing of plasma-derived DNA is indicated in advanced cancer patients to prevent repeated invasive biopsies when the initial biopsy is inadequate, unobtainable for genomic testing, or uninformative, or when the patient's cancer has progressed despite treatment. Its clinical utility is derived from reduction in the costs, complications and delays associated with invasive tissue biopsies for genomic testing.

  3. NGS-QC Generator: A Quality Control System for ChIP-Seq and Related Deep Sequencing-Generated Datasets.

    Science.gov (United States)

    Mendoza-Parra, Marco Antonio; Saleem, Mohamed-Ashick M; Blum, Matthias; Cholley, Pierre-Etienne; Gronemeyer, Hinrich

    2016-01-01

    The combination of massive parallel sequencing with a variety of modern DNA/RNA enrichment technologies provides means for interrogating functional protein-genome interactions (ChIP-seq), genome-wide transcriptional activity (RNA-seq; GRO-seq), chromatin accessibility (DNase-seq, FAIRE-seq, MNase-seq), and more recently the three-dimensional organization of chromatin (Hi-C, ChIA-PET). In systems biology-based approaches several of these readouts are generally cumulated with the aim of describing living systems through a reconstitution of the genome-regulatory functions. However, an issue that is often underestimated is that conclusions drawn from such multidimensional analyses of NGS-derived datasets critically depend on the quality of the compared datasets. To address this problem, we have developed the NGS-QC Generator, a quality control system that infers quality descriptors for any kind of ChIP-sequencing and related datasets. In this chapter we provide a detailed protocol for (1) assessing quality descriptors with the NGS-QC Generator; (2) to interpret the generated reports; and (3) to explore the database of QC indicators (www.ngs-qc.org) for >21,000 publicly available datasets.

  4. Deep sequencing of Lotus corniculatus L. reveals key enzymes and potential transcription factors related to the flavonoid biosynthesis pathway.

    Science.gov (United States)

    Wang, Ying; Hua, Wenping; Wang, Jian; Hannoufa, Abdelali; Xu, Ziqin; Wang, Zhezhi

    2013-04-01

    Lotus corniculatus L. is used worldwide as a forage crop due to its abundance of secondary metabolites and its ability to grow in severe environments. Although the entire genome of L. corniculatus var. japonicus R. is being sequenced, the differences in morphology and production of secondary metabolites between these two related species have led us to investigate this variability at the genetic level, in particular the differences in flavonoid biosynthesis. Our goal is to use the resulting information to develop more valuable forage crops and medicinal materials. Here, we conducted Illumina/Solexa sequencing to profile the transcriptome of L. corniculatus. We produced 26,492,952 short reads that corresponded to 2.38 gigabytes of total nucleotides. These reads were then assembled into 45,698 unigenes, of which a large number associated with secondary metabolism were annotated. In addition, we identified 2,998 unigenes based on homology with L. japonicus transcription factors (TFs) and grouped them into 55 families. Meanwhile, a comparison of four tag-based digital gene expression libraries, built from the flowers, pods, leaves, and roots, revealed distinct patterns of spatial expression of candidate unigenes in flavonoid biosynthesis. Based on these results, we identified many key enzymes from L. corniculatus which were different from reference genes of L. japonicus, and five TFs that are potential enhancers in flavonoid biosynthesis. Our results provide initial genetics resources that will be valuable in efforts to manipulate the flavonoid metabolic pathway in plants.

  5. DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences.

    Science.gov (United States)

    Quang, Daniel; Xie, Xiaohui

    2016-06-20

    Modeling the properties and functions of DNA sequences is an important, but challenging task in the broad field of genomics. This task is particularly difficult for non-coding DNA, the vast majority of which is still poorly understood in terms of function. A powerful predictive model for the function of non-coding DNA can have enormous benefit for both basic science and translational research because over 98% of the human genome is non-coding and 93% of disease-associated variants lie in these regions. To address this need, we propose DanQ, a novel hybrid convolutional and bi-directional long short-term memory recurrent neural network framework for predicting non-coding function de novo from sequence. In the DanQ model, the convolution layer captures regulatory motifs, while the recurrent layer captures long-term dependencies between the motifs in order to learn a regulatory 'grammar' to improve predictions. DanQ improves considerably upon other models across several metrics. For some regulatory markers, DanQ can achieve over a 50% relative improvement in the area under the precision-recall curve metric compared to related models. We have made the source code available at the github repository http://github.com/uci-cbcl/DanQ.

  6. HIV-1 transmission patterns in antiretroviral therapy-naive, HIV-infected North Americans based on phylogenetic analysis by population level and ultra-deep DNA sequencing.

    Directory of Open Access Journals (Sweden)

    Lisa L Ross

    Full Text Available Factors that contribute to the transmission of human immunodeficiency virus type 1 (HIV-1, especially drug-resistant HIV-1 variants remain a significant public health concern. In-depth phylogenetic analyses of viral sequences obtained in the screening phase from antiretroviral-naïve HIV-infected patients seeking enrollment in EPZ108859, a large open-label study in the USA, Canada and Puerto Rico (ClinicalTrials.gov NCT00440947 were examined for insights into the roles of drug resistance and epidemiological factors that could impact disease dissemination. Viral transmission clusters (VTCs were initially predicted from a phylogenetic analysis of population level HIV-1 pol sequences obtained from 690 antiretroviral-naïve subjects in 2007. Subsequently, the predicted VTCs were tested for robustness by ultra deep sequencing (UDS using pyrosequencing technology and further phylogenetic analyses. The demographic characteristics of clustered and non-clustered subjects were then compared. From 690 subjects, 69 were assigned to 1 of 30 VTCs, each containing 2 to 5 subjects. Race composition of VTCs were significantly more likely to be white (72% vs. 60%; p = 0.04. VTCs had fewer reverse transcriptase and major PI resistance mutations (9% vs. 24%; p = 0.002 than non-clustered sequences. Both men-who-have-sex-with-men (MSM (68% vs. 48%; p = 0.001 and Canadians (29% vs. 14%; p = 0.03 were significantly more frequent in VTCs than non-clustered sequences. Of the 515 subjects who initiated antiretroviral therapy, 33 experienced confirmed virologic failure through 144 weeks while only 3/33 were from VTCs. Fewer VTCs subjects (as compared to those with non-clustering virus had HIV-1 with resistance-associated mutations or experienced virologic failure during the course of the study. Our analysis shows specific geographical and drug resistance trends that correlate well with transmission clusters defined by HIV sequences of similarity

  7. Quantitation of Bt-176 maize genomic sequences by surface plasmon resonance-based biospecific interaction analysis of multiplex polymerase chain reaction (PCR).

    Science.gov (United States)

    Feriotto, Giordana; Gardenghi, Sara; Bianchi, Nicoletta; Gambari, Roberto

    2003-07-30

    Surface plasmon resonance (SPR) based biosensors have been described for the identification of genetically modified organisms (GMO) by biospecific interaction analysis (BIA). This paper describes the design and testing of an SPR-based BIA protocol for quantitative determinations of GMOs. Biotinylated multiplex Polymerase Chain Reaction (PCR) products from nontransgenic maize as well as maize powders containing 0.5 and 2% genetically modified Bt-176 sequences were immobilized on different flow cells of a sensor chip. After immobilization, different oligonucleotide probes recognizing maize zein and Bt-176 sequences were injected. The results obtained were compared with Southern blot analysis and with quantitative real-time PCR assays. It was demonstrated that sequential injections of Bt-176 and zein probes to sensor chip flow cells containing multiplex PCR products allow discrimination between PCR performed using maize genomic DNA containing 0.5% Bt-176 sequences and that performed using maize genomic DNA containing 2% Bt-176 sequences. The efficiency of SPR-based BIA in discriminating material containing different amounts of Bt-176 maize is comparable to real-time quantitative PCR and much more reliable than Southern blotting, which in the past has been used for semiquantitative purposes. Furthermore, the approach allows the BIA assay to be repeated several times on the same multiplex PCR product immobilized on the sensor chip, after washing and regeneration of the flow cell. Finally, it is emphasized that the presented strategy to quantify GMOs could be proposed for all of the SPR-based, commercially available biosensors. Some of these optical SPR-based biosensors use, instead of flow-based sensor chips, stirred microcuvettes, reducing the costs of the experimentation.

  8. Global Analysis of Non-coding Small RNAs in Arabidopsis in Response to Jasmonate Treatment by Deep Sequencing Technology

    Institute of Scientific and Technical Information of China (English)

    Bosen Zhang; Zhiping Jin; Daoxin Xie

    2012-01-01

    In plants,non-coding small RNAs play a vital role in plant development and stress responses.To explore the possible role of non-coding small RNAs in the regulation of the jasmonate (JA) pathway,we compared the non-coding small RNAs between the JA-deficient aos mutant and the JA-treated wild type Arabidopsis via high-throughput sequencing.Thirty new miRNAs and 27 new miRNA candidates were identified through bioinformatics approach.Forty-nine known miRNAs (belonging to 24 families),15 new miRNAs and new miRNA candidates (belonging to 11 families) and 3 tasiRNA families were induced by JA,whereas 1 new miRNA,1 tasiRNA family and 22 known miRNAs (belonging to 9 families) were repressed by JA.

  9. Bacterial communities associated with host-adapted populations of pea aphids revealed by deep sequencing of 16S ribosomal DNA.

    Directory of Open Access Journals (Sweden)

    Jean-Pierre Gauthier

    Full Text Available Associations between microbes and animals are ubiquitous and hosts may benefit from harbouring microbial communities through improved resource exploitation or resistance to environmental stress. The pea aphid, Acyrthosiphon pisum, is the host of heritable bacterial symbionts, including the obligate endosymbiont Buchnera aphidicola and several facultative symbionts. While obligate symbionts supply aphids with key nutrients, facultative symbionts influence their hosts in many ways such as protection against natural enemies, heat tolerance, color change and reproduction alteration. The pea aphid also encompasses multiple plant-specialized biotypes, each adapted to one or a few legume species. Facultative symbiont communities differ strongly between biotypes, although bacterial involvement in plant specialization is uncertain. Here, we analyse the diversity of bacterial communities associated with nine biotypes of the pea aphid complex using amplicon pyrosequencing of 16S rRNA genes. Combined clustering and phylogenetic analyses of 16S sequences allowed identifying 21 bacterial OTUs (Operational Taxonomic Unit. More than 98% of the sequencing reads were assigned to known pea aphid symbionts. The presence of Wolbachia was confirmed in A. pisum while Erwinia and Pantoea, two gut associates, were detected in multiple samples. The diversity of bacterial communities harboured by pea aphid biotypes was very low, ranging from 3 to 11 OTUs across samples. Bacterial communities differed more between than within biotypes but this difference did not correlate with the genetic divergence between biotypes. Altogether, these results confirm that the aphid microbiota is dominated by a few heritable symbionts and that plant specialization is an important structuring factor of bacterial communities associated with the pea aphid complex. However, since we examined the microbiota of aphid samples kept a few generations in controlled conditions, it may be that

  10. Revisiting bovine pyometra--new insights into the disease using a culture-independent deep sequencing approach.

    Science.gov (United States)

    Knudsen, Lif Rødtness Vesterby; Karstrup, Cecilia Christensen; Pedersen, Hanne Gervi; Agerholm, Jørgen Steen; Jensen, Tim Kåre; Klitgaard, Kirstine

    2015-02-25

    The bacteria present in the uterus during pyometra have previously been studied using bacteriological culturing. These studies identified Fusobacterium necrophorum and Trueperella pyogenes as the major contributors to the pathogenesis of pyometra. However, an increasing number of culture-independent studies have demonstrated that the bacterial diversity in most environments is underestimated in culture-based studies. Consequently, fastidious pyometra-associated pathogens may have been overlooked. Therefore, the primary purpose of this study was to investigate the diversity of bacteria in the uterus of cows with pyometra by using culture-independent 16S rRNA PCR combined with next generation sequencing. We investigated the microbial composition in the uterus of 21 cows with pyometra, which were obtained from a Danish slaughterhouse. Similar to the observations from the culture studies, Fusobacteriaceae, the family that F. necrophorum belongs to, was the operational taxonomic unit (OTU) observed in the largest quantities. By contrast, the Actinomycetaceae family, which includes T. pyogenes, constituted only 1% of the total number of reads. Thus we cannot confirm the previously reported role of species from this family in the pathogenesis of pyometra. Finally, we identified a large number of sequences representing three families of Gram-negative bacteria in the pyometra samples: Porphyromonadaceae, Mycoplasmataceae, and Pasteurellaceae. It is likely that these families comprise potential pathogenic species of a fastidious nature, which have been overlooked in previous studies. Our results increase the knowledge of the complexity of the pyometra microbiota and suggest that pathogens in addition to F. necrophorum may be involved in the pathogenesis of pyometra.

  11. An ultra-deep sequencing strategy to detect sub-clonal TP53 mutations in presentation chronic lymphocytic leukaemia cases using multiple polymerases.

    Science.gov (United States)

    Worrillow, L; Baskaran, P; Care, M A; Varghese, A; Munir, T; Evans, P A; O'Connor, S J; Rawstron, A; Hazelwood, L; Tooze, R M; Hillmen, P; Newton, D J

    2016-10-06

    Chronic lymphocytic leukaemia (CLL) is the most common clonal B-cell disorder characterized by clonal diversity, a relapsing and remitting course, and in its aggressive forms remains largely incurable. Current front-line regimes include agents such as fludarabine, which act primarily via the DNA damage response pathway. Key to this is the transcription factor p53. Mutations in the TP53 gene, altering p53 functionality, are associated with genetic instability, and are present in aggressive CLL. Furthermore, the emergence of clonal TP53 mutations in relapsed CLL, refractory to DNA-damaging therapy, suggests that accurate detection of sub-clonal TP53 mutations prior to and during treatment may be indicative of early relapse. In this study, we describe a novel deep sequencing workflow using multiple polymerases to generate sequencing libraries (MuPol-Seq), facilitating accurate detection of TP53 mutations at a frequency as low as 0.3%, in presentation CLL cases tested. As these mutations were mostly clustered within the regions of TP53 encoding DNA-binding domains, essential for DNA contact and structural architecture, they are likely to be of prognostic relevance in disease progression. The workflow described here has the potential to be implemented routinely to identify rare mutations across a range of diseases.

  12. Characterization of Intra-Type Variants of Oncogenic Human Papillomaviruses by Next-Generation Deep Sequencing of the E6/E7 Region

    Directory of Open Access Journals (Sweden)

    Enrico Lavezzo

    2016-03-01

    Full Text Available Different human papillomavirus (HPV types are characterized by differences in tissue tropism and ability to promote cell proliferation and transformation. In addition, clinical and experimental studies have shown that some genetic variants/lineages of high-risk HPV (HR-HPV types are characterized by increased oncogenic activity and probability to induce cancer. In this study, we designed and validated a new method based on multiplex PCR-deep sequencing of the E6/E7 region of HR-HPV types to characterize HPV intra-type variants in clinical specimens. Validation experiments demonstrated that this method allowed reliable identification of the different lineages of oncogenic HPV types. Advantages of this method over other published methods were represented by its ability to detect variants of all HR-HPV types in a single reaction, to detect variants of HR-HPV types in clinical specimens with multiple infections, and, being based on sequencing of the full E6/E7 region, to detect amino acid changes in these oncogenes potentially associated with increased transforming activity.

  13. Characterization of Intra-Type Variants of Oncogenic Human Papillomaviruses by Next-Generation Deep Sequencing of the E6/E7 Region.

    Science.gov (United States)

    Lavezzo, Enrico; Masi, Giulia; Toppo, Stefano; Franchin, Elisa; Gazzola, Valentina; Sinigaglia, Alessandro; Masiero, Serena; Trevisan, Marta; Pagni, Silvana; Palù, Giorgio; Barzon, Luisa

    2016-03-14

    Different human papillomavirus (HPV) types are characterized by differences in tissue tropism and ability to promote cell proliferation and transformation. In addition, clinical and experimental studies have shown that some genetic variants/lineages of high-risk HPV (HR-HPV) types are characterized by increased oncogenic activity and probability to induce cancer. In this study, we designed and validated a new method based on multiplex PCR-deep sequencing of the E6/E7 region of HR-HPV types to characterize HPV intra-type variants in clinical specimens. Validation experiments demonstrated that this method allowed reliable identification of the different lineages of oncogenic HPV types. Advantages of this method over other published methods were represented by its ability to detect variants of all HR-HPV types in a single reaction, to detect variants of HR-HPV types in clinical specimens with multiple infections, and, being based on sequencing of the full E6/E7 region, to detect amino acid changes in these oncogenes potentially associated with increased transforming activity.

  14. Analysis of ultra-deep pyrosequencing and cloning based sequencing of the basic core promoter/precore/core region of hepatitis B virus using newly developed bioinformatics tools.

    Directory of Open Access Journals (Sweden)

    Mukhlid Yousif

    Full Text Available AIMS: The aims of this study were to develop bioinformatics tools to explore ultra-deep pyrosequencing (UDPS data, to test these tools, and to use them to determine the optimum error threshold, and to compare results from UDPS and cloning based sequencing (CBS. METHODS: Four serum samples, infected with either genotype D or E, from HBeAg-positive and HBeAg-negative patients were randomly selected. UDPS and CBS were used to sequence the basic core promoter/precore region of HBV. Two online bioinformatics tools, the "Deep Threshold Tool" and the "Rosetta Tool" (http://hvdr.bioinf.wits.ac.za/tools/, were built to test and analyze the generated data. RESULTS: A total of 10952 reads were generated by UDPS on the 454 GS Junior platform. In the four samples, substitutions, detected at 0.5% threshold or above, were identified at 39 unique positions, 25 of which were non-synonymous mutations. Sample #2 (HBeAg-negative, genotype D had substitutions in 26 positions, followed by sample #1 (HBeAg-negative, genotype E in 12 positions, sample #3 (HBeAg-positive, genotype D in 7 positions and sample #4 (HBeAg-positive, genotype E in only four positions. The ratio of nucleotide substitutions between isolates from HBeAg-negative and HBeAg-positive patients was 3.5 ∶ 1. Compared to genotype E isolates, genotype D isolates showed greater variation in the X, basic core promoter/precore and core regions. Only 18 of the 39 positions identified by UDPS were detected by CBS, which detected 14 of the 25 non-synonymous mutations detected by UDPS. CONCLUSION: UDPS data should be approached with caution. Appropriate curation of read data is required prior to analysis, in order to clean the data and eliminate artefacts. CBS detected fewer than 50% of the substitutions detected by UDPS. Furthermore it is important that the appropriate consensus (reference sequence is used in order to identify variants correctly.

  15. Development of a genus-specific next generation sequencing approach for sensitive and quantitative determination of the Legionella microbiome in freshwater systems.

    Science.gov (United States)

    Pereira, Rui P A; Peplies, Jörg; Brettar, Ingrid; Höfle, Manfred G

    2017-03-31

    Next Generation Sequencing (NGS) has revolutionized the analysis of natural and man-made microbial communities by using universal primers for bacteria in a PCR based approach targeting the 16S rRNA gene. In our study we narrowed primer specificity to a single, monophyletic genus because for many questions in microbiology only a specific part of the whole microbiome is of interest. We have chosen the genus Legionella, comprising more than 20 pathogenic species, due to its high relevance for water-based respiratory infections. A new NGS-based approach was designed by sequencing 16S rRNA gene amplicons specific for the genus Legionella using the Illumina MiSeq technology. This approach was validated and applied to a set of representative freshwater samples. Our results revealed that the generated libraries presented a low average raw error rate per base (95%) and very good repeatability. Only in samples in which the gammabacterial clade SAR86 was present more than 1% non-Legionella sequences were observed. Next-generation sequencing read counts did not reveal considerable amplification/sequencing biases and showed a sensitive as well as precise quantification of L. pneumophila along a dilution range using a spiked-in, certified genome standard. The genome standard and a mock community consisting of six different Legionella species demonstrated that the developed NGS approach was quantitative and specific at the level of individual species, including L. pneumophila. The sensitivity of our genus-specific approach was at least one order of magnitude higher compared to the universal NGS approach. Comparison of quantification by real-time PCR showed consistency with the NGS data. Overall, our NGS approach can determine the quantitative abundances of Legionella species, i. e. the complete Legionella microbiome, without the need for species-specific primers. The developed NGS approach provides a new molecular surveillance tool to monitor all Legionella species in qualitative

  16. Quantitative assessment of hepatic function: modified look-locker inversion recovery (MOLLI) sequence for T1 mapping on Gd-EOB-DTPA-enhanced liver MR imaging

    Energy Technology Data Exchange (ETDEWEB)

    Yoon, Jeong Hee [Seoul National University Hospital, Department of Radiology, Seoul (Korea, Republic of); Lee, Jeong Min; Han, Joon Koo; Choi, Byung Ihn [Seoul National University Hospital, Department of Radiology, Seoul (Korea, Republic of); Seoul National University College of Medicine, Institute of Radiation Medicine, Jongno-gu, Seoul (Korea, Republic of); Paek, Munyoung [Siemens Healthcare, Seoul (Korea, Republic of)

    2016-06-15

    To determine whether multislice T1 mapping of the liver using a modified look-locker inversion recovery (MOLLI) sequence on gadoxetic acid-enhanced magnetic resonance imaging (MRI) can be used as a quantitative tool to estimate liver function and predict the presence of oesophageal or gastric varices. Phantoms filled with gadoxetic acid were scanned three times using MOLLI sequence to test repeatability. Patients with chronic liver disease or liver cirrhosis who underwent gadoxetic acid-enhanced liver MRI including MOLLI sequence at 3 T were included (n = 343). Pre- and postcontrast T1 relaxation times of the liver (T1liver), changes between pre- and postcontrast T1liver (ΔT1liver), and adjusted postcontrast T1liver (postcontrast T1liver-T1spleen/T1spleen) were compared among Child-Pugh classes. In 62 patients who underwent endoscopy, all T1 parameters and spleen sizes were correlated with varices. Phantom study showed excellent repeatability of MOLLI sequence. As Child-Pugh scores increased, pre- and postcontrast T1liver were significantly prolonged (P < 0.001), and ΔT1liver and adjusted postcontrast T1liver decreased (P< 0.001). Adjusted postcontrast T1liver and spleen size were independently associated with varices (R{sup 2} = 0.29, P < 0.001). T1 mapping of the liver using MOLLI sequence on gadoxetic acid-enhanced MRI demonstrated potential in quantitatively estimating liver function, and adjusted postcontrast T1liver was significantly associated with varices. (orig.)

  17. Three-dimensional fluid-attenuated inversion recovery sequence for visualisation of subthalamic nucleus for deep brain stimulation in Parkinson's disease

    Energy Technology Data Exchange (ETDEWEB)

    Heo, Young Jin [University of Ulsan College of Medicine, Asan Medical Center, Department of Radiology, Research Institute of Radiology, Seoul (Korea, Republic of); Inje University, Department of Radiology, Busan Paik Hospital, Busan (Korea, Republic of); Kim, Sang Joon; Kim, Ho Sung; Choi, Choong Gon; Jung, Seung Chai [University of Ulsan College of Medicine, Asan Medical Center, Department of Radiology, Research Institute of Radiology, Seoul (Korea, Republic of); Lee, Jung Kyo [University of Ulsan College of Medicine, Asan Medical Center, Department of Neurosurgery, Seoul (Korea, Republic of); Lee, Chong Sik; Chung, Sun J. [University of Ulsan College of Medicine, Asan Medical Center, Department of Neurology, Seoul (Korea, Republic of); Cho, So Hyun [Department of Radiology, Busan (Korea, Republic of); Lee, Gyoung Ro [Philips HealthCare Korea, Seoul (Korea, Republic of)

    2015-09-15

    Deep brain stimulation (DBS) of the subthalamic nucleus (STN) is an accepted treatment for advanced Parkinson's disease (PD). However, targeting the STN is difficult due to its relatively small size and variable location. The purpose of this study was to assess which of the following sequences obtained with the 3.0 T MR system can accurately delineate the STN: coronal 3D fluid-attenuated inversion recovery (FLAIR), 2D T2*-weighted fast-field echo (T2*-FFE) and 2D T2-weighted turbo spin-echo (TSE) sequences. We included 20 consecutive patients with PD who underwent 3.0 T MR for DBS targeting. 3D FLAIR, 2D T2*-FFE and T2-TSE images were obtained for all study patients. Image quality and demarcation of the STN were analysed using 4-point scales, and contrast ratio (CR) of the STN and normal white matter was calculated. The Friedman test was used to compare the three sequences. In qualitative analysis, the 2D T2*-FFE image showed more artefacts than 3D FLAIR or 2D T2-TSE, but the difference did not reach statistical significance. 3D FLAIR images showed significantly superior demarcation of the STN compared with 2D T2*-FFE and T2-TSE images (P < 0.001, respectively). The CR of 3D FLAIR was significantly higher than that of 2D T2*-FFE or T2-TSE images in multiple comparison correction (P < 0.001), but there was no significant difference in the CR between 2D T2*-FFE and T2-TSE images. Coronal 3D FLAIR images showed the most accurate demarcation of the STN for DBS targeting among coronal 3D FLAIR, 2D T2*-FFE and T2-TSE images. (orig.)

  18. Integrated analysis of gene expression, CpG island methylation, and gene copy number in breast cancer cells by deep sequencing.

    Directory of Open Access Journals (Sweden)

    Zhifu Sun

    Full Text Available We used deep sequencing technology to profile the transcriptome, gene copy number, and CpG island methylation status simultaneously in eight commonly used breast cell lines to develop a model for how these genomic features are integrated in estrogen receptor positive (ER+ and negative breast cancer. Total mRNA sequence, gene copy number, and genomic CpG island methylation were carried out using the Illumina Genome Analyzer. Sequences were mapped to the human genome to obtain digitized gene expression data, DNA copy number in reference to the non-tumor cell line (MCF10A, and methylation status of 21,570 CpG islands to identify differentially expressed genes that were correlated with methylation or copy number changes. These were evaluated in a dataset from 129 primary breast tumors. Gene expression in cell lines was dominated by ER-associated genes. ER+ and ER- cell lines formed two distinct, stable clusters, and 1,873 genes were differentially expressed in the two groups. Part of chromosome 8 was deleted in all ER- cells and part of chromosome 17 amplified in all ER+ cells. These loci encoded 30 genes that were overexpressed in ER+ cells; 9 of these genes were overexpressed in ER+ tumors. We identified 149 differentially expressed genes that exhibited differential methylation of one or more CpG islands within 5 kb of the 5' end of the gene and for which mRNA abundance was inversely correlated with CpG island methylation status. In primary tumors we identified 84 genes that appear to be robust components of the methylation signature that we identified in ER+ cell lines. Our analyses reveal a global pattern of differential CpG island methylation that contributes to the transcriptome landscape of ER+ and ER- breast cancer cells and tumors. The role of gene amplification/deletion appears to more modest, although several potentially significant genes appear to be regulated by copy number aberrations.

  19. Integrative analyses of RNA editing, alternative splicing, and expression of young genes in human brain transcriptome by deep RNA sequencing.

    Science.gov (United States)

    Wu, Dong-Dong; Ye, Ling-Qun; Li, Yan; Sun, Yan-Bo; Shao, Yi; Chen, Chunyan; Zhu, Zhu; Zhong, Li; Wang, Lu; Irwin, David M; Zhang, Yong E; Zhang, Ya-Ping

    2015-08-01

    Next-generation RNA sequencing has been successfully used for identification of transcript assembly, evaluation of gene expression levels, and detection of post-transcriptional modifications. Despite these large-scale studies, additional comprehensive RNA-seq data from different subregions of the human brain are required to fully evaluate the evolutionary patterns experienced by the human brain transcriptome. Here, we provide a total of 6.5 billion RNA-seq reads from different subregions of the human brain. A significant correlation was observed between the levels of alternative splicing and RNA editing, which might be explained by a competition between the molecular machineries responsible for the splicing and editing of RNA. Young human protein-coding genes demonstrate biased expression to the neocortical and non-neocortical regions during evolution on the lineage leading to humans. We also found that a significantly greater number of young human protein-coding genes are expressed in the putamen, a tissue that was also observed to have the highest level of RNA-editing activity. The putamen, which previously received little attention, plays an important role in cognitive ability, and our data suggest a potential contribution of the putamen to human evolution. © The Author (2015). Published by Oxford University Press on behalf of Journal of Molecular Cell Biology, IBCB, SIBS, CAS. All rights reserved.

  20. Deep RNA sequencing reveals hidden features and dynamics of early gene transcription in Paramecium bursaria chlorella virus 1.

    Directory of Open Access Journals (Sweden)

    Guillaume Blanc

    Full Text Available Paramecium bursaria chlorella virus 1 (PBCV-1 is the prototype of the genus Chlorovirus (family Phycodnaviridae that infects the unicellular, eukaryotic green alga Chlorella variabilis NC64A. The 331-kb PBCV-1 genome contains 416 major open reading frames. A mRNA-seq approach was used to analyze PBCV-1 transcriptomes at 6 progressive times during the first hour of infection. The alignment of 17 million reads to the PBCV-1 genome allowed the construction of single-base transcriptome maps. Significant transcription was detected for a