WorldWideScience

Sample records for nontranslated rna sequences

  1. Sequences in the 5′ Nontranslated Region of Hepatitis C Virus Required for RNA Replication

    Science.gov (United States)

    Friebe, Peter; Lohmann, Volker; Krieger, Nicole; Bartenschlager, Ralf

    2001-01-01

    Sequences in the 5′ and 3′ termini of plus-strand RNA viruses harbor cis-acting elements important for efficient translation and replication. In case of the hepatitis C virus (HCV), a plus-strand RNA virus of the family Flaviviridae, a 341-nucleotide-long nontranslated region (NTR) is located at the 5′ end of the genome. This sequence contains an internal ribosome entry site (IRES) that is located downstream of an about 40-nucleotide-long sequence of unknown function. By using our recently developed HCV replicon system, we mapped and characterized the sequences in the 5′ NTR required for RNA replication. We show that deletions introduced into the 5′ terminal 40 nucleotides abolished RNA replication but only moderately affected translation. By generating a series of replicons with HCV-poliovirus (PV) chimeric 5′ NTRs, we could show that the first 125 nucleotides of the HCV genome are essential and sufficient for RNA replication. However, the efficiency could be tremendously increased upon the addition of the complete HCV 5′ NTR. These data show that (i) sequences upstream of the HCV IRES are essential for RNA replication, (ii) the first 125 nucleotides of the HCV 5′ NTR are sufficient for RNA replication, but such replicon molecules are severely impaired for multiplication, and (iii) high-level HCV replication requires sequences located within the IRES. These data provide the first identification of signals in the 5′ NTR of HCV RNA essential for replication of this virus. PMID:11711595

  2. A cooperative interaction between nontranslated RNA sequences and NS5A protein promotes in vivo fitness of a chimeric hepatitis C/GB virus B.

    Directory of Open Access Journals (Sweden)

    Lucile Warter

    Full Text Available GB virus B (GBV-B is closely related to hepatitis C virus (HCV, infects small non-human primates, and is thus a valuable surrogate for studying HCV. Despite significant differences, the 5' nontranslated RNAs (NTRs of these viruses fold into four similar structured domains (I-IV, with domains II-III-IV comprising the viral internal ribosomal entry site (IRES. We previously reported the in vivo rescue of a chimeric GBV-B (vGB/III(HC containing HCV sequence in domain III, an essential segment of the IRES. We show here that three mutations identified within the vGB/III(HC genome (within the 3'NTR, upstream of the poly(U tract, and NS5A coding sequence are necessary and sufficient for production of this chimeric virus following intrahepatic inoculation of synthetic RNA in tamarins, and thus apparently compensate for the presence of HCV sequence in domain III. To assess the mechanism(s underlying these compensatory mutations, and to determine whether 5'NTR subdomains participating in genome replication do so in a virus-specific fashion, we constructed and evaluated a series of chimeric subgenomic GBV-B replicons in which various 5'NTR subdomains were substituted with their HCV homologs. Domains I and II of the GBV-B 5'NTR could not be replaced with HCV sequence, indicating that they contain essential, virus-specific RNA replication elements. In contrast, domain III could be swapped with minimal loss of genome replication capacity in cell culture. The 3'NTR and NS5A mutations required for rescue of the related chimeric virus in vivo had no effect on replication of the subgenomic GBneoD/III(HC RNA in vitro. The data suggest that in vivo fitness of the domain III chimeric virus is dependent on a cooperative interaction between the 5'NTR, 3'NTR and NS5A at a step in the viral life cycle subsequent to genome replication, most likely during particle assembly. Such a mechanism may be common to all hepaciviruses.

  3. Essential nontranslational functions of tRNA synthetases.

    Science.gov (United States)

    Guo, Min; Schimmel, Paul

    2013-03-01

    Nontranslational functions of vertebrate aminoacyl tRNA synthetases (aaRSs), which catalyze the production of aminoacyl-tRNAs for protein synthesis, have recently been discovered. Although these new functions were thought to be 'moonlighting activities', many are as critical for cellular homeostasis as their activity in translation. New roles have been associated with their cytoplasmic forms as well as with nuclear and secreted extracellular forms that affect pathways for cardiovascular development and the immune response and mTOR, IFN-γ and p53 signaling. The associations of aaRSs with autoimmune disorders, cancers and neurological disorders further highlight nontranslational functions of these proteins. New architecture elaborations of the aaRSs accompany their functional expansion in higher organisms and have been associated with the nontranslational functions for several aaRSs. Although a general understanding of how these functions developed is limited, the expropriation of aaRSs for essential nontranslational functions may have been initiated by co-opting the amino acid-binding site for another purpose.

  4. Essential Non-Translational Functions of tRNA Synthetases

    Science.gov (United States)

    Guo, Min; Schimmel, Paul

    2013-01-01

    Nontranslational functions of vertebrate aminoacyl tRNA synthetases (aaRSs), which catalyze the production of aminoacyl-tRNAs for protein synthesis, have recently been discovered. While these new functions were thought to be ‘moonlighting activities’, many are as critical for cellular homeostasis as the activity in translation. New roles have been associated with cytoplasmic forms as well as with nuclear and secreted extracellular forms that impact pathways for cardiovascular development, the immune response, and mTOR, IFN-γ and p53 signaling. The associations of aaRSs with autoimmune disorders, cancers and neurological disorders further highlight nontranslational functions of these proteins. Novel architecture elaborations of the aaRSs accompany their functional expansion in higher organisms and have been associated with the nontranslational functions for several aaRSs. While a general understanding of how these functions developed is limited, the expropriation of aaRSs for essential nontranslational functions may have been initiated by co-opting the amino acid binding site for another purpose. PMID:23416400

  5. Transfer of the 3' non-translated region of grapevine chrome mosaic virus RNA-1 by recombination to tomato black ring virus RNA-2 in pseudorecombinant isolates.

    Science.gov (United States)

    Le Gall, O; Candresse, T; Dunez, J

    1995-05-01

    In grapevine chrome mosaic and tomato black ring viruses (GCMV and TBRV), as in many other nepoviruses, the 3' non-translated regions (3'NTR) are identical between the two genomic RNAs. We have investigated the structure of the 3'NTR of two recombinant isolates which contain GCMV RNA-1 and TBRV RNA-2. In these isolates, the 3'NTR of RNA-1 was transferred to RNA-2, thus restoring the 3' identity. The transfer occurred within three passages, and probably contributes to the spread of randomly appearing mutations from one genomic RNA to the other. The site of recombination is near the 3' end of the open reading frame.

  6. RNA Sequencing Analysis of Salivary Extracellular RNA.

    Science.gov (United States)

    Majem, Blanca; Li, Feng; Sun, Jie; Wong, David T W

    2017-01-01

    Salivary biomarkers for disease detection, diagnostic and prognostic assessments have become increasingly well established in recent years. In this chapter we explain the current leading technology that has been used to characterize salivary non-coding RNAs (ncRNAs) from the extracellular RNA (exRNA) fraction: HiSeq from Illumina® platform for RNA sequencing. Therefore, the chapter is divided into two main sections regarding the type of the library constructed (small and long ncRNA libraries), from saliva collection, RNA extraction and quantification to cDNA library generation and corresponding QCs. Using these invaluable technical tools, one can identify thousands of ncRNA species in saliva. These methods indicate that salivary exRNA provides an efficient medium for biomarker discovery of oral and systemic diseases.

  7. Deciphering the RNA landscape by RNAome sequencing.

    Science.gov (United States)

    Derks, Kasper W J; Misovic, Branislav; van den Hout, Mirjam C G N; Kockx, Christel E M; Gomez, Cesar Payan; Brouwer, Rutger W W; Vrieling, Harry; Hoeijmakers, Jan H J; van IJcken, Wilfred F J; Pothof, Joris

    2015-01-01

    Current RNA expression profiling methods rely on enrichment steps for specific RNA classes, thereby not detecting all RNA species in an unperturbed manner. We report strand-specific RNAome sequencing that determines expression of small and large RNAs from rRNA-depleted total RNA in a single sequence run. Since current analysis pipelines cannot reliably analyze small and large RNAs simultaneously, we developed TRAP, Total Rna Analysis Pipeline, a robust interface that is also compatible with existing RNA sequencing protocols. RNAome sequencing quantitatively preserved all RNA classes, allowing cross-class comparisons that facilitates the identification of relationships between different RNA classes. We demonstrate the strength of RNAome sequencing in mouse embryonic stem cells treated with cisplatin. MicroRNA and mRNA expression in RNAome sequencing significantly correlated between replicates and was in concordance with both existing RNA sequencing methods and gene expression arrays generated from the same samples. Moreover, RNAome sequencing also detected additional RNA classes such as enhancer RNAs, anti-sense RNAs, novel RNA species and numerous differentially expressed RNAs undetectable by other methods. At the level of complete RNA classes, RNAome sequencing also identified a specific global repression of the microRNA and microRNA isoform classes after cisplatin treatment whereas all other classes such as mRNAs were unchanged. These characteristics of RNAome sequencing will significantly improve expression analysis as well as studies on RNA biology not covered by existing methods.

  8. RNAome sequencing delineates the complete RNA landscape

    Directory of Open Access Journals (Sweden)

    Kasper W.J. Derks

    2015-09-01

    Full Text Available Standard RNA expression profiling methods rely on enrichment steps for specific RNA classes, thereby not detecting all RNA species. For example, small and large RNAs from the same sample cannot be sequenced in a single sequence run. We designed RNAome sequencing, which is a strand-specific method to determine the expression of small and large RNAs from ribosomal RNA-depleted total RNA in a single sequence run. RNAome sequencing quantitatively preserves all RNA classes. This characteristic allows comparisons between RNA classes, thereby facilitating relationships between different RNA classes. Here, we describe in detail the experimental procedure associated with RNAome sequencing published by Derks and colleagues in RNA Biology (2015 [1]. We also provide the R code for the developed Total Rna Analysis Pipeline (TRAP, an algorithm to analyze RNAome sequencing datasets (deposited at the Gene Expression Omnibus data repository, accession number GSE48084.

  9. RNAome sequencing delineates the complete RNA landscape

    NARCIS (Netherlands)

    K.W.J. Derks (Kasper); J. Pothof (Joris)

    2015-01-01

    textabstractStandard RNA expression profiling methods rely on enrichment steps for specific RNA classes, thereby not detecting all RNA species. For example, small and large RNAs from the same sample cannot be sequenced in a single sequence run. We designed RNAome sequencing, which is a

  10. Deciphering the RNA landscape by RNAome sequencing

    NARCIS (Netherlands)

    K.W.J. Derks (Kasper); B. Misovic (Branislav); M.C.G.N. van den hout (Mirjam); C. Kockx (Christel); C.P. Gomez (Cesar Payan); R.W.W. Brouwer (Rutger); H. Vrieling (Harry); J.H.J. Hoeijmakers (Jan); W.F.J. van IJcken (Wilfred); J. Pothof (Joris)

    2015-01-01

    textabstractCurrent RNA expression profiling methods rely on enrichment steps for specific RNA classes, thereby not detecting all RNA species in an unperturbed manner. We report strand-specific RNAome sequencing that determines expression of small and large RNAs from rRNA-depleted total RNA in a

  11. Biases in small RNA deep sequencing data

    OpenAIRE

    Raabe, Carsten A.; Tang, Thean-Hock; Brosius, Juergen; Rozhdestvensky, Timofey S.

    2013-01-01

    High-throughput RNA sequencing (RNA-seq) is considered a powerful tool for novel gene discovery and fine-tuned transcriptional profiling. The digital nature of RNA-seq is also believed to simplify meta-analysis and to reduce background noise associated with hybridization-based approaches. The development of multiplex sequencing enables efficient and economic parallel analysis of gene expression. In addition, RNA-seq is of particular value when low RNA expression or modest changes between samp...

  12. Deletions within the 3' Non-Translated Region of Alfalfa mosaic virus RNA4 Do Not Affect Replication but Significantly Reduce Long-Distance Movement of Chimeric Tobacco mosaic virus

    Directory of Open Access Journals (Sweden)

    Vidadi Yusibov

    2013-07-01

    Full Text Available Alfalfa mosaic virus (AlMV RNAs 1 and 2 with deletions in their 3' non‑translated regions (NTRs have been previously shown to be encapsidated into virions by coat protein (CP expressed from RNA3, indicating that the 3' NTRs of RNAs 1 and 2 are not required for virion assembly. Here, we constructed various mutants by deleting sequences within the 3' NTR of AlMV subgenomic (sg RNA4 (same as of RNA3 and examined the effect of these deletions on replication and translation of chimeric Tobacco mosaic virus (TMV expressing AlMV sgRNA4 from the TMV CP sg promoter (Av/A4 in tobacco protoplasts and Nicotiana benthamiana plants. While the Av/A4 mutants were as competent as the wild-type Av/A4 in RNA replication in protoplasts, their encapsidation, long-distance movement and virus accumulation varied significantly in N. benthamiana. These data suggest that the 3' NTR of AlMV sgRNA4 contains potential elements necessary for virus encapsidation.

  13. Biases in small RNA deep sequencing data.

    Science.gov (United States)

    Raabe, Carsten A; Tang, Thean-Hock; Brosius, Juergen; Rozhdestvensky, Timofey S

    2014-02-01

    High-throughput RNA sequencing (RNA-seq) is considered a powerful tool for novel gene discovery and fine-tuned transcriptional profiling. The digital nature of RNA-seq is also believed to simplify meta-analysis and to reduce background noise associated with hybridization-based approaches. The development of multiplex sequencing enables efficient and economic parallel analysis of gene expression. In addition, RNA-seq is of particular value when low RNA expression or modest changes between samples are monitored. However, recent data uncovered severe bias in the sequencing of small non-protein coding RNA (small RNA-seq or sRNA-seq), such that the expression levels of some RNAs appeared to be artificially enhanced and others diminished or even undetectable. The use of different adapters and barcodes during ligation as well as complex RNA structures and modifications drastically influence cDNA synthesis efficacies and exemplify sources of bias in deep sequencing. In addition, variable specific RNA G/C-content is associated with unequal polymerase chain reaction amplification efficiencies. Given the central importance of RNA-seq to molecular biology and personalized medicine, we review recent findings that challenge small non-protein coding RNA-seq data and suggest approaches and precautions to overcome or minimize bias.

  14. Nucleotide sequence of medium-chain acyl-CoA dehydrogenase mRNA and its expression in enzyme-deficient human tissue

    Energy Technology Data Exchange (ETDEWEB)

    Kelly, D.P.; Kim, J.J.; Billadello, J.J.; Hainline, B.E.; Chu, T.W.; Strauss, A.W.

    1987-06-01

    Medium-chain acyl-CoA dehydrogenase is one of three similar enzymes that catalyze the initial step of fatty acid ..beta..-oxidation. Definition of the primary structure of MCAD and the tissue distribution of its mRNA is of biochemical and clinical importance because of the recent recognition of inherited MCAD deficiency in humans. The MCAD mRNA nucleotide sequence was determined from two overlapping cDNA clones isolated from human liver and placental cDNA libraries, respectively. The MCAD mRNA includes a 1263-base-pair coding region and a 738-base-pair 3'-nontranslated region. A partial amino acid sequence (137 residues) determined on peptides derived from MCAD purified from porcine liver confirmed the identity of the cDNA clone. Comparison of the amino acid sequence predicted from the human MCAD cDNA with the partial protein sequence of the porcine MCAD revealed a high degree (88%) of interspecies sequence identity. RNA blot analysis shows that MCAD mRNA is expressed in a variety of rat (2.2 kilobases) and human (2.4 kilobases) tissues. Blot hybridization of RNA prepared from cultured skin fibroblasts from a patient with MCAD deficiency disclosed that mRNA was present and of similar size of MCAD mRNA derived from control fibroblasts. The isolation and characterization of MCAD cDNA is an important step in the definition of the defect underlying its metabolic consequences.

  15. Compilation of tRNA sequences.

    Science.gov (United States)

    Sprinzl, M; Grueter, F; Spelzhaus, A; Gauss, D H

    1980-01-11

    This compilation presents in a small space the tRNA sequences so far published. The numbering of tRNAPhe from yeast is used following the rules proposed by the participants of the Cold Spring Harbor Meeting on tRNA 1978 (1,2;Fig. 1). This numbering allows comparisons with the three dimensional structure of tRNAPhe. The secondary structure of tRNAs is indicated by specific underlining. In the primary structure a nucleoside followed by a nucleoside in brackets or a modification in brackets denotes that both types of nucleosides can occupy this position. Part of a sequence in brackets designates a piece of sequence not unambiguosly analyzed. Rare nucleosides are named according to the IUPACIUB rules (for complicated rare nucleosides and their identification see Table 1); those with lengthy names are given with the prefix x and specified in the footnotes. Footnotes are numbered according to the coordinates of the corresponding nucleoside and are indicated in the sequence by an asterisk. The references are restricted to the citation of the latest publication in those cases where several papers deal with one sequence. For additional information the reader is referred either to the original literature or to other tRNA sequence compilations (3-7). Mutant tRNAs are dealt with in a compilation by J. Celis (8). The compilers would welcome any information by the readers regarding missing material or erroneous presentation. On the basis of this numbering system computer printed compilations of tRNA sequences in a linear form and in cloverleaf form are in preparation.

  16. The RNA world, automatic sequences and oncogenetics

    International Nuclear Information System (INIS)

    Tahir Shah, K.

    1993-04-01

    We construct a model of the RNA world in terms of naturally evolving nucleotide sequences assuming only Crick-Watson base pairing and self-cleaving/splicing capability. These sequences have the following properties. 1) They are recognizable by an automation (or automata). That is, to each k-sequence, there exist a k-automation which accepts, recognizes or generates the k-sequence. These are known as automatic sequences. Fibonacci and Morse-Thue sequences are the most natural outcome of pre-biotic chemical conditions. 2) Infinite (resp. large) sequences are self-similar (resp. nearly self-similar) under certain rewrite rules and consequently give rise to fractal (resp.fractal-like) structures. Computationally, such sequences can also be generated by their corresponding deterministic parallel re-write system, known as a DOL system. The self-similar sequences are fixed points of their respective rewrite rules. Some of these automatic sequences have the capability that they can read or 'accept' other sequences while others can detect errors and trigger error-correcting mechanisms. They can be enlarged and have block and/or palindrome structure. Linear recurring sequences such as Fibonacci sequence are simply Feed-back Shift Registers, a well know model of information processing machines. We show that a mutation of any rewrite rule can cause a combinatorial explosion of error and relates this to oncogenetical behavior. On the other hand, a mutation of sequences that are not rewrite rules, leads to normal evolutionary change. Known experimental results support our hypothesis. (author). Refs

  17. Short RNA indicator sequences are not completely degraded by autoclaving

    Science.gov (United States)

    Unnithan, Veena V.; Unc, Adrian; Joe, Valerisa; Smith, Geoffrey B.

    2014-01-01

    Short indicator RNA sequences (autoclaving and are recovered intact by molecular amplification. Primers targeting longer sequences are most likely to produce false positives due to amplification errors easily verified by melting curves analyses. If short indicator RNA sequences are used for virus identification and quantification then post autoclave RNA degradation methodology should be employed, which may include further autoclaving. PMID:24518856

  18. Meta-analysis of small RNA-sequencing errors reveals ubiquitous post-transcriptional RNA modifications

    OpenAIRE

    Ebhardt, H. Alexander; Tsang, Herbert H.; Dai, Denny C.; Liu, Yifeng; Bostan, Babak; Fahlman, Richard P.

    2009-01-01

    Recent advances in DNA-sequencing technology have made it possible to obtain large datasets of small RNA sequences. Here we demonstrate that not all non-perfectly matched small RNA sequences are simple technological sequencing errors, but many hold valuable biological information. Analysis of three small RNA datasets originating from Oryza sativa and Arabidopsis thaliana small RNA-sequencing projects demonstrates that many single nucleotide substitution errors overlap when aligning homologous...

  19. Studies of RNA Sequence and Structure Using Nanopores

    Science.gov (United States)

    Henley, Robert Y.; Carson, Spencer; Wanunu, Meni

    2016-01-01

    Nanopores are powerful single-molecule sensors with nanometer scale dimensions suitable for detection, quantification, and characterization of nucleic acids and proteins. Beyond sequencing applications, both biological and solid-state nanopores hold great promise as tools for studying the biophysical properties of RNA. In this review, we highlight selected landmark nanopore studies with regards to RNA sequencing, microRNA detection, RNA/ligand interactions, and RNA structural/conformational analysis. PMID:26970191

  20. A telescope for the RNA universe : novel bioinformatic approaches to analyze RNA sequencing data

    NARCIS (Netherlands)

    Pulyakhina, Irina

    2016-01-01

    In this thesis I focus on the application of bioinformatics to analyze RNA. The type of experimental data of interest is sequencing data generated with various Next Generation Sequencing technique: nuclear RNA, cytoplasmic RNA, captured polyadenylated RNA fragments, etc. I highlight the necessity in

  1. Sequencing small RNA: introduction and data analysis fundamentals.

    Science.gov (United States)

    Mehta, Jai Prakash

    2014-01-01

    Small RNAs are important transcriptional regulators within cells. With the advent of powerful Next Generation Sequencing platforms, sequencing small RNAs seems to be an obvious choice to understand their expression and its downstream effect. Additionally, sequencing provides an opportunity to identify novel and polymorphic miRNA. However, the biggest challenge is the appropriate data analysis pipeline, which is still in phase of active development by various academic groups. This chapter describes basic and advanced steps for small RNA sequencing analysis including quality control, small RNA alignment and quantification, differential expression analysis, novel small RNA identification, target prediction, and downstream analysis. We also provide a list of various resources for small RNA analysis.

  2. Sample preparation for small RNA massive parallel sequencing

    NARCIS (Netherlands)

    Gommans, W.M.; Berezikov, E.

    2012-01-01

    High-throughput sequencing has allowed for a comprehensive small RNA (sRNA) expression analysis of numerous tissues in a diverse set of organisms. The computational analysis of the millions of generated sequencing reads has led to the discovery of novel miRNAs and other sRNA species, and resulted in

  3. Involvement of the 5'-leader sequence in coupling the stability of a human H3 histone mRNA with DNA replication

    International Nuclear Information System (INIS)

    Morris, T.; Marashi, F.; Weber, L.; Hickey, E.; Greenspan, D.; Bonner, J.; Stein, J.; Stein, G.

    1986-01-01

    Two lines of evidence derived from fusion gene constructs indicate that sequences residing in the 5'-nontranslated region of a cell cycle-dependent human H3 histone mRNA are involved in the selective destabilization that occurs when DNA synthesis is terminated. The experimental approach was to construct chimeric genes in which fragments of the mRNA coding regions of the H3 histone gene were fused with fragments of genes not expressed in a cell cycle-dependent manner. After transfection in HeLa S3 cells with the recombinant plasmids, levels of fusion mRNAs were determined by S1 nuclease analysis prior to and following DNA synthesis inhibition. When the first 20 nucleotides of an H3 histone mRNA leader were replaced with 89 nucleotides of the leader from a Drosophila heat-shock (hsp70) mRNA, the fusion transcript remained stable during inhibition of DNA synthesis, in contrast to the rapid destabilization of the endogenous histone mRNA in these cells. In a reciprocal experiment, a histone-globin fusion gene was constructed that produced a transcript with the initial 20 nucleotides of the H3 histone mRNA substituted for the human β-globin mRNA leader. In HeLa cells treated with inhibitors of DNA synthesis and/or protein synthesis, cellular levels of this histone-globin fusion mRNA appeared to be regulated in a manner similar to endogenous histone mRNA levels. These results suggest that the first 20 nucleotides of the leader are sufficient to couple histone mRNA stability with DNA replication

  4. Empirical insights into the stochasticity of small RNA sequencing

    Science.gov (United States)

    Qin, Li-Xuan; Tuschl, Thomas; Singer, Samuel

    2016-04-01

    The choice of stochasticity distribution for modeling the noise distribution is a fundamental assumption for the analysis of sequencing data and consequently is critical for the accurate assessment of biological heterogeneity and differential expression. The stochasticity of RNA sequencing has been assumed to follow Poisson distributions. We collected microRNA sequencing data and observed that its stochasticity is better approximated by gamma distributions, likely because of the stochastic nature of exponential PCR amplification. We validated our findings with two independent datasets, one for microRNA sequencing and another for RNA sequencing. Motivated by the gamma distributed stochasticity, we provided a simple method for the analysis of RNA sequencing data and showed its superiority to three existing methods for differential expression analysis using three data examples of technical replicate data and biological replicate data.

  5. Size, Shape, and Sequence-Dependent Immunogenicity of RNA Nanoparticles

    Directory of Open Access Journals (Sweden)

    Sijin Guo

    2017-12-01

    Full Text Available RNA molecules have emerged as promising therapeutics. Like all other drugs, the safety profile and immune response are important criteria for drug evaluation. However, the literature on RNA immunogenicity has been controversial. Here, we used the approach of RNA nanotechnology to demonstrate that the immune response of RNA nanoparticles is size, shape, and sequence dependent. RNA triangle, square, pentagon, and tetrahedron with same shape but different sizes, or same size but different shapes were used as models to investigate the immune response. The levels of pro-inflammatory cytokines induced by these RNA nanoarchitectures were assessed in macrophage-like cells and animals. It was found that RNA polygons without extension at the vertexes were immune inert. However, when single-stranded RNA with a specific sequence was extended from the vertexes of RNA polygons, strong immune responses were detected. These immunostimulations are sequence specific, because some other extended sequences induced little or no immune response. Additionally, larger-size RNA square induced stronger cytokine secretion. 3D RNA tetrahedron showed stronger immunostimulation than planar RNA triangle. These results suggest that the immunogenicity of RNA nanoparticles is tunable to produce either a minimal immune response that can serve as safe therapeutic vectors, or a strong immune response for cancer immunotherapy or vaccine adjuvants.

  6. Meta-analysis of small RNA-sequencing errors reveals ubiquitous post-transcriptional RNA modifications.

    Science.gov (United States)

    Ebhardt, H Alexander; Tsang, Herbert H; Dai, Denny C; Liu, Yifeng; Bostan, Babak; Fahlman, Richard P

    2009-05-01

    Recent advances in DNA-sequencing technology have made it possible to obtain large datasets of small RNA sequences. Here we demonstrate that not all non-perfectly matched small RNA sequences are simple technological sequencing errors, but many hold valuable biological information. Analysis of three small RNA datasets originating from Oryza sativa and Arabidopsis thaliana small RNA-sequencing projects demonstrates that many single nucleotide substitution errors overlap when aligning homologous non-identical small RNA sequences. Investigating the sites and identities of substitution errors reveal that many potentially originate as a result of post-transcriptional modifications or RNA editing. Modifications include N1-methyl modified purine nucleotides in tRNA, potential deamination or base substitutions in micro RNAs, 3' micro RNA uridine extensions and 5' micro RNA deletions. Additionally, further analysis of large sequencing datasets reveal that the combined effects of 5' deletions and 3' uridine extensions can alter the specificity by which micro RNAs associate with different Argonaute proteins. Hence, we demonstrate that not all sequencing errors in small RNA datasets are technical artifacts, but that these actually often reveal valuable biological insights to the sites of post-transcriptional RNA modifications.

  7. Phylogenetic relationships of Salmonella based on rRNA sequences

    DEFF Research Database (Denmark)

    Christensen, H.; Nordentoft, Steen; Olsen, J.E.

    1998-01-01

    separated by 16S rRNA analysis and found to be closely related to the Escherichia coli and Shigella complex by both 16S and 23S rRNA analyses. The diphasic serotypes S. enterica subspp. I and VI were separated from the monophasic serotypes subspp. IIIa and IV, including S. bongori, by 23S rRNA sequence...

  8. Reconstruction of ancestral RNA sequences under multiple structural constraints

    OpenAIRE

    Tremblay-Savard, Olivier; Reinharz, Vladimir; Waldisp?hl, J?r?me

    2016-01-01

    Background Secondary structures form the scaffold of multiple sequence alignment of non-coding RNA (ncRNA) families. An accurate reconstruction of ancestral ncRNAs must use this structural signal. However, the inference of ancestors of a single ncRNA family with a single consensus structure may bias the results towards sequences with high affinity to this structure, which are far from the true ancestors. Methods In this paper, we introduce achARNement, a maximum parsimony approach that, given...

  9. Finding Common Sequence and Structure Motifs in a set of RNA sequences

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Heyer, Laurie J.; Stormo, Gary D.

    1997-01-01

    We present a computational scheme to search for the most common motif, composed of a combination of sequence and structure constraints, among a collection of RNA sequences. The method uses a simplified version of the Sankoff algorithm for simultaneous folding and alignment of RNA sequences......, and comparisons with other approaches, are provided. The solutions include finding consensus structure identical to published ones....

  10. Empirical insights into the stochasticity of small RNA sequencing

    OpenAIRE

    Qin, Li-Xuan; Tuschl, Thomas; Singer, Samuel

    2016-01-01

    The choice of stochasticity distribution for modeling the noise distribution is a fundamental assumption for the analysis of sequencing data and consequently is critical for the accurate assessment of biological heterogeneity and differential expression. The stochasticity of RNA sequencing has been assumed to follow Poisson distributions. We collected microRNA sequencing data and observed that its stochasticity is better approximated by gamma distributions, likely because of the stochastic na...

  11. Sequence analysis of RNase MRP RNA reveals its origination from eukaryotic RNase P RNA

    Science.gov (United States)

    Zhu, Yanglong; Stribinskis, Vilius; Ramos, Kenneth S.; Li, Yong

    2006-01-01

    RNase MRP is a eukaryote-specific endoribonuclease that generates RNA primers for mitochondrial DNA replication and processes precursor rRNA. RNase P is a ubiquitous endoribonuclease that cleaves precursor tRNA transcripts to produce their mature 5′ termini. We found extensive sequence homology of catalytic domains and specificity domains between their RNA subunits in many organisms. In Candida glabrata, the internal loop of helix P3 is 100% conserved between MRP and P RNAs. The helix P8 of MRP RNA from microsporidia Encephalitozoon cuniculi is identical to that of P RNA. Sequence homology can be widely spread over the whole molecule of MRP RNA and P RNA, such as those from Dictyostelium discoideum. These conserved nucleotides between the MRP and P RNAs strongly support the hypothesis that the MRP RNA is derived from the P RNA molecule in early eukaryote evolution. PMID:16540690

  12. Simulations Using Random-Generated DNA and RNA Sequences

    Science.gov (United States)

    Bryce, C. F. A.

    1977-01-01

    Using a very simple computer program written in BASIC, a very large number of random-generated DNA or RNA sequences are obtained. Students use these sequences to predict complementary sequences and translational products, evaluate base compositions, determine frequencies of particular triplet codons, and suggest possible secondary structures.…

  13. Annotating RNA motifs in sequences and alignments.

    Science.gov (United States)

    Gardner, Paul P; Eldai, Hisham

    2015-01-01

    RNA performs a diverse array of important functions across all cellular life. These functions include important roles in translation, building translational machinery and maturing messenger RNA. More recent discoveries include the miRNAs and bacterial sRNAs that regulate gene expression, the thermosensors, riboswitches and other cis-regulatory elements that help prokaryotes sense their environment and eukaryotic piRNAs that suppress transposition. However, there can be a long period between the initial discovery of a RNA and determining its function. We present a bioinformatic approach to characterize RNA motifs, which are critical components of many RNA structure-function relationships. These motifs can, in some instances, provide researchers with functional hypotheses for uncharacterized RNAs. Moreover, we introduce a new profile-based database of RNA motifs--RMfam--and illustrate some applications for investigating the evolution and functional characterization of RNA. All the data and scripts associated with this work are available from: https://github.com/ppgardne/RMfam. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  14. Nucleotide sequence of a human tRNA gene heterocluster

    International Nuclear Information System (INIS)

    Chang, Y.N.; Pirtle, I.L.; Pirtle, R.M.

    1986-01-01

    Leucine tRNA from bovine liver was used as a hybridization probe to screen a human gene library harbored in Charon-4A of bacteriophage lambda. The human DNA inserts from plaque-pure clones were characterized by restriction endonuclease mapping and Southern hybridization techniques, using both [3'- 32 P]-labeled bovine liver leucine tRNA and total tRNA as hybridization probes. An 8-kb Hind III fragment of one of these γ-clones was subcloned into the Hind III site of pBR322. Subsequent fine restriction mapping and DNA sequence analysis of this plasmid DNA indicated the presence of four tRNA genes within the 8-kb DNA fragment. A leucine tRNA gene with an anticodon of AAG and a proline tRNA gene with an anticodon of AGG are in a 1.6-kb subfragment. A threonine tRNA gene with an anticodon of UGU and an as yet unidentified tRNA gene are located in a 1.1-kb subfragment. These two different subfragments are separated by 2.8 kb. The coding regions of the three sequenced genes contain characteristic internal split promoter sequences and do not have intervening sequences. The 3'-flanking region of these three genes have typical RNA polymerase III termination sites of at least four consecutive T residues

  15. Sequence requirements for RNA strand transfer during nidovirus discontinuous subgenomic RNA synthesis

    NARCIS (Netherlands)

    Pasternak, A. O.; van den Born, E.; Spaan, W. J.; Snijder, E. J.

    2001-01-01

    Nidovirus subgenomic mRNAs contain a leader sequence derived from the 5' end of the genome fused to different sequences ('bodies') derived from the 3' end. Their generation involves a unique mechanism of discontinuous subgenomic RNA synthesis that resembles copy-choice RNA recombination. During this

  16. Rfam: annotating families of non-coding RNA sequences.

    Science.gov (United States)

    Daub, Jennifer; Eberhardt, Ruth Y; Tate, John G; Burge, Sarah W

    2015-01-01

    The primary task of the Rfam database is to collate experimentally validated noncoding RNA (ncRNA) sequences from the published literature and facilitate the prediction and annotation of new homologues in novel nucleotide sequences. We group homologous ncRNA sequences into "families" and related families are further grouped into "clans." We collate and manually curate data cross-references for these families from other databases and external resources. Our Web site offers researchers a simple interface to Rfam and provides tools with which to annotate their own sequences using our covariance models (CMs), through our tools for searching, browsing, and downloading information on Rfam families. In this chapter, we will work through examples of annotating a query sequence, collating family information, and searching for data.

  17. Deep Sequencing Insights in Therapeutic shRNA Processing and siRNA Target Cleavage Precision

    Directory of Open Access Journals (Sweden)

    Hubert Denise

    2014-01-01

    Full Text Available TT-034 (PF-05095808 is a recombinant adeno-associated virus serotype 8 (AAV8 agent expressing three short hairpin RNA (shRNA pro-drugs that target the hepatitis C virus (HCV RNA genome. The cytosolic enzyme Dicer cleaves each shRNA into multiple, potentially active small interfering RNA (siRNA drugs. Using next-generation sequencing (NGS to identify and characterize active shRNAs maturation products, we observed that each TT-034–encoded shRNA could be processed into as many as 95 separate siRNA strands. Few of these appeared active as determined by Sanger 5′ RNA Ligase-Mediated Rapid Amplification of cDNA Ends (5-RACE and through synthetic shRNA and siRNA analogue studies. Moreover, NGS scrutiny applied on 5-RACE products (RACE-seq suggested that synthetic siRNAs could direct cleavage in not one, but up to five separate positions on targeted RNA, in a sequence-dependent manner. These data support an on-target mechanism of action for TT-034 without cytotoxicity and question the accepted precision of substrate processing by the key RNA interference (RNAi enzymes Dicer and siRNA-induced silencing complex (siRISC.

  18. FLDS: A Comprehensive dsRNA Sequencing Method for Intracellular RNA Virus Surveillance.

    Science.gov (United States)

    Urayama, Syun-Ichi; Takaki, Yoshihiro; Nunoura, Takuro

    2016-01-01

    Knowledge of the distribution and diversity of RNA viruses is still limited in spite of their possible environmental and epidemiological impacts because RNA virus-specific metagenomic methods have not yet been developed. We herein constructed an effective metagenomic method for RNA viruses by targeting long double-stranded (ds)RNA in cellular organisms, which is a hallmark of infection, or the replication of dsRNA and single-stranded (ss)RNA viruses, except for retroviruses. This novel dsRNA targeting metagenomic method is characterized by an extremely high recovery rate of viral RNA sequences, the retrieval of terminal sequences, and uniform read coverage, which has not previously been reported in other metagenomic methods targeting RNA viruses. This method revealed a previously unidentified viral RNA diversity of more than 20 complete RNA viral genomes including dsRNA and ssRNA viruses associated with an environmental diatom colony. Our approach will be a powerful tool for cataloging RNA viruses associated with organisms of interest.

  19. An enhanced RNA alignment benchmark for sequence alignment programs

    Directory of Open Access Journals (Sweden)

    Steger Gerhard

    2006-10-01

    Full Text Available Abstract Background The performance of alignment programs is traditionally tested on sets of protein sequences, of which a reference alignment is known. Conclusions drawn from such protein benchmarks do not necessarily hold for the RNA alignment problem, as was demonstrated in the first RNA alignment benchmark published so far. For example, the twilight zone – the similarity range where alignment quality drops drastically – starts at 60 % for RNAs in comparison to 20 % for proteins. In this study we enhance the previous benchmark. Results The RNA sequence sets in the benchmark database are taken from an increased number of RNA families to avoid unintended impact by using only a few families. The size of sets varies from 2 to 15 sequences to assess the influence of the number of sequences on program performance. Alignment quality is scored by two measures: one takes into account only nucleotide matches, the other measures structural conservation. The performance order of parameters – like nucleotide substitution matrices and gap-costs – as well as of programs is rated by rank tests. Conclusion Most sequence alignment programs perform equally well on RNA sequence sets with high sequence identity, that is with an average pairwise sequence identity (APSI above 75 %. Parameters for gap-open and gap-extension have a large influence on alignment quality lower than APSI ≤ 75 %; optimal parameter combinations are shown for several programs. The use of different 4 × 4 substitution matrices improved program performance only in some cases. The performance of iterative programs drastically increases with increasing sequence numbers and/or decreasing sequence identity, which makes them clearly superior to programs using a purely non-iterative, progressive approach. The best sequence alignment programs produce alignments of high quality down to APSI > 55 %; at lower APSI the use of sequence+structure alignment programs is recommended.

  20. Mutation of miRNA target sequences during human evolution

    DEFF Research Database (Denmark)

    Gardner, Paul P; Vinther, Jeppe

    2008-01-01

    It has long-been hypothesized that changes in non-protein-coding genes and the regulatory sequences controlling expression could undergo positive selection. Here we identify 402 putative microRNA (miRNA) target sequences that have been mutated specifically in the human lineage and show that genes...... containing such deletions are more highly expressed than their mouse orthologs. Our findings indicate that some miRNA target mutations are fixed by positive selection and might have been involved in the evolution of human-specific traits....

  1. The chemical structure of DNA sequence signals for RNA transcription

    Science.gov (United States)

    George, D. G.; Dayhoff, M. O.

    1982-01-01

    The proposed recognition sites for RNA transcription for E. coli NRA polymerase, bacteriophage T7 RNA polymerase, and eukaryotic RNA polymerase Pol II are evaluated in the light of the requirements for efficient recognition. It is shown that although there is good experimental evidence that specific nucleic acid sequence patterns are involved in transcriptional regulation in bacteria and bacterial viruses, among the sequences now available, only in the case of the promoters recognized by bacteriophage T7 polymerase does it seem likely that the pattern is sufficient. It is concluded that the eukaryotic pattern that is investigated is not restrictive enough to serve as a recognition site.

  2. Novel microRNA discovery using small RNA sequencing in post-mortem human brain.

    Science.gov (United States)

    Wake, Christian; Labadorf, Adam; Dumitriu, Alexandra; Hoss, Andrew G; Bregu, Joli; Albrecht, Kenneth H; DeStefano, Anita L; Myers, Richard H

    2016-10-04

    MicroRNAs (miRNAs) are short, non-coding RNAs that regulate gene expression mainly through translational repression of target mRNA molecules. More than 2700 human miRNAs have been identified and some are known to be associated with disease phenotypes and to display tissue-specific patterns of expression. We used high-throughput small RNA sequencing to discover novel miRNAs in 93 human post-mortem prefrontal cortex samples from individuals with Huntington's disease (n = 28) or Parkinson's disease (n = 29) and controls without neurological impairment (n = 36). A custom miRNA identification analysis pipeline was built, which utilizes miRDeep* miRNA identification and result filtering based on false positive rate estimates. Ninety-nine novel miRNA candidates with a false positive rate of less than 5 % were identified. Thirty-four of the candidate miRNAs show sequence similarity with known mature miRNA sequences and may be novel members of known miRNA families, while the remaining 65 may constitute previously undiscovered families of miRNAs. Nineteen of the 99 candidate miRNAs were replicated using independent, publicly-available human brain RNA-sequencing samples, and seven were experimentally validated using qPCR. We have used small RNA sequencing to identify 99 putative novel miRNAs that are present in human brain samples.

  3. TARDIS, a targeted RNA directional sequencing method for rare RNA discovery.

    Science.gov (United States)

    Portal, Maximiliano M; Pavet, Valeria; Erb, Cathie; Gronemeyer, Hinrich

    2015-12-01

    High-throughput transcriptional analysis has unveiled a myriad of novel RNAs. However, technical constraints in RNA sequencing library preparation and platform performance hamper the identification of rare transcripts contained within the RNA repertoire. Herein we present targeted-RNA directional sequencing (TARDIS), a hybridization-based method that allows subsets of RNAs contained within the transcriptome to be interrogated independently of transcript length, function, the presence or absence of poly-A tracts, or the mechanism of biogenesis. TARDIS is a modular protocol that is subdivided into four main phases, including the generation of random DNA traps covering the region of interest, purification of input RNA material, DNA trap-based RNA capture, and finally RNA-sequencing library construction. Importantly, coupling RNA capture to strand-specific RNA sequencing enables robust identification and reconstruction of novel transcripts, the definition of sense and antisense RNA pairs and, by the concomitant analysis of long and natural small RNA pools, it allows the user to infer potential precursor-product relations. TARDIS takes ∼10 d to implement.

  4. Reproducibility of high-throughput mRNA and small RNA sequencing across laboratories.

    Science.gov (United States)

    't Hoen, Peter A C; Friedländer, Marc R; Almlöf, Jonas; Sammeth, Michael; Pulyakhina, Irina; Anvar, Seyed Yahya; Laros, Jeroen F J; Buermans, Henk P J; Karlberg, Olof; Brännvall, Mathias; den Dunnen, Johan T; van Ommen, Gert-Jan B; Gut, Ivo G; Guigó, Roderic; Estivill, Xavier; Syvänen, Ann-Christine; Dermitzakis, Emmanouil T; Lappalainen, Tuuli

    2013-11-01

    RNA sequencing is an increasingly popular technology for genome-wide analysis of transcript sequence and abundance. However, understanding of the sources of technical and interlaboratory variation is still limited. To address this, the GEUVADIS consortium sequenced mRNAs and small RNAs of lymphoblastoid cell lines of 465 individuals in seven sequencing centers, with a large number of replicates. The variation between laboratories appeared to be considerably smaller than the already limited biological variation. Laboratory effects were mainly seen in differences in insert size and GC content and could be adequately corrected for. In small-RNA sequencing, the microRNA (miRNA) content differed widely between samples owing to competitive sequencing of rRNA fragments. This did not affect relative quantification of miRNAs. We conclude that distributing RNA sequencing among different laboratories is feasible, given proper standardization and randomization procedures. We provide a set of quality measures and guidelines for assessing technical biases in RNA-seq data.

  5. Nuclear RNA sequencing of the mouse erythroid cell transcriptome.

    Directory of Open Access Journals (Sweden)

    Jennifer A Mitchell

    Full Text Available In addition to protein coding genes a substantial proportion of mammalian genomes are transcribed. However, most transcriptome studies investigate steady-state mRNA levels, ignoring a considerable fraction of the transcribed genome. In addition, steady-state mRNA levels are influenced by both transcriptional and posttranscriptional mechanisms, and thus do not provide a clear picture of transcriptional output. Here, using deep sequencing of nuclear RNAs (nucRNA-Seq in parallel with chromatin immunoprecipitation sequencing (ChIP-Seq of active RNA polymerase II, we compared the nuclear transcriptome of mouse anemic spleen erythroid cells with polymerase occupancy on a genome-wide scale. We demonstrate that unspliced transcripts quantified by nucRNA-seq correlate with primary transcript frequencies measured by RNA FISH, but differ from steady-state mRNA levels measured by poly(A-enriched RNA-seq. Highly expressed protein coding genes showed good correlation between RNAPII occupancy and transcriptional output; however, genome-wide we observed a poor correlation between transcriptional output and RNAPII association. This poor correlation is due to intergenic regions associated with RNAPII which correspond with transcription factor bound regulatory regions and a group of stable, nuclear-retained long non-coding transcripts. In conclusion, sequencing the nuclear transcriptome provides an opportunity to investigate the transcriptional landscape in a given cell type through quantification of unspliced primary transcripts and the identification of nuclear-retained long non-coding RNAs.

  6. Nuclear RNA Sequencing of the Mouse Erythroid Cell Transcriptome

    Science.gov (United States)

    Umlauf, David; Chen, Chih-yu; Moir, Catherine A.; Eskiw, Christopher H.; Schoenfelder, Stefan; Chakalova, Lyubomira; Nagano, Takashi; Fraser, Peter

    2012-01-01

    In addition to protein coding genes a substantial proportion of mammalian genomes are transcribed. However, most transcriptome studies investigate steady-state mRNA levels, ignoring a considerable fraction of the transcribed genome. In addition, steady-state mRNA levels are influenced by both transcriptional and posttranscriptional mechanisms, and thus do not provide a clear picture of transcriptional output. Here, using deep sequencing of nuclear RNAs (nucRNA-Seq) in parallel with chromatin immunoprecipitation sequencing (ChIP-Seq) of active RNA polymerase II, we compared the nuclear transcriptome of mouse anemic spleen erythroid cells with polymerase occupancy on a genome-wide scale. We demonstrate that unspliced transcripts quantified by nucRNA-seq correlate with primary transcript frequencies measured by RNA FISH, but differ from steady-state mRNA levels measured by poly(A)-enriched RNA-seq. Highly expressed protein coding genes showed good correlation between RNAPII occupancy and transcriptional output; however, genome-wide we observed a poor correlation between transcriptional output and RNAPII association. This poor correlation is due to intergenic regions associated with RNAPII which correspond with transcription factor bound regulatory regions and a group of stable, nuclear-retained long non-coding transcripts. In conclusion, sequencing the nuclear transcriptome provides an opportunity to investigate the transcriptional landscape in a given cell type through quantification of unspliced primary transcripts and the identification of nuclear-retained long non-coding RNAs. PMID:23209567

  7. Small molecule alteration of RNA sequence in cells and animals.

    Science.gov (United States)

    Guan, Lirui; Luo, Yiling; Ja, William W; Disney, Matthew D

    2017-10-18

    RNA regulation and maintenance are critical for proper cell function. Small molecules that specifically alter RNA sequence would be exceptionally useful as probes of RNA structure and function or as potential therapeutics. Here, we demonstrate a photochemical approach for altering the trinucleotide expanded repeat causative of myotonic muscular dystrophy type 1 (DM1), r(CUG) exp . The small molecule, 2H-4-Ru, binds to r(CUG) exp and converts guanosine residues to 8-oxo-7,8-dihydroguanosine upon photochemical irradiation. We demonstrate targeted modification upon irradiation in cell culture and in Drosophila larvae provided a diet containing 2H-4-Ru. Our results highlight a general chemical biology approach for altering RNA sequence in vivo by using small molecules and photochemistry. Furthermore, these studies show that addition of 8-oxo-G lesions into RNA 3' untranslated regions does not affect its steady state levels. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. RNA-Pareto: interactive analysis of Pareto-optimal RNA sequence-structure alignments.

    Science.gov (United States)

    Schnattinger, Thomas; Schöning, Uwe; Marchfelder, Anita; Kestler, Hans A

    2013-12-01

    Incorporating secondary structure information into the alignment process improves the quality of RNA sequence alignments. Instead of using fixed weighting parameters, sequence and structure components can be treated as different objectives and optimized simultaneously. The result is not a single, but a Pareto-set of equally optimal solutions, which all represent different possible weighting parameters. We now provide the interactive graphical software tool RNA-Pareto, which allows a direct inspection of all feasible results to the pairwise RNA sequence-structure alignment problem and greatly facilitates the exploration of the optimal solution set.

  9. Evaluation of commercially available RNA amplification kits for RNA sequencing using very low input amounts of total RNA.

    Science.gov (United States)

    Shanker, Savita; Paulson, Ariel; Edenberg, Howard J; Peak, Allison; Perera, Anoja; Alekseyev, Yuriy O; Beckloff, Nicholas; Bivens, Nathan J; Donnelly, Robert; Gillaspy, Allison F; Grove, Deborah; Gu, Weikuan; Jafari, Nadereh; Kerley-Hamilton, Joanna S; Lyons, Robert H; Tepper, Clifford; Nicolet, Charles M

    2015-04-01

    This article includes supplemental data. Please visit http://www.fasebj.org to obtain this information.Multiple recent publications on RNA sequencing (RNA-seq) have demonstrated the power of next-generation sequencing technologies in whole-transcriptome analysis. Vendor-specific protocols used for RNA library construction often require at least 100 ng total RNA. However, under certain conditions, much less RNA is available for library construction. In these cases, effective transcriptome profiling requires amplification of subnanogram amounts of RNA. Several commercial RNA amplification kits are available for amplification prior to library construction for next-generation sequencing, but these kits have not been comprehensively field evaluated for accuracy and performance of RNA-seq for picogram amounts of RNA. To address this, 4 types of amplification kits were tested with 3 different concentrations, from 5 ng to 50 pg, of a commercially available RNA. Kits were tested at multiple sites to assess reproducibility and ease of use. The human total reference RNA used was spiked with a control pool of RNA molecules in order to further evaluate quantitative recovery of input material. Additional control data sets were generated from libraries constructed following polyA selection or ribosomal depletion using established kits and protocols. cDNA was collected from the different sites, and libraries were synthesized at a single site using established protocols. Sequencing runs were carried out on the Illumina platform. Numerous metrics were compared among the kits and dilutions used. Overall, no single kit appeared to meet all the challenges of small input material. However, it is encouraging that excellent data can be recovered with even the 50 pg input total RNA.

  10. Small RNA cloning and sequencing strategy affects host and viral microRNA expression signatures.

    Science.gov (United States)

    Stik, Grégoire; Muylkens, Benoît; Coupeau, Damien; Laurent, Sylvie; Dambrine, Ginette; Messmer, Mélanie; Chane-Woon-Ming, Béatrice; Pfeffer, Sébastien; Rasschaert, Denis

    2014-07-10

    The establishment of the microRNA (miRNA) expression signatures is the basic element to investigate the role played by these regulatory molecules in the biology of an organism. Marek's disease virus 1 (MDV-1) is an avian herpesvirus that naturally infects chicken and induces T cells lymphomas. During latency, MDV-1, like other herpesviruses, expresses a limited subset of transcripts. These include three miRNA clusters. Several studies identified the expression of virus and host encoded miRNAs from MDV-1 infected cell cultures and chickens. But a high discrepancy was observed when miRNA cloning frequencies obtained from different cloning and sequencing protocols were compared. Thus, we analyzed the effect of small RNA library preparation and sequencing on the miRNA frequencies obtained from the same RNA samples collected during MDV-1 infection of chicken at different steps of the oncoviral pathogenesis. Qualitative and quantitative variations were found in the data, depending on the strategy used. One of the mature miRNA derived from the latency-associated-transcript (LAT), mdv1-miR-M7-5p, showed the highest variation. Its cloning frequency was 50% of the viral miRNA counts when a small scale sequencing approach was used. Its frequency was 100 times less abundant when determined through the deep sequencing approach. Northern blot analysis showed a better correlation with the miRNA frequencies found by the small scale sequencing approach. By analyzing the cellular miRNA repertoire, we also found a gap between the two sequencing approaches. Collectively, our study indicates that next-generation sequencing data considered alone are limited for assessing the absolute copy number of transcripts. Thus, the quantification of small RNA should be addressed by compiling data obtained by using different techniques such as microarrays, qRT-PCR and NB analysis in support of high throughput sequencing data. These observations should be considered when miRNA variations are studied

  11. High throughput 16S rRNA gene amplicon sequencing

    DEFF Research Database (Denmark)

    Nierychlo, Marta; Larsen, Poul; Jørgensen, Mads Koustrup

    S rRNA gene amplicon sequencing has been developed over the past few years and is now ready to use for more comprehensive studies related to plant operation and optimization thanks to short analysis time, low cost, high throughput, and high taxonomic resolution. In this study we show how 16S r......RNA gene amplicon sequencing can be used to reveal factors of importance for the operation of full-scale nutrient removal plants related to settling problems and floc properties. Using optimized DNA extraction protocols, indexed primers and our in-house Illumina platform, we prepared multiple samples...... be correlated to the presence of the species that are regarded as “strong” and “weak” floc formers. In conclusion, 16S rRNA gene amplicon sequencing provides a high throughput approach for a rapid and cheap community profiling of activated sludge that in combination with multivariate statistics can be used...

  12. Reconstruction of ancestral RNA sequences under multiple structural constraints

    Directory of Open Access Journals (Sweden)

    Olivier Tremblay-Savard

    2016-11-01

    Full Text Available Abstract Background Secondary structures form the scaffold of multiple sequence alignment of non-coding RNA (ncRNA families. An accurate reconstruction of ancestral ncRNAs must use this structural signal. However, the inference of ancestors of a single ncRNA family with a single consensus structure may bias the results towards sequences with high affinity to this structure, which are far from the true ancestors. Methods In this paper, we introduce achARNement, a maximum parsimony approach that, given two alignments of homologous ncRNA families with consensus secondary structures and a phylogenetic tree, simultaneously calculates ancestral RNA sequences for these two families. Results We test our methodology on simulated data sets, and show that achARNement outperforms classical maximum parsimony approaches in terms of accuracy, but also reduces by several orders of magnitude the number of candidate sequences. To conclude this study, we apply our algorithms on the Glm clan and the FinP-traJ clan from the Rfam database. Conclusions Our results show that our methods reconstruct small sets of high-quality candidate ancestors with better agreement to the two target structures than with classical approaches. Our program is freely available at: http://csb.cs.mcgill.ca/acharnement .

  13. Reconstruction of ancestral RNA sequences under multiple structural constraints.

    Science.gov (United States)

    Tremblay-Savard, Olivier; Reinharz, Vladimir; Waldispühl, Jérôme

    2016-11-11

    Secondary structures form the scaffold of multiple sequence alignment of non-coding RNA (ncRNA) families. An accurate reconstruction of ancestral ncRNAs must use this structural signal. However, the inference of ancestors of a single ncRNA family with a single consensus structure may bias the results towards sequences with high affinity to this structure, which are far from the true ancestors. In this paper, we introduce achARNement, a maximum parsimony approach that, given two alignments of homologous ncRNA families with consensus secondary structures and a phylogenetic tree, simultaneously calculates ancestral RNA sequences for these two families. We test our methodology on simulated data sets, and show that achARNement outperforms classical maximum parsimony approaches in terms of accuracy, but also reduces by several orders of magnitude the number of candidate sequences. To conclude this study, we apply our algorithms on the Glm clan and the FinP-traJ clan from the Rfam database. Our results show that our methods reconstruct small sets of high-quality candidate ancestors with better agreement to the two target structures than with classical approaches. Our program is freely available at: http://csb.cs.mcgill.ca/acharnement .

  14. High-Throughput Sequencing Based Methods of RNA Structure Investigation

    DEFF Research Database (Denmark)

    Kielpinski, Lukasz Jan

    In this thesis we describe the development of four related methods for RNA structure probing that utilize massive parallel sequencing. Using them, we were able to gather structural data for multiple, long molecules simultaneously. First, we have established an easy to follow experimental and comp......In this thesis we describe the development of four related methods for RNA structure probing that utilize massive parallel sequencing. Using them, we were able to gather structural data for multiple, long molecules simultaneously. First, we have established an easy to follow experimental...... with known priming sites....

  15. Identification of Bacterial Small RNAs by RNA Sequencing

    DEFF Research Database (Denmark)

    Gómez Lozano, María; Marvig, Rasmus Lykke; Molin, Søren

    2014-01-01

    Small regulatory RNAs (sRNAs) in bacteria are known to modulate gene expression and control a variety of processes including metabolic reactions, stress responses, and pathogenesis in response to environmental signals. A method to identify bacterial sRNAs on a genome-wide scale based on RNA...... sequencing (RNA-seq) is described that involves the preparation and analysis of three different sequencing libraries. As a signifi cant number of unique sRNAs are identifi ed in each library, the libraries can be used either alone or in combination to increase the number of sRNAs identifi ed. The approach...

  16. Single-cell sequencing of the small-RNA transcriptome.

    Science.gov (United States)

    Faridani, Omid R; Abdullayev, Ilgar; Hagemann-Jensen, Michael; Schell, John P; Lanner, Fredrik; Sandberg, Rickard

    2016-12-01

    Little is known about the heterogeneity of small-RNA expression as small-RNA profiling has so far required large numbers of cells. Here we present a single-cell method for small-RNA sequencing and apply it to naive and primed human embryonic stem cells and cancer cells. Analysis of microRNAs and fragments of tRNAs and small nucleolar RNAs (snoRNAs) reveals the potential of microRNAs as markers for different cell types and states.

  17. Unveiling Chloroplast RNA Editing Events Using Next Generation Small RNA Sequencing Data

    Directory of Open Access Journals (Sweden)

    Nureyev F. Rodrigues

    2017-09-01

    Full Text Available Organellar RNA editing involves the modification of nucleotide sequences to maintain conserved protein functions, mainly by reverting non-neutral codon mutations. The loss of plastid editing events, resulting from mutations in RNA editing factors or through stress interference, leads to developmental, physiological and photosynthetic alterations. Recently, next generation sequencing technology has generated the massive discovery of sRNA sequences and expanded the number of sRNA data. Here, we present a method to screen chloroplast RNA editing using public sRNA libraries from Arabidopsis, soybean and rice. We mapped the sRNAs against the nuclear, mitochondrial and plastid genomes to confirm predicted cytosine to uracil (C-to-U editing events and identify new editing sites in plastids. Among the predicted editing sites, 40.57, 34.78, and 25.31% were confirmed using sRNAs from Arabidopsis, soybean and rice, respectively. SNP analysis revealed 58.2, 43.9, and 37.5% new C-to-U changes in the respective species and identified known and new putative adenosine to inosine (A-to-I RNA editing in tRNAs. The present method and data reveal the potential of sRNA as a reliable source to identify new and confirm known editing sites.

  18. Transfer RNA detection by small RNA deep sequencing and disease association with myelodysplastic syndromes.

    Science.gov (United States)

    Guo, Yan; Bosompem, Amma; Mohan, Sanjay; Erdogan, Begum; Ye, Fei; Vickers, Kasey C; Sheng, Quanhu; Zhao, Shilin; Li, Chung-I; Su, Pei-Fang; Jagasia, Madan; Strickland, Stephen A; Griffiths, Elizabeth A; Kim, Annette S

    2015-09-24

    Although advances in sequencing technologies have popularized the use of microRNA (miRNA) sequencing (miRNA-seq) for the quantification of miRNA expression, questions remain concerning the optimal methodologies for analysis and utilization of the data. The construction of a miRNA sequencing library selects RNA by length rather than type. However, as we have previously described, miRNAs represent only a subset of the species obtained by size selection. Consequently, the libraries obtained for miRNA sequencing also contain a variety of additional species of small RNAs. This study looks at the prevalence of these other species obtained from bone marrow aspirate specimens and explores the predictive value of these small RNAs in the determination of response to therapy in myelodysplastic syndromes (MDS). Paired pre and post treatment bone marrow aspirate specimens were obtained from patients with MDS who were treated with either azacytidine or decitabine (24 pre-treatment specimens, 23 post-treatment specimens) with 22 additional non-MDS control specimens. Total RNA was extracted from these specimens and submitted for next generation sequencing after an additional size exclusion step to enrich for small RNAs. The species of small RNAs were enumerated, single nucleotide variants (SNVs) identified, and finally the differential expression of tRNA-derived species (tDRs) in the specimens correlated with diseasestatus and response to therapy. Using miRNA sequencing data generated from bone marrow aspirate samples of patients with known MDS (N = 47) and controls (N = 23), we demonstrated that transfer RNA (tRNA) fragments (specifically tRNA halves, tRHs) are one of the most common species of small RNA isolated from size selection. Using tRNA expression values extracted from miRNA sequencing data, we identified six tRNA fragments that are differentially expressed between MDS and normal samples. Using the elastic net method, we identified four tRNAs-derived small RNAs (t

  19. High-throughput RNA interference screening using pooled shRNA libraries and next generation sequencing.

    Science.gov (United States)

    Sims, David; Mendes-Pereira, Ana M; Frankum, Jessica; Burgess, Darren; Cerone, Maria-Antonietta; Lombardelli, Cristina; Mitsopoulos, Costas; Hakas, Jarle; Murugaesu, Nirupa; Isacke, Clare M; Fenwick, Kerry; Assiotis, Ioannis; Kozarewa, Iwanka; Zvelebil, Marketa; Ashworth, Alan; Lord, Christopher J

    2011-10-21

    RNA interference (RNAi) screening is a state-of-the-art technology that enables the dissection of biological processes and disease-related phenotypes. The commercial availability of genome-wide, short hairpin RNA (shRNA) libraries has fueled interest in this area but the generation and analysis of these complex data remain a challenge. Here, we describe complete experimental protocols and novel open source computational methodologies, shALIGN and shRNAseq, that allow RNAi screens to be rapidly deconvoluted using next generation sequencing. Our computational pipeline offers efficient screen analysis and the flexibility and scalability to quickly incorporate future developments in shRNA library technology.

  20. Nicotiana small RNA sequences support a host genome origin of cucumber mosaic virus satellite RNA.

    Directory of Open Access Journals (Sweden)

    Kiran Zahid

    2015-01-01

    Full Text Available Satellite RNAs (satRNAs are small noncoding subviral RNA pathogens in plants that depend on helper viruses for replication and spread. Despite many decades of research, the origin of satRNAs remains unknown. In this study we show that a β-glucuronidase (GUS transgene fused with a Cucumber mosaic virus (CMV Y satellite RNA (Y-Sat sequence (35S-GUS:Sat was transcriptionally repressed in N. tabacum in comparison to a 35S-GUS transgene that did not contain the Y-Sat sequence. This repression was not due to DNA methylation at the 35S promoter, but was associated with specific DNA methylation at the Y-Sat sequence. Both northern blot hybridization and small RNA deep sequencing detected 24-nt siRNAs in wild-type Nicotiana plants with sequence homology to Y-Sat, suggesting that the N. tabacum genome contains Y-Sat-like sequences that give rise to 24-nt sRNAs capable of guiding RNA-directed DNA methylation (RdDM to the Y-Sat sequence in the 35S-GUS:Sat transgene. Consistent with this, Southern blot hybridization detected multiple DNA bands in Nicotiana plants that had sequence homology to Y-Sat, suggesting that Y-Sat-like sequences exist in the Nicotiana genome as repetitive DNA, a DNA feature associated with 24-nt sRNAs. Our results point to a host genome origin for CMV satRNAs, and suggest novel approach of using small RNA sequences for finding the origin of other satRNAs.

  1. Nicotiana small RNA sequences support a host genome origin of cucumber mosaic virus satellite RNA.

    Science.gov (United States)

    Zahid, Kiran; Zhao, Jian-Hua; Smith, Neil A; Schumann, Ulrike; Fang, Yuan-Yuan; Dennis, Elizabeth S; Zhang, Ren; Guo, Hui-Shan; Wang, Ming-Bo

    2015-01-01

    Satellite RNAs (satRNAs) are small noncoding subviral RNA pathogens in plants that depend on helper viruses for replication and spread. Despite many decades of research, the origin of satRNAs remains unknown. In this study we show that a β-glucuronidase (GUS) transgene fused with a Cucumber mosaic virus (CMV) Y satellite RNA (Y-Sat) sequence (35S-GUS:Sat) was transcriptionally repressed in N. tabacum in comparison to a 35S-GUS transgene that did not contain the Y-Sat sequence. This repression was not due to DNA methylation at the 35S promoter, but was associated with specific DNA methylation at the Y-Sat sequence. Both northern blot hybridization and small RNA deep sequencing detected 24-nt siRNAs in wild-type Nicotiana plants with sequence homology to Y-Sat, suggesting that the N. tabacum genome contains Y-Sat-like sequences that give rise to 24-nt sRNAs capable of guiding RNA-directed DNA methylation (RdDM) to the Y-Sat sequence in the 35S-GUS:Sat transgene. Consistent with this, Southern blot hybridization detected multiple DNA bands in Nicotiana plants that had sequence homology to Y-Sat, suggesting that Y-Sat-like sequences exist in the Nicotiana genome as repetitive DNA, a DNA feature associated with 24-nt sRNAs. Our results point to a host genome origin for CMV satRNAs, and suggest novel approach of using small RNA sequences for finding the origin of other satRNAs.

  2. Sequence analysis of L RNA of Lassa virus

    International Nuclear Information System (INIS)

    Vieth, Simon; Torda, Andrew E.; Asper, Marcel; Schmitz, Herbert; Guenther, Stephan

    2004-01-01

    The L RNA of three Lassa virus strains originating from Nigeria, Ghana/Ivory Coast, and Sierra Leone was sequenced and the data subjected to structure predictions and phylogenetic analyses. The L gene products had 2218-2221 residues, diverged by 18% at the amino acid level, and contained several conserved regions. Only one region of 504 residues (positions 1043-1546) could be assigned a function, namely that of an RNA polymerase. Secondary structure predictions suggest that this domain is very similar to RNA-dependent RNA polymerases of known structure encoded by plus-strand RNA viruses, permitting a model to be built. Outside the polymerase region, there is little structural data, except for regions of strong alpha-helical content and probably a coiled-coil domain at the N terminus. No evidence for reassortment or recombination during Lassa virus evolution was found. The secondary structure-assisted alignment of the RNA polymerase region permitted a reliable reconstruction of the phylogeny of all negative-strand RNA viruses, indicating that Arenaviridae are most closely related to Nairoviruses. In conclusion, the data provide a basis for structural and functional characterization of the Lassa virus L protein and reveal new insights into the phylogeny of negative-strand RNA viruses

  3. RNA-DNA sequence differences spell genetic code ambiguities

    DEFF Research Database (Denmark)

    Bentin, Thomas; Nielsen, Michael L

    2013-01-01

    A recent paper in Science by Li et al. 2011(1) reports widespread sequence differences in the human transcriptome between RNAs and their encoding genes termed RNA-DNA differences (RDDs). The findings could add a new layer of complexity to gene expression but the study has been criticized. ...

  4. Sequence analysis of mitochondrial 16S ribosomal RNA gene ...

    Indian Academy of Sciences (India)

    Unknown

    Sequence analysis of mitochondrial 16S ribosomal RNA gene fragment from seven mosquito species. YOGESH S SHOUCHE* and MILIND S PATOLE. National Center for Cell Science, Pune University Campus, Pune 411 007, India. *Corresponding author (Fax, 91-20-5672259; Email, yogesh@nccs.res.in). Mosquitoes are ...

  5. Finding the most significant common sequence and structure motifs in a set of RNA sequences

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Heyer, L.J.; Stormo, G.D.

    1997-01-01

    We present a computational scheme to locally align a collection of RNA sequences using sequence and structure constraints, In addition, the method searches for the resulting alignments with the most significant common motifs, among all possible collections, The first part utilizes a simplified...

  6. RNA isolation for small RNA Next-Generation Sequencing from acellular biofluids.

    Science.gov (United States)

    Burgos, Kasandra L; Van Keuren-Jensen, Kendall

    2014-01-01

    There are a number of considerations when choosing protocols both upstream and downstream of Next-Generation Sequencing experiments. On the front end, purification methods, additives, and residuum can often inhibit the sensitive chemistries by which sequencing-by-synthesis is performed. On the back end, data handling, analysis software packages, and pipelines can also impact sequencing outcomes. The current chapter will describe stepwise how acellular biofluid samples are prepared for small RNA sequencing. With regard to purification methods, we found that small RNA yield can be improved considerably by following the total RNA isolation protocol included with Ambion's mirVana PARIS Kit but modifying the organic extraction step. Specifically, after transferring the upper aqueous phase to a fresh tube, water is added to the residual material (interphase and lower organic layer) and again phase-separated. In contrast, all the protocols provided with the commercially available kits at the time of this chapter publication require only one organic extraction. This simple yet, as it turns out, quite useful modification allows access to previously inaccessible material. Potential benefits from these changes are a more comprehensive sample profiling of small RNA, as well as wider access to small volume samples, such as is typically available for acellular biofluids, which now can be prepared for small RNA sequencing on the Illumina platform.

  7. Exploring Connectivity in Sequence Space of Functional RNA

    Science.gov (United States)

    Wei, Chenyu; Pohorille, Andrzej; Popovic, Milena; Ditzler, Mark

    2017-01-01

    Emergence of replicable genetic molecules was one of the marking points in the origin of life, evolution of which can be conceptualized as a walk through the space of all possible sequences. A theoretical concept of fitness landscape helps to understand evolutionary processes through assigning a value of fitness to each genotype. Then, evolution of a phenotype is viewed as a series of consecutive, single-point mutations. Natural selection biases evolution toward peaks of high fitness and away from valleys of low fitness. whereas neutral drift occurs in the sequence space without direction as mutations are introduced at random. Large networks of neutral or near-neutral mutations on a fitness landscape, especially for sufficiently long genomes, are possible or even inevitable. Their detection in experiments, however, has been elusive. Although a few near-neutral evolutionary pathways have been found, recent experimental evidence indicates landscapes consist of largely isolated islands. The generality of these results, however, is not clear, as the genome length or the fraction of functional molecules in the genotypic space might have been insufficient for the emergence of large, neutral networks. Thorough investigation on the structure of the fitness landscape is essential to understand the mechanisms of evolution of early genomes. RNA molecules are commonly assumed to play the pivotal role in the origin of genetic systems. They are widely believed to be early, if not the earliest, genetic and catalytic molecules, with abundant biochemical activities as aptamers and ribozymes, i.e. RNA molecules capable, respectively, to bind small molecules or catalyze chemical reactions. Here, we present results of our recent studies on the structure of the sequence space of RNA ligase ribozymes selected through in vitro evolution. Several hundred thousands of sequences active to a different degree were obtained by way of deep sequencing. Analysis of these sequences revealed

  8. Chimira: analysis of small RNA sequencing data and microRNA modifications.

    Science.gov (United States)

    Vitsios, Dimitrios M; Enright, Anton J

    2015-10-15

    Chimira is a web-based system for microRNA (miRNA) analysis from small RNA-Seq data. Sequences are automatically cleaned, trimmed, size selected and mapped directly to miRNA hairpin sequences. This generates count-based miRNA expression data for subsequent statistical analysis. Moreover, it is capable of identifying epi-transcriptomic modifications in the input sequences. Supported modification types include multiple types of 3'-modifications (e.g. uridylation, adenylation), 5'-modifications and also internal modifications or variation (ADAR editing or single nucleotide polymorphisms). Besides cleaning and mapping of input sequences to miRNAs, Chimira provides a simple and intuitive set of tools for the analysis and interpretation of the results (see also Supplementary Material). These allow the visual study of the differential expression between two specific samples or sets of samples, the identification of the most highly expressed miRNAs within sample pairs (or sets of samples) and also the projection of the modification profile for specific miRNAs across all samples. Other tools have already been published in the past for various types of small RNA-Seq analysis, such as UEA workbench, seqBuster, MAGI, OASIS and CAP-miRSeq, CPSS for modifications identification. A comprehensive comparison of Chimira with each of these tools is provided in the Supplementary Material. Chimira outperforms all of these tools in total execution speed and aims to facilitate simple, fast and reliable analysis of small RNA-Seq data allowing also, for the first time, identification of global microRNA modification profiles in a simple intuitive interface. Chimira has been developed as a web application and it is accessible here: http://www.ebi.ac.uk/research/enright/software/chimira. aje@ebi.ac.uk Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  9. Re-inspection of small RNA sequence datasets reveals several novel human miRNA genes.

    Directory of Open Access Journals (Sweden)

    Thomas Birkballe Hansen

    Full Text Available BACKGROUND: miRNAs are key players in gene expression regulation. To fully understand the complex nature of cellular differentiation or initiation and progression of disease, it is important to assess the expression patterns of as many miRNAs as possible. Thereby, identifying novel miRNAs is an essential prerequisite to make possible a comprehensive and coherent understanding of cellular biology. METHODOLOGY/PRINCIPAL FINDINGS: Based on two extensive, but previously published, small RNA sequence datasets from human embryonic stem cells and human embroid bodies, respectively [1], we identified 112 novel miRNA-like structures and were able to validate miRNA processing in 12 out of 17 investigated cases. Several miRNA candidates were furthermore substantiated by including additional available small RNA datasets, thereby demonstrating the power of combining datasets to identify miRNAs that otherwise may be assigned as experimental noise. CONCLUSIONS/SIGNIFICANCE: Our analysis highlights that existing datasets are not yet exhaustedly studied and continuous re-analysis of the available data is important to uncover all features of small RNA sequencing.

  10. Small RNA sequences are readily stabilized by inclusion in a carrier rRNA.

    Science.gov (United States)

    D'Souza, Lisa M; Larios-Sanz, Maia; Setterquist, Robert A; Willson, Richard C; Fox, George E

    2003-01-01

    This laboratory previously showed that an RNA derived from 5S ribosomal RNA could be used as a carrier to harbor a nucleic acid "tag" for monitoring genetically engineered or naturally occurring bacteria. The prototype system expressed a specific tagged RNA that was stable and accumulated to high levels. For such a system to be useful there should, however, be little limitation on the sequence composition and length of the insert. To test these limitations, a collection of insertion sequences were created and introduced into the artificial 5S rRNA cassette. This library consisted of random 13- and 50-base oligonucleotides that were inserted into the carrier RNA. We report here that essentially all of the insert-containing RNAs are stable and accumulate to detectable levels. Tagged RNAs were produced by both plasmid-borne and chromosomally integrated expression systems in E. coli and several Pseudomonas strains without obvious effect on the host cell. It is anticipated that in addition to its intended use in environmental monitoring, this system can be used for in vivo selection of useful artificial RNAs. Because the carrier lends stability to the RNAs, the system may also be useful in RNA production.

  11. Deciphering mRNA Sequence Determinants of Protein Production Rate

    Science.gov (United States)

    Szavits-Nossan, Juraj; Ciandrini, Luca; Romano, M. Carmen

    2018-03-01

    One of the greatest challenges in biophysical models of translation is to identify coding sequence features that affect the rate of translation and therefore the overall protein production in the cell. We propose an analytic method to solve a translation model based on the inhomogeneous totally asymmetric simple exclusion process, which allows us to unveil simple design principles of nucleotide sequences determining protein production rates. Our solution shows an excellent agreement when compared to numerical genome-wide simulations of S. cerevisiae transcript sequences and predicts that the first 10 codons, which is the ribosome footprint length on the mRNA, together with the value of the initiation rate, are the main determinants of protein production rate under physiological conditions. Finally, we interpret the obtained analytic results based on the evolutionary role of the codons' choice for regulating translation rates and ribosome densities.

  12. Adenylylation of small RNA sequencing adapters using the TS2126 RNA ligase I.

    Science.gov (United States)

    Lama, Lodoe; Ryan, Kevin

    2016-01-01

    Many high-throughput small RNA next-generation sequencing protocols use 5' preadenylylated DNA oligonucleotide adapters during cDNA library preparation. Preadenylylation of the DNA adapter's 5' end frees from ATP-dependence the ligation of the adapter to RNA collections, thereby avoiding ATP-dependent side reactions. However, preadenylylation of the DNA adapters can be costly and difficult. The currently available method for chemical adenylylation of DNA adapters is inefficient and uses techniques not typically practiced in laboratories profiling cellular RNA expression. An alternative enzymatic method using a commercial RNA ligase was recently introduced, but this enzyme works best as a stoichiometric adenylylating reagent rather than a catalyst and can therefore prove costly when several variant adapters are needed or during scale-up or high-throughput adenylylation procedures. Here, we describe a simple, scalable, and highly efficient method for the 5' adenylylation of DNA oligonucleotides using the thermostable RNA ligase 1 from bacteriophage TS2126. Adapters with 3' blocking groups are adenylylated at >95% yield at catalytic enzyme-to-adapter ratios and need not be gel purified before ligation to RNA acceptors. Experimental conditions are also reported that enable DNA adapters with free 3' ends to be 5' adenylylated at >90% efficiency. © 2015 Lama and Ryan; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  13. The nucleotide sequence of 5S rRNA from a red alga, Porphyra yezoensis.

    OpenAIRE

    Takaiwa, F; Kusuda, M; Saga, N; Sugiura, M

    1982-01-01

    The nucleotide sequence of 5S rRNA from Porphyra yezoensis has been determined to be: pACGUACGGCCAUAUCCGAGACACGCGUACCGGAACCCAUUCCGAAUUCCGAAGUCAAGCGUCCGCGAGUUGGGUUAGU - AAUCUGGUGAAAGAUCACAGGCGAACCCCCAAUGCUGUACGUC. This 5S rRNA sequence is most similar to that of Euglena gracilis (63% homology).

  14. Finding the most significant common sequence and structure motifs in a set of RNA sequences

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Heyer, L.J.; Stormo, G.D.

    1997-01-01

    We present a computational scheme to locally align a collection of RNA sequences using sequence and structure constraints, In addition, the method searches for the resulting alignments with the most significant common motifs, among all possible collections, The first part utilizes a simplified......, but the core algorithm assures that the pairwise alignments are optimized for both sequence and structure conservation. The choice of scoring system and the method of progressively constructing the final solution are important considerations that are discussed, Example solutions, and comparisons with other...

  15. Intramolecular circularization increases efficiency of RNA sequencing and enables CLIP-Seq of nuclear RNA from human cells

    Science.gov (United States)

    Chu, Yongjun; Wang, Tao; Dodd, David; Xie, Yang; Janowski, Bethany A.; Corey, David R.

    2015-01-01

    RNA sequencing (RNA-Seq) is a powerful tool for analyzing the identity of cellular RNAs but is often limited by the amount of material available for analysis. In spite of extensive efforts employing existing protocols, we observed that it was not possible to obtain useful sequencing libraries from nuclear RNA derived from cultured human cells after crosslinking and immunoprecipitation (CLIP). Here, we report a method for obtaining strand-specific small RNA libraries for RNA sequencing that requires picograms of RNA. We employ an intramolecular circularization step that increases the efficiency of library preparation and avoids the need for intermolecular ligations of adaptor sequences. Other key features include random priming for full-length cDNA synthesis and gel-free library purification. Using our method, we generated CLIP-Seq libraries from nuclear RNA that had been UV-crosslinked and immunoprecipitated with anti-Argonaute 2 (Ago2) antibody. Computational protocols were developed to enable analysis of raw sequencing data and we observe substantial differences between recognition by Ago2 of RNA species in the nucleus relative to the cytoplasm. This RNA self-circularization approach to RNA sequencing (RC-Seq) allows data to be obtained using small amounts of input RNA that cannot be sequenced by standard methods. PMID:25813040

  16. Integrated mRNA and microRNA transcriptome sequencing characterizes sequence variants and mRNA–microRNA regulatory network in nasopharyngeal carcinoma model systems

    Directory of Open Access Journals (Sweden)

    Carol Ying-Ying Szeto

    2014-01-01

    Full Text Available Nasopharyngeal carcinoma (NPC is a prevalent malignancy in Southeast Asia among the Chinese population. Aberrant regulation of transcripts has been implicated in many types of cancers including NPC. Herein, we characterized mRNA and miRNA transcriptomes by RNA sequencing (RNASeq of NPC model systems. Matched total mRNA and small RNA of undifferentiated Epstein–Barr virus (EBV-positive NPC xenograft X666 and its derived cell line C666, well-differentiated NPC cell line HK1, and the immortalized nasopharyngeal epithelial cell line NP460 were sequenced by Solexa technology. We found 2812 genes and 149 miRNAs (human and EBV to be differentially expressed in NP460, HK1, C666 and X666 with RNASeq; 533 miRNA–mRNA target pairs were inversely regulated in the three NPC cell lines compared to NP460. Integrated mRNA/miRNA expression profiling and pathway analysis show extracellular matrix organization, Beta-1 integrin cell surface interactions, and the PI3K/AKT, EGFR, ErbB, and Wnt pathways were potentially deregulated in NPC. Real-time quantitative PCR was performed on selected mRNA/miRNAs in order to validate their expression. Transcript sequence variants such as short insertions and deletions (INDEL, single nucleotide variant (SNV, and isomiRs were characterized in the NPC model systems. A novel TP53 transcript variant was identified in NP460, HK1, and C666. Detection of three previously reported novel EBV-encoded BART miRNAs and their isomiRs were also observed. Meta-analysis of a model system to a clinical system aids the choice of different cell lines in NPC studies. This comprehensive characterization of mRNA and miRNA transcriptomes in NPC cell lines and the xenograft provides insights on miRNA regulation of mRNA and valuable resources on transcript variation and regulation in NPC, which are potentially useful for mechanistic and preclinical studies.

  17. Identification of active miRNA promoters from nuclear run-on RNA sequencing.

    Science.gov (United States)

    Liu, Qi; Wang, Jing; Zhao, Yue; Li, Chung-I; Stengel, Kristy R; Acharya, Pankaj; Johnston, Gretchen; Hiebert, Scott W; Shyr, Yu

    2017-07-27

    The genome-wide identification of microRNA transcription start sites (miRNA TSSs) is essential for understanding how miRNAs are regulated in development and disease. In this study, we developed mirSTP (mirna transcription Start sites Tracking Program), a probabilistic model for identifying active miRNA TSSs from nascent transcriptomes generated by global run-on sequencing (GRO-seq) and precision run-on sequencing (PRO-seq). MirSTP takes advantage of characteristic bidirectional transcription signatures at active TSSs in GRO/PRO-seq data, and provides accurate TSS prediction for human intergenic miRNAs at a high resolution. MirSTP performed better than existing generalized and experiment specific methods, in terms of the enrichment of various promoter-associated marks. MirSTP analysis of 27 human cell lines in 183 GRO-seq and 28 PRO-seq experiments identified TSSs for 480 intergenic miRNAs, indicating a wide usage of alternative TSSs. By integrating predicted miRNA TSSs with matched ENCODE transcription factor (TF) ChIP-seq data, we connected miRNAs into the transcriptional circuitry, which provides a valuable source for understanding the complex interplay between TF and miRNA. With mirSTP, we not only predicted TSSs for 72 miRNAs, but also identified 12 primary miRNAs with significant RNA polymerase pausing alterations after JQ1 treatment; each miRNA was further validated through BRD4 binding to its predicted promoter. MirSTP is available at http://bioinfo.vanderbilt.edu/mirSTP/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  18. Preparation of highly multiplexed small RNA sequencing libraries.

    Science.gov (United States)

    Persson, Helena; Søkilde, Rolf; Pirona, Anna Chiara; Rovira, Carlos

    2017-08-01

    MicroRNAs (miRNAs) are ~22-nucleotide-long small non-coding RNAs that regulate the expression of protein-coding genes by base pairing to partially complementary target sites, preferentially located in the 3´ untranslated region (UTR) of target mRNAs. The expression and function of miRNAs have been extensively studied in human disease, as well as the possibility of using these molecules as biomarkers for prognostication and treatment guidance. To identify and validate miRNAs as biomarkers, their expression must be screened in large collections of patient samples. Here, we develop a scalable protocol for the rapid and economical preparation of a large number of small RNA sequencing libraries using dual indexing for multiplexing. Combined with the use of off-the-shelf reagents, more samples can be sequenced simultaneously on large-scale sequencing platforms at a considerably lower cost per sample. Sample preparation is simplified by pooling libraries prior to gel purification, which allows for the selection of a narrow size range while minimizing sample variation. A comparison with publicly available data from benchmarking of miRNA analysis platforms showed that this method captures absolute and differential expression as effectively as commercially available alternatives.

  19. Prader-Willi Critical Region, a Non-Translated, Imprinted Central Regulator of Bone Mass: Possible Role in Skeletal Abnormalities in Prader-Willi Syndrome.

    Directory of Open Access Journals (Sweden)

    Ee-Cheng Khor

    Full Text Available Prader-Willi Syndrome (PWS, a maternally imprinted disorder and leading cause of obesity, is characterised by insatiable appetite, poor muscle development, cognitive impairment, endocrine disturbance, short stature and osteoporosis. A number of causative loci have been located within the imprinted Prader-Willi Critical Region (PWCR, including a set of small non-translated nucleolar RNA's (snoRNA. Recently, micro-deletions in humans identified the snoRNA Snord116 as a critical contributor to the development of PWS exhibiting many of the classical symptoms of PWS. Here we show that loss of the PWCR which includes Snord116 in mice leads to a reduced bone mass phenotype, similar to that observed in humans. Consistent with reduced stature in PWS, PWCR KO mice showed delayed skeletal development, with shorter femurs and vertebrae, reduced bone size and mass in both sexes. The reduction in bone mass in PWCR KO mice was associated with deficiencies in cortical bone volume and cortical mineral apposition rate, with no change in cancellous bone. Importantly, while the length difference was corrected in aged mice, consistent with continued growth in rodents, reduced cortical bone formation was still evident, indicating continued osteoblastic suppression by loss of PWCR expression in skeletally mature mice. Interestingly, deletion of this region included deletion of the exclusively brain expressed Snord116 cluster and resulted in an upregulation in expression of both NPY and POMC mRNA in the arcuate nucleus. Importantly, the selective deletion of the PWCR only in NPY expressing neurons replicated the bone phenotype of PWCR KO mice. Taken together, PWCR deletion in mice, and specifically in NPY neurons, recapitulates the short stature and low BMD and aspects of the hormonal imbalance of PWS individuals. Moreover, it demonstrates for the first time, that a region encoding non-translated RNAs, expressed solely within the brain, can regulate bone mass in health

  20. Systems genetics of complex diseases using RNA-sequencing methods

    DEFF Research Database (Denmark)

    Mazzoni, Gianluca; Kogelman, Lisette; Suravajhala, Prashanth

    2015-01-01

    non-coding RNAs (ncRNAs). The integration of transcriptomics data with genomic data in a systems genetics context represents a valuable possibility to go deep into the causal and regulatory mechanisms that generate complex traits and diseases. However RNA-Seq data have to be treated carefully......Next generation sequencing technologies have enabled the generation of huge quantities of biological data, and nowadays extensive datasets at different ‘omics levels have been generated. Systems genetics is a powerful approach that allows to integrate different ‘omics level and understand...... the biological mechanisms behind complex diseases or traits. In the recent past, transcriptomic studies with microarrays have been replaced with the new powerful RNA-seq technologies. This has led to detection of novel gene transcripts, novel regulatory mechanisms, allele specific gene expression and numerous...

  1. MicroRNA Expression Analysis Using Small RNA Sequencing Discovery and RT-qPCR-Based Validation.

    Science.gov (United States)

    Van Goethem, Alan; Mestdagh, Pieter; Van Maerken, Tom; Vandesompele, Jo

    2017-01-01

    miRNAs are small noncoding RNA molecules that function as regulators of gene expression. Deregulated miRNA expression has been reported in various diseases including cancer. Due to their small size and high degree of homology, accurate quantification of miRNA expression is technically challenging. In this chapter, we present two different technologies for miRNA quantification: small RNA sequencing and RT-qPCR.

  2. Small RNA Library Preparation and Illumina Sequencing in Plants.

    Science.gov (United States)

    Bilichak, Andriy; Golubov, Andrey; Kovalchuk, Igor

    2017-01-01

    The discovery of small RNAs in plants and animals almost two decades ago attracted a significant interest towards epigenetic regulation of gene expression and the practical implementation of the gained knowledge in applied studies. New and sometimes unexpected functions have been ascribed to sRNAs almost every couple of years since their discovery, hence indicating that the complete role of sRNAs in plant and animal physiology is still barely understood. Next-generation sequencing technologies allow to generate high-resolution profiles of sRNAs for the consequent analysis and possibly to discover novel functions of sRNAs. In this chapter, we provide brief guidelines for sRNA library preparation in plants and a practical approach that can be implemented to overcome possible difficulties with sequencing library generation.

  3. Experimental design, preprocessing, normalization and differential expression analysis of small RNA sequencing experiments

    OpenAIRE

    McCormick, Kevin P; Willmann, Matthew R; Meyers, Blake C

    2011-01-01

    Prior to the advent of new, deep sequencing methods, small RNA (sRNA) discovery was dependent on Sanger sequencing, which was time-consuming and limited knowledge to only the most abundant sRNA. The innovation of large-scale, next-generation sequencing has exponentially increased knowledge of the biology, diversity and abundance of sRNA populations. In this review, we discuss issues involved in the design of sRNA sequencing experiments, including choosing a sequencing platform, inherent biase...

  4. Long Non-Coding RNA and Alternative Splicing Modulations in Parkinson's Leukocytes Identified by RNA Sequencing

    Science.gov (United States)

    Soreq, Lilach; Guffanti, Alessandro; Salomonis, Nathan; Simchovitz, Alon; Israel, Zvi; Bergman, Hagai; Soreq, Hermona

    2014-01-01

    The continuously prolonged human lifespan is accompanied by increase in neurodegenerative diseases incidence, calling for the development of inexpensive blood-based diagnostics. Analyzing blood cell transcripts by RNA-Seq is a robust means to identify novel biomarkers that rapidly becomes a commonplace. However, there is lack of tools to discover novel exons, junctions and splicing events and to precisely and sensitively assess differential splicing through RNA-Seq data analysis and across RNA-Seq platforms. Here, we present a new and comprehensive computational workflow for whole-transcriptome RNA-Seq analysis, using an updated version of the software AltAnalyze, to identify both known and novel high-confidence alternative splicing events, and to integrate them with both protein-domains and microRNA binding annotations. We applied the novel workflow on RNA-Seq data from Parkinson's disease (PD) patients' leukocytes pre- and post- Deep Brain Stimulation (DBS) treatment and compared to healthy controls. Disease-mediated changes included decreased usage of alternative promoters and N-termini, 5′-end variations and mutually-exclusive exons. The PD regulated FUS and HNRNP A/B included prion-like domains regulated regions. We also present here a workflow to identify and analyze long non-coding RNAs (lncRNAs) via RNA-Seq data. We identified reduced lncRNA expression and selective PD-induced changes in 13 of over 6,000 detected leukocyte lncRNAs, four of which were inversely altered post-DBS. These included the U1 spliceosomal lncRNA and RP11-462G22.1, each entailing sequence complementarity to numerous microRNAs. Analysis of RNA-Seq from PD and unaffected controls brains revealed over 7,000 brain-expressed lncRNAs, of which 3,495 were co-expressed in the leukocytes including U1, which showed both leukocyte and brain increases. Furthermore, qRT-PCR validations confirmed these co-increases in PD leukocytes and two brain regions, the amygdala and substantia

  5. Small RNA Deep Sequencing Reveals Role for Arabidopsis thaliana RNA-Dependent RNA Polymerases in Viral siRNA Biogenesis

    OpenAIRE

    Qi, Xiaopeng; Bao, Forrest Sheng; Xie, Zhixin

    2009-01-01

    RNA silencing functions as an important antiviral defense mechanism in a broad range of eukaryotes. In plants, biogenesis of several classes of endogenous small interfering RNAs (siRNAs) requires RNA-dependent RNA Polymerase (RDR) activities. Members of the RDR family proteins, including RDR1and RDR6, have also been implicated in antiviral defense, although a direct role for RDRs in viral siRNA biogenesis has yet to be demonstrated. Using a crucifer-infecting strain of Tobacco Mosaic Virus (T...

  6. SEXCMD: Development and validation of sex marker sequences for whole-exome/genome and RNA sequencing.

    Directory of Open Access Journals (Sweden)

    Seongmun Jeong

    Full Text Available Over the last decade, a large number of nucleotide sequences have been generated by next-generation sequencing technologies and deposited to public databases. However, most of these datasets do not specify the sex of individuals sampled because researchers typically ignore or hide this information. Male and female genomes in many species have distinctive sex chromosomes, XX/XY and ZW/ZZ, and expression levels of many sex-related genes differ between the sexes. Herein, we describe how to develop sex marker sequences from syntenic regions of sex chromosomes and use them to quickly identify the sex of individuals being analyzed. Array-based technologies routinely use either known sex markers or the B-allele frequency of X or Z chromosomes to deduce the sex of an individual. The same strategy has been used with whole-exome/genome sequence data; however, all reads must be aligned onto a reference genome to determine the B-allele frequency of the X or Z chromosomes. SEXCMD is a pipeline that can extract sex marker sequences from reference sex chromosomes and rapidly identify the sex of individuals from whole-exome/genome and RNA sequencing after training with a known dataset through a simple machine learning approach. The pipeline counts total numbers of hits from sex-specific marker sequences and identifies the sex of the individuals sampled based on the fact that XX/ZZ samples do not have Y or W chromosome hits. We have successfully validated our pipeline with mammalian (Homo sapiens; XY and avian (Gallus gallus; ZW genomes. Typical calculation time when applying SEXCMD to human whole-exome or RNA sequencing datasets is a few minutes, and analyzing human whole-genome datasets takes about 10 minutes. Another important application of SEXCMD is as a quality control measure to avoid mixing samples before bioinformatics analysis. SEXCMD comprises simple Python and R scripts and is freely available at https://github.com/lovemun/SEXCMD.

  7. SEXCMD: Development and validation of sex marker sequences for whole-exome/genome and RNA sequencing.

    Science.gov (United States)

    Jeong, Seongmun; Kim, Jiwoong; Park, Won; Jeon, Hongmin; Kim, Namshin

    2017-01-01

    Over the last decade, a large number of nucleotide sequences have been generated by next-generation sequencing technologies and deposited to public databases. However, most of these datasets do not specify the sex of individuals sampled because researchers typically ignore or hide this information. Male and female genomes in many species have distinctive sex chromosomes, XX/XY and ZW/ZZ, and expression levels of many sex-related genes differ between the sexes. Herein, we describe how to develop sex marker sequences from syntenic regions of sex chromosomes and use them to quickly identify the sex of individuals being analyzed. Array-based technologies routinely use either known sex markers or the B-allele frequency of X or Z chromosomes to deduce the sex of an individual. The same strategy has been used with whole-exome/genome sequence data; however, all reads must be aligned onto a reference genome to determine the B-allele frequency of the X or Z chromosomes. SEXCMD is a pipeline that can extract sex marker sequences from reference sex chromosomes and rapidly identify the sex of individuals from whole-exome/genome and RNA sequencing after training with a known dataset through a simple machine learning approach. The pipeline counts total numbers of hits from sex-specific marker sequences and identifies the sex of the individuals sampled based on the fact that XX/ZZ samples do not have Y or W chromosome hits. We have successfully validated our pipeline with mammalian (Homo sapiens; XY) and avian (Gallus gallus; ZW) genomes. Typical calculation time when applying SEXCMD to human whole-exome or RNA sequencing datasets is a few minutes, and analyzing human whole-genome datasets takes about 10 minutes. Another important application of SEXCMD is as a quality control measure to avoid mixing samples before bioinformatics analysis. SEXCMD comprises simple Python and R scripts and is freely available at https://github.com/lovemun/SEXCMD.

  8. Nascent RNA sequencing reveals distinct features in plant transcription.

    Science.gov (United States)

    Hetzel, Jonathan; Duttke, Sascha H; Benner, Christopher; Chory, Joanne

    2016-10-25

    Transcriptional regulation of gene expression is a major mechanism used by plants to confer phenotypic plasticity, and yet compared with other eukaryotes or bacteria, little is known about the design principles. We generated an extensive catalog of nascent and steady-state transcripts in Arabidopsis thaliana seedlings using global nuclear run-on sequencing (GRO-seq), 5'GRO-seq, and RNA-seq and reanalyzed published maize data to capture characteristics of plant transcription. De novo annotation of nascent transcripts accurately mapped start sites and unstable transcripts. Examining the promoters of coding and noncoding transcripts identified comparable chromatin signatures, a conserved "TGT" core promoter motif and unreported transcription factor-binding sites. Mapping of engaged RNA polymerases showed a lack of enhancer RNAs, promoter-proximal pausing, and divergent transcription in Arabidopsis seedlings and maize, which are commonly present in yeast and humans. In contrast, Arabidopsis and maize genes accumulate RNA polymerases in proximity of the polyadenylation site, a trend that coincided with longer genes and CpG hypomethylation. Lack of promoter-proximal pausing and a higher correlation of nascent and steady-state transcripts indicate Arabidopsis may regulate transcription predominantly at the level of initiation. Our findings provide insight into plant transcription and eukaryotic gene expression as a whole.

  9. Mechanisms controlling mRNA processing and translation : decoding the regulatory layers defining gene expression through RNA sequencing

    NARCIS (Netherlands)

    Klerk, Eleonora de

    2015-01-01

    The work described in this thesis focuses on the mechanisms that give rise to alternative mRNAs and their alternative translation into proteins. Each of the described studies has been based on a specific set of high-throughput RNA sequencing technologies. An overview of the available RNA sequencing

  10. Analysis of sequencing data for probing RNA secondary structures and protein-RNA binding in studying posttranscriptional regulations.

    Science.gov (United States)

    Hu, Xihao; Wu, Yang; Lu, Zhi John; Yip, Kevin Y

    2016-11-01

    High-throughput sequencing has been used to study posttranscriptional regulations, where the identification of protein-RNA binding is a major and fast-developing sub-area, which is in turn benefited by the sequencing methods for whole-transcriptome probing of RNA secondary structures. In the study of RNA secondary structures using high-throughput sequencing, bases are modified or cleaved according to their structural features, which alter the resulting composition of sequencing reads. In the study of protein-RNA binding, methods have been proposed to immuno-precipitate (IP) protein-bound RNA transcripts in vitro or in vivo By sequencing these transcripts, the protein-RNA interactions and the binding locations can be identified. For both types of data, read counts are affected by a combination of confounding factors, including expression levels of transcripts, sequence biases, mapping errors and the probing or IP efficiency of the experimental protocols. Careful processing of the sequencing data and proper extraction of important features are fundamentally important to a successful analysis. Here we review and compare different experimental methods for probing RNA secondary structures and binding sites of RNA-binding proteins (RBPs), and the computational methods proposed for analyzing the corresponding sequencing data. We suggest how these two types of data should be integrated to study the structural properties of RBP binding sites as a systematic way to better understand posttranscriptional regulations. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  11. New methods for next generation sequencing based microRNA expression profiling

    OpenAIRE

    den Dunnen Johan T; van Ommen Gertjan; Ariyurek Yavuz; Buermans Henk PJ; 't Hoen Peter AC

    2010-01-01

    Abstract Background MicroRNAs are small non-coding RNA transcripts that regulate post-transcriptional gene expression. The millions of short sequence reads generated by next generation sequencing technologies make this technique explicitly suitable for profiling of known and novel microRNAs. A modification to the small-RNA expression kit (SREK, Ambion) library preparation method for the SOLiD sequencing platform is described to generate microRNA sequencing libraries that are compatible with t...

  12. Transcriptional profiling of bovine milk using RNA sequencing

    Directory of Open Access Journals (Sweden)

    Wickramasinghe Saumya

    2012-01-01

    Full Text Available Abstract Background Cow milk is a complex bioactive fluid consumed by humans beyond infancy. Even though the chemical and physical properties of cow milk are well characterized, very limited research has been done on characterizing the milk transcriptome. This study performs a comprehensive expression profiling of genes expressed in milk somatic cells of transition (day 15, peak (day 90 and late (day 250 lactation Holstein cows by RNA sequencing. Milk samples were collected from Holstein cows at 15, 90 and 250 days of lactation, and RNA was extracted from the pelleted milk cells. Gene expression analysis was conducted by Illumina RNA sequencing. Sequence reads were assembled and analyzed in CLC Genomics Workbench. Gene Ontology (GO and pathway analysis were performed using the Blast2GO program and GeneGo application of MetaCore program. Results A total of 16,892 genes were expressed in transition lactation, 19,094 genes were expressed in peak lactation and 18,070 genes were expressed in late lactation. Regardless of the lactation stage approximately 9,000 genes showed ubiquitous expression. Genes encoding caseins, whey proteins and enzymes in lactose synthesis pathway showed higher expression in early lactation. The majority of genes in the fat metabolism pathway had high expression in transition and peak lactation milk. Most of the genes encoding for endogenous proteases and enzymes in ubiquitin-proteasome pathway showed higher expression along the course of lactation. Conclusions This is the first study to describe the comprehensive bovine milk transcriptome in Holstein cows. The results revealed that 69% of NCBI Btau 4.0 annotated genes are expressed in bovine milk somatic cells. Most of the genes were ubiquitously expressed in all three stages of lactation. However, a fraction of the milk transcriptome has genes devoted to specific functions unique to the lactation stage. This indicates the ability of milk somatic cells to adapt to different

  13. Bayesian analysis of RNA sequencing data by estimating multiple shrinkage priors

    NARCIS (Netherlands)

    van de Wiel, M.A.; Leday, G.G.R.; Pardo, L.M.; Rue, H; van der Vaart, A.W.; van Wieringen, W.N.

    2013-01-01

    Next generation sequencing is quickly replacing microarrays as a technique to probe different molecular levels of the cell, such as DNA or RNA. The technology provides higher resolution, while reducing bias. RNA sequencing results in counts of RNA strands. This type of data imposes new statistical

  14. Modulations of RNA sequences by cytokinin in pumpkin cotyledons

    International Nuclear Information System (INIS)

    Chang, C.; Ertl, J.; Chen, C.

    1987-01-01

    Polyadenylated mRNAs from excised pumpkin cotyledons treated with or without 10 -4 M benzyladenine (BA) for various time periods in suspension culture were assayed by in vitro translation in the presence of [ 35 S] methionine. The radioactive polypeptides were analyzed by one- and two-dimensional polyacrylamide gel electrophoresis. Specific sequences of mRNAs were enhanced, reduced, induced, or suppressed by the hormone within 60 min of the application of BA to the cotyledons. Four independent cDNA clones of cytokinin-modulated mRNAs have been selected and characterized. RNA blot hybridization using the four cDNA probes also indicates that the levels of specific mRNAs are modulated upward or downward by the hormone

  15. RNA-ligase-dependent biases in miRNA representation in deep-sequenced small RNA cDNA libraries

    Science.gov (United States)

    Hafner, Markus; Renwick, Neil; Brown, Miguel; Mihailović, Aleksandra; Holoch, Daniel; Lin, Carolina; Pena, John T.G.; Nusbaum, Jeffrey D.; Morozov, Pavel; Ludwig, Janos; Ojo, Tolulope; Luo, Shujun; Schroth, Gary; Tuschl, Thomas

    2011-01-01

    Sequencing of small RNA cDNA libraries is an important tool for the discovery of new RNAs and the analysis of their mutational status as well as expression changes across samples. It requires multiple enzyme-catalyzed steps, including sequential oligonucleotide adapter ligations to the 3′ and 5′ ends of the small RNAs, reverse transcription (RT), and PCR. We assessed biases in representation of miRNAs relative to their input concentration, using a pool of 770 synthetic miRNAs and 45 calibrator oligoribonucleotides, and tested the influence of Rnl1 and two variants of Rnl2, Rnl2(1–249) and Rnl2(1–249)K227Q, for 3′-adapter ligation. The use of the Rnl2 variants for adapter ligations yielded substantially fewer side products compared with Rnl1; however, the benefits of using Rnl2 remained largely obscured by additional biases in the 5′-adapter ligation step; RT and PCR steps did not have a significant impact on read frequencies. Intramolecular secondary structures of miRNA and/or miRNA/3′-adapter products contributed to these biases, which were highly reproducible under defined experimental conditions. We used the synthetic miRNA cocktail to derive correction factors for approximation of the absolute levels of individual miRNAs in biological samples. Finally, we evaluated the influence of 5′-terminal 5-nt barcode extensions for a set of 20 barcoded 3′ adapters and observed similar biases in miRNA read distribution, thereby enabling cost-saving multiplex analysis for large-scale miRNA profiling. PMID:21775473

  16. Small RNA sequencing reveals metastasis-related microRNAs in lung adenocarcinoma

    DEFF Research Database (Denmark)

    Daugaard, Iben; Venø, Morten T.; Yan, Yan

    2017-01-01

    The majority of lung cancer deaths are caused by metastatic disease. MicroRNAs (miRNAs) are posttranscriptional regulators of gene expression and miRNA dysregulation can contribute to metastatic progression. Here, small RNA sequencing was used to profile the miRNA and piwi-interacting RNA (pi...

  17. Interference by clustered regularly interspaced short palindromic repeat (CRISPR) RNA is governed by a seed sequence

    NARCIS (Netherlands)

    Semenova, E.V.; Jore, M.M.; Westra, E.R.; Oost, van der J.; Brouns, S.J.J.

    2011-01-01

    Prokaryotic clustered regularly interspaced short palindromic repeat (CRISPR)/Cas (CRISPR-associated sequences) systems provide adaptive immunity against viruses when a spacer sequence of small CRISPR RNA (crRNA) matches a protospacer sequence in the viral genome. Viruses that escape CRISPR/Cas

  18. FASTR: A novel data format for concomitant representation of RNA sequence and secondary structure information.

    Science.gov (United States)

    Bose, Tungadri; Dutta, Anirban; Mh, Mohammed; Gandhi, Hemang; Mande, Sharmila S

    2015-09-01

    Given the importance of RNA secondary structures in defining their biological role, it would be convenient for researchers seeking RNA data if both sequence and structural information pertaining to RNA molecules are made available together. Current nucleotide data repositories archive only RNA sequence data. Furthermore, storage formats which can frugally represent RNA sequence as well as structure data in a single file, are currently unavailable. This article proposes a novel storage format, 'FASTR', for concomitant representation of RNA sequence and structure. The storage efficiency of the proposed FASTR format has been evaluated using RNA data from various microorganisms. Results indicate that the size of FASTR formatted files (containing both RNA sequence as well as structure information) are equivalent to that of FASTA-format files, which contain only RNA sequence information. RNA secondary structure is typically represented using a combination of a string of nucleotide characters along with the corresponding dot-bracket notation indicating structural attributes. 'FASTR' - the novel storage format proposed in the present study enables a frugal representation of both RNA sequence and structural information in the form of a single string. In spite of having a relatively smaller storage footprint, the resultant 'fastr' string(s) retain all sequence as well as secondary structural information that could be stored using a dot-bracket notation. An implementation of the 'FASTR' methodology is available for download at http://metagenomics.atc.tcs.com/compression/fastr.

  19. Transcriptome sequencing of the Microarray Quality Control (MAQC RNA reference samples using next generation sequencing

    Directory of Open Access Journals (Sweden)

    Thierry-Mieg Danielle

    2009-06-01

    Full Text Available Abstract Background Transcriptome sequencing using next-generation sequencing platforms will soon be competing with DNA microarray technologies for global gene expression analysis. As a preliminary evaluation of these promising technologies, we performed deep sequencing of cDNA synthesized from the Microarray Quality Control (MAQC reference RNA samples using Roche's 454 Genome Sequencer FLX. Results We generated more that 3.6 million sequence reads of average length 250 bp for the MAQC A and B samples and introduced a data analysis pipeline for translating cDNA read counts into gene expression levels. Using BLAST, 90% of the reads mapped to the human genome and 64% of the reads mapped to the RefSeq database of well annotated genes with e-values ≤ 10-20. We measured gene expression levels in the A and B samples by counting the numbers of reads that mapped to individual RefSeq genes in multiple sequencing runs to evaluate the MAQC quality metrics for reproducibility, sensitivity, specificity, and accuracy and compared the results with DNA microarrays and Quantitative RT-PCR (QRTPCR from the MAQC studies. In addition, 88% of the reads were successfully aligned directly to the human genome using the AceView alignment programs with an average 90% sequence similarity to identify 137,899 unique exon junctions, including 22,193 new exon junctions not yet contained in the RefSeq database. Conclusion Using the MAQC metrics for evaluating the performance of gene expression platforms, the ExpressSeq results for gene expression levels showed excellent reproducibility, sensitivity, and specificity that improved systematically with increasing shotgun sequencing depth, and quantitative accuracy that was comparable to DNA microarrays and QRTPCR. In addition, a careful mapping of the reads to the genome using the AceView alignment programs shed new light on the complexity of the human transcriptome including the discovery of thousands of new splice variants.

  20. Sequence composition similarities with the 7SL RNA are highly predictive of functional genomic features

    OpenAIRE

    Paquet, Yanick; Anderson, Alan

    2010-01-01

    Transposable elements derived from the 7SL RNA gene, such as Alu elements in primates, have had remarkable success in several mammalian lineages. The results presented here show a broad spectrum of functions for genomic segments that display sequence composition similarities with the 7SL RNA gene. Using thoroughly documented loci, we report that DNaseI-hypersensitive sites can be singled out in large genomic sequences by an assessment of sequence composition similarities with the 7SL RNA gene...

  1. sRNAnalyzer—a flexible and customizable small RNA sequencing data analysis pipeline

    Science.gov (United States)

    Kim, Taek-Kyun; Baxter, David; Scherler, Kelsey; Gordon, Aaron; Fong, Olivia; Etheridge, Alton; Galas, David J.

    2017-01-01

    Abstract Although many tools have been developed to analyze small RNA sequencing (sRNA-Seq) data, it remains challenging to accurately analyze the small RNA population, mainly due to multiple sequence ID assignment caused by short read length. Additional issues in small RNA analysis include low consistency of microRNA (miRNA) measurement results across different platforms, miRNA mapping associated with miRNA sequence variation (isomiR) and RNA editing, and the origin of those unmapped reads after screening against all endogenous reference sequence databases. To address these issues, we built a comprehensive and customizable sRNA-Seq data analysis pipeline—sRNAnalyzer, which enables: (i) comprehensive miRNA profiling strategies to better handle isomiRs and summarization based on each nucleotide position to detect potential SNPs in miRNAs, (ii) different sequence mapping result assignment approaches to simulate results from microarray/qRT-PCR platforms and a local probabilistic model to assign mapping results to the most-likely IDs, (iii) comprehensive ribosomal RNA filtering for accurate mapping of exogenous RNAs and summarization based on taxonomy annotation. We evaluated our pipeline on both artificial samples (including synthetic miRNA and Escherichia coli cultures) and biological samples (human tissue and plasma). sRNAnalyzer is implemented in Perl and available at: http://srnanalyzer.systemsbiology.net/. PMID:29069500

  2. sRNAnalyzer-a flexible and customizable small RNA sequencing data analysis pipeline.

    Science.gov (United States)

    Wu, Xiaogang; Kim, Taek-Kyun; Baxter, David; Scherler, Kelsey; Gordon, Aaron; Fong, Olivia; Etheridge, Alton; Galas, David J; Wang, Kai

    2017-12-01

    Although many tools have been developed to analyze small RNA sequencing (sRNA-Seq) data, it remains challenging to accurately analyze the small RNA population, mainly due to multiple sequence ID assignment caused by short read length. Additional issues in small RNA analysis include low consistency of microRNA (miRNA) measurement results across different platforms, miRNA mapping associated with miRNA sequence variation (isomiR) and RNA editing, and the origin of those unmapped reads after screening against all endogenous reference sequence databases. To address these issues, we built a comprehensive and customizable sRNA-Seq data analysis pipeline-sRNAnalyzer, which enables: (i) comprehensive miRNA profiling strategies to better handle isomiRs and summarization based on each nucleotide position to detect potential SNPs in miRNAs, (ii) different sequence mapping result assignment approaches to simulate results from microarray/qRT-PCR platforms and a local probabilistic model to assign mapping results to the most-likely IDs, (iii) comprehensive ribosomal RNA filtering for accurate mapping of exogenous RNAs and summarization based on taxonomy annotation. We evaluated our pipeline on both artificial samples (including synthetic miRNA and Escherichia coli cultures) and biological samples (human tissue and plasma). sRNAnalyzer is implemented in Perl and available at: http://srnanalyzer.systemsbiology.net/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  3. Annealing to sequences within the primer binding site loop promotes an HIV-1 RNA conformation favoring RNA dimerization and packaging

    Science.gov (United States)

    Seif, Elias; Niu, Meijuan; Kleiman, Lawrence

    2013-01-01

    The 5′ untranslated region (5′ UTR) of HIV-1 genomic RNA (gRNA) includes structural elements that regulate reverse transcription, transcription, translation, tRNALys3 annealing to the gRNA, and gRNA dimerization and packaging into viruses. It has been reported that gRNA dimerization and packaging are regulated by changes in the conformation of the 5′-UTR RNA. In this study, we show that annealing of tRNALys3 or a DNA oligomer complementary to sequences within the primer binding site (PBS) loop of the 5′ UTR enhances its dimerization in vitro. Structural analysis of the 5′-UTR RNA using selective 2′-hydroxyl acylation analyzed by primer extension (SHAPE) shows that the annealing promotes a conformational change of the 5′ UTR that has been previously reported to favor gRNA dimerization and packaging into virus. The model predicted by SHAPE analysis is supported by antisense experiments designed to test which annealed sequences will promote or inhibit gRNA dimerization. Based on reports showing that the gRNA dimerization favors its incorporation into viruses, we tested the ability of a mutant gRNA unable to anneal to tRNALys3 to be incorporated into virions. We found a ∼60% decrease in mutant gRNA packaging compared with wild-type gRNA. Together, these data further support a model for viral assembly in which the initial annealing of tRNALys3 to gRNA is cytoplasmic, which in turn aids in the promotion of gRNA dimerization and its incorporation into virions. PMID:23960173

  4. Evaluating Methods for Isolating Total RNA and Predicting the Success of Sequencing Phylogenetically Diverse Plant Transcriptomes

    Science.gov (United States)

    Bruskiewich, Richard; Burris, Jason N.; Carrigan, Charlotte T.; Chase, Mark W.; Clarke, Neil D.; Covshoff, Sarah; dePamphilis, Claude W.; Edger, Patrick P.; Goh, Falicia; Graham, Sean; Greiner, Stephan; Hibberd, Julian M.; Jordon-Thaden, Ingrid; Kutchan, Toni M.; Leebens-Mack, James; Melkonian, Michael; Miles, Nicholas; Myburg, Henrietta; Patterson, Jordan; Pires, J. Chris; Ralph, Paula; Rolf, Megan; Sage, Rowan F.; Soltis, Douglas; Soltis, Pamela; Stevenson, Dennis; Stewart, C. Neal; Surek, Barbara; Thomsen, Christina J. M.; Villarreal, Juan Carlos; Wu, Xiaolei; Zhang, Yong; Deyholos, Michael K.; Wong, Gane Ka-Shu

    2012-01-01

    Next-generation sequencing plays a central role in the characterization and quantification of transcriptomes. Although numerous metrics are purported to quantify the quality of RNA, there have been no large-scale empirical evaluations of the major determinants of sequencing success. We used a combination of existing and newly developed methods to isolate total RNA from 1115 samples from 695 plant species in 324 families, which represents >900 million years of phylogenetic diversity from green algae through flowering plants, including many plants of economic importance. We then sequenced 629 of these samples on Illumina GAIIx and HiSeq platforms and performed a large comparative analysis to identify predictors of RNA quality and the diversity of putative genes (scaffolds) expressed within samples. Tissue types (e.g., leaf vs. flower) varied in RNA quality, sequencing depth and the number of scaffolds. Tissue age also influenced RNA quality but not the number of scaffolds ≥1000 bp. Overall, 36% of the variation in the number of scaffolds was explained by metrics of RNA integrity (RIN score), RNA purity (OD 260/230), sequencing platform (GAIIx vs HiSeq) and the amount of total RNA used for sequencing. However, our results show that the most commonly used measures of RNA quality (e.g., RIN) are weak predictors of the number of scaffolds because Illumina sequencing is robust to variation in RNA quality. These results provide novel insight into the methods that are most important in isolating high quality RNA for sequencing and assembling plant transcriptomes. The methods and recommendations provided here could increase the efficiency and decrease the cost of RNA sequencing for individual labs and genome centers. PMID:23185583

  5. Transcription profile of boar spermatozoa as revealed by RNA-sequencing

    Science.gov (United States)

    High-throughput RNA sequencing (RNA-Seq) overcomes the limitations of the current hybridization-based techniques to detect the actual pool of RNA transcripts in spermatozoa. The application of this technology in livestock can speed the discovery of potential predictors of male fertility. As a first ...

  6. Evaluating Quality of Aged Archival Formalin-Fixed Paraffin-Embedded Samples for RNA-Sequencing

    Science.gov (United States)

    Archival formalin-fixed paraffin-embedded (FFPE) samples offer a vast, untapped source of genomic data for biomarker discovery. However, the quality of FFPE samples is often highly variable, and conventional methods to assess RNA quality for RNA-sequencing (RNA-seq) are not infor...

  7. Total RNA Sequencing Analysis of DCIS Progressing to Invasive Breast Cancer

    Science.gov (United States)

    2015-09-01

    Assay Kits respectively on the Qubit 2.0 Fluorometer (Life Technologies). The BioRad Experion Automated Electrophoresis System RNA kit was used to...AWARD NUMBER: W81XWH-14-1-0080 TITLE: Total RNA Sequencing Analysis of DCIS Progressing to Invasive Breast Cancer. PRINCIPAL INVESTIGATOR...Aug 2015 4. TITLE AND SUBTITLE Total RNA Sequencing Analysis of DCIS Progressing to Invasive Breast Cancer. 5a. CONTRACT NUMBER 5b. GRANT

  8. High-throughput sequencing of human plasma RNA by using thermostable group II intron reverse transcriptases.

    Science.gov (United States)

    Qin, Yidan; Yao, Jun; Wu, Douglas C; Nottingham, Ryan M; Mohr, Sabine; Hunicke-Smith, Scott; Lambowitz, Alan M

    2016-01-01

    Next-generation RNA-sequencing (RNA-seq) has revolutionized transcriptome profiling, gene expression analysis, and RNA-based diagnostics. Here, we developed a new RNA-seq method that exploits thermostable group II intron reverse transcriptases (TGIRTs) and used it to profile human plasma RNAs. TGIRTs have higher thermostability, processivity, and fidelity than conventional reverse transcriptases, plus a novel template-switching activity that can efficiently attach RNA-seq adapters to target RNA sequences without RNA ligation. The new TGIRT-seq method enabled construction of RNA-seq libraries from RNA in RNA in 1-mL plasma samples from a healthy individual revealed RNA fragments mapping to a diverse population of protein-coding gene and long ncRNAs, which are enriched in intron and antisense sequences, as well as nearly all known classes of small ncRNAs, some of which have never before been seen in plasma. Surprisingly, many of the small ncRNA species were present as full-length transcripts, suggesting that they are protected from plasma RNases in ribonucleoprotein (RNP) complexes and/or exosomes. This TGIRT-seq method is readily adaptable for profiling of whole-cell, exosomal, and miRNAs, and for related procedures, such as HITS-CLIP and ribosome profiling. © 2015 Qin et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  9. Maternal Plasma DNA and RNA Sequencing for Prenatal Testing.

    Science.gov (United States)

    Tamminga, Saskia; van Maarle, Merel; Henneman, Lidewij; Oudejans, Cees B M; Cornel, Martina C; Sistermans, Erik A

    2016-01-01

    Cell-free DNA (cfDNA) testing has recently become indispensable in diagnostic testing and screening. In the prenatal setting, this type of testing is often called noninvasive prenatal testing (NIPT). With a number of techniques, using either next-generation sequencing or single nucleotide polymorphism-based approaches, fetal cfDNA in maternal plasma can be analyzed to screen for rhesus D genotype, common chromosomal aneuploidies, and increasingly for testing other conditions, including monogenic disorders. With regard to screening for common aneuploidies, challenges arise when implementing NIPT in current prenatal settings. Depending on the method used (targeted or nontargeted), chromosomal anomalies other than trisomy 21, 18, or 13 can be detected, either of fetal or maternal origin, also referred to as unsolicited or incidental findings. For various biological reasons, there is a small chance of having either a false-positive or false-negative NIPT result, or no result, also referred to as a "no-call." Both pre- and posttest counseling for NIPT should include discussing potential discrepancies. Since NIPT remains a screening test, a positive NIPT result should be confirmed by invasive diagnostic testing (either by chorionic villus biopsy or by amniocentesis). As the scope of NIPT is widening, professional guidelines need to discuss the ethics of what to offer and how to offer. In this review, we discuss the current biochemical, clinical, and ethical challenges of cfDNA testing in the prenatal setting and its future perspectives including novel applications that target RNA instead of DNA. © 2016 Elsevier Inc. All rights reserved.

  10. Defining reference sequences for Nocardia species by similarity and clustering analyses of 16S rRNA gene sequence data.

    Directory of Open Access Journals (Sweden)

    Manal Helal

    Full Text Available BACKGROUND: The intra- and inter-species genetic diversity of bacteria and the absence of 'reference', or the most representative, sequences of individual species present a significant challenge for sequence-based identification. The aims of this study were to determine the utility, and compare the performance of several clustering and classification algorithms to identify the species of 364 sequences of 16S rRNA gene with a defined species in GenBank, and 110 sequences of 16S rRNA gene with no defined species, all within the genus Nocardia. METHODS: A total of 364 16S rRNA gene sequences of Nocardia species were studied. In addition, 110 16S rRNA gene sequences assigned only to the Nocardia genus level at the time of submission to GenBank were used for machine learning classification experiments. Different clustering algorithms were compared with a novel algorithm or the linear mapping (LM of the distance matrix. Principal Components Analysis was used for the dimensionality reduction and visualization. RESULTS: The LM algorithm achieved the highest performance and classified the set of 364 16S rRNA sequences into 80 clusters, the majority of which (83.52% corresponded with the original species. The most representative 16S rRNA sequences for individual Nocardia species have been identified as 'centroids' in respective clusters from which the distances to all other sequences were minimized; 110 16S rRNA gene sequences with identifications recorded only at the genus level were classified using machine learning methods. Simple kNN machine learning demonstrated the highest performance and classified Nocardia species sequences with an accuracy of 92.7% and a mean frequency of 0.578. CONCLUSION: The identification of centroids of 16S rRNA gene sequence clusters using novel distance matrix clustering enables the identification of the most representative sequences for each individual species of Nocardia and allows the quantitation of inter- and intra

  11. A DNA sequence obtained by replacement of the dopamine RNA aptamer bases is not an aptamer

    DEFF Research Database (Denmark)

    Álvarez-Martos, Isabel; Ferapontova, Elena

    2017-01-01

    A unique specificity of the aptamer-ligand biorecognition and binding facilitates bioanalysis and biosensor development, contributing to discrimination of structurally related molecules, such as dopamine and other catecholamine neurotransmitters. The aptamer sequence capable of specific binding...... of dopamine is a 57 nucleotides long RNA sequence reported in 1997 (Biochemistry, 1997, 36, 9726). Later, it was suggested that the DNA homologue of the RNA aptamer retains the specificity of dopamine binding (Biochem. Biophys. Res. Commun., 2009, 388, 732). Here, we show that the DNA sequence obtained...... by the replacement of the RNA aptamer bases for their DNA analogues is not able of specific biorecognition of dopamine, in contrast to the original RNA aptamer sequence. This DNA sequence binds dopamine and structurally related catecholamine neurotransmitters non-specifically, as any DNA sequence, and, thus...

  12. Full-length mRNA sequencing uncovers a widespread coupling between transcription initiation and mRNA processing.

    Science.gov (United States)

    Anvar, Seyed Yahya; Allard, Guy; Tseng, Elizabeth; Sheynkman, Gloria M; de Klerk, Eleonora; Vermaat, Martijn; Yin, Raymund H; Johansson, Hans E; Ariyurek, Yavuz; den Dunnen, Johan T; Turner, Stephen W; 't Hoen, Peter A C

    2018-03-29

    The multifaceted control of gene expression requires tight coordination of regulatory mechanisms at transcriptional and post-transcriptional level. Here, we studied the interdependence of transcription initiation, splicing and polyadenylation events on single mRNA molecules by full-length mRNA sequencing. In MCF-7 breast cancer cells, we find 2700 genes with interdependent alternative transcription initiation, splicing and polyadenylation events, both in proximal and distant parts of mRNA molecules, including examples of coupling between transcription start sites and polyadenylation sites. The analysis of three human primary tissues (brain, heart and liver) reveals similar patterns of interdependency between transcription initiation and mRNA processing events. We predict thousands of novel open reading frames from full-length mRNA sequences and obtained evidence for their translation by shotgun proteomics. The mapping database rescues 358 previously unassigned peptides and improves the assignment of others. By recognizing sample-specific amino-acid changes and novel splicing patterns, full-length mRNA sequencing improves proteogenomics analysis of MCF-7 cells. Our findings demonstrate that our understanding of transcriptome complexity is far from complete and provides a basis to reveal largely unresolved mechanisms that coordinate transcription initiation and mRNA processing.

  13. Deep sequencing of RNA from ancient maize kernels

    DEFF Research Database (Denmark)

    Fordyce, Sarah Louise; Avila Arcos, Maria del Carmen; Rasmussen, Morten

    2013-01-01

    The characterization of biomolecules from ancient samples can shed otherwise unobtainable insights into the past. Despite the fundamental role of transcriptomal change in evolution, the potential of ancient RNA remains unexploited - perhaps due to dogma associated with the fragility of RNA. We hy...

  14. RStrucFam: a web server to associate structure and cognate RNA for RNA-binding proteins from sequence information.

    Science.gov (United States)

    Ghosh, Pritha; Mathew, Oommen K; Sowdhamini, Ramanathan

    2016-10-07

    RNA-binding proteins (RBPs) interact with their cognate RNA(s) to form large biomolecular assemblies. They are versatile in their functionality and are involved in a myriad of processes inside the cell. RBPs with similar structural features and common biological functions are grouped together into families and superfamilies. It will be useful to obtain an early understanding and association of RNA-binding property of sequences of gene products. Here, we report a web server, RStrucFam, to predict the structure, type of cognate RNA(s) and function(s) of proteins, where possible, from mere sequence information. The web server employs Hidden Markov Model scan (hmmscan) to enable association to a back-end database of structural and sequence families. The database (HMMRBP) comprises of 437 HMMs of RBP families of known structure that have been generated using structure-based sequence alignments and 746 sequence-centric RBP family HMMs. The input protein sequence is associated with structural or sequence domain families, if structure or sequence signatures exist. In case of association of the protein with a family of known structures, output features like, multiple structure-based sequence alignment (MSSA) of the query with all others members of that family is provided. Further, cognate RNA partner(s) for that protein, Gene Ontology (GO) annotations, if any and a homology model of the protein can be obtained. The users can also browse through the database for details pertaining to each family, protein or RNA and their related information based on keyword search or RNA motif search. RStrucFam is a web server that exploits structurally conserved features of RBPs, derived from known family members and imprinted in mathematical profiles, to predict putative RBPs from sequence information. Proteins that fail to associate with such structure-centric families are further queried against the sequence-centric RBP family HMMs in the HMMRBP database. Further, all other essential

  15. Deep Sequencing of RNA from Ancient Maize Kernels

    Science.gov (United States)

    Rasmussen, Morten; Cappellini, Enrico; Romero-Navarro, J. Alberto; Wales, Nathan; Alquezar-Planas, David E.; Penfield, Steven; Brown, Terence A.; Vielle-Calzada, Jean-Philippe; Montiel, Rafael; Jørgensen, Tina; Odegaard, Nancy; Jacobs, Michael; Arriaza, Bernardo; Higham, Thomas F. G.; Ramsey, Christopher Bronk; Willerslev, Eske; Gilbert, M. Thomas P.

    2013-01-01

    The characterization of biomolecules from ancient samples can shed otherwise unobtainable insights into the past. Despite the fundamental role of transcriptomal change in evolution, the potential of ancient RNA remains unexploited – perhaps due to dogma associated with the fragility of RNA. We hypothesize that seeds offer a plausible refuge for long-term RNA survival, due to the fundamental role of RNA during seed germination. Using RNA-Seq on cDNA synthesized from nucleic acid extracts, we validate this hypothesis through demonstration of partial transcriptomal recovery from two sources of ancient maize kernels. The results suggest that ancient seed transcriptomics may offer a powerful new tool with which to study plant domestication. PMID:23326310

  16. Organism-specific rRNA capture system for application in next-generation sequencing.

    Directory of Open Access Journals (Sweden)

    Sai-Kam Li

    Full Text Available RNA-sequencing is a powerful tool in studying RNomics. However, the highly abundance of ribosomal RNAs (rRNA and transfer RNA (tRNA have predominated in the sequencing reads, thereby hindering the study of lowly expressed genes. Therefore, rRNA depletion prior to sequencing is often performed in order to preserve the subtle alteration in gene expression especially those at relatively low expression levels. One of the commercially available methods is to use DNA or RNA probes to hybridize to the target RNAs. However, there is always a concern with the non-specific binding and unintended removal of messenger RNA (mRNA when the same set of probes is applied to different organisms. The degree of such unintended mRNA removal varies among organisms due to organism-specific genomic variation. We developed a computer-based method to design probes to deplete rRNA in an organism-specific manner. Based on the computation results, biotinylated-RNA-probes were produced by in vitro transcription and were used to perform rRNA depletion with subtractive hybridization. We demonstrated that the designed probes of 16S rRNAs and 23S rRNAs can efficiently remove rRNAs from Mycobacterium smegmatis. In comparison with a commercial subtractive hybridization-based rRNA removal kit, using organism-specific probes is better in preserving the RNA integrity and abundance. We believe the computer-based design approach can be used as a generic method in preparing RNA of any organisms for next-generation sequencing, particularly for the transcriptome analysis of microbes.

  17. Sequence-specific RNA Photocleavage by Single-stranded DNA in Presence of Riboflavin

    Science.gov (United States)

    Zhao, Yongyun; Chen, Gangyi; Yuan, Yi; Li, Na; Dong, Juan; Huang, Xin; Cui, Xin; Tang, Zhuo

    2015-10-01

    Constant efforts have been made to develop new method to realize sequence-specific RNA degradation, which could cause inhibition of the expression of targeted gene. Herein, by using an unmodified short DNA oligonucleotide for sequence recognition and endogenic small molecue, vitamin B2 (riboflavin) as photosensitizer, we report a simple strategy to realize the sequence-specific photocleavage of targeted RNA. The DNA strand is complimentary to the target sequence to form DNA/RNA duplex containing a G•U wobble in the middle. The cleavage reaction goes through oxidative elimination mechanism at the nucleoside downstream of U of the G•U wobble in duplex to obtain unnatural RNA terminal, and the whole process is under tight control by using light as switch, which means the cleavage could be carried out according to specific spatial and temporal requirements. The biocompatibility of this method makes the DNA strand in combination with riboflavin a promising molecular tool for RNA manipulation.

  18. Sequence-specific RNA Photocleavage by Single-stranded DNA in Presence of Riboflavin.

    Science.gov (United States)

    Zhao, Yongyun; Chen, Gangyi; Yuan, Yi; Li, Na; Dong, Juan; Huang, Xin; Cui, Xin; Tang, Zhuo

    2015-10-13

    Constant efforts have been made to develop new method to realize sequence-specific RNA degradation, which could cause inhibition of the expression of targeted gene. Herein, by using an unmodified short DNA oligonucleotide for sequence recognition and endogenic small molecule, vitamin B2 (riboflavin) as photosensitizer, we report a simple strategy to realize the sequence-specific photocleavage of targeted RNA. The DNA strand is complimentary to the target sequence to form DNA/RNA duplex containing a G • U wobble in the middle. The cleavage reaction goes through oxidative elimination mechanism at the nucleoside downstream of U of the G • U wobble in duplex to obtain unnatural RNA terminal, and the whole process is under tight control by using light as switch, which means the cleavage could be carried out according to specific spatial and temporal requirements. The biocompatibility of this method makes the DNA strand in combination with riboflavin a promising molecular tool for RNA manipulation.

  19. Hybridization-based reconstruction of small non-coding RNA transcripts from deep sequencing data.

    Science.gov (United States)

    Ragan, Chikako; Mowry, Bryan J; Bauer, Denis C

    2012-09-01

    Recent advances in RNA sequencing technology (RNA-Seq) enables comprehensive profiling of RNAs by producing millions of short sequence reads from size-fractionated RNA libraries. Although conventional tools for detecting and distinguishing non-coding RNAs (ncRNAs) from reference-genome data can be applied to sequence data, ncRNA detection can be improved by harnessing the full information content provided by this new technology. Here we present NorahDesk, the first unbiased and universally applicable method for small ncRNAs detection from RNA-Seq data. NorahDesk utilizes the coverage-distribution of small RNA sequence data as well as thermodynamic assessments of secondary structure to reliably predict and annotate ncRNA classes. Using publicly available mouse sequence data from brain, skeletal muscle, testis and ovary, we evaluated our method with an emphasis on the performance for microRNAs (miRNAs) and piwi-interacting small RNA (piRNA). We compared our method with Dario and mirDeep2 and found that NorahDesk produces longer transcripts with higher read coverage. This feature makes it the first method particularly suitable for the prediction of both known and novel piRNAs.

  20. DNApi: A De Novo Adapter Prediction Algorithm for Small RNA Sequencing Data.

    Directory of Open Access Journals (Sweden)

    Junko Tsuji

    Full Text Available With the rapid accumulation of publicly available small RNA sequencing datasets, third-party meta-analysis across many datasets is becoming increasingly powerful. Although removing the 3´ adapter is an essential step for small RNA sequencing analysis, the adapter sequence information is not always available in the metadata. The information can be also erroneous even when it is available. In this study, we developed DNApi, a lightweight Python software package that predicts the 3´ adapter sequence de novo and provides the user with cleansed small RNA sequences ready for down stream analysis. Tested on 539 publicly available small RNA libraries accompanied with 3´ adapter sequences in their metadata, DNApi shows near-perfect accuracy (98.5% with fast runtime (~2.85 seconds per library and efficient memory usage (~43 MB on average. In addition to 3´ adapter prediction, it is also important to classify whether the input small RNA libraries were already processed, i.e. the 3´ adapters were removed. DNApi perfectly judged that given another batch of datasets, 192 publicly available processed libraries were "ready-to-map" small RNA sequence. DNApi is compatible with Python 2 and 3, and is available at https://github.com/jnktsj/DNApi. The 731 small RNA libraries used for DNApi evaluation were from human tissues and were carefully and manually collected. This study also provides readers with the curated datasets that can be integrated into their studies.

  1. DNApi: A De Novo Adapter Prediction Algorithm for Small RNA Sequencing Data.

    Science.gov (United States)

    Tsuji, Junko; Weng, Zhiping

    2016-01-01

    With the rapid accumulation of publicly available small RNA sequencing datasets, third-party meta-analysis across many datasets is becoming increasingly powerful. Although removing the 3´ adapter is an essential step for small RNA sequencing analysis, the adapter sequence information is not always available in the metadata. The information can be also erroneous even when it is available. In this study, we developed DNApi, a lightweight Python software package that predicts the 3´ adapter sequence de novo and provides the user with cleansed small RNA sequences ready for down stream analysis. Tested on 539 publicly available small RNA libraries accompanied with 3´ adapter sequences in their metadata, DNApi shows near-perfect accuracy (98.5%) with fast runtime (~2.85 seconds per library) and efficient memory usage (~43 MB on average). In addition to 3´ adapter prediction, it is also important to classify whether the input small RNA libraries were already processed, i.e. the 3´ adapters were removed. DNApi perfectly judged that given another batch of datasets, 192 publicly available processed libraries were "ready-to-map" small RNA sequence. DNApi is compatible with Python 2 and 3, and is available at https://github.com/jnktsj/DNApi. The 731 small RNA libraries used for DNApi evaluation were from human tissues and were carefully and manually collected. This study also provides readers with the curated datasets that can be integrated into their studies.

  2. Predicting RNA secondary structures from sequence and probing data.

    Science.gov (United States)

    Lorenz, Ronny; Wolfinger, Michael T; Tanzer, Andrea; Hofacker, Ivo L

    2016-07-01

    RNA secondary structures have proven essential for understanding the regulatory functions performed by RNA such as microRNAs, bacterial small RNAs, or riboswitches. This success is in part due to the availability of efficient computational methods for predicting RNA secondary structures. Recent advances focus on dealing with the inherent uncertainty of prediction by considering the ensemble of possible structures rather than the single most stable one. Moreover, the advent of high-throughput structural probing has spurred the development of computational methods that incorporate such experimental data as auxiliary information. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  3. Mapping the miRNA interactome by cross-linking ligation and sequencing of hybrids (CLASH).

    Science.gov (United States)

    Helwak, Aleksandra; Tollervey, David

    2014-03-01

    RNA-RNA interactions have critical roles in many cellular processes, but studying them is difficult and laborious. Here we describe an experimental procedure, termed cross-linking ligation and sequencing of hybrids (CLASH), which allows high-throughput identification of sites of RNA-RNA interaction. During CLASH, a tagged bait protein is UV-cross-linked in cell cultures to stabilize RNA interactions, and it is purified under denaturing conditions. RNAs associated with the bait protein are partially truncated, and the ends of RNA duplexes are ligated together. After linker addition, cDNA library preparation and high-throughput sequencing, the ligated duplexes give rise to chimeric cDNAs, which unambiguously identify RNA-RNA interaction sites independent of bioinformatic predictions. This protocol is optimized for studying miRNA targets bound by Argonaute (AGO) proteins, but it should be easily adapted for other RNA-binding proteins and classes of RNA. The protocol requires ∼5 d to complete, excluding the time required for high-throughput sequencing and bioinformatic analyses.

  4. Combined DECS Analysis and Next-Generation Sequencing Enable Efficient Detection of Novel Plant RNA Viruses.

    Science.gov (United States)

    Yanagisawa, Hironobu; Tomita, Reiko; Katsu, Koji; Uehara, Takuya; Atsumi, Go; Tateda, Chika; Kobayashi, Kappei; Sekine, Ken-Taro

    2016-03-07

    The presence of high molecular weight double-stranded RNA (dsRNA) within plant cells is an indicator of infection with RNA viruses as these possess genomic or replicative dsRNA. DECS (dsRNA isolation, exhaustive amplification, cloning, and sequencing) analysis has been shown to be capable of detecting unknown viruses. We postulated that a combination of DECS analysis and next-generation sequencing (NGS) would improve detection efficiency and usability of the technique. Here, we describe a model case in which we efficiently detected the presumed genome sequence of Blueberry shoestring virus (BSSV), a member of the genus Sobemovirus, which has not so far been reported. dsRNAs were isolated from BSSV-infected blueberry plants using the dsRNA-binding protein, reverse-transcribed, amplified, and sequenced using NGS. A contig of 4,020 nucleotides (nt) that shared similarities with sequences from other Sobemovirus species was obtained as a candidate of the BSSV genomic sequence. Reverse transcription (RT)-PCR primer sets based on sequences from this contig enabled the detection of BSSV in all BSSV-infected plants tested but not in healthy controls. A recombinant protein encoded by the putative coat protein gene was bound by the BSSV-antibody, indicating that the candidate sequence was that of BSSV itself. Our results suggest that a combination of DECS analysis and NGS, designated here as "DECS-C," is a powerful method for detecting novel plant viruses.

  5. Bias in ligation-based small RNA sequencing library construction is determined by adaptor and RNA structure.

    Directory of Open Access Journals (Sweden)

    Ryan T Fuchs

    Full Text Available High-throughput sequencing (HTS has become a powerful tool for the detection of and sequence characterization of microRNAs (miRNA and other small RNAs (sRNA. Unfortunately, the use of HTS data to determine the relative quantity of different miRNAs in a sample has been shown to be inconsistent with quantitative PCR and Northern Blot results. Several recent studies have concluded that the major contributor to this inconsistency is bias introduced during the construction of sRNA libraries for HTS and that the bias is primarily derived from the adaptor ligation steps, specifically where single stranded adaptors are sequentially ligated to the 3' and 5'-end of sRNAs using T4 RNA ligases. In this study we investigated the effects of ligation bias by using a pool of randomized ligation substrates, defined mixtures of miRNA sequences and several combinations of adaptors in HTS library construction. We show that like the 3' adaptor ligation step, the 5' adaptor ligation is also biased, not because of primary sequence, but instead due to secondary structures of the two ligation substrates. We find that multiple secondary structural factors influence final representation in HTS results. Our results provide insight about the nature of ligation bias and allowed us to design adaptors that reduce ligation bias and produce HTS results that more accurately reflect the actual concentrations of miRNAs in the defined starting material.

  6. Exploring microbial diversity and taxonomy using SSU rRNA hypervariable tag sequencing.

    Directory of Open Access Journals (Sweden)

    Susan M Huse

    2008-11-01

    Full Text Available Massively parallel pyrosequencing of hypervariable regions from small subunit ribosomal RNA (SSU rRNA genes can sample a microbial community two or three orders of magnitude more deeply per dollar and per hour than capillary sequencing of full-length SSU rRNA. As with full-length rRNA surveys, each sequence read is a tag surrogate for a single microbe. However, rather than assigning taxonomy by creating gene trees de novo that include all experimental sequences and certain reference taxa, we compare the hypervariable region tags to an extensive database of rRNA sequences and assign taxonomy based on the best match in a Global Alignment for Sequence Taxonomy (GAST process. The resulting taxonomic census provides information on both composition and diversity of the microbial community. To determine the effectiveness of using only hypervariable region tags for assessing microbial community membership, we compared the taxonomy assigned to the V3 and V6 hypervariable regions with the taxonomy assigned to full-length SSU rRNA sequences isolated from both the human gut and a deep-sea hydrothermal vent. The hypervariable region tags and full-length rRNA sequences provided equivalent taxonomy and measures of relative abundance of microbial communities, even for tags up to 15% divergent from their nearest reference match. The greater sampling depth per dollar afforded by massively parallel pyrosequencing reveals many more members of the "rare biosphere" than does capillary sequencing of the full-length gene. In addition, tag sequencing eliminates cloning bias and the sequences are short enough to be completely sequenced in a single read, maximizing the number of organisms sampled in a run while minimizing chimera formation. This technique allows the cost-effective exploration of changes in microbial community structure, including the rare biosphere, over space and time and can be applied immediately to initiatives, such as the Human Microbiome Project.

  7. 3′ terminal diversity of MRP RNA and other human noncoding RNAs revealed by deep sequencing

    Science.gov (United States)

    2013-01-01

    Background Post-transcriptional 3′ end processing is a key component of RNA regulation. The abundant and essential RNA subunit of RNase MRP has been proposed to function in three distinct cellular compartments and therefore may utilize this mode of regulation. Here we employ 3′ RACE coupled with high-throughput sequencing to characterize the 3′ terminal sequences of human MRP RNA and other noncoding RNAs that form RNP complexes. Results The 3′ terminal sequence of MRP RNA from HEK293T cells has a distinctive distribution of genomically encoded termini (including an assortment of U residues) with a portion of these selectively tagged by oligo(A) tails. This profile contrasts with the relatively homogenous 3′ terminus of an in vitro transcribed MRP RNA control and the differing 3′ terminal profiles of U3 snoRNA, RNase P RNA, and telomerase RNA (hTR). Conclusions 3′ RACE coupled with deep sequencing provides a valuable framework for the functional characterization of 3′ terminal sequences of noncoding RNAs. PMID:24053768

  8. 3' terminal diversity of MRP RNA and other human noncoding RNAs revealed by deep sequencing.

    Science.gov (United States)

    Goldfarb, Katherine C; Cech, Thomas R

    2013-09-21

    Post-transcriptional 3' end processing is a key component of RNA regulation. The abundant and essential RNA subunit of RNase MRP has been proposed to function in three distinct cellular compartments and therefore may utilize this mode of regulation. Here we employ 3' RACE coupled with high-throughput sequencing to characterize the 3' terminal sequences of human MRP RNA and other noncoding RNAs that form RNP complexes. The 3' terminal sequence of MRP RNA from HEK293T cells has a distinctive distribution of genomically encoded termini (including an assortment of U residues) with a portion of these selectively tagged by oligo(A) tails. This profile contrasts with the relatively homogenous 3' terminus of an in vitro transcribed MRP RNA control and the differing 3' terminal profiles of U3 snoRNA, RNase P RNA, and telomerase RNA (hTR). 3' RACE coupled with deep sequencing provides a valuable framework for the functional characterization of 3' terminal sequences of noncoding RNAs.

  9. Use of S1 nuclease in deep sequencing for detection of double-stranded RNA viruses.

    Science.gov (United States)

    Shimada, Saya; Nagai, Makoto; Moriyama, Hiromitsu; Fukuhara, Toshiyuki; Koyama, Satoshi; Omatsu, Tsutomu; Furuya, Tetsuya; Shirai, Junsuke; Mizutani, Tetsuya

    2015-09-01

    Metagenomic approach using next-generation DNA sequencing has facilitated the detection of many pathogenic viruses from fecal samples. However, in many cases, majority of the detected sequences originate from the host genome and bacterial flora in the gut. Here, to improve efficiency of the detection of double-stranded (ds) RNA viruses from samples, we evaluated the applicability of S1 nuclease on deep sequencing. Treating total RNA with S1 nuclease resulted in 1.5-28.4- and 10.1-208.9-fold increases in sequence reads of group A rotavirus in fecal and viral culture samples, respectively. Moreover, increasing coverage of mapping to reference sequences allowed for sufficient genotyping using analytical software. These results suggest that library construction using S1 nuclease is useful for deep sequencing in the detection of dsRNA viruses.

  10. Predicting sequence and structural specificities of RNA binding regions recognized by splicing factor SRSF1

    Directory of Open Access Journals (Sweden)

    Wang Xin

    2011-12-01

    Full Text Available Abstract Background RNA-binding proteins (RBPs play diverse roles in eukaryotic RNA processing. Despite their pervasive functions in coding and noncoding RNA biogenesis and regulation, elucidating the sequence specificities that define protein-RNA interactions remains a major challenge. Recently, CLIP-seq (Cross-linking immunoprecipitation followed by high-throughput sequencing has been successfully implemented to study the transcriptome-wide binding patterns of SRSF1, PTBP1, NOVA and fox2 proteins. These studies either adopted traditional methods like Multiple EM for Motif Elicitation (MEME to discover the sequence consensus of RBP's binding sites or used Z-score statistics to search for the overrepresented nucleotides of a certain size. We argue that most of these methods are not well-suited for RNA motif identification, as they are unable to incorporate the RNA structural context of protein-RNA interactions, which may affect to binding specificity. Here, we describe a novel model-based approach--RNAMotifModeler to identify the consensus of protein-RNA binding regions by integrating sequence features and RNA secondary structures. Results As an example, we implemented RNAMotifModeler on SRSF1 (SF2/ASF CLIP-seq data. The sequence-structural consensus we identified is a purine-rich octamer 'AGAAGAAG' in a highly single-stranded RNA context. The unpaired probabilities, the probabilities of not forming pairs, are significantly higher than negative controls and the flanking sequence surrounding the binding site, indicating that SRSF1 proteins tend to bind on single-stranded RNA. Further statistical evaluations revealed that the second and fifth bases of SRSF1octamer motif have much stronger sequence specificities, but weaker single-strandedness, while the third, fourth, sixth and seventh bases are far more likely to be single-stranded, but have more degenerate sequence specificities. Therefore, we hypothesize that nucleotide specificity and

  11. MicroRNA Expression Profile in Penile Cancer Revealed by Next-Generation Small RNA Sequencing.

    Directory of Open Access Journals (Sweden)

    Li Zhang

    Full Text Available Penile cancer (PeCa is a relatively rare tumor entity but possesses higher morbidity and mortality rates especially in developing countries. To date, the concrete pathogenic signaling pathways and core machineries involved in tumorigenesis and progression of PeCa remain to be elucidated. Several studies suggested miRNAs, which modulate gene expression at posttranscriptional level, were frequently mis-regulated and aberrantly expressed in human cancers. However, the miRNA profile in human PeCa has not been reported before. In this present study, the miRNA profile was obtained from 10 fresh penile cancerous tissues and matched adjacent non-cancerous tissues via next-generation sequencing. As a result, a total of 751 and 806 annotated miRNAs were identified in normal and cancerous penile tissues, respectively. Among which, 56 miRNAs with significantly different expression levels between paired tissues were identified. Subsequently, several annotated miRNAs were selected randomly and validated using quantitative real-time PCR. Compared with the previous publications regarding to the altered miRNAs expression in various cancers and especially genitourinary (prostate, bladder, kidney, testis cancers, the most majority of deregulated miRNAs showed the similar expression pattern in penile cancer. Moreover, the bioinformatics analyses suggested that the putative target genes of differentially expressed miRNAs between cancerous and matched normal penile tissues were tightly associated with cell junction, proliferation, growth as well as genomic instability and so on, by modulating Wnt, MAPK, p53, PI3K-Akt, Notch and TGF-β signaling pathways, which were all well-established to participate in cancer initiation and progression. Our work presents a global view of the differentially expressed miRNAs and potentially regulatory networks of their target genes for clarifying the pathogenic transformation of normal penis to PeCa, which research resource also

  12. Informatics for RNA Sequencing: A Web Resource for Analysis on the Cloud.

    Directory of Open Access Journals (Sweden)

    Malachi Griffith

    2015-08-01

    Full Text Available Massively parallel RNA sequencing (RNA-seq has rapidly become the assay of choice for interrogating RNA transcript abundance and diversity. This article provides a detailed introduction to fundamental RNA-seq molecular biology and informatics concepts. We make available open-access RNA-seq tutorials that cover cloud computing, tool installation, relevant file formats, reference genomes, transcriptome annotations, quality-control strategies, expression, differential expression, and alternative splicing analysis methods. These tutorials and additional training resources are accompanied by complete analysis pipelines and test datasets made available without encumbrance at www.rnaseq.wiki.

  13. [Molecular relationship of Eurytrema coelmaticum inferred from 18S rRNA sequence].

    Science.gov (United States)

    Zheng, Ya-dong; Luo, Xue-nong; Shi, Cheng-hong; Zong, Rui-qian; Jing, Zhi-zhong; Cai, Xue-peng

    2006-10-01

    To elucidate the taxonomic position of Eurytrema coelmaticum by using molecular technology. 18S rRNA fragment was amplified from E. coelmaticum genomic DNA by specific conservative primers and sequenced. Homology and phylogenic tree of 18S rRNA sequences between E. coelmaticum and other Dicrocoeliidae trematodes were analyzed and constructed by DNAStar and MEGA3 respectively, and their evolutionary relationship was determined. E. coelmaticum 18S rRNA sequence was with high homology to those from Dicrocoelium dendriticum, Lyperosomum collurionis and Brachylecithum lobatum. Among them, the diversity of E. coelmaticum from D. dendriticum was 2.42%, and that from L. collurionis was 1.75%; D. dendriticum and B. lobatum were closer in evolution only with 1.09% diversity. For Dicrocoeliidae trematodes, classification based on 18S rRNA target is valid and the sequences are highly conservative. E. coelmaticum is evolutionarily closer to L. collurionis than to D. dendriticum and B. lobatum.

  14. The nucleotide sequence of threonine transfer RNA coded by bacteriophage T4

    International Nuclear Information System (INIS)

    Guthrie, C.; Scholla, C.A.; Yesian, H.; Abelson, J.

    1978-01-01

    The nucleotide sequence of a low molecular weight RNA coded by bacteriophage T4 (and previously identified as species α) has been determined. The molecule is of particular biological interest for its associated biosynthetic properties. This RNA is 76 nucleotides in length, contains eight modified bases, and can be arranged in a cloverleaf configuration common to tRNAs. The anticodon sequence is UGU, which corresponds to the threonine-specific codons ACsub(G)sup(A). The nucleotide sequence was determined primarily by nearest-neighbour analysis of RNA synthesized in vitro using [α- 32 P] nucleoside triphosphates. Using the single-strand specific nuclease S1, two in vivo labelled half-molecules were generated and analysed. This information together with restrictions imposed by nearest-neighbour data, provided a unique linear sequence of nucleotides with the features of secondary structure common to tRNA molecules. (author)

  15. Barcoding bias in high-throughput multiplex sequencing of miRNA.

    Science.gov (United States)

    Alon, Shahar; Vigneault, Francois; Eminaga, Seda; Christodoulou, Danos C; Seidman, Jonathan G; Church, George M; Eisenberg, Eli

    2011-09-01

    Second-generation sequencing is gradually becoming the method of choice for miRNA detection and expression profiling. Given the relatively small number of miRNAs and improvements in DNA sequencing technology, studying miRNA expression profiles of multiple samples in a single flow cell lane becomes feasible. Multiplexing strategies require marking each miRNA library with a DNA barcode. Here we report that barcodes introduced through adapter ligation confer significant bias on miRNA expression profiles. This bias is much higher than the expected Poisson noise and masks significant expression differences between miRNA libraries. This bias can be eliminated by adding barcodes during PCR amplification of libraries. The accuracy of miRNA expression measurement in multiplexed experiments becomes a function of sample number.

  16. Comprehensive analysis of human small RNA sequencing data provides insights into expression profiles and miRNA editing.

    Science.gov (United States)

    Gong, Jing; Wu, Yuliang; Zhang, Xiantong; Liao, Yifang; Sibanda, Vusumuzi Leroy; Liu, Wei; Guo, An-Yuan

    2014-01-01

    MicroRNAs (miRNAs) play key regulatory roles in various biological processes and diseases. A comprehensive analysis of large scale small RNA sequencing data (smRNA-seq) will be very helpful to explore tissue or disease specific miRNA markers and uncover miRNA variants. Here, we systematically analyzed 410 human smRNA-seq datasets, which samples are from 24 tissue/disease/cell lines. We tested the mapping strategies and found that it was necessary to make multiple-round mappings with different mismatch parameters. miRNA expression profiles revealed that on average ∼70% of known miRNAs were expressed at low level or not expressed (RPM 100). About 30% known miRNAs were not expressed in all of our used samples. The miRNA expression profiles were compiled into an online database (HMED, http://bioinfo.life.hust.edu.cn/smallRNA/). Dozens of tissue/disease specific miRNAs, disease/control dysregulated miRNAs and miRNAs with arm switching events were discovered. Further, we identified some highly confident editing sites including 24 A-to-I sites and 23 C-to-U sites. About half of them were widespread miRNA editing sites in different tissues. We characterized that the 2 types of editing sites have different features with regard to location, editing level and frequency. Our analyses for expression profiles, specific miRNA markers, arm switching, and editing sites, may provide valuable information for further studies of miRNA function and biomarker finding.

  17. Genetic selection and DNA sequences of 4.5S RNA homologs

    DEFF Research Database (Denmark)

    Brown, S; Thon, G; Tolentino, E

    1989-01-01

    A general strategy for cloning the functional homologs of an Escherichia coli gene was used to clone homologs of 4.5S RNA from other bacteria. The genes encoding these homologs were selected by their ability to complement a deletion of the gene for 4.5S RNA. DNA sequences of the regions encoding...

  18. Structure and sequence motifs in the HIV-1 RNA genome

    NARCIS (Netherlands)

    van Bel, N.

    2015-01-01

    The untranslated leader of the HIV-1 RNA genome contains some 350 nucleotides and is highly conserved among virus isolates. Several characteristic hairpin structures that regulate important virus replication steps, such as dimerization and packaging in virion particles, are clustered in this leader.

  19. High-Throughput Mapping of Single-Neuron Projections by Sequencing of Barcoded RNA.

    Science.gov (United States)

    Kebschull, Justus M; Garcia da Silva, Pedro; Reid, Ashlan P; Peikon, Ian D; Albeanu, Dinu F; Zador, Anthony M

    2016-09-07

    Neurons transmit information to distant brain regions via long-range axonal projections. In the mouse, area-to-area connections have only been systematically mapped using bulk labeling techniques, which obscure the diverse projections of intermingled single neurons. Here we describe MAPseq (Multiplexed Analysis of Projections by Sequencing), a technique that can map the projections of thousands or even millions of single neurons by labeling large sets of neurons with random RNA sequences ("barcodes"). Axons are filled with barcode mRNA, each putative projection area is dissected, and the barcode mRNA is extracted and sequenced. Applying MAPseq to the locus coeruleus (LC), we find that individual LC neurons have preferred cortical targets. By recasting neuroanatomy, which is traditionally viewed as a problem of microscopy, as a problem of sequencing, MAPseq harnesses advances in sequencing technology to permit high-throughput interrogation of brain circuits. Copyright © 2016 Elsevier Inc. All rights reserved.

  20. In Silico Identification of RNA Modifications from High-Throughput Sequencing Data Using HAMR.

    Science.gov (United States)

    Kuksa, Pavel P; Leung, Yuk Yee; Vandivier, Lee E; Anderson, Zachary; Gregory, Brian D; Wang, Li-San

    2017-01-01

    RNA molecules are often altered post-transcriptionally by the covalent modification of their nucleotides. These modifications are known to modulate the structure, function, and activity of RNAs. When reverse transcribed into cDNA during RNA sequencing library preparation, atypical (modified) ribonucleotides that affect Watson-Crick base pairing will interfere with reverse transcriptase (RT), resulting in cDNA products with mis-incorporated bases or prematurely terminated RNA products. These interactions with RT can therefore be inferred from mismatch patterns in the sequencing reads, and are distinguishable from simple base-calling errors, single-nucleotide polymorphisms (SNPs), or RNA editing sites. Here, we describe a computational protocol for the in silico identification of modified ribonucleotides from RT-based RNA-seq read-out using the High-throughput Analysis of Modified Ribonucleotides (HAMR) software. HAMR can identify these modifications transcriptome-wide with single nucleotide resolution, and also differentiate between different types of modifications to predict modification identity. Researchers can use HAMR to identify and characterize RNA modifications using RNA-seq data from a variety of common RT-based sequencing protocols such as Poly(A), total RNA-seq, and small RNA-seq.

  1. Nascent RNA sequencing reveals distinct features in plant transcription

    OpenAIRE

    Hetzel, Jonathan; Duttke, Sascha H.; Benner, Christopher; Chory, Joanne

    2016-01-01

    Transcription is a fundamental and dynamic step in the regulation of gene expression, but the characteristics of plant transcription are poorly understood. We adapted the global nuclear run-on sequencing (GRO-seq) and 5′GRO-seq methods for plants and provide a plant version of the next-generation sequencing software HOMER (homer.ucsd.edu/homer/plants) to facilitate data analysis. Mapping nascent transcripts in Arabidopsis thaliana seedlings enabled identification of known and novel transcript...

  2. Highly divergent 16S rRNA sequences in ribosomal operons of Scytonema hyalinum (Cyanobacteria.

    Directory of Open Access Journals (Sweden)

    Jeffrey R Johansen

    Full Text Available A highly divergent 16S rRNA gene was found in one of the five ribosomal operons present in a species complex currently circumscribed as Scytonema hyalinum (Nostocales, Cyanobacteria using clone libraries. If 16S rRNA sequence macroheterogeneity among ribosomal operons due to insertions, deletions or truncation is excluded, the sequence heterogeneity observed in S. hyalinum was the highest observed in any prokaryotic species thus far (7.3-9.0%. The secondary structure of the 16S rRNA molecules encoded by the two divergent operons was nearly identical, indicating possible functionality. The 23S rRNA gene was examined for a few strains in this complex, and it was also found to be highly divergent from the gene in Type 2 operons (8.7%, and likewise had nearly identical secondary structure between the Type 1 and Type 2 operons. Furthermore, the 16S-23S ITS showed marked differences consistent between operons among numerous strains. Both operons have promoter sequences that satisfy consensus requirements for functional prokaryotic transcription initiation. Horizontal gene transfer from another unknown heterocytous cyanobacterium is considered the most likely explanation for the origin of this molecule, but does not explain the ultimate origin of this sequence, which is very divergent from all 16S rRNA sequences found thus far in cyanobacteria. The divergent sequence is highly conserved among numerous strains of S. hyalinum, suggesting adaptive advantage and selective constraint of the divergent sequence.

  3. Sequence-specific cleavage of dsRNA by Mini-III RNase.

    Science.gov (United States)

    Głów, Dawid; Pianka, Dariusz; Sulej, Agata A; Kozłowski, Łukasz P; Czarnecka, Justyna; Chojnowski, Grzegorz; Skowronek, Krzysztof J; Bujnicki, Janusz M

    2015-03-11

    Ribonucleases (RNases) play a critical role in RNA processing and degradation by hydrolyzing phosphodiester bonds (exo- or endonucleolytically). Many RNases that cut RNA internally exhibit substrate specificity, but their target sites are usually limited to one or a few specific nucleotides in single-stranded RNA and often in a context of a particular three-dimensional structure of the substrate. Thus far, no RNase counterparts of restriction enzymes have been identified which could cleave double-stranded RNA (dsRNA) in a sequence-specific manner. Here, we present evidence for a sequence-dependent cleavage of long dsRNA by RNase Mini-III from Bacillus subtilis (BsMiniIII). Analysis of the sites cleaved by this enzyme in limited digest of bacteriophage Φ6 dsRNA led to the identification of a consensus target sequence. We defined nucleotide residues within the preferred cleavage site that affected the efficiency of the cleavage and were essential for the discrimination of cleavable versus non-cleavable dsRNA sequences. We have also determined that the loop α5b-α6, a distinctive structural element in Mini-III RNases, is crucial for the specific cleavage, but not for dsRNA binding. Our results suggest that BsMiniIII may serve as a prototype of a sequence-specific dsRNase that could possibly be used for targeted cleavage of dsRNA. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. Complete Genome Sequence of a Double-Stranded RNA Virus from Avocado

    Science.gov (United States)

    Villanueva, Francisco; Sabanadzovic, Sead; Valverde, Rodrigo A.

    2012-01-01

    A number of avocado (Persea americana) cultivars are known to contain high-molecular-weight double-stranded RNA (dsRNA) molecules for which a viral nature has been suggested, although sequence data are not available. Here we report the cloning and complete sequencing of a 13.5-kbp dsRNA virus isolated from avocado and show that it corresponds to the genome of a new species of the genus Endornavirus (family Endornaviridae), tentatively named Persea americana endornavirus (PaEV). PMID:22205720

  5. Analysis of 16S rRNA amplicon sequencing options on the Roche/454 next-generation titanium sequencing platform.

    Directory of Open Access Journals (Sweden)

    Hideyuki Tamaki

    Full Text Available BACKGROUND: 16S rRNA gene pyrosequencing approach has revolutionized studies in microbial ecology. While primer selection and short read length can affect the resulting microbial community profile, little is known about the influence of pyrosequencing methods on the sequencing throughput and the outcome of microbial community analyses. The aim of this study is to compare differences in output, ease, and cost among three different amplicon pyrosequencing methods for the Roche/454 Titanium platform METHODOLOGY/PRINCIPAL FINDINGS: The following three pyrosequencing methods for 16S rRNA genes were selected in this study: Method-1 (standard method is the recommended method for bi-directional sequencing using the LIB-A kit; Method-2 is a new option designed in this study for unidirectional sequencing with the LIB-A kit; and Method-3 uses the LIB-L kit for unidirectional sequencing. In our comparison among these three methods using 10 different environmental samples, Method-2 and Method-3 produced 1.5-1.6 times more useable reads than the standard method (Method-1, after quality-based trimming, and did not compromise the outcome of microbial community analyses. Specifically, Method-3 is the most cost-effective unidirectional amplicon sequencing method as it provided the most reads and required the least effort in consumables management. CONCLUSIONS: Our findings clearly demonstrated that alternative pyrosequencing methods for 16S rRNA genes could drastically affect sequencing output (e.g. number of reads before and after trimming but have little effect on the outcomes of microbial community analysis. This finding is important for both researchers and sequencing facilities utilizing 16S rRNA gene pyrosequencing for microbial ecological studies.

  6. 4SALE--a tool for synchronous RNA sequence and secondary structure alignment and editing.

    Science.gov (United States)

    Seibel, Philipp N; Müller, Tobias; Dandekar, Thomas; Schultz, Jörg; Wolf, Matthias

    2006-11-13

    In sequence analysis the multiple alignment builds the fundament of all proceeding analyses. Errors in an alignment could strongly influence all succeeding analyses and therefore could lead to wrong predictions. Hand-crafted and hand-improved alignments are necessary and meanwhile good common practice. For RNA sequences often the primary sequence as well as a secondary structure consensus is well known, e.g., the cloverleaf structure of the t-RNA. Recently, some alignment editors are proposed that are able to include and model both kinds of information. However, with the advent of a large amount of reliable RNA sequences together with their solved secondary structures (available from e.g. the ITS2 Database), we are faced with the problem to handle sequences and their associated secondary structures synchronously. 4SALE fills this gap. The application allows a fast sequence and synchronous secondary structure alignment for large data sets and for the first time synchronous manual editing of aligned sequences and their secondary structures. This study describes an algorithm for the synchronous alignment of sequences and their associated secondary structures as well as the main features of 4SALE used for further analyses and editing. 4SALE builds an optimal and unique starting point for every RNA sequence and structure analysis. 4SALE, which provides an user-friendly and intuitive interface, is a comprehensive toolbox for RNA analysis based on sequence and secondary structure information. The program connects sequence and structure databases like the ITS2 Database to phylogeny programs as for example the CBCAnalyzer. 4SALE is written in JAVA and therefore platform independent. The software is freely available and distributed from the website at http://4sale.bioapps.biozentrum.uni-wuerzburg.de.

  7. 4SALE – A tool for synchronous RNA sequence and secondary structure alignment and editing

    Directory of Open Access Journals (Sweden)

    Schultz Jörg

    2006-11-01

    Full Text Available Abstract Background In sequence analysis the multiple alignment builds the fundament of all proceeding analyses. Errors in an alignment could strongly influence all succeeding analyses and therefore could lead to wrong predictions. Hand-crafted and hand-improved alignments are necessary and meanwhile good common practice. For RNA sequences often the primary sequence as well as a secondary structure consensus is well known, e.g., the cloverleaf structure of the t-RNA. Recently, some alignment editors are proposed that are able to include and model both kinds of information. However, with the advent of a large amount of reliable RNA sequences together with their solved secondary structures (available from e.g. the ITS2 Database, we are faced with the problem to handle sequences and their associated secondary structures synchronously. Results 4SALE fills this gap. The application allows a fast sequence and synchronous secondary structure alignment for large data sets and for the first time synchronous manual editing of aligned sequences and their secondary structures. This study describes an algorithm for the synchronous alignment of sequences and their associated secondary structures as well as the main features of 4SALE used for further analyses and editing. 4SALE builds an optimal and unique starting point for every RNA sequence and structure analysis. Conclusion 4SALE, which provides an user-friendly and intuitive interface, is a comprehensive toolbox for RNA analysis based on sequence and secondary structure information. The program connects sequence and structure databases like the ITS2 Database to phylogeny programs as for example the CBCAnalyzer. 4SALE is written in JAVA and therefore platform independent. The software is freely available and distributed from the website at http://4sale.bioapps.biozentrum.uni-wuerzburg.de

  8. Secondary structure-based analysis of mouse brain small RNA sequences obtained by using next-generation sequencing.

    Science.gov (United States)

    Kiyosawa, Hidenori; Okumura, Akio; Okui, Saya; Ushida, Chisato; Kawai, Gota

    2015-08-01

    In order to find novel structured small RNAs, next-generation sequencing was applied to small RNA fractions with lengths ranging from 40 to 140 nt and secondary structure-based clustering was performed. Sequences of structured RNAs were effectively clustered and analyzed by secondary structure. Although more than 99% of the obtained sequences were known RNAs, 16 candidate mouse structured small non-coding RNAs (MsncRs) were isolated. Based on these results, the merits of secondary structure-based analysis are discussed. Copyright © 2015 Elsevier Inc. All rights reserved.

  9. miRNA Nomenclature : A View Incorporating Genetic Origins, Biosynthetic Pathways, and Sequence Variants

    NARCIS (Netherlands)

    Desvignes, T.; Batzel, P.; Berezikov, E.; Eilbeck, K.; Eppig, J. T.; McAndrews, M. S.; Singer, A.; Postlethwait, J. H.

    2015-01-01

    High-throughput sequencing of miRNAs has revealed the diversity and variability of mature and functional short noncoding RNAs, including their genomic origins, biogenesis pathways, sequence variability, and newly identified products such as miRNA-offset RNAs (moRs). Here we review known cases of

  10. Modeling RNA Secondary Structure with Sequence Comparison and Experimental Mapping Data.

    Science.gov (United States)

    Tan, Zhen; Sharma, Gaurav; Mathews, David H

    2017-07-25

    Secondary structure prediction is an important problem in RNA bioinformatics because knowledge of structure is critical to understanding the functions of RNA sequences. Significant improvements in prediction accuracy have recently been demonstrated though the incorporation of experimentally obtained structural information, for instance using selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) mapping. However, such mapping data is currently available only for a limited number of RNA sequences. In this article, we present a method for extending the benefit of experimental mapping data in secondary structure prediction to homologous sequences. Specifically, we propose a method for integrating experimental mapping data into a comparative sequence analysis algorithm for secondary structure prediction of multiple homologs, whereby the mapping data benefits not only the prediction for the specific sequence that was mapped but also other homologs. The proposed method is realized by modifying the TurboFold II algorithm for prediction of RNA secondary structures to utilize basepairing probabilities guided by SHAPE experimental data when such data are available. The SHAPE-mapping-guided basepairing probabilities are obtained using the RSample method. Results demonstrate that the SHAPE mapping data for a sequence improves structure prediction accuracy of other homologous sequences beyond the accuracy obtained by sequence comparison alone (TurboFold II). The updated version of TurboFold II is freely available as part of the RNAstructure software package. Copyright © 2017 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  11. How to design a single-cell RNA-sequencing experiment: pitfalls, challenges and perspectives.

    Science.gov (United States)

    Dal Molin, Alessandra; Di Camillo, Barbara

    2018-01-31

    The sequencing of the transcriptome of single cells, or single-cell RNA-sequencing, has now become the dominant technology for the identification of novel cell types in heterogeneous cell populations or for the study of stochastic gene expression. In recent years, various experimental methods and computational tools for analysing single-cell RNA-sequencing data have been proposed. However, most of them are tailored to different experimental designs or biological questions, and in many cases, their performance has not been benchmarked yet, thus increasing the difficulty for a researcher to choose the optimal single-cell transcriptome sequencing (scRNA-seq) experiment and analysis workflow. In this review, we aim to provide an overview of the current available experimental and computational methods developed to handle single-cell RNA-sequencing data and, based on their peculiarities, we suggest possible analysis frameworks depending on specific experimental designs. Together, we propose an evaluation of challenges and open questions and future perspectives in the field. In particular, we go through the different steps of scRNA-seq experimental protocols such as cell isolation, messenger RNA capture, reverse transcription, amplification and use of quantitative standards such as spike-ins and Unique Molecular Identifiers (UMIs). We then analyse the current methodological challenges related to preprocessing, alignment, quantification, normalization, batch effect correction and methods to control for confounding effects. © The Author(s) 2018. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  12. ISVASE: identification of sequence variant associated with splicing event using RNA-seq data.

    Science.gov (United States)

    Aljohi, Hasan Awad; Liu, Wanfei; Lin, Qiang; Yu, Jun; Hu, Songnian

    2017-06-28

    Exon recognition and splicing precisely and efficiently by spliceosome is the key to generate mature mRNAs. About one third or a half of disease-related mutations affect RNA splicing. Software PVAAS has been developed to identify variants associated with aberrant splicing by directly using RNA-seq data. However, it bases on the assumption that annotated splicing site is normal splicing, which is not true in fact. We develop the ISVASE, a tool for specifically identifying sequence variants associated with splicing events (SVASE) by using RNA-seq data. Comparing with PVAAS, our tool has several advantages, such as multi-pass stringent rule-dependent filters and statistical filters, only using split-reads, independent sequence variant identification in each part of splicing (junction), sequence variant detection for both of known and novel splicing event, additional exon-exon junction shift event detection if known splicing events provided, splicing signal evaluation, known DNA mutation and/or RNA editing data supported, higher precision and consistency, and short running time. Using a realistic RNA-seq dataset, we performed a case study to illustrate the functionality and effectiveness of our method. Moreover, the output of SVASEs can be used for downstream analysis such as splicing regulatory element study and sequence variant functional analysis. ISVASE is useful for researchers interested in sequence variants (DNA mutation and/or RNA editing) associated with splicing events. The package is freely available at https://sourceforge.net/projects/isvase/ .

  13. Deep Sequencing Reveals a MicroRNA Expression Signature in Triple-Negative Breast Cancer.

    Science.gov (United States)

    Chang, Yao-Yin; Lai, Liang-Chuan; Tsai, Mong-Hsun; Chuang, Eric Y

    2018-01-01

    Deep sequencing is an advanced technology in genomic biology to detect the precise order of nucleotides in a strand of DNA/RNA molecule. The analysis of deep sequencing data also requires sophisticated knowledge in both computational software and bioinformatics. In this chapter, the procedures of deep sequencing analysis of microRNA (miRNA) transcriptome in triple-negative breast cancer and adjacent normal tissue are described in detail. As miRNAs are critical regulators of gene expression and many of them were previously reported to be associated with the malignant progression of human cancer, the analytical method that accurately identifies deregulated miRNAs in a specific type of cancer is thus important for the understanding of its tumor behavior. We obtained raw sequence reads of miRNA expression from 24 triple-negative breast cancers and 14 adjacent normal tissues using deep sequencing technology in this work. Expression data of miRNA reads were normalized with the quantile-quantile scaling method and were analyzed statistically. A miRNA expression signature composed of 25 differentially expressed miRNAs showed to be an effective classifier between triple-negative breast cancers and adjacent normal tissues in a hierarchical clustering analysis.

  14. Small RNA Deep Sequencing and the Effects of microRNA408 on Root Gravitropic Bending in Arabidopsis

    Science.gov (United States)

    Li, Huasheng; Lu, Jinying; Sun, Qiao; Chen, Yu; He, Dacheng; Liu, Min

    2015-11-01

    MicroRNA (miRNA) is a non-coding small RNA composed of 20 to 24 nucleotides that influences plant root development. This study analyzed the miRNA expression in Arabidopsis root tip cells using Illumina sequencing and real-time PCR before (sample 0) and 15 min after (sample 15) a 3-D clinostat rotational treatment was administered. After stimulation was performed, the expression levels of seven miRNA genes, including Arabidopsis miR160, miR161, miR394, miR402, miR403, miR408, and miR823, were significantly upregulated. Illumina sequencing results also revealed two novel miRNAsthat have not been previously reported, The target genes of these miRNAs included pentatricopeptide repeat-containing protein and diadenosine tetraphosphate hydrolase. An overexpression vector of Arabidopsis miR408 was constructed and transferred to Arabidopsis plant. The roots of plants over expressing miR408 exhibited a slower reorientation upon gravistimulation in comparison with those of wild-type. This result indicate that miR408 could play a role in root gravitropic response.

  15. Sequence organization of the Acanthamoeba rRNA intergenic spacer: identification of transcriptional enhancers.

    Science.gov (United States)

    Yang, Q; Zwick, M G; Paule, M R

    1994-01-01

    The primary sequence of the entire 2330 bp intergenic spacer of the A.castellanii ribosomal RNA gene was determined. Repeated sequence elements averaging 140 bp were identified and found to bind a protein required for optimum initiation at the core promoter. These repeated elements were shown to stimulate rRNA transcription by RNA polymerase I in vitro. The repeats inhibited transcription when placed in trans, and stimulated transcription when in cis, in either orientation, but only when upstream of the core promoter. Thus, these repeated elements have characteristics similar to polymerase I enhancers found in higher eukaryotes. The number of rRNA repeats in Acanthamoeba cells was determined to be 24 per haploid genome, the lowest number so far identified in any eukaryote. However, because Acanthamoeba is polyploid, each cell contains approximately 600 rRNA genes. Images PMID:7984432

  16. Identification Exon Skipping Events From High-Throughput RNA Sequencing Data.

    Science.gov (United States)

    Bai, Yang; Ji, Shufan; Jiang, Qinghua; Wang, Yadong

    2015-07-01

    The emergence of next-generation high-throughput RNA sequencing (RNA-Seq) provides tremendous opportunities for researchers to analyze alternative splicing on a genome-wide scale. However, accurate identification of alternative splicing events from RNA-Seq data has remained an unresolved challenge in next-generation sequencing (NGS) studies. Identifying exon skipping (ES) events is an essential part in genome-wide alternative splicing event identification. In this paper, we propose a novel method ESFinder, a random forest classifier to identify ES events from RNA-Seq data. ESFinder conducts thorough studies on predicting features and figures out proper features according to their relevance for ES event identification. Experimental results on real human skeletal muscle and brain RNA-Seq data show that ESFinder could effectively predict ES events with high predictive accuracy. The codes of ESFinder are available at http://mlg.hit.edu.cn/ybai/ES/ESFinder.html.

  17. The PETfold and PETcofold web servers for intra- and intermolecular structures of multiple RNA sequences

    DEFF Research Database (Denmark)

    Seemann, Ernst Stefan; Menzel, Karl Peter; Backofen, Rolf

    2011-01-01

    to interactive usage of the predictors. Additionally, the web servers provide direct access to annotated RNA alignments, such as the Rfam 10.0 database and multiple alignments of 16 vertebrate genomes with human. The web servers are freely available at: http://rth.dk/resources/petfold/...... gene. We present web servers to analyze multiple RNA sequences for common RNA structure and for RNA interaction sites. The web servers are based on the recent PET (Probabilistic Evolutionary and Thermodynamic) models PETfold and PETcofold, but add user friendly features ranging from a graphical layer...

  18. Maternal Plasma DNA and RNA Sequencing for Prenatal Testing

    NARCIS (Netherlands)

    Tamminga, Saskia; van Maarle, Merel; Henneman, Lidewij; Oudejans, Cees B. M.; Cornel, Martina C.; Sistermans, Erik A.

    2016-01-01

    Cell-free DNA (cf DNA) testing has recently become indispensable in diagnostic testing and screening. In the prenatal setting, this type of testing is often called noninvasive prenatal testing (NIPT). With a number of techniques, using either next-generation sequencing or single nucleotide

  19. Sequence analysis of mitochondrial 16S ribosomal RNA gene ...

    Indian Academy of Sciences (India)

    Mosquitoes are vectors for the transmission of many human pathogens that include viruses, nematodes and protozoa. For the understanding of their vectorial capacity, identification of disease carrying and refractory strains is essential. Recently, molecular taxonomic techniques have been utilized for this purpose. Sequence ...

  20. Sequence analysis of mitochondrial 16S ribosomal RNA gene

    Indian Academy of Sciences (India)

    Mosquitoes are vectors for the transmission of many human pathogens that include viruses, nematodes and protozoa. For the understanding of their vectorial capacity, identification of disease carrying and refractory strains is essential. Recently, molecular taxonomic techniques have been utilized for this purpose. Sequence ...

  1. Optimization of extraction of circulating RNAs from plasma--enabling small RNA sequencing.

    Science.gov (United States)

    Spornraft, Melanie; Kirchner, Benedikt; Haase, Bettina; Benes, Vladimir; Pfaffl, Michael W; Riedmaier, Irmgard

    2014-01-01

    There are several protocols and kits for the extraction of circulating RNAs from plasma with a following quantification of specific genes via RT-qPCR. Due to the marginal amount of cell-free RNA in plasma samples, the total RNA yield is insufficient to perform Next-Generation Sequencing (NGS), the state-of-the-art technology in massive parallel sequencing that enables a comprehensive characterization of the whole transcriptome. Screening the transcriptome for biomarker signatures accelerates progress in biomarker profiling for molecular diagnostics, early disease detection or food safety. Therefore, the aim was to optimize a method that enables the extraction of sufficient amounts of total RNA from bovine plasma to generate good-quality small RNA Sequencing (small RNA-Seq) data. An increased volume of plasma (9 ml) was processed using the Qiagen miRNeasy Serum/Plasma Kit in combination with the QIAvac24 Plus system, a vacuum manifold that enables handling of high volumes during RNA isolation. 35 ng of total RNA were passed on to cDNA library preparation followed by small RNA high-throughput sequencing analysis on the Illumina HiSeq2000 platform. Raw sequencing reads were processed by a data analysis pipeline using different free software solutions. Seq-data was trimmed, quality checked, gradually selected for miRNAs/piRNAs and aligned to small RNA reference annotation indexes. Mapping to human reference indexes resulted in 4.8±2.8% of mature miRNAs and 1.4±0.8% of piRNAs and of 5.0±2.9% of mature miRNAs for bos taurus.

  2. Optimization of extraction of circulating RNAs from plasma--enabling small RNA sequencing.

    Directory of Open Access Journals (Sweden)

    Melanie Spornraft

    Full Text Available There are several protocols and kits for the extraction of circulating RNAs from plasma with a following quantification of specific genes via RT-qPCR. Due to the marginal amount of cell-free RNA in plasma samples, the total RNA yield is insufficient to perform Next-Generation Sequencing (NGS, the state-of-the-art technology in massive parallel sequencing that enables a comprehensive characterization of the whole transcriptome. Screening the transcriptome for biomarker signatures accelerates progress in biomarker profiling for molecular diagnostics, early disease detection or food safety. Therefore, the aim was to optimize a method that enables the extraction of sufficient amounts of total RNA from bovine plasma to generate good-quality small RNA Sequencing (small RNA-Seq data. An increased volume of plasma (9 ml was processed using the Qiagen miRNeasy Serum/Plasma Kit in combination with the QIAvac24 Plus system, a vacuum manifold that enables handling of high volumes during RNA isolation. 35 ng of total RNA were passed on to cDNA library preparation followed by small RNA high-throughput sequencing analysis on the Illumina HiSeq2000 platform. Raw sequencing reads were processed by a data analysis pipeline using different free software solutions. Seq-data was trimmed, quality checked, gradually selected for miRNAs/piRNAs and aligned to small RNA reference annotation indexes. Mapping to human reference indexes resulted in 4.8±2.8% of mature miRNAs and 1.4±0.8% of piRNAs and of 5.0±2.9% of mature miRNAs for bos taurus.

  3. Bioinformatical approaches to RNA structure prediction & Sequencing of an ancient human genome

    DEFF Research Database (Denmark)

    Lindgreen, Stinus

    tools that exist. The second part has been focused on the mapping and genotyping of ancient genomic DNA. The development of next generation sequencing technologies combined with the use of ancient DNA material present the researchers with some special challenges in the analyses. This work resulted...... in the publication of the first genome of an ancient human individual, where close to the theoretical maximum of the genome sequence was recovered with high confidence. Part of the project was the development of the program SNPest for genotyping and SNP calling that models various sources of error and predicts...... in families of related RNA sequences. Also, the program MASTR was developed to perform simultaneous alignment of multiple RNA sequences and prediction of a common secondary structure. The webserver WAR was developed to make it easy for non-computer savy researchers to use the many RNA structure prediction...

  4. Cloning and sequencing of full-length cDNAs of RNA1 and RNA2 of a Tomato black ring virus isolate from Poland.

    Science.gov (United States)

    Jończyk, M; Le Gall, O; Pałucha, A; Borodynko, N; Pospieszny, H

    2004-04-01

    Full-length cDNA clones corresponding to the RNA1 and RNA2 of the Polish isolate MJ of Tomato black ring virus (TBRV, genus Nepovirus) were obtained using a direct recombination strategy in yeast, and their complete nucleotide sequences were established. RNA1 is 7358 nucleotides and RNA2 is 4633 nucleotides in length, excluding the poly(A) tails. Both RNAs contain a single open reading frame encoding polyproteins of 254 kDa and 149 kDa for RNA1 and RNA2 respectively. Putative cleavage sites were identified, and the relationships between TBRV and related nepoviruses were studied by sequence comparison.

  5. Nodavirus Coat Protein Imposes Dodecahedral RNA Structure Independent of Nucleotide Sequence and Length†

    Science.gov (United States)

    Tihova, Mariana; Dryden, Kelly A.; Le, Thuc-vy L.; Harvey, Stephen C.; Johnson, John E.; Yeager, Mark; Schneemann, Anette

    2004-01-01

    The nodavirus Flock house virus (FHV) has a bipartite, positive-sense RNA genome that is packaged into an icosahedral particle displaying T=3 symmetry. The high-resolution X-ray structure of FHV has shown that 10 bp of well-ordered, double-stranded RNA are located at each of the 30 twofold axes of the virion, but it is not known which portions of the genome form these duplex regions. The regular distribution of double-stranded RNA in the interior of the virus particle indicates that large regions of the encapsidated genome are engaged in secondary structure interactions. Moreover, the RNA is restricted to a topology that is unlikely to exist during translation or replication. We used electron cryomicroscopy and image reconstruction to determine the structure of four types of FHV particles that differed in RNA and protein content. RNA-capsid interactions were primarily mediated via the N and C termini, which are essential for RNA recognition and particle assembly. A substantial fraction of the packaged nucleic acid, either viral or heterologous, was organized as a dodecahedral cage of duplex RNA. The similarity in tertiary structure suggests that RNA folding is independent of sequence and length. Computational modeling indicated that RNA duplex formation involves both short-range and long-range interactions. We propose that the capsid protein is able to exploit the plasticity of the RNA secondary structures, capturing those that are compatible with the geometry of the dodecahedral cage. PMID:14990708

  6. SINA: accurate high-throughput multiple sequence alignment of ribosomal RNA genes.

    Science.gov (United States)

    Pruesse, Elmar; Peplies, Jörg; Glöckner, Frank Oliver

    2012-07-15

    In the analysis of homologous sequences, computation of multiple sequence alignments (MSAs) has become a bottleneck. This is especially troublesome for marker genes like the ribosomal RNA (rRNA) where already millions of sequences are publicly available and individual studies can easily produce hundreds of thousands of new sequences. Methods have been developed to cope with such numbers, but further improvements are needed to meet accuracy requirements. In this study, we present the SILVA Incremental Aligner (SINA) used to align the rRNA gene databases provided by the SILVA ribosomal RNA project. SINA uses a combination of k-mer searching and partial order alignment (POA) to maintain very high alignment accuracy while satisfying high throughput performance demands. SINA was evaluated in comparison with the commonly used high throughput MSA programs PyNAST and mothur. The three BRAliBase III benchmark MSAs could be reproduced with 99.3, 97.6 and 96.1 accuracy. A larger benchmark MSA comprising 38 772 sequences could be reproduced with 98.9 and 99.3% accuracy using reference MSAs comprising 1000 and 5000 sequences. SINA was able to achieve higher accuracy than PyNAST and mothur in all performed benchmarks. Alignment of up to 500 sequences using the latest SILVA SSU/LSU Ref datasets as reference MSA is offered at http://www.arb-silva.de/aligner. This page also links to Linux binaries, user manual and tutorial. SINA is made available under a personal use license.

  7. Comparative analysis of hepatitis B virus polymerase sequences required for viral RNA binding, RNA packaging, and protein priming.

    Science.gov (United States)

    Jones, Scott A; Clark, Daniel N; Cao, Feng; Tavis, John E; Hu, Jianming

    2014-02-01

    Hepatitis B virus replicates a DNA genome through reverse transcription of a pregenomic RNA (pgRNA) by using a multifunctional polymerase (HP). A critical function of HP is its specific association with a viral RNA signal, termed ε (Hε), located on pgRNA, which is required for specific packaging of pgRNA into viral nucleocapsids and initiation of viral reverse transcription. HP initiates reverse transcription by using itself as a protein primer (protein priming) and Hε as the obligatory template. HP is made up of four domains, including the terminal protein (TP), the spacer, the reverse transcriptase (RT), and the RNase H domains. A recently developed, Hε-dependent, in vitro protein priming assay was used in this study to demonstrate that almost the entire TP and RT domains and most of the RNase H domain were required for protein priming. Specific residues within TP, RT, and the spacer were identified as being critical for HP-Hε binding and/or protein priming. Comparison of HP sequence requirements for Hε binding, pgRNA packaging, and protein priming allowed the classification of the HP mutants into five groups, each with distinct effects on these complex and related processes. Detailed characterization of HP requirements for these related and essential functions of HP will further elucidate the mechanisms of its multiple functions and aid in the targeting of these functions for antiviral therapy.

  8. Genetic diagnosis of Mendelian disorders via RNA sequencing.

    Science.gov (United States)

    Kremer, Laura S; Bader, Daniel M; Mertes, Christian; Kopajtich, Robert; Pichler, Garwin; Iuso, Arcangela; Haack, Tobias B; Graf, Elisabeth; Schwarzmayr, Thomas; Terrile, Caterina; Koňaříková, Eliška; Repp, Birgit; Kastenmüller, Gabi; Adamski, Jerzy; Lichtner, Peter; Leonhardt, Christoph; Funalot, Benoit; Donati, Alice; Tiranti, Valeria; Lombes, Anne; Jardel, Claude; Gläser, Dieter; Taylor, Robert W; Ghezzi, Daniele; Mayr, Johannes A; Rötig, Agnes; Freisinger, Peter; Distelmaier, Felix; Strom, Tim M; Meitinger, Thomas; Gagneur, Julien; Prokisch, Holger

    2017-06-12

    Across a variety of Mendelian disorders, ∼50-75% of patients do not receive a genetic diagnosis by exome sequencing indicating disease-causing variants in non-coding regions. Although genome sequencing in principle reveals all genetic variants, their sizeable number and poorer annotation make prioritization challenging. Here, we demonstrate the power of transcriptome sequencing to molecularly diagnose 10% (5 of 48) of mitochondriopathy patients and identify candidate genes for the remainder. We find a median of one aberrantly expressed gene, five aberrant splicing events and six mono-allelically expressed rare variants in patient-derived fibroblasts and establish disease-causing roles for each kind. Private exons often arise from cryptic splice sites providing an important clue for variant prioritization. One such event is found in the complex I assembly factor TIMMDC1 establishing a novel disease-associated gene. In conclusion, our study expands the diagnostic tools for detecting non-exonic variants and provides examples of intronic loss-of-function variants with pathological relevance.

  9. Subcellular RNA sequencing reveals broad presence of cytoplasmic intron-sequence retaining transcripts in mouse and rat neurons.

    Directory of Open Access Journals (Sweden)

    Mugdha Khaladkar

    Full Text Available Recent findings have revealed the complexity of the transcriptional landscape in mammalian cells. One recently described class of novel transcripts are the Cytoplasmic Intron-sequence Retaining Transcripts (CIRTs, hypothesized to confer post-transcriptional regulatory function. For instance, the neuronal CIRT KCNMA1i16 contributes to the firing properties of hippocampal neurons. Intronic sub-sequence retention within IL1-β mRNA in anucleate platelets has been implicated in activity-dependent splicing and translation. In a recent study, we showed CIRTs harbor functional SINE ID elements which are hypothesized to mediate dendritic localization in neurons. Based on these studies and others, we hypothesized that CIRTs may be present in a broad set of transcripts and comprise novel signals for post-transcriptional regulation. We carried out a transcriptome-wide survey of CIRTs by sequencing micro-dissected subcellular RNA fractions. We sequenced two batches of 150-300 individually dissected dendrites from primary cultures of hippocampal neurons in rat and three batches from mouse hippocampal neurons. After statistical processing to minimize artifacts, we found a broad prevalence of CIRTs in the neurons in both species (44-60% of the expressed transcripts. The sequence patterns, including stereotypical length, biased inclusion of specific introns, and intron-intron junctions, suggested CIRT-specific nuclear processing. Our analysis also suggested that these cytoplasmic intron-sequence retaining transcripts may serve as a primary transcript for ncRNAs. Our results show that retaining intronic sequences is not isolated to a few loci but may be a genome-wide phenomenon for embedding functional signals within certain mRNA. The results hypothesize a novel source of cis-sequences for post-transcriptional regulation. Our results hypothesize two potentially novel splicing pathways: one, within the nucleus for CIRT biogenesis; and another, within the

  10. Deep Sequencing Analysis of Nucleolar Small RNAs: RNA Isolation and Library Preparation.

    Science.gov (United States)

    Bai, Baoyan; Laiho, Marikki

    2016-01-01

    The nucleolus is a subcellular compartment with a key essential function in ribosome biogenesis. The nucleolus is rich in noncoding RNAs, mostly the ribosomal RNAs and small nucleolar RNAs. Surprisingly, also several miRNAs have been detected in the nucleolus, raising the question as to whether other small RNA species are present and functional in the nucleolus. We have developed a strategy for stepwise enrichment of nucleolar small RNAs from the total nucleolar RNA extracts and subsequent construction of nucleolar small RNA libraries which are suitable for deep sequencing. Our method successfully isolates the small RNA population from total RNAs and monitors the RNA quality in each step to ensure that small RNAs recovered represent the actual small RNA population in the nucleolus and not degradation products from larger RNAs. We have further applied this approach to characterize the distribution of small RNAs in different cellular compartments.

  11. An Optimized Transient Dual Luciferase Assay for Quantifying MicroRNA Directed Repression of Targeted Sequences

    Directory of Open Access Journals (Sweden)

    Richard L. Moyle

    2017-09-01

    Full Text Available Studies investigating the action of small RNAs on computationally predicted target genes require some form of experimental validation. Classical molecular methods of validating microRNA action on target genes are laborious, while approaches that tag predicted target sequences to qualitative reporter genes encounter technical limitations. The aim of this study was to address the challenge of experimentally validating large numbers of computationally predicted microRNA-target transcript interactions using an optimized, quantitative, cost-effective, and scalable approach. The presented method combines transient expression via agroinfiltration of Nicotiana benthamiana leaves with a quantitative dual luciferase reporter system, where firefly luciferase is used to report the microRNA-target sequence interaction and Renilla luciferase is used as an internal standard to normalize expression between replicates. We report the appropriate concentration of N. benthamiana leaf extracts and dilution factor to apply in order to avoid inhibition of firefly LUC activity. Furthermore, the optimal ratio of microRNA precursor expression construct to reporter construct and duration of the incubation period post-agroinfiltration were determined. The optimized dual luciferase assay provides an efficient, repeatable and scalable method to validate and quantify microRNA action on predicted target sequences. The optimized assay was used to validate five predicted targets of rice microRNA miR529b, with as few as six technical replicates. The assay can be extended to assess other small RNA-target sequence interactions, including assessing the functionality of an artificial miRNA or an RNAi construct on a targeted sequence.

  12. CPSS: a computational platform for the analysis of small RNA deep sequencing data.

    Science.gov (United States)

    Zhang, Yuanwei; Xu, Bo; Yang, Yifan; Ban, Rongjun; Zhang, Huan; Jiang, Xiaohua; Cooke, Howard J; Xue, Yu; Shi, Qinghua

    2012-07-15

    Next generation sequencing (NGS) techniques have been widely used to document the small ribonucleic acids (RNAs) implicated in a variety of biological, physiological and pathological processes. An integrated computational tool is needed for handling and analysing the enormous datasets from small RNA deep sequencing approach. Herein, we present a novel web server, CPSS (a computational platform for the analysis of small RNA deep sequencing data), designed to completely annotate and functionally analyse microRNAs (miRNAs) from NGS data on one platform with a single data submission. Small RNA NGS data can be submitted to this server with analysis results being returned in two parts: (i) annotation analysis, which provides the most comprehensive analysis for small RNA transcriptome, including length distribution and genome mapping of sequencing reads, small RNA quantification, prediction of novel miRNAs, identification of differentially expressed miRNAs, piwi-interacting RNAs and other non-coding small RNAs between paired samples and detection of miRNA editing and modifications and (ii) functional analysis, including prediction of miRNA targeted genes by multiple tools, enrichment of gene ontology terms, signalling pathway involvement and protein-protein interaction analysis for the predicted genes. CPSS, a ready-to-use web server that integrates most functions of currently available bioinformatics tools, provides all the information wanted by the majority of users from small RNA deep sequencing datasets. CPSS is implemented in PHP/PERL+MySQL+R and can be freely accessed at http://mcg.ustc.edu.cn/db/cpss/index.html or http://mcg.ustc.edu.cn/sdap1/cpss/index.html.

  13. Purification and sequence analysis of the mRNA coding for an immunoglobulin heavy chain

    International Nuclear Information System (INIS)

    Cowan, N.J.; Secher, D.S.; Milstein, C.

    1976-01-01

    A mutant cell line (IF2) derived from the mouse myeloma MOPC 21 has been used for the isolation and sequence analysis of H-chain mRNA. The IF2 cells synthesise an H-chain of reduced size in which the Csub(H)1 homology region is missing. Sizing of the IF2 H-chain mRNA and wild-type H-chain mRNA revealed that the deletion is expressed at the mRNA level. The mutant H-chain mRNA sedimented at 16-S, enabling effective resolution from 18-S ribosomal RNA. In experiments using IF2 cells labelled with [ 32 P]phosphate, the 16-S mRNA was purified by oligo(T)-cellulose chromatography. Polyacrylamide gel analysis of the poly(A)-containing fraction showed the presence of a single radioactive band. Comparison of the mobility of this band relative to markers of known molecular weight revealed that the molecule contained about 1,600 nucleotides. Digestion of the 32 P-labelled mRNA with T 1 ribonuclease and two-dimensional fractionation of the resulting oligonucleotides yielded a fingerprint' suitable for a preliminary sequence analysis. By using the established amino acid sequence of the IF2 H-chain and a knowledge of the genetic code, 14 oligonucleotides were assigned within the constant region and four within the variable region of the IF2 H-chain. This sequence data accounts for 19.5% of the coding region. Several other oligonucleotides, which could not be assigned within the coding region but which occurred in approximately molar yield, have also been partially characterised. These oligonucleotides are presumably derived from the untranslated regions of the mRNA. (orig.) [de

  14. High-quality RNA extraction from copepods for Next Generation Sequencing: A comparative study.

    Science.gov (United States)

    Asai, Sneha; Ianora, Adrianna; Lauritano, Chiara; Lindeque, Penelope K; Carotenuto, Ylenia

    2015-12-01

    Despite the ecological importance of copepods, few Next Generation Sequencing studies (NGS) have been performed on small crustaceans, and a standard method for RNA extraction is lacking. In this study, we compared three commonly-used methods: TRIzol®, Aurum Total RNA Mini Kit and Qiagen RNeasy Micro Kit, in combination with preservation reagents TRIzol® or RNAlater®, to obtain high-quality and quantity of RNA from copepods for NGS. Total RNA was extracted from the copepods Calanus helgolandicus, Centropages typicus and Temora stylifera and its quantity and quality were evaluated using NanoDrop, agarose gel electrophoresis and Agilent Bioanalyzer. Our results demonstrate that preservation of copepods in RNAlater® and extraction with Qiagen RNeasy Micro Kit were the optimal isolation method for high-quality and quantity of RNA for NGS studies of C. helgolandicus. Intriguingly, C. helgolandicus 28S rRNA is formed by two subunits that separate after heat-denaturation and migrate along with 18S rRNA. This unique property of protostome RNA has never been reported in copepods. Overall, our comparative study on RNA extraction protocols will help increase gene expression studies on copepods using high-throughput applications, such as RNA-Seq and microarrays. Copyright © 2014 Elsevier B.V. All rights reserved.

  15. A DNA sequence obtained by replacement of the dopamine RNA aptamer bases is not an aptamer.

    Science.gov (United States)

    Álvarez-Martos, Isabel; Ferapontova, Elena E

    2017-08-05

    A unique specificity of the aptamer-ligand biorecognition and binding facilitates bioanalysis and biosensor development, contributing to discrimination of structurally related molecules, such as dopamine and other catecholamine neurotransmitters. The aptamer sequence capable of specific binding of dopamine is a 57 nucleotides long RNA sequence reported in 1997 (Biochemistry, 1997, 36, 9726). Later, it was suggested that the DNA homologue of the RNA aptamer retains the specificity of dopamine binding (Biochem. Biophys. Res. Commun., 2009, 388, 732). Here, we show that the DNA sequence obtained by the replacement of the RNA aptamer bases for their DNA analogues is not able of specific biorecognition of dopamine, in contrast to the original RNA aptamer sequence. This DNA sequence binds dopamine and structurally related catecholamine neurotransmitters non-specifically, as any DNA sequence, and, thus, is not an aptamer and cannot be used neither for in vivo nor in situ analysis of dopamine in the presence of structurally related neurotransmitters. Copyright © 2017 Elsevier Inc. All rights reserved.

  16. Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters.

    Science.gov (United States)

    Core, Leighton J; Waterfall, Joshua J; Lis, John T

    2008-12-19

    RNA polymerases are highly regulated molecular machines. We present a method (global run-on sequencing, GRO-seq) that maps the position, amount, and orientation of transcriptionally engaged RNA polymerases genome-wide. In this method, nuclear run-on RNA molecules are subjected to large-scale parallel sequencing and mapped to the genome. We show that peaks of promoter-proximal polymerase reside on approximately 30% of human genes, transcription extends beyond pre-messenger RNA 3' cleavage, and antisense transcription is prevalent. Additionally, most promoters have an engaged polymerase upstream and in an orientation opposite to the annotated gene. This divergent polymerase is associated with active genes but does not elongate effectively beyond the promoter. These results imply that the interplay between polymerases and regulators over broad promoter regions dictates the orientation and efficiency of productive transcription.

  17. Deep sequencing of RNA from immune cell-derived vesicles uncovers the selective incorporation of small non-coding RNA biotypes with potential regulatory functions.

    Science.gov (United States)

    Nolte-'t Hoen, Esther N M; Buermans, Henk P J; Waasdorp, Maaike; Stoorvogel, Willem; Wauben, Marca H M; 't Hoen, Peter A C

    2012-10-01

    Cells release RNA-carrying vesicles and membrane-free RNA/protein complexes into the extracellular milieu. Horizontal vesicle-mediated transfer of such shuttle RNA between cells allows dissemination of genetically encoded messages, which may modify the function of target cells. Other studies used array analysis to establish the presence of microRNAs and mRNA in cell-derived vesicles from many sources. Here, we used an unbiased approach by deep sequencing of small RNA released by immune cells. We found a large variety of small non-coding RNA species representing pervasive transcripts or RNA cleavage products overlapping with protein coding regions, repeat sequences or structural RNAs. Many of these RNAs were enriched relative to cellular RNA, indicating that cells destine specific RNAs for extracellular release. Among the most abundant small RNAs in shuttle RNA were sequences derived from vault RNA, Y-RNA and specific tRNAs. Many of the highly abundant small non-coding transcripts in shuttle RNA are evolutionary well-conserved and have previously been associated to gene regulatory functions. These findings allude to a wider range of biological effects that could be mediated by shuttle RNA than previously expected. Moreover, the data present leads for unraveling how cells modify the function of other cells via transfer of specific non-coding RNA species.

  18. Thousands of corresponding human and mouse genomic regions unalignable in primary sequence contain common RNA structure

    DEFF Research Database (Denmark)

    Torarinsson, Elfar; Sawera, Milena; Havgaard, Jakob Hull

    2006-01-01

    Human and mouse genome sequences contain roughly 100,000 regions that are unalignable in primary sequence and neighbor corresponding alignable regions between both organisms. These pairs are generally assumed to be nonconserved, although the level of structural conservation between these has never...... been investigated. Owing to the limitations in computational methods, comparative genomics has been lacking the ability to compare such nonconserved sequence regions for conserved structural RNA elements. We have investigated the presence of structural RNA elements by conducting a local structural...... alignment, using FOLDALIGN, on a subset of these 100,000 corresponding regions and estimate that 1800 contain common RNA structures. Comparing our results with the recent mapping of transcribed fragments (transfrags) in human, we find that high-scoring candidates are twice as likely to be found in regions...

  19. Hierarchical folding of multiple sequence alignments for the prediction of structures and RNA-RNA interactions

    Directory of Open Access Journals (Sweden)

    Gorodkin Jan

    2010-05-01

    Full Text Available Abstract Background Many regulatory non-coding RNAs (ncRNAs function through complementary binding with mRNAs or other ncRNAs, e.g., microRNAs, snoRNAs and bacterial sRNAs. Predicting these RNA interactions is essential for functional studies of putative ncRNAs or for the design of artificial RNAs. Many ncRNAs show clear signs of undergoing compensating base changes over evolutionary time. Here, we postulate that a non-negligible part of the existing RNA-RNA interactions contain preserved but covarying patterns of interactions. Methods We present a novel method that takes compensating base changes across the binding sites into account. The algorithm works in two steps on two pre-generated multiple alignments. In the first step, individual base pairs with high reliability are found using the PETfold algorithm, which includes evolutionary and thermodynamic properties. In step two (where high reliability base pairs from step one are constrained as unpaired, the principle of cofolding is combined with hierarchical folding. The final prediction of intra- and inter-molecular base pairs consists of the reliabilities computed from the constrained expected accuracy scoring, which is an extended version of that used for individual multiple alignments. Results We derived a rather extensive algorithm. One of the advantages of our approach (in contrast to other RNA-RNA interaction prediction methods is the application of covariance detection and prediction of pseudoknots between intra- and inter-molecular base pairs. As a proof of concept, we show an example and discuss the strengths and weaknesses of the approach.

  20. Determining mutant spectra of three RNA viral samples using ultra-deep sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Chen, H

    2012-06-06

    RNA viruses have extremely high mutation rates that enable the virus to adapt to new host environments and even jump from one species to another. As part of a viral transmission study, three viral samples collected from naturally infected animals were sequenced using Illumina paired-end technology at ultra-deep coverage. In order to determine the mutant spectra within the viral quasispecies, it is critical to understand the sequencing error rates and control for false positive calls of viral variants (point mutantations). I will estimate the sequencing error rate from two control sequences and characterize the mutant spectra in the natural samples with this error rate.

  1. Efficient RNA extraction protocol for the wood mangrove species Laguncularia racemosa suited for next-generation RNA sequencing

    International Nuclear Information System (INIS)

    Wilwerth, M. W.; Rossetto, P.

    2016-01-01

    Mangrove flora and habitat have immeasurable importance in marine and coastal ecology as well as in the economy. Despite their importance, they are constantly threatened by oil spill accidents and environmental contamination; therefore, it is crucial to understand the changes in gene expression to better predict toxicity in these plants. Among the species of Atlantic coast mangrove (Americas and Africa), Laguncularia racemosa, or white mangrove, is a conspicuous species. The wide distribution of L. racemosa in areas where marine oil exploration is rapidly increasing make it a candidate mangrove species model to uncover the impact of oil spills at the molecular level with the use of massive transcriptome sequencing. However, for this purpose, the RNA extraction protocol should ensure low levels of contaminants and structure integrity. In this study, eight RNA extraction methods were tested and analysed using downstream applications. The InviTrap Spin Plant RNA Mini Kit performed best with regard to purity and integrity. Moreover, the obtained RNA was submitted to cDNA synthesis and RT-PCR, successfully generating amplification products of the expected size. These Results show the applicability of the RNA obtained here for downstream methodologies, such as the construction of cDNA libraries for the Illumina Hi-seq platform. (author)

  2. MicroRNA identity and abundance in porcine skeletal muscles determined by deep sequencing

    DEFF Research Database (Denmark)

    Nielsen, M; Hansen, J H; Hedegaard, J

    2010-01-01

    MicroRNAs (miRNA) are short single-stranded RNA molecules that regulate gene expression post-transcriptionally by binding to complementary sequences in the 3' untranslated region (3' UTR) of target mRNAs. MiRNAs participate in the regulation of myogenesis, and identification of the complete set o...... that highly expressed miRNAs are involved in skeletal muscle development and regeneration, signal transduction, cell-cell and cell-extracellular matrix communication and neural development and function....

  3. Sequencing and characterisation of an extensive Atlantic salmon (Salmo salar L. microRNA repertoire.

    Directory of Open Access Journals (Sweden)

    Michaël Bekaert

    Full Text Available Atlantic salmon (Salmo salar L., a member of the family Salmonidae, is a totemic species of ecological and cultural significance that is also economically important in terms of both sports fisheries and aquaculture. These factors have promoted the continuous development of genomic resources for this species, furthering both fundamental and applied research. MicroRNAs (miRNA are small endogenous non-coding RNA molecules that control spatial and temporal expression of targeted genes through post-transcriptional regulation. While miRNA have been characterised in detail for many other species, this is not yet the case for Atlantic salmon. To identify miRNAs from Atlantic salmon, we constructed whole fish miRNA libraries for 18 individual juveniles (fry, four months post hatch and characterised them by Illumina high-throughput sequencing (total of 354,505,167 paired-ended reads. We report an extensive and partly novel repertoire of miRNA sequences, comprising 888 miRNA genes (547 unique mature miRNA sequences, quantify their expression levels in basal conditions, examine their homology to miRNAs from other species and identify their predicted target genes. We also identify the location and putative copy number of the miRNA genes in the draft Atlantic salmon reference genome sequence. The Atlantic salmon miRNAs experimentally identified in this study provide a robust large-scale resource for functional genome research in salmonids. There is an opportunity to explore the evolution of salmonid miRNAs following the relatively recent whole genome duplication event in salmonid species and to investigate the role of miRNAs in the regulation of gene expression in particular their contribution to variation in economically and ecologically important traits.

  4. Single-Cell RNA Sequencing of the Bronchial Epithelium in Smokers with Lung Cancer

    Science.gov (United States)

    2017-07-01

    changes primarily involve incorporation of the 3’ sequencing adaptor via random hexamer-based reverse transcription (rather than RNA ligation...libraries using an adapted version of the CEL-Seq RNA library preparation protocol that includes plate-, well-, and transcript -specific barcodes...gene level counts for each cell as well as a new algorithm, Celda, to define and characterize transcriptionally distinct cell populations. We have

  5. Molecular indexing enables quantitative targeted RNA sequencing and reveals poor efficiencies in standard library preparations.

    Science.gov (United States)

    Fu, Glenn K; Xu, Weihong; Wilhelmy, Julie; Mindrinos, Michael N; Davis, Ronald W; Xiao, Wenzhong; Fodor, Stephen P A

    2014-02-04

    We present a simple molecular indexing method for quantitative targeted RNA sequencing, in which mRNAs of interest are selectively captured from complex cDNA libraries and sequenced to determine their absolute concentrations. cDNA fragments are individually labeled so that each molecule can be tracked from the original sample through the library preparation and sequencing process. Multiple copies of cDNA fragments of identical sequence become distinct through labeling, and replicate clones created during PCR amplification steps can be identified and assigned to their distinct parent molecules. Selective capture enables efficient use of sequencing for deep sampling and for the absolute quantitation of rare or transient transcripts that would otherwise escape detection by standard sequencing methods. We have also constructed a set of synthetic barcoded RNA molecules, which can be introduced as controls into the sample preparation mix and used to monitor the efficiency of library construction. The quantitative targeted sequencing revealed extremely low efficiency in standard library preparations, which were further confirmed by using synthetic barcoded RNA molecules. This finding shows that standard library preparation methods result in the loss of rare transcripts and highlights the need for monitoring library efficiency and for developing more efficient sample preparation methods.

  6. Development of Transcriptomic Markers for Population Analysis Using Restriction Site Associated RNA Sequencing (RARseq.

    Directory of Open Access Journals (Sweden)

    Magdy S Alabady

    Full Text Available We describe restriction site associated RNA sequencing (RARseq, an RNAseq-based genotype by sequencing (GBS method. It includes the construction of RNAseq libraries from double stranded cDNA digested with selected restriction enzymes. To test this, we constructed six single- and six-dual-digested RARseq libraries from six F2 pitcher plant individuals and sequenced them on a half of a Miseq run. On average, the de novo approach of population genome analysis detected 544 and 570 RNA SNPs, whereas the reference transcriptome-based approach revealed an average of 1907 and 1876 RNA SNPs per individual, from single- and dual-digested RARseq data, respectively. The average numbers of RNA SNPs and alleles per loci are 1.89 and 2.17, respectively. Our results suggest that the RARseq protocol allows good depth of coverage per loci for detecting RNA SNPs and polymorphic loci for population genomics and mapping analyses. In non-model systems where complete genomes sequences are not always available, RARseq data can be analyzed in reference to the transcriptome. In addition to enriching for functional markers, this method may prove particularly useful in organisms where the genomes are not favorable for DNA GBS.

  7. Phenotype classification of single cells using SRS microscopy, RNA sequencing, and microfluidics (Conference Presentation)

    Science.gov (United States)

    Streets, Aaron M.; Cao, Chen; Zhang, Xiannian; Huang, Yanyi

    2016-03-01

    Phenotype classification of single cells reveals biological variation that is masked in ensemble measurement. This heterogeneity is found in gene and protein expression as well as in cell morphology. Many techniques are available to probe phenotypic heterogeneity at the single cell level, for example quantitative imaging and single-cell RNA sequencing, but it is difficult to perform multiple assays on the same single cell. In order to directly track correlation between morphology and gene expression at the single cell level, we developed a microfluidic platform for quantitative coherent Raman imaging and immediate RNA sequencing (RNA-Seq) of single cells. With this device we actively sort and trap cells for analysis with stimulated Raman scattering microscopy (SRS). The cells are then processed in parallel pipelines for lysis, and preparation of cDNA for high-throughput transcriptome sequencing. SRS microscopy offers three-dimensional imaging with chemical specificity for quantitative analysis of protein and lipid distribution in single cells. Meanwhile, the microfluidic platform facilitates single-cell manipulation, minimizes contamination, and furthermore, provides improved RNA-Seq detection sensitivity and measurement precision, which is necessary for differentiating biological variability from technical noise. By combining coherent Raman microscopy with RNA sequencing, we can better understand the relationship between cellular morphology and gene expression at the single-cell level.

  8. Structure of mouse rRNA precursors. Complete sequence and potential folding of the spacer regions between 18S and 28S rRNA.

    OpenAIRE

    Michot, B; Bachellerie, J P; Raynal, F

    1983-01-01

    We have determined the complete nucleotide sequence of the regions of mouse ribosomal RNA transcription unit which separate mature rRNA genes. These internal transcribed spacers (ITS) are excised from rRNA precursor during ribosome biosynthesis. ITS 1, between 18S and 5.8S rRNA genes, is 999 nucleotides long. ITS 2, between 5.8S and 28S rRNA genes, is 1089 nucleotides long. Both spacers are very rich in G + C, 70 and 74% respectively. Mouse sequences have been compared with the other availabl...

  9. Mining small RNA sequencing data: a new approach to identify small nucleolar RNAs in Arabidopsis

    OpenAIRE

    Chen, Ho-Ming; Wu, Shu-Hsing

    2009-01-01

    Small nucleolar RNAs (snoRNAs) are noncoding RNAs that direct 2?-O-methylation or pseudouridylation on ribosomal RNAs or spliceosomal small nuclear RNAs. These modifications are needed to modulate the activity of ribosomes and spliceosomes. A comprehensive repertoire of snoRNAs is needed to expand the knowledge of these modifications. The sequences corresponding to snoRNAs in 18?26-nt small RNA sequencing data have been rarely explored and remain as a hidden treasure for snoRNA annotation. He...

  10. Construction of small RNA cDNA libraries for high-throughput sequencing.

    Science.gov (United States)

    Lu, Cheng; Shedge, Vikas

    2011-01-01

    Small RNAs (smRNAs) play an essential role in virtually every aspect of growth and development, by regulating gene expression at the post-transcriptional and/or transcriptional level. New high-throughput sequencing technology allows for a comprehensive coverage of smRNAs in any given biological sample, and has been widely used for profiling smRNA populations in various developmental stages, tissue and cell types, or normal and disease states. In this article, we describe the method used in our laboratory to construct smRNA cDNA libraries for high-throughput sequencing.

  11. Deep sequencing analysis of the developing mouse brain reveals a novel microRNA

    OpenAIRE

    Ling, King-Hwa; Brautigan, Peter J; Hahn, Christopher N; Daish, Tasman; Rayner, John R; Cheah, Pike-See; Raison, Joy M; Piltz, Sandra; Mann, Jeffrey R; Mattiske, Deidre M; Thomas, Paul Q; Adelson, David L; Scott, Hamish S

    2011-01-01

    Abstract Background MicroRNAs (miRNAs) are small non-coding RNAs that can exert multilevel inhibition/repression at a post-transcriptional or protein synthesis level during disease or development. Characterisation of miRNAs in adult mammalian brains by deep sequencing has been reported previously. However, to date, no small RNA profiling of the developing brain has been undertaken using this method. We have performed deep sequencing and small RNA analysis of a developing (E15.5) mouse brain. ...

  12. YM500v3: a database for small RNA sequencing in human cancer research.

    Science.gov (United States)

    Chung, I-Fang; Chang, Shing-Jyh; Chen, Chen-Yang; Liu, Shu-Hsuan; Li, Chia-Yang; Chan, Chia-Hao; Shih, Chuan-Chi; Cheng, Wei-Chung

    2017-01-04

    We previously presented the YM500 database, which contains >8000 small RNA sequencing (smRNA-seq) data sets and integrated analysis results for various cancer miRNome studies. In the updated YM500v3 database (http://ngs.ym.edu.tw/ym500/) presented herein, we not only focus on miRNAs but also on other functional small non-coding RNAs (sncRNAs), such as PIWI-interacting RNAs (piRNAs), tRNA-derived fragments (tRFs), small nuclear RNAs (snRNAs) and small nucleolar RNAs (snoRNAs). There is growing knowledge of the role of sncRNAs in gene regulation and tumorigenesis. We have also incorporated >10 000 cancer-related RNA-seq and >3000 more smRNA-seq data sets into the YM500v3 database. Furthermore, there are two main new sections, 'Survival' and 'Cancer', in this updated version. The 'Survival' section provides the survival analysis results in all cancer types or in a user-defined group of samples for a specific sncRNA. The 'Cancer' section provides the results of differential expression analyses, miRNA-gene interactions and cancer miRNA-related pathways. In the 'Expression' section, sncRNA expression profiles across cancer and sample types are newly provided. Cancer-related sncRNAs hold potential for both biotech applications and basic research. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  13. The impact of RNA sequence library construction protocols on transcriptomic profiling of leukemia.

    Science.gov (United States)

    Kumar, Ashwini; Kankainen, Matti; Parsons, Alun; Kallioniemi, Olli; Mattila, Pirkko; Heckman, Caroline A

    2017-08-17

    RNA sequencing (RNA-seq) has become an indispensable tool to identify disease associated transcriptional profiles and determine the molecular underpinnings of diseases. However, the broad adaptation of the methodology into the clinic is still hampered by inconsistent results from different RNA-seq protocols and involves further evaluation of its analytical reliability using patient samples. Here, we applied two commonly used RNA-seq library preparation protocols to samples from acute leukemia patients to understand how poly-A-tailed mRNA selection (PA) and ribo-depletion (RD) based RNA-seq library preparation protocols affect gene fusion detection, variant calling, and gene expression profiling. Overall, the protocols produced similar results with consistent outcomes. Nevertheless, the PA protocol was more efficient in quantifying expression of leukemia marker genes and showed better performance in the expression-based classification of leukemia. Independent qRT-PCR experiments verified that the PA protocol better represented total RNA compared to the RD protocol. In contrast, the RD protocol detected a higher number of non-coding RNA features and had better alignment efficiency. The RD protocol also recovered more known fusion-gene events, although variability was seen in fusion gene predictions. The overall findings provide a framework for the use of RNA-seq in a precision medicine setting with limited number of samples and suggest that selection of the library preparation protocol should be based on the objectives of the analysis.

  14. Ribosomal RNA gene sequences confirm that protistan endoparasite of larval cod Gadus morhua is Ichthyodinium sp

    DEFF Research Database (Denmark)

    Skovgaard, Alf; Meyer, Stefan; Overton, Julia Lynne

    2010-01-01

    An enigmatic protistan endoparasite found in eggs and larvae of cod Gadus morhua and turbot Psetta maxima was isolated from Baltic cod larvae, and DNA was extracted for sequencing of the parasite's small Subunit ribosomal RNA (SSU rRNA) gene. The endoparasite has previously been suggested...... to be related to Ichthyodinium chabelardi, a dinoflagellate-like protist that parasitizes yolk sacs of embryos and larvae of a variety of fish species. Comparison of a 1535 bp long fragment of the SSU rRNA gene of the cod endoparasite showed absolute identify with I. chabelardi, demonstrating that the 2...

  15. Identification of extracellular miRNA in archived serum samples by next-generation sequencing from RNA extracted using multiple methods.

    Science.gov (United States)

    Gautam, Aarti; Kumar, Raina; Dimitrov, George; Hoke, Allison; Hammamieh, Rasha; Jett, Marti

    2016-10-01

    miRNAs act as important regulators of gene expression by promoting mRNA degradation or by attenuating protein translation. Since miRNAs are stably expressed in bodily fluids, there is growing interest in profiling these miRNAs, as it is minimally invasive and cost-effective as a diagnostic matrix. A technical hurdle in studying miRNA dynamics is the ability to reliably extract miRNA as small sample volumes and low RNA abundance create challenges for extraction and downstream applications. The purpose of this study was to develop a pipeline for the recovery of miRNA using small volumes of archived serum samples. The RNA was extracted employing several widely utilized RNA isolation kits/methods with and without addition of a carrier. The small RNA library preparation was carried out using Illumina TruSeq small RNA kit and sequencing was carried out using Illumina platform. A fraction of five microliters of total RNA was used for library preparation as quantification is below the detection limit. We were able to profile miRNA levels in serum from all the methods tested. We found out that addition of nucleic acid based carrier molecules had higher numbers of processed reads but it did not enhance the mapping of any miRBase annotated sequences. However, some of the extraction procedures offer certain advantages: RNA extracted by TRIzol seemed to align to the miRBase best; extractions using TRIzol with carrier yielded higher miRNA-to-small RNA ratios. Nuclease free glycogen can be carrier of choice for miRNA sequencing. Our findings illustrate that miRNA extraction and quantification is influenced by the choice of methodologies. Addition of nucleic acid- based carrier molecules during extraction procedure is not a good choice when assaying miRNA using sequencing. The careful selection of an extraction method permits the archived serum samples to become valuable resources for high-throughput applications.

  16. Modeling bias and variation in the stochastic processes of small RNA sequencing.

    Science.gov (United States)

    Argyropoulos, Christos; Etheridge, Alton; Sakhanenko, Nikita; Galas, David

    2017-06-20

    The use of RNA-seq as the preferred method for the discovery and validation of small RNA biomarkers has been hindered by high quantitative variability and biased sequence counts. In this paper we develop a statistical model for sequence counts that accounts for ligase bias and stochastic variation in sequence counts. This model implies a linear quadratic relation between the mean and variance of sequence counts. Using a large number of sequencing datasets, we demonstrate how one can use the generalized additive models for location, scale and shape (GAMLSS) distributional regression framework to calculate and apply empirical correction factors for ligase bias. Bias correction could remove more than 40% of the bias for miRNAs. Empirical bias correction factors appear to be nearly constant over at least one and up to four orders of magnitude of total RNA input and independent of sample composition. Using synthetic mixes of known composition, we show that the GAMLSS approach can analyze differential expression with greater accuracy, higher sensitivity and specificity than six existing algorithms (DESeq2, edgeR, EBSeq, limma, DSS, voom) for the analysis of small RNA-seq data. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  17. Transcriptomic Analysis of C. elegans RNA Sequencing Data Through the Tuxedo Suite on the Galaxy Project.

    Science.gov (United States)

    Amrit, Francis R G; Ghazi, Arjumand

    2017-04-08

    Next generation sequencing (NGS) technologies have revolutionized the nature of biological investigation. Of these, RNA Sequencing (RNA-Seq) has emerged as a powerful tool for gene-expression analysis and transcriptome mapping. However, handling RNA-Seq datasets requires sophisticated computational expertise and poses inherent challenges for biology researchers. This bottleneck has been mitigated by the open access Galaxy project that allows users without bioinformatics skills to analyze RNA-Seq data, and the Database for Annotation, Visualization, and Integrated Discovery (DAVID), a Gene Ontology (GO) term analysis suite that helps derive biological meaning from large data sets. However, for first-time users and bioinformatics' amateurs, self-learning and familiarization with these platforms can be time-consuming and daunting. We describe a straightforward workflow that will help C. elegans researchers to isolate worm RNA, conduct an RNA-Seq experiment and analyze the data using Galaxy and DAVID platforms. This protocol provides stepwise instructions for using the various Galaxy modules for accessing raw NGS data, quality-control checks, alignment, and differential gene expression analysis, guiding the user with parameters at every step to generate a gene list that can be screened for enrichment of gene classes or biological processes using DAVID. Overall, we anticipate that this article will provide information to C. elegans researchers undertaking RNA-Seq experiments for the first time as well as frequent users running a small number of samples.

  18. TurboFold: Iterative probabilistic estimation of secondary structures for multiple RNA sequences

    Directory of Open Access Journals (Sweden)

    Sharma Gaurav

    2011-04-01

    Full Text Available Abstract Background The prediction of secondary structure, i.e. the set of canonical base pairs between nucleotides, is a first step in developing an understanding of the function of an RNA sequence. The most accurate computational methods predict conserved structures for a set of homologous RNA sequences. These methods usually suffer from high computational complexity. In this paper, TurboFold, a novel and efficient method for secondary structure prediction for multiple RNA sequences, is presented. Results TurboFold takes, as input, a set of homologous RNA sequences and outputs estimates of the base pairing probabilities for each sequence. The base pairing probabilities for a sequence are estimated by combining intrinsic information, derived from the sequence itself via the nearest neighbor thermodynamic model, with extrinsic information, derived from the other sequences in the input set. For a given sequence, the extrinsic information is computed by using pairwise-sequence-alignment-based probabilities for co-incidence with each of the other sequences, along with estimated base pairing probabilities, from the previous iteration, for the other sequences. The extrinsic information is introduced as free energy modifications for base pairing in a partition function computation based on the nearest neighbor thermodynamic model. This process yields updated estimates of base pairing probability. The updated base pairing probabilities in turn are used to recompute extrinsic information, resulting in the overall iterative estimation procedure that defines TurboFold. TurboFold is benchmarked on a number of ncRNA datasets and compared against alternative secondary structure prediction methods. The iterative procedure in TurboFold is shown to improve estimates of base pairing probability with each iteration, though only small gains are obtained beyond three iterations. Secondary structures composed of base pairs with estimated probabilities higher than a

  19. High-throughput sequencing of RNA silencing-associated small RNAs in olive (Olea europaea L..

    Directory of Open Access Journals (Sweden)

    Livia Donaire

    Full Text Available Small RNAs (sRNAs of 20 to 25 nucleotides (nt in length maintain genome integrity and control gene expression in a multitude of developmental and physiological processes. Despite RNA silencing has been primarily studied in model plants, the advent of high-throughput sequencing technologies has enabled profiling of the sRNA component of more than 40 plant species. Here, we used deep sequencing and molecular methods to report the first inventory of sRNAs in olive (Olea europaea L.. sRNA libraries prepared from juvenile and adult shoots revealed that the 24-nt class dominates the sRNA transcriptome and atypically accumulates to levels never seen in other plant species, suggesting an active role of heterochromatin silencing in the maintenance and integrity of its large genome. A total of 18 known miRNA families were identified in the libraries. Also, 5 other sRNAs derived from potential hairpin-like precursors remain as plausible miRNA candidates. RNA blots confirmed miRNA expression and suggested tissue- and/or developmental-specific expression patterns. Target mRNAs of conserved miRNAs were computationally predicted among the olive cDNA collection and experimentally validated through endonucleolytic cleavage assays. Finally, we use expression data to uncover genetic components of the miR156, miR172 and miR390/TAS3-derived trans-acting small interfering RNA (tasiRNA regulatory nodes, suggesting that these interactive networks controlling developmental transitions are fully operational in olive.

  20. High-Throughput Sequencing of RNA Silencing-Associated Small RNAs in Olive (Olea europaea L.)

    Science.gov (United States)

    Donaire, Livia; Pedrola, Laia; de la Rosa, Raúl; Llave, César

    2011-01-01

    Small RNAs (sRNAs) of 20 to 25 nucleotides (nt) in length maintain genome integrity and control gene expression in a multitude of developmental and physiological processes. Despite RNA silencing has been primarily studied in model plants, the advent of high-throughput sequencing technologies has enabled profiling of the sRNA component of more than 40 plant species. Here, we used deep sequencing and molecular methods to report the first inventory of sRNAs in olive (Olea europaea L.). sRNA libraries prepared from juvenile and adult shoots revealed that the 24-nt class dominates the sRNA transcriptome and atypically accumulates to levels never seen in other plant species, suggesting an active role of heterochromatin silencing in the maintenance and integrity of its large genome. A total of 18 known miRNA families were identified in the libraries. Also, 5 other sRNAs derived from potential hairpin-like precursors remain as plausible miRNA candidates. RNA blots confirmed miRNA expression and suggested tissue- and/or developmental-specific expression patterns. Target mRNAs of conserved miRNAs were computationally predicted among the olive cDNA collection and experimentally validated through endonucleolytic cleavage assays. Finally, we use expression data to uncover genetic components of the miR156, miR172 and miR390/TAS3-derived trans-acting small interfering RNA (tasiRNA) regulatory nodes, suggesting that these interactive networks controlling developmental transitions are fully operational in olive. PMID:22140484

  1. Combined sequencing of mRNA and DNA from human embryonic stem cells.

    Science.gov (United States)

    Mertes, Florian; Kuhl, Heiner; Wruck, Wasco; Lehrach, Hans; Adjaye, James

    2016-06-01

    Combined transcriptome and whole genome sequencing of the same ultra-low input sample down to single cells is a rapidly evolving approach for the analysis of rare cells. Besides stem cells, rare cells originating from tissues like tumor or biopsies, circulating tumor cells and cells from early embryonic development are under investigation. Herein we describe a universal method applicable for the analysis of minute amounts of sample material (150 to 200 cells) derived from sub-colony structures from human embryonic stem cells. The protocol comprises the combined isolation and separate amplification of poly(A) mRNA and whole genome DNA followed by next generation sequencing. Here we present a detailed description of the method developed and an overview of the results obtained for RNA and whole genome sequencing of human embryonic stem cells, sequencing data is available in the Gene Expression Omnibus (GEO) database under accession number GSE69471.

  2. Fast online and index-based algorithms for approximate search of RNA sequence-structure patterns

    Science.gov (United States)

    2013-01-01

    Background It is well known that the search for homologous RNAs is more effective if both sequence and structure information is incorporated into the search. However, current tools for searching with RNA sequence-structure patterns cannot fully handle mutations occurring on both these levels or are simply not fast enough for searching large sequence databases because of the high computational costs of the underlying sequence-structure alignment problem. Results We present new fast index-based and online algorithms for approximate matching of RNA sequence-structure patterns supporting a full set of edit operations on single bases and base pairs. Our methods efficiently compute semi-global alignments of structural RNA patterns and substrings of the target sequence whose costs satisfy a user-defined sequence-structure edit distance threshold. For this purpose, we introduce a new computing scheme to optimally reuse the entries of the required dynamic programming matrices for all substrings and combine it with a technique for avoiding the alignment computation of non-matching substrings. Our new index-based methods exploit suffix arrays preprocessed from the target database and achieve running times that are sublinear in the size of the searched sequences. To support the description of RNA molecules that fold into complex secondary structures with multiple ordered sequence-structure patterns, we use fast algorithms for the local or global chaining of approximate sequence-structure pattern matches. The chaining step removes spurious matches from the set of intermediate results, in particular of patterns with little specificity. In benchmark experiments on the Rfam database, our improved online algorithm is faster than the best previous method by up to factor 45. Our best new index-based algorithm achieves a speedup of factor 560. Conclusions The presented methods achieve considerable speedups compared to the best previous method. This, together with the expected

  3. Improved identification of Gordonia, Rhodococcus and Tsukamurella species by 5'-end 16S rRNA gene sequencing.

    Science.gov (United States)

    Wang, Tao; Kong, Fanrong; Chen, Sharon; Xiao, Meng; Sorrell, Tania; Wang, Xiaoyan; Wang, Shuo; Sintchenko, Vitali

    2011-01-01

    The identification of fastidious aerobic Actinomycetes such as Gordonia, Rhodococcus, and Tsukamurella has remained a challenge leading to clinically significant misclassifications. This study is intended to examine the feasibility of partial 5'-end 16S rRNA gene sequencing for the identification of Gordonia, Rhodococcus, and Tsukamurella, and defined potential reference sequences for species from each of these genera. The 16S rRNA gene sequence based identification algorithm for species identification was used and enhanced by aligning test sequences with reference sequences from the List of Prokaryotic Names with Standing in Nomenclature. Conventional PCR based 16S rRNA gene sequencing and the alignment of the isolate 16S rRNA gene sequence with reference sequences accurately identified 100% of clinical strains of aerobic Actinomycetes. While partial 16S rRNA gene sequences of reference type strains matched with the 16S rRNA gene sequences of 19 isolates in our data set, another 13 strains demonstrated a degree of polymorphism with a 1-4 bp difference in the regions of difference. 5'-end 606 bp 16S rRNA gene sequencing, coupled with the assignment of well defined reference sequences to clinically relevant species of bacteria, can be a useful strategy for improving the identification of clinically relevant aerobic Actinomycetes.

  4. RPiRLS: Quantitative Predictions of RNA Interacting with Any Protein of Known Sequence

    Directory of Open Access Journals (Sweden)

    Wen-Jun Shen

    2018-02-01

    Full Text Available RNA-protein interactions (RPIs have critical roles in numerous fundamental biological processes, such as post-transcriptional gene regulation, viral assembly, cellular defence and protein synthesis. As the number of available RNA-protein binding experimental data has increased rapidly due to high-throughput sequencing methods, it is now possible to measure and understand RNA-protein interactions by computational methods. In this study, we integrate a sequence-based derived kernel with regularized least squares to perform prediction. The derived kernel exploits the contextual information around an amino acid or a nucleic acid as well as the repetitive conserved motif information. We propose a novel machine learning method, called RPiRLS to predict the interaction between any RNA and protein of known sequences. For the RPiRLS classifier, each protein sequence comprises up to 20 diverse amino acids but for the RPiRLS-7G classifier, each protein sequence is represented by using 7-letter reduced alphabets based on their physiochemical properties. We evaluated both methods on a number of benchmark data sets and compared their performances with two newly developed and state-of-the-art methods, RPI-Pred and IPMiner. On the non-redundant benchmark test sets extracted from the PRIDB, the RPiRLS method outperformed RPI-Pred and IPMiner in terms of accuracy, specificity and sensitivity. Further, RPiRLS achieved an accuracy of 92% on the prediction of lncRNA-protein interactions. The proposed method can also be extended to construct RNA-protein interaction networks. The RPiRLS web server is freely available at http://bmc.med.stu.edu.cn/RPiRLS.

  5. The UEA sRNA workbench: a suite of tools for analysing and visualizing next generation sequencing microRNA and small RNA datasets.

    Science.gov (United States)

    Stocks, Matthew B; Moxon, Simon; Mapleson, Daniel; Woolfenden, Hugh C; Mohorianu, Irina; Folkes, Leighton; Schwach, Frank; Dalmay, Tamas; Moulton, Vincent

    2012-08-01

    RNA silencing is a complex, highly conserved mechanism mediated by small RNAs (sRNAs), such as microRNAs (miRNAs), that is known to be involved in a diverse set of biological functions including development, pathogen control, genome maintenance and response to environmental change. Advances in next generation sequencing technologies are producing increasingly large numbers of sRNA reads per sample at a fraction of the cost of previous methods. However, many bioinformatics tools do not scale accordingly, are cumbersome, or require extensive support from bioinformatics experts. Therefore, researchers need user-friendly, robust tools, capable of not only processing large sRNA datasets in a reasonable time frame but also presenting the results in an intuitive fashion and visualizing sRNA genomic features. Herein, we present the UEA sRNA workbench, a suite of tools that is a successor to the web-based UEA sRNA Toolkit, but in downloadable format and with several enhanced and additional features. The program and help pages are available at http://srna-workbench.cmp.uea.ac.uk. vincent.moulton@cmp.uea.ac.uk.

  6. RNA deep sequencing reveals differential microRNA expression during development of sea urchin and sea star.

    Directory of Open Access Journals (Sweden)

    Sabah Kadri

    Full Text Available microRNAs (miRNAs are small (20-23 nt, non-coding single stranded RNA molecules that act as post-transcriptional regulators of mRNA gene expression. They have been implicated in regulation of developmental processes in diverse organisms. The echinoderms, Strongylocentrotus purpuratus (sea urchin and Patiria miniata (sea star are excellent model organisms for studying development with well-characterized transcriptional networks. However, to date, nothing is known about the role of miRNAs during development in these organisms, except that the genes that are involved in the miRNA biogenesis pathway are expressed during their developmental stages. In this paper, we used Illumina Genome Analyzer (Illumina, Inc. to sequence small RNA libraries in mixed stage population of embryos from one to three days after fertilization of sea urchin and sea star (total of 22,670,000 reads. Analysis of these data revealed the miRNA populations in these two species. We found that 47 and 38 known miRNAs are expressed in sea urchin and sea star, respectively, during early development (32 in common. We also found 13 potentially novel miRNAs in the sea urchin embryonic library. miRNA expression is generally conserved between the two species during development, but 7 miRNAs are highly expressed in only one species. We expect that our two datasets will be a valuable resource for everyone working in the field of developmental biology and the regulatory networks that affect it. The computational pipeline to analyze Illumina reads is available at http://www.benoslab.pitt.edu/services.html.

  7. RNA deep sequencing reveals differential microRNA expression during development of sea urchin and sea star.

    Science.gov (United States)

    Kadri, Sabah; Hinman, Veronica F; Benos, Panayiotis V

    2011-01-01

    microRNAs (miRNAs) are small (20-23 nt), non-coding single stranded RNA molecules that act as post-transcriptional regulators of mRNA gene expression. They have been implicated in regulation of developmental processes in diverse organisms. The echinoderms, Strongylocentrotus purpuratus (sea urchin) and Patiria miniata (sea star) are excellent model organisms for studying development with well-characterized transcriptional networks. However, to date, nothing is known about the role of miRNAs during development in these organisms, except that the genes that are involved in the miRNA biogenesis pathway are expressed during their developmental stages. In this paper, we used Illumina Genome Analyzer (Illumina, Inc.) to sequence small RNA libraries in mixed stage population of embryos from one to three days after fertilization of sea urchin and sea star (total of 22,670,000 reads). Analysis of these data revealed the miRNA populations in these two species. We found that 47 and 38 known miRNAs are expressed in sea urchin and sea star, respectively, during early development (32 in common). We also found 13 potentially novel miRNAs in the sea urchin embryonic library. miRNA expression is generally conserved between the two species during development, but 7 miRNAs are highly expressed in only one species. We expect that our two datasets will be a valuable resource for everyone working in the field of developmental biology and the regulatory networks that affect it. The computational pipeline to analyze Illumina reads is available at http://www.benoslab.pitt.edu/services.html. © 2011 Kadri et al.

  8. RNA Deep Sequencing Reveals Differential MicroRNA Expression during Development of Sea Urchin and Sea Star

    Science.gov (United States)

    Kadri, Sabah; Hinman, Veronica F.; Benos, Panayiotis V.

    2011-01-01

    microRNAs (miRNAs) are small (20–23 nt), non-coding single stranded RNA molecules that act as post-transcriptional regulators of mRNA gene expression. They have been implicated in regulation of developmental processes in diverse organisms. The echinoderms, Strongylocentrotus purpuratus (sea urchin) and Patiria miniata (sea star) are excellent model organisms for studying development with well-characterized transcriptional networks. However, to date, nothing is known about the role of miRNAs during development in these organisms, except that the genes that are involved in the miRNA biogenesis pathway are expressed during their developmental stages. In this paper, we used Illumina Genome Analyzer (Illumina, Inc.) to sequence small RNA libraries in mixed stage population of embryos from one to three days after fertilization of sea urchin and sea star (total of 22,670,000 reads). Analysis of these data revealed the miRNA populations in these two species. We found that 47 and 38 known miRNAs are expressed in sea urchin and sea star, respectively, during early development (32 in common). We also found 13 potentially novel miRNAs in the sea urchin embryonic library. miRNA expression is generally conserved between the two species during development, but 7 miRNAs are highly expressed in only one species. We expect that our two datasets will be a valuable resource for everyone working in the field of developmental biology and the regulatory networks that affect it. The computational pipeline to analyze Illumina reads is available at http://www.benoslab.pitt.edu/services.html. PMID:22216218

  9. Accuracy of taxonomy prediction for 16S rRNA and fungal ITS sequences

    Directory of Open Access Journals (Sweden)

    Robert C. Edgar

    2018-04-01

    Full Text Available Prediction of taxonomy for marker gene sequences such as 16S ribosomal RNA (rRNA is a fundamental task in microbiology. Most experimentally observed sequences are diverged from reference sequences of authoritatively named organisms, creating a challenge for prediction methods. I assessed the accuracy of several algorithms using cross-validation by identity, a new benchmark strategy which explicitly models the variation in distances between query sequences and the closest entry in a reference database. When the accuracy of genus predictions was averaged over a representative range of identities with the reference database (100%, 99%, 97%, 95% and 90%, all tested methods had ≤50% accuracy on the currently-popular V4 region of 16S rRNA. Accuracy was found to fall rapidly with identity; for example, better methods were found to have V4 genus prediction accuracy of ∼100% at 100% identity but ∼50% at 97% identity. The relationship between identity and taxonomy was quantified as the probability that a rank is the lowest shared by a pair of sequences with a given pair-wise identity. With the V4 region, 95% identity was found to be a twilight zone where taxonomy is highly ambiguous because the probabilities that the lowest shared rank between pairs of sequences is genus, family, order or class are approximately equal.

  10. Identification and sequence determination of a novel double-stranded RNA mycovirus from the entomopathogenic fungus Beauveria bassiana.

    Science.gov (United States)

    Kotta-Loizou, Ioly; Sipkova, Jana; Coutts, Robert H A

    2015-03-01

    An isolate of the entomopathogenic fungus Beauveria bassiana was found to contain five double-stranded (ds) RNA elements ranging from 1.5 to more than 3 kbp. The complete sequence of the largest dsRNA element is described here. Analysis of the RdRp nucleotide sequence reveals its similarity to unclassified dsRNA elements, such as Alternaria longipes dsRNA virus 1, and its distant relationship to the RNA-dependent RNA polymerases of members of the family Partitiviridae.

  11. [Characterization of Black and Dichothrix Cyanobacteria Based on the 16S Ribosomal RNA Gene Sequence

    Science.gov (United States)

    Ortega, Maya

    2010-01-01

    My project focuses on characterizing different cyanobacteria in thrombolitic mats found on the island of Highborn Cay, Bahamas. Thrombolites are interesting ecosystems because of the ability of bacteria in these mats to remove carbon dioxide from the atmosphere and mineralize it as calcium carbonate. In the future they may be used as models to develop carbon sequestration technologies, which could be used as part of regenerative life systems in space. These thrombolitic communities are also significant because of their similarities to early communities of life on Earth. I targeted two cyanobacteria in my research, Dichothrix spp. and whatever black is, since they are believed to be important to carbon sequestration in these thrombolitic mats. The goal of my summer research project was to molecularly identify these two cyanobacteria. DNA was isolated from each organism through mat dissections and DNA extractions. I ran Polymerase Chain Reactions (PCR) to amplify the 16S ribosomal RNA (rRNA) gene in each cyanobacteria. This specific gene is found in almost all bacteria and is highly conserved, meaning any changes in the sequence are most likely due to evolution. As a result, the 16S rRNA gene can be used for bacterial identification of different species based on the sequence of their 16S rRNA gene. Since the exact sequence of the Dichothrix gene was unknown, I designed different primers that flanked the gene based on the known sequences from other taxonomically similar cyanobacteria. Once the 16S rRNA gene was amplified, I cloned the gene into specialized Escherichia coli cells and sent the gene products for sequencing. Once the sequence is obtained, it will be added to a genetic database for future reference to and classification of other Dichothrix sp.

  12. The impact of CRISPR repeat sequence on structures of a Cas6 protein-RNA complex

    Energy Technology Data Exchange (ETDEWEB)

    Wang, Ruiying; Zheng, Han; Preamplume, Gan; Shao, Yaming; Li, Hong [FSU

    2012-03-15

    The repeat-associated mysterious proteins (RAMPs) comprise the most abundant family of proteins involved in prokaryotic immunity against invading genetic elements conferred by the clustered regularly interspaced short palindromic repeat (CRISPR) system. Cas6 is one of the first characterized RAMP proteins and is a key enzyme required for CRISPR RNA maturation. Despite a strong structural homology with other RAMP proteins that bind hairpin RNA, Cas6 distinctly recognizes single-stranded RNA. Previous structural and biochemical studies show that Cas6 captures the 5' end while cleaving the 3' end of the CRISPR RNA. Here, we describe three structures and complementary biochemical analysis of a noncatalytic Cas6 homolog from Pyrococcus horikoshii bound to CRISPR repeat RNA of different sequences. Our study confirms the specificity of the Cas6 protein for single-stranded RNA and further reveals the importance of the bases at Positions 5-7 in Cas6-RNA interactions. Substitutions of these bases result in structural changes in the protein-RNA complex including its oligomerization state.

  13. CoRAL: predicting non-coding RNAs from small RNA-sequencing data.

    Science.gov (United States)

    Leung, Yuk Yee; Ryvkin, Paul; Ungar, Lyle H; Gregory, Brian D; Wang, Li-San

    2013-08-01

    The surprising observation that virtually the entire human genome is transcribed means we know little about the function of many emerging classes of RNAs, except their astounding diversities. Traditional RNA function prediction methods rely on sequence or alignment information, which are limited in their abilities to classify the various collections of non-coding RNAs (ncRNAs). To address this, we developed Classification of RNAs by Analysis of Length (CoRAL), a machine learning-based approach for classification of RNA molecules. CoRAL uses biologically interpretable features including fragment length and cleavage specificity to distinguish between different ncRNA populations. We evaluated CoRAL using genome-wide small RNA sequencing data sets from four human tissue types and were able to classify six different types of RNAs with ∼80% cross-validation accuracy. Analysis by CoRAL revealed that microRNAs, small nucleolar and transposon-derived RNAs are highly discernible and consistent across all human tissue types assessed, whereas long intergenic ncRNAs, small cytoplasmic RNAs and small nuclear RNAs show less consistent patterns. The ability to reliably annotate loci across tissue types demonstrates the potential of CoRAL to characterize ncRNAs using small RNA sequencing data in less well-characterized organisms.

  14. RNA sequencing (RNA-Seq of lymph node, spleen, and thymus transcriptome from wild Peninsular Malaysian cynomolgus macaque (Macaca fascicularis

    Directory of Open Access Journals (Sweden)

    Joey Ee Uli

    2017-08-01

    Full Text Available The cynomolgus macaque (Macaca fascicularis is an extensively utilised nonhuman primate model for biomedical research due to its biological, behavioural, and genetic similarities to humans. Genomic information of cynomolgus macaque is vital for research in various fields; however, there is presently a shortage of genomic information on the Malaysian cynomolgus macaque. This study aimed to sequence, assemble, annotate, and profile the Peninsular Malaysian cynomolgus macaque transcriptome derived from three tissues (lymph node, spleen, and thymus using RNA sequencing (RNA-Seq technology. A total of 174,208,078 paired end 70 base pair sequencing reads were obtained from the Illumina Hi-Seq 2500 sequencer. The overall mapping percentage of the sequencing reads to the M. fascicularis reference genome ranged from 53–63%. Categorisation of expressed genes to Gene Ontology (GO and KEGG pathway categories revealed that GO terms with the highest number of associated expressed genes include Cellular process, Catalytic activity, and Cell part, while for pathway categorisation, the majority of expressed genes in lymph node, spleen, and thymus fall under the Global overview and maps pathway category, while 266, 221, and 138 genes from lymph node, spleen, and thymus were respectively enriched in the Immune system category. Enriched Immune system pathways include Platelet activation pathway, Antigen processing and presentation, B cell receptor signalling pathway, and Intestinal immune network for IgA production. Differential gene expression analysis among the three tissues revealed 574 differentially expressed genes (DEG between lymph and spleen, 5402 DEGs between lymph and thymus, and 7008 DEGs between spleen and thymus. Venn diagram analysis of expressed genes revealed a total of 2,630, 253, and 279 tissue-specific genes respectively for lymph node, spleen, and thymus tissues. This is the first time the lymph node, spleen, and thymus transcriptome of the

  15. Global Perspectives on Activated Sludge Community Composition analyzed using 16S rRNA amplicon sequencing

    DEFF Research Database (Denmark)

    Nierychlo, Marta; Saunders, Aaron Marc; Albertsen, Mads

    Activated sludge is the most commonly applied bioprocess throughout the world for wastewater treatment. Microorganisms are key to the process, yet our knowledge of their identity and function is still limited. High-througput16S rRNA amplicon sequencing can reliably characterize microbial...

  16. Phylogenetic analysis of 23S rRNA gene sequences of some ...

    African Journals Online (AJOL)

    The phylogenetic relationships among thirteen Rhizobium leguminosarum bv. viciae isolates collected from various geographical regions were studied by analysis of the 23S rRNA sequences. The average of genetic distance among the studied isolates was very narrow (ranged from 0.00 to 0.04) and the studied isolates ...

  17. cDNA sequence of the long mRNA for human glutamine synthase

    NARCIS (Netherlands)

    van den Hoff, M. J.; Geerts, W. J.; Das, A. T.; Moorman, A. F.; Lamers, W. H.

    1991-01-01

    Screening a human liver cDNA library in lambda ZAP revealed several clones for the mRNA of glutamine synthase. The longest clone was completely sequenced and consists of a 109 bp 5' untranslated region, a 1119 bp protein coding region, a 1498 bp 3' untranslated region and a poly(A) tract of 12 bp

  18. Reproducible Analysis of Sequencing-Based RNA Structure Probing Data with User-Friendly Tools.

    Science.gov (United States)

    Kielpinski, Lukasz Jan; Sidiropoulos, Nikolaos; Vinther, Jeppe

    2015-01-01

    RNA structure-probing data can improve the prediction of RNA secondary and tertiary structure and allow structural changes to be identified and investigated. In recent years, massive parallel sequencing has dramatically improved the throughput of RNA structure probing experiments, but at the same time also made analysis of the data challenging for scientists without formal training in computational biology. Here, we discuss different strategies for data analysis of massive parallel sequencing-based structure-probing data. To facilitate reproducible and standardized analysis of this type of data, we have made a collection of tools, which allow raw sequencing reads to be converted to normalized probing values using different published strategies. In addition, we also provide tools for visualization of the probing data in the UCSC Genome Browser and for converting RNA coordinates to genomic coordinates and vice versa. The collection is implemented as functions in the R statistical environment and as tools in the Galaxy platform, making them easily accessible for the scientific community. We demonstrate the usefulness of the collection by applying it to the analysis of sequencing-based hydroxyl radical probing data and comparing different normalization strategies. © 2015 Elsevier Inc. All rights reserved.

  19. Prosthetic joint infection due to Lysobacter thermophilus diagnosed by 16S rRNA gene sequencing

    OpenAIRE

    B Dhawan; S Sebastian; R Malhotra; A Kapil; D Gautam

    2016-01-01

    We report the first case of prosthetic joint infection caused by Lysobacter thermophilus which was identified by 16S rRNA gene sequencing. Removal of prosthesis followed by antibiotic treatment resulted in good clinical outcome. This case illustrates the use of molecular diagnostics to detect uncommon organisms in suspected prosthetic infections.

  20. Prosthetic joint infection due to Lysobacter thermophilus diagnosed by 16S rRNA gene sequencing

    Directory of Open Access Journals (Sweden)

    B Dhawan

    2016-01-01

    Full Text Available We report the first case of prosthetic joint infection caused by Lysobacter thermophilus which was identified by 16S rRNA gene sequencing. Removal of prosthesis followed by antibiotic treatment resulted in good clinical outcome. This case illustrates the use of molecular diagnostics to detect uncommon organisms in suspected prosthetic infections.

  1. Phylogenetic analysis of 23S rRNA gene sequences of some ...

    African Journals Online (AJOL)

    Tuoyo Aghomotsegin

    2016-08-31

    Aug 31, 2016 ... The phylogenetic relationships among thirteen Rhizobium leguminosarum bv. viciae isolates collected from various geographical regions were studied by analysis of the 23S rRNA sequences. The average of genetic distance among the studied isolates was very narrow (ranged from 0.00 to 0.04) and the ...

  2. Profiling of Ribose Methylations in RNA by High-Throughput Sequencing

    DEFF Research Database (Denmark)

    Birkedal, Ulf; Christensen-Dalsgaard, Mikkel; Krogh, Nicolai

    2015-01-01

    Ribose methylations are the most abundant chemical modifications of ribosomal RNA and are critical for ribosome assembly and fidelity of translation. Many aspects of ribose methylations have been difficult to study due to lack of efficient mapping methods. Here, we present a sequencing-based method...

  3. microPIR: an integrated database of microRNA target sites within human promoter sequences.

    Directory of Open Access Journals (Sweden)

    Jittima Piriyapongsa

    Full Text Available BACKGROUND: microRNAs are generally understood to regulate gene expression through binding to target sequences within 3'-UTRs of mRNAs. Therefore, computational prediction of target sites is usually restricted to these gene regions. Recent experimental studies though have suggested that microRNAs may alternatively modulate gene expression by interacting with promoters. A database of potential microRNA target sites in promoters would stimulate research in this field leading to more understanding of complex microRNA regulatory mechanism. METHODOLOGY: We developed a database hosting predicted microRNA target sites located within human promoter sequences and their associated genomic features, called microPIR (microRNA-Promoter Interaction Resource. microRNA seed sequences were used to identify perfect complementary matching sequences in the human promoters and the potential target sites were predicted using the RNAhybrid program. >15 million target sites were identified which are located within 5000 bp upstream of all human genes, on both sense and antisense strands. The experimentally confirmed argonaute (AGO binding sites and EST expression data including the sequence conservation across vertebrate species of each predicted target are presented for researchers to appraise the quality of predicted target sites. The microPIR database integrates various annotated genomic sequence databases, e.g. repetitive elements, transcription factor binding sites, CpG islands, and SNPs, offering users the facility to extensively explore relationships among target sites and other genomic features. Furthermore, functional information of target genes including gene ontologies, KEGG pathways, and OMIM associations are provided. The built-in genome browser of microPIR provides a comprehensive view of multidimensional genomic data. Finally, microPIR incorporates a PCR primer design module to facilitate experimental validation. CONCLUSIONS: The proposed micro

  4. JNSViewer-A JavaScript-based Nucleotide Sequence Viewer for DNA/RNA secondary structures.

    Directory of Open Access Journals (Sweden)

    Jieming Shi

    Full Text Available Many tools are available for visualizing RNA or DNA secondary structures, but there is scarce implementation in JavaScript that provides seamless integration with the increasingly popular web computational platforms. We have developed JNSViewer, a highly interactive web service, which is bundled with several popular tools for DNA/RNA secondary structure prediction and can provide precise and interactive correspondence among nucleotides, dot-bracket data, secondary structure graphs, and genic annotations. In JNSViewer, users can perform RNA secondary structure predictions with different programs and settings, add customized genic annotations in GFF format to structure graphs, search for specific linear motifs, and extract relevant structure graphs of sub-sequences. JNSViewer also allows users to choose a transcript or specific segment of Arabidopsis thaliana genome sequences and predict the corresponding secondary structure. Popular genome browsers (i.e., JBrowse and BrowserGenome were integrated into JNSViewer to provide powerful visualizations of chromosomal locations, genic annotations, and secondary structures. In addition, we used StructureFold with default settings to predict some RNA structures for Arabidopsis by incorporating in vivo high-throughput RNA structure profiling data and stored the results in our web server, which might be a useful resource for RNA secondary structure studies in plants. JNSViewer is available at http://bioinfolab.miamioh.edu/jnsviewer/index.html.

  5. JNSViewer-A JavaScript-based Nucleotide Sequence Viewer for DNA/RNA secondary structures.

    Science.gov (United States)

    Shi, Jieming; Li, Xi; Dong, Min; Graham, Mitchell; Yadav, Nehul; Liang, Chun

    2017-01-01

    Many tools are available for visualizing RNA or DNA secondary structures, but there is scarce implementation in JavaScript that provides seamless integration with the increasingly popular web computational platforms. We have developed JNSViewer, a highly interactive web service, which is bundled with several popular tools for DNA/RNA secondary structure prediction and can provide precise and interactive correspondence among nucleotides, dot-bracket data, secondary structure graphs, and genic annotations. In JNSViewer, users can perform RNA secondary structure predictions with different programs and settings, add customized genic annotations in GFF format to structure graphs, search for specific linear motifs, and extract relevant structure graphs of sub-sequences. JNSViewer also allows users to choose a transcript or specific segment of Arabidopsis thaliana genome sequences and predict the corresponding secondary structure. Popular genome browsers (i.e., JBrowse and BrowserGenome) were integrated into JNSViewer to provide powerful visualizations of chromosomal locations, genic annotations, and secondary structures. In addition, we used StructureFold with default settings to predict some RNA structures for Arabidopsis by incorporating in vivo high-throughput RNA structure profiling data and stored the results in our web server, which might be a useful resource for RNA secondary structure studies in plants. JNSViewer is available at http://bioinfolab.miamioh.edu/jnsviewer/index.html.

  6. JNSViewer—A JavaScript-based Nucleotide Sequence Viewer for DNA/RNA secondary structures

    Science.gov (United States)

    Dong, Min; Graham, Mitchell; Yadav, Nehul

    2017-01-01

    Many tools are available for visualizing RNA or DNA secondary structures, but there is scarce implementation in JavaScript that provides seamless integration with the increasingly popular web computational platforms. We have developed JNSViewer, a highly interactive web service, which is bundled with several popular tools for DNA/RNA secondary structure prediction and can provide precise and interactive correspondence among nucleotides, dot-bracket data, secondary structure graphs, and genic annotations. In JNSViewer, users can perform RNA secondary structure predictions with different programs and settings, add customized genic annotations in GFF format to structure graphs, search for specific linear motifs, and extract relevant structure graphs of sub-sequences. JNSViewer also allows users to choose a transcript or specific segment of Arabidopsis thaliana genome sequences and predict the corresponding secondary structure. Popular genome browsers (i.e., JBrowse and BrowserGenome) were integrated into JNSViewer to provide powerful visualizations of chromosomal locations, genic annotations, and secondary structures. In addition, we used StructureFold with default settings to predict some RNA structures for Arabidopsis by incorporating in vivo high-throughput RNA structure profiling data and stored the results in our web server, which might be a useful resource for RNA secondary structure studies in plants. JNSViewer is available at http://bioinfolab.miamioh.edu/jnsviewer/index.html. PMID:28582416

  7. RNA sequencing uncovers antisense RNAs and novel small RNAs in Streptococcus pyogenes.

    Science.gov (United States)

    Le Rhun, Anaïs; Beer, Yan Yan; Reimegård, Johan; Chylinski, Krzysztof; Charpentier, Emmanuelle

    2016-01-01

    Streptococcus pyogenes is a human pathogen responsible for a wide spectrum of diseases ranging from mild to life-threatening infections. During the infectious process, the temporal and spatial expression of pathogenicity factors is tightly controlled by a complex network of protein and RNA regulators acting in response to various environmental signals. Here, we focus on the class of small RNA regulators (sRNAs) and present the first complete analysis of sRNA sequencing data in S. pyogenes. In the SF370 clinical isolate (M1 serotype), we identified 197 and 428 putative regulatory RNAs by visual inspection and bioinformatics screening of the sequencing data, respectively. Only 35 from the 197 candidates identified by visual screening were assigned a predicted function (T-boxes, ribosomal protein leaders, characterized riboswitches or sRNAs), indicating how little is known about sRNA regulation in S. pyogenes. By comparing our list of predicted sRNAs with previous S. pyogenes sRNA screens using bioinformatics or microarrays, 92 novel sRNAs were revealed, including antisense RNAs that are for the first time shown to be expressed in this pathogen. We experimentally validated the expression of 30 novel sRNAs and antisense RNAs. We show that the expression profile of 9 sRNAs including 2 predicted regulatory elements is affected by the endoribonucleases RNase III and/or RNase Y, highlighting the critical role of these enzymes in sRNA regulation.

  8. SHAPE Selection (SHAPES) enrich for RNA structure signal in SHAPE sequencing-based probing data.

    Science.gov (United States)

    Poulsen, Line Dahl; Kielpinski, Lukasz Jan; Salama, Sofie R; Krogh, Anders; Vinther, Jeppe

    2015-05-01

    Selective 2' Hydroxyl Acylation analyzed by Primer Extension (SHAPE) is an accurate method for probing of RNA secondary structure. In existing SHAPE methods, the SHAPE probing signal is normalized to a no-reagent control to correct for the background caused by premature termination of the reverse transcriptase. Here, we introduce a SHAPE Selection (SHAPES) reagent, N-propanone isatoic anhydride (NPIA), which retains the ability of SHAPE reagents to accurately probe RNA structure, but also allows covalent coupling between the SHAPES reagent and a biotin molecule. We demonstrate that SHAPES-based selection of cDNA-RNA hybrids on streptavidin beads effectively removes the large majority of background signal present in SHAPE probing data and that sequencing-based SHAPES data contain the same amount of RNA structure data as regular sequencing-based SHAPE data obtained through normalization to a no-reagent control. Moreover, the selection efficiently enriches for probed RNAs, suggesting that the SHAPES strategy will be useful for applications with high-background and low-probing signal such as in vivo RNA structure probing. © 2015 Poulsen et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  9. Structure of mouse rRNA precursors. Complete sequence and potential folding of the spacer regions between 18S and 28S rRNA.

    Science.gov (United States)

    Michot, B; Bachellerie, J P; Raynal, F

    1983-05-25

    We have determined the complete nucleotide sequence of the regions of mouse ribosomal RNA transcription unit which separate mature rRNA genes. These internal transcribed spacers (ITS) are excised from rRNA precursor during ribosome biosynthesis. ITS 1, between 18S and 5.8S rRNA genes, is 999 nucleotides long. ITS 2, between 5.8S and 28S rRNA genes, is 1089 nucleotides long. Both spacers are very rich in G + C, 70 and 74% respectively. Mouse sequences have been compared with the other available eukaryotes: while no homology is apparent with yeast or xenopus, mouse and rat ITS sequences have been largely conserved, with homologous segments interspersed with highly divergent tracts. Homology with rat is much more extensive for ITS 1 than for ITS 2. Tentative secondary structure models are proposed for the folding of these regions within rRNA precursor; they are closely related in mouse and rat.

  10. Riboprinting and 16S rRNA Gene Sequencing for Identification of Brewery Pediococcus Isolates

    Science.gov (United States)

    Barney, Michael; Volgyi, Antonia; Navarro, Alfonso; Ryder, David

    2001-01-01

    A total of 46 brewery and 15 ATCC Pediococcus isolates were ribotyped using a Qualicon RiboPrinter. Of these, 41 isolates were identified as Pediococcus damnosus using EcoRI digestion. Three ATCC reference strains had patterns similar to each other and matched 17 of the brewery isolates. Six other brewing isolates were similar to ATCC 25249. The other 18 P. damnosus brewery isolates had unique patterns. Of the remaining brewing isolates, one was identified as P. parvulus, two were identified as P. acidilactici, and two were identified as unique Pediococcus species. The use of alternate restriction endonucleases indicated that PstI and PvuII could further differentiate some strains having identical EcoRI profiles. An acid-resistant P. damnosus isolate could be distinguished from non-acid-resistant varieties of the same species using PstI instead of EcoRI. 16S rRNA gene sequence analysis was compared to riboprinting for identifying pediococci. The complete 16S rRNA gene was PCR amplified and sequenced from seven brewery isolates and three ATCC references with distinctive riboprint patterns. The 16S rRNA gene sequences from six different brewery P. damnosus isolates were homologous with a high degree of similarity to the GenBank reference strain but were identical to each other and one ATCC strain with the exception of 1 bp in one strain. A slime-producing, beer spoilage isolate had 16S rRNA gene sequence homology to the P. acidilactici reference strain, in agreement with the riboprint data. Although 16S rRNA gene sequencing correctly identified the genus and species of the test Pediococcus isolates, riboprinting proved to be a better method for subspecies differentiation. PMID:11157216

  11. Reanalysis of RNA-sequencing data reveals several additional fusion genes with multiple isoforms.

    Science.gov (United States)

    Kangaspeska, Sara; Hultsch, Susanne; Edgren, Henrik; Nicorici, Daniel; Murumägi, Astrid; Kallioniemi, Olli

    2012-01-01

    RNA-sequencing and tailored bioinformatic methodologies have paved the way for identification of expressed fusion genes from the chaotic genomes of solid tumors. We have recently successfully exploited RNA-sequencing for the discovery of 24 novel fusion genes in breast cancer. Here, we demonstrate the importance of continuous optimization of the bioinformatic methodology for this purpose, and report the discovery and experimental validation of 13 additional fusion genes from the same samples. Integration of copy number profiling with the RNA-sequencing results revealed that the majority of the gene fusions were promoter-donating events that occurred at copy number transition points or involved high-level DNA-amplifications. Sequencing of genomic fusion break points confirmed that DNA-level rearrangements underlie selected fusion transcripts. Furthermore, a significant portion (>60%) of the fusion genes were alternatively spliced. This illustrates the importance of reanalyzing sequencing data as gene definitions change and bioinformatic methods improve, and highlights the previously unforeseen isoform diversity among fusion transcripts.

  12. Long Noncoding RNA and mRNA Expression Profiles in the Thyroid Gland of Two Phenotypically Extreme Pig Breeds Using Ribo-Zero RNA Sequencing.

    Science.gov (United States)

    Shen, Yifei; Mao, Haiguang; Huang, Minjie; Chen, Lixing; Chen, Jiucheng; Cai, Zhaowei; Wang, Ying; Xu, Ningying

    2016-07-09

    The thyroid gland is an important endocrine organ modulating development, growth, and metabolism, mainly by controlling the synthesis and secretion of thyroid hormones (THs). However, little is known about the pig thyroid transcriptome. Long non-coding RNAs (lncRNAs) regulate gene expression and play critical roles in many cellular processes. Yorkshire pigs have a higher growth rate but lower fat deposition than that of Jinhua pigs, and thus, these species are ideal models for studying growth and lipid metabolism. This study revealed higher levels of THs in the serum of Yorkshire pigs than in the serum of Jinhua pigs. By using Ribo-zero RNA sequencing-which can capture both polyA and non-polyA transcripts-the thyroid transcriptome of both breeds were analyzed and 22,435 known mRNAs were found to be expressed in the pig thyroid. In addition, 1189 novel mRNAs and 1018 candidate lncRNA transcripts were detected. Multiple TH-synthesis-related genes were identified among the 455 differentially-expressed known mRNAs, 37 novel mRNAs, and 52 lncRNA transcripts. Bioinformatics analysis revealed that differentially-expressed genes were enriched in the microtubule-based process, which contributes to THs secretion. Moreover, integrating analysis predicted 13 potential lncRNA-mRNA gene pairs. These data expanded the repertoire of porcine lncRNAs and mRNAs and contribute to understanding the possible molecular mechanisms involved in animal growth and lipid metabolism.

  13. A comprehensive comparative review of sequence-based predictors of DNA- and RNA-binding residues.

    Science.gov (United States)

    Yan, Jing; Friedrich, Stefanie; Kurgan, Lukasz

    2016-01-01

    Motivated by the pressing need to characterize protein-DNA and protein-RNA interactions on large scale, we review a comprehensive set of 30 computational methods for high-throughput prediction of RNA- or DNA-binding residues from protein sequences. We summarize these predictors from several significant perspectives including their design, outputs and availability. We perform empirical assessment of methods that offer web servers using a new benchmark data set characterized by a more complete annotation that includes binding residues transferred from the same or similar proteins. We show that predictors of DNA-binding (RNA-binding) residues offer relatively strong predictive performance but they are unable to properly separate DNA- from RNA-binding residues. We design and empirically assess several types of consensuses and demonstrate that machine learning (ML)-based approaches provide improved predictive performance when compared with the individual predictors of DNA-binding residues or RNA-binding residues. We also formulate and execute first-of-its-kind study that targets combined prediction of DNA- and RNA-binding residues. We design and test three types of consensuses for this prediction and conclude that this novel approach that relies on ML design provides better predictive quality than individual predictors when tested on prediction of DNA- and RNA-binding residues individually. It also substantially improves discrimination between these two types of nucleic acids. Our results suggest that development of a new generation of predictors would benefit from using training data sets that combine both RNA- and DNA-binding proteins, designing new inputs that specifically target either DNA- or RNA-binding residues and pursuing combined prediction of DNA- and RNA-binding residues. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  14. A method for the construction of equalized directional cDNA libraries from hydrolyzed total RNA.

    Science.gov (United States)

    Davis, Claytus; Barvish, Zeev; Gitelman, Inna

    2007-10-09

    The transcribed sequences of a cell, the transcriptome, represent the trans-acting fraction of the genetic information, yet eukaryotic cDNA libraries are typically made from only the poly-adenylated fraction. The non-coding or translated but non-polyadenylated RNAs are therefore not represented. The goal of this study was to develop a method that would more completely represent the transcriptome in a useful format, avoiding over-representation of some of the abundant, but low-complexity non-translated transcripts. We developed a combination of self-subtraction and directional cloning procedures for this purpose. Libraries were prepared from partially degraded (hydrolyzed) total RNA from three different species. A restriction endonuclease site was added to the 3' end during first-strand synthesis using a directional random-priming technique. The abundant non-polyadenylated rRNA and tRNA sequences were largely removed by using self-subtraction to equalize the representation of the various RNA species. Sequencing random clones from the libraries showed that 87% of clones were in the forward orientation with respect to known or predicted transcripts. 70% matched identified or predicted translated RNAs in the sequence databases. Abundant mRNAs were less frequent in the self-subtracted libraries compared to a non-subtracted mRNA library. 3% of the sequences were from known or hypothesized ncRNA loci, including five matches to miRNA loci. We describe a simple method for making high-quality, directional, random-primed, cDNA libraries from small amounts of degraded total RNA. This technique is advantageous in situations where a cDNA library with complete but equalized representation of transcribed sequences, whether polyadenylated or not, is desired.

  15. A method for the construction of equalized directional cDNA libraries from hydrolyzed total RNA

    Directory of Open Access Journals (Sweden)

    Gitelman Inna

    2007-10-01

    Full Text Available Abstract Background The transcribed sequences of a cell, the transcriptome, represent the trans-acting fraction of the genetic information, yet eukaryotic cDNA libraries are typically made from only the poly-adenylated fraction. The non-coding or translated but non-polyadenylated RNAs are therefore not represented. The goal of this study was to develop a method that would more completely represent the transcriptome in a useful format, avoiding over-representation of some of the abundant, but low-complexity non-translated transcripts. Results We developed a combination of self-subtraction and directional cloning procedures for this purpose. Libraries were prepared from partially degraded (hydrolyzed total RNA from three different species. A restriction endonuclease site was added to the 3' end during first-strand synthesis using a directional random-priming technique. The abundant non-polyadenylated rRNA and tRNA sequences were largely removed by using self-subtraction to equalize the representation of the various RNA species. Sequencing random clones from the libraries showed that 87% of clones were in the forward orientation with respect to known or predicted transcripts. 70% matched identified or predicted translated RNAs in the sequence databases. Abundant mRNAs were less frequent in the self-subtracted libraries compared to a non-subtracted mRNA library. 3% of the sequences were from known or hypothesized ncRNA loci, including five matches to miRNA loci. Conclusion We describe a simple method for making high-quality, directional, random-primed, cDNA libraries from small amounts of degraded total RNA. This technique is advantageous in situations where a cDNA library with complete but equalized representation of transcribed sequences, whether polyadenylated or not, is desired.

  16. Accounting for technical noise in differential expression analysis of single-cell RNA sequencing data.

    Science.gov (United States)

    Jia, Cheng; Hu, Yu; Kelly, Derek; Kim, Junhyong; Li, Mingyao; Zhang, Nancy R

    2017-11-02

    Recent technological breakthroughs have made it possible to measure RNA expression at the single-cell level, thus paving the way for exploring expression heterogeneity among individual cells. Current single-cell RNA sequencing (scRNA-seq) protocols are complex and introduce technical biases that vary across cells, which can bias downstream analysis without proper adjustment. To account for cell-to-cell technical differences, we propose a statistical framework, TASC (Toolkit for Analysis of Single Cell RNA-seq), an empirical Bayes approach to reliably model the cell-specific dropout rates and amplification bias by use of external RNA spike-ins. TASC incorporates the technical parameters, which reflect cell-to-cell batch effects, into a hierarchical mixture model to estimate the biological variance of a gene and detect differentially expressed genes. More importantly, TASC is able to adjust for covariates to further eliminate confounding that may originate from cell size and cell cycle differences. In simulation and real scRNA-seq data, TASC achieves accurate Type I error control and displays competitive sensitivity and improved robustness to batch effects in differential expression analysis, compared to existing methods. TASC is programmed to be computationally efficient, taking advantage of multi-threaded parallelization. We believe that TASC will provide a robust platform for researchers to leverage the power of scRNA-seq. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  17. Novel approaches for bioinformatic analysis of salivary RNA sequencing data for development.

    Science.gov (United States)

    Kaczor-Urbanowicz, Karolina Elzbieta; Kim, Yong; Li, Feng; Galeev, Timur; Kitchen, Rob R; Gerstein, Mark; Koyano, Kikuye; Jeong, Sung-Hee; Wang, Xiaoyan; Elashoff, David; Kang, So Young; Kim, Su Mi; Kim, Kyoung; Kim, Sung; Chia, David; Xiao, Xinshu; Rozowsky, Joel; Wong, David T W

    2018-01-01

    Analysis of RNA sequencing (RNA-Seq) data in human saliva is challenging. Lack of standardization and unification of the bioinformatic procedures undermines saliva's diagnostic potential. Thus, it motivated us to perform this study. We applied principal pipelines for bioinformatic analysis of small RNA-Seq data of saliva of 98 healthy Korean volunteers including either direct or indirect mapping of the reads to the human genome using Bowtie1. Analysis of alignments to exogenous genomes by another pipeline revealed that almost all of the reads map to bacterial genomes. Thus, salivary exRNA has fundamental properties that warrant the design of unique additional steps while performing the bioinformatic analysis. Our pipelines can serve as potential guidelines for processing of RNA-Seq data of human saliva. Processing and analysis results of the experimental data generated by the exceRpt (v4.6.3) small RNA-seq pipeline (github.gersteinlab.org/exceRpt) are available from exRNA atlas (exrna-atlas.org). Alignment to exogenous genomes and their quantification results were used in this paper for the analyses of small RNAs of exogenous origin. dtww@ucla.edu. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  18. Comparison of DNA-, PMA-, and RNA-based 16S rRNA Illumina sequencing for detection of live bacteria in water

    OpenAIRE

    Li, Ru; Tun, Hein Min; Jahan, Musarrat; Zhang, Zhengxiao; Kumar, Ayush; Fernando, Dilantha; Farenhorst, Annemieke; Khafipour, Ehsan

    2017-01-01

    The limitation of 16S rRNA gene sequencing (DNA-based) for microbial community analyses in water is the inability to differentiate live (dormant cells as well as growing or non-growing metabolically active cells) and dead cells, which can lead to false positive results in the absence of live microbes. Propidium-monoazide (PMA) has been used to selectively remove DNA from dead cells during downstream sequencing process. In comparison, 16S rRNA sequencing (RNA-based) can target live microbial c...

  19. Phylogenetic relationships between Sarcocystis species from reindeer and other Sarcocystidae deduced from ssu rRNA gene sequences

    DEFF Research Database (Denmark)

    Dahlgren, S.S.; Oliveira, Rodrigo Gouveia; Gjerde, B.

    2008-01-01

    any effect on previously inferred phylogenetic relationships within the Sarcocystidae. The complete small subunit (ssu) rRNA gene sequences of all six Sarcocystis species from reindeer were used in the phylogenetic analyses along with ssu rRNA gene sequences of 85 other members of the Coccidea. Trees...

  20. A Guide RNA Sequence Design Platform for the CRISPR/Cas9 System for Model Organism Genomes

    Directory of Open Access Journals (Sweden)

    Ming Ma

    2013-01-01

    Full Text Available Cas9/CRISPR has been reported to efficiently induce targeted gene disruption and homologous recombination in both prokaryotic and eukaryotic cells. Thus, we developed a Guide RNA Sequence Design Platform for the Cas9/CRISPR silencing system for model organisms. The platform is easy to use for gRNA design with input query sequences. It finds potential targets by PAM and ranks them according to factors including uniqueness, SNP, RNA secondary structure, and AT content. The platform allows users to upload and share their experimental results. In addition, most guide RNA sequences from published papers have been put into our database.

  1. Identification of miRNA from Porphyra yezoensis by high-throughput sequencing and bioinformatics analysis.

    Science.gov (United States)

    Liang, Chengwei; Zhang, Xiaowen; Zou, Jian; Xu, Dong; Su, Feng; Ye, Naihao

    2010-05-19

    miRNAs are a class of non-coding, small RNAs that are approximately 22 nucleotides long and play important roles in the translational level regulation of gene expression by either directly binding or cleaving target mRNAs. The red alga, Porphyra yezoensis is one of the most important marine economic crops worldwide. To date, only a few miRNAs have been identified in green unicellar alga and there is no report about Porphyra miRNAs. To identify miRNAs in Porphyra yezoensis, a small RNA library was constructed. Solexa technology was used to perform high throughput sequencing of the library and subsequent bioinformatics analysis to identify novel miRNAs. Specifically, 180,557,942 reads produced 13,324 unique miRNAs representing 224 conserved miRNA families that have been identified in other plants species. In addition, seven novel putative miRNAs were predicted from a limited number of ESTs. The potential targets of these putative miRNAs were also predicted based on sequence homology search. This study provides a first large scale cloning and characterization of Porphyra miRNAs and their potential targets. These miRNAs belong to 224 conserved miRNA families and 7 miRNAs are novel in Porphyra. These miRNAs add to the growing database of new miRNA and lay the foundation for further understanding of miRNA function in the regulation of Porphyra yezoensis development.

  2. Calling genotypes from public RNA-sequencing data enables identification of genetic variants that affect gene-expression levels

    NARCIS (Netherlands)

    Deelen, Patrick; Zhernakova, Daria V.; de Haan, Mark; van der Sijde, Marijke; Bonder, Marc Jan; Karjalainen, Juha; van der Velde, K. Joeri; Abbott, Kristin M.; Fu, Jingyuan; Wijmenga, Cisca; Sinke, Richard J.; Swertz, Morris A.; Franke, Lude

    2015-01-01

    Background: RNA-sequencing (RNA-seq) is a powerful technique for the identification of genetic variants that affect gene-expression levels, either through expression quantitative trait locus (eQTL) mapping or through allele-specific expression (ASE) analysis. Given increasing numbers of RNA-seq

  3. Characterization of squid enolase mRNA: sequence analysis, tissue distribution, and axonal localization.

    Science.gov (United States)

    Chun, J T; Gioio, A E; Crispino, M; Giuditta, A; Kaplan, B B

    1995-08-01

    Enolase is a glycolytic enzyme whose amino acid sequence is highly conserved across a wide range of animal species. In mammals, enolase is known to be a dimeric protein composed of distinct but closely related subunits: alpha (non-neuronal), beta (muscle-specific), and gamma (neuron-specific). However, little information is available on the primary sequence of enolase in invertebrates. Here we report the isolation of two overlapping cDNA clones and the putative primary structure of the enzyme from the squid (Loligo pealii) nervous system. The composite sequence of those cDNA clones is 1575 bp and contains the entire coding region (1302 bp), as well as 66 and 207 bp of 5' and 3' untranslated sequence, respectively. Cross-species comparison of enolase primary structure reveals that squid enolase shares over 70% sequence identity to vertebrate forms of the enzyme. The greatest degree of sequence similarity was manifest to the alpha isoform of the human homologue. Results of Northern analysis revealed a single 1.6 kb mRNA species, the relative abundance of which differs approximately 10-fold between various tissues. Interestingly, evidence derived from in situ hybridization and polymerase chain reaction experiments indicate that the mRNA encoding enolase is present in the squid giant axon.

  4. On the optimal trimming of high-throughput mRNA sequence data

    Directory of Open Access Journals (Sweden)

    Matthew D MacManes

    2014-01-01

    Full Text Available The widespread and rapid adoption of high-throughput sequencing technologies has afforded researchers the opportunity to gain a deep understanding of genome level processes that underlie evolutionary change, and perhaps more importantly, the links between genotype and phenotype. In particular, researchers interested in functional biology and adaptation have used these technologies to sequence mRNA transcriptomes of specific tissues, which in turn are often compared to other tissues, or other individuals with different phenotypes. While these techniques are extremely powerful, careful attention to data quality is required. In particular, because high-throughput sequencing is more error-prone than traditional Sanger sequencing, quality trimming of sequence reads should be an important step in all data processing pipelines. While several software packages for quality trimming exist, no general guidelines for the specifics of trimming have been developed. Here, using empirically derived sequence data, I provide general recommendations regarding the optimal strength of trimming, specifically in mRNA-Seq studies. Although very aggressive quality trimming is common, this study suggests that a more gentle trimming, specifically of those nucleotides whose Phred score < 2 or < 5, is optimal for most studies across a wide variety of metrics.

  5. MicroRNA of the fifth-instar posterior silk gland of silkworm identified by Solexa sequencing

    Directory of Open Access Journals (Sweden)

    Jisheng Li

    2014-12-01

    Full Text Available No special studies have been focused on the microRNA (miRNA in the fifth-instar posterior silk gland of Bombyx mori. Here, using next-generation sequencing, we acquired 93.2 million processed reads from 10 small RNA libraries. In this paper, we tried to thoroughly describe how our dataset generated from deep sequencing which was recently published in BMC genomics. Results showed that our findings are largely enriched silkworm miRNA depository and may benefit us to reveal the miRNA functions in the process of silk production.

  6. Enhancing potency of siRNA targeting fusion genes by optimization outside of target sequence.

    Science.gov (United States)

    Gavrilov, Kseniya; Seo, Young-Eun; Tietjen, Gregory T; Cui, Jiajia; Cheng, Christopher J; Saltzman, W Mark

    2015-12-01

    Canonical siRNA design algorithms have become remarkably effective at predicting favorable binding regions within a target mRNA, but in some cases (e.g., a fusion junction site) region choice is restricted. In these instances, alternative approaches are necessary to obtain a highly potent silencing molecule. Here we focus on strategies for rational optimization of two siRNAs that target the junction sites of fusion oncogenes BCR-ABL and TMPRSS2-ERG. We demonstrate that modifying the termini of these siRNAs with a terminal G-U wobble pair or a carefully selected pair of terminal asymmetry-enhancing mismatches can result in an increase in potency at low doses. Importantly, we observed that improvements in silencing at the mRNA level do not necessarily translate to reductions in protein level and/or cell death. Decline in protein level is also heavily influenced by targeted protein half-life, and delivery vehicle toxicity can confound measures of cell death due to silencing. Therefore, for BCR-ABL, which has a long protein half-life that is difficult to overcome using siRNA, we also developed a nontoxic transfection vector: poly(lactic-coglycolic acid) nanoparticles that release siRNA over many days. We show that this system can achieve effective killing of leukemic cells. These findings provide insights into the implications of siRNA sequence for potency and suggest strategies for the design of more effective therapeutic siRNA molecules. Furthermore, this work points to the importance of integrating studies of siRNA design and delivery, while heeding and addressing potential limitations such as restricted targetable mRNA regions, long protein half-lives, and nonspecific toxicities.

  7. Analysis of unannotated equine transcripts identified by mRNA sequencing.

    Directory of Open Access Journals (Sweden)

    Stephen J Coleman

    Full Text Available Sequencing of equine mRNA (RNA-seq identified 428 putative transcripts which do not map to any previously annotated or predicted horse genes. Most of these encode the equine homologs of known protein-coding genes described in other species, yet the potential exists to identify novel and perhaps equine-specific gene structures. A set of 36 transcripts were prioritized for further study by filtering for levels of expression (depth of RNA-seq read coverage, distance from annotated features in the equine genome, the number of putative exons, and patterns of gene expression between tissues. From these, four were selected for further investigation based on predicted open reading frames of greater than or equal to 50 amino acids and lack of detectable homology to known genes across species. Sanger sequencing of RT-PCR amplicons from additional equine samples confirmed expression and structural annotation of each transcript. Functional predictions were made by conserved domain searches. A single transcript, expressed in the cerebellum, contains a putative kruppel-associated box (KRAB domain, suggesting a potential function associated with zinc finger proteins and transcriptional regulation. Overall levels of conserved synteny and sequence conservation across a 1MB region surrounding each transcript were approximately 73% compared to the human, canine, and bovine genomes; however, the four loci display some areas of low conservation and sequence inversion in regions that immediately flank these previously unannotated equine transcripts. Taken together, the evidence suggests that these four transcripts are likely to be equine-specific.

  8. Intraspecific sequence variation in 16S rRNA gene of Ureaplasma diversum isolates.

    Science.gov (United States)

    Marques, L M; Buzinhani, M; Guimaraes, A M S; Marques, R C P; Farias, S T; Neto, R L; Yamaguti, M; Oliveira, R C; Timenetsky, J

    2011-08-26

    Ureaplasma diversum infection in bulls may result in seminal vesiculitis, balanoposthitis and alterations in spermatozoids. In cows, it can cause placentitis, fetal alveolitis, abortion and the birth of weak calves. U. diversum ATCC 49782 (serogroups A), ATCC 49783 (serogroup C) and 34 field isolates were used for this study. These microorganisms were submitted to Polymerase Chain Reaction for 16S gene sequence determination using Taq High Fidelity and the products were purified and bi-directionally sequenced. Using the sequence obtained, a fragment containing four hypervariable regions was selected and nucleotide polymorphisms were identified based on their position within the 16S rRNA gene. Forty-four single nucleotide polymorphisms (SNP) were detected. The genotypic variability of the 16S rRNA gene of U. diversum isolates shows that the taxonomy classification of these organisms is likely much more complex than previously described and that 16S rRNA gene sequencing may be used to suggest an epidemiologic pattern of different origin strains. Copyright © 2011 Elsevier B.V. All rights reserved.

  9. Variability of persisting MHV RNA sequences constituting immune and replication-relevant domains.

    Science.gov (United States)

    Bergmann, C; Dimacali, E; Stohl, S; Wei, W; Lai, M M; Tahara, S; Marten, N

    1998-05-10

    Survivors of acute infection with the neurotropic JHM strain of mouse hepatitis virus develop a persistent infection of the central nervous system associated with chronic ongoing demyelination. Persistence is characterized by viral RNA in the absence of infectious virus. To associate persistence with possible immune evasion and/or replication defects, viral RNA from brains of acutely and persistently infected mice was examined for mutations by reverse transcriptase-PCR. Sequences analyzed included the encapsidation sequence (ECS), the transmembrane domains of the matrix (M) protein, and a cytotoxic T cell (CTL) epitope within the nucleocapsid (N) protein. The ECS, present only on genomic RNA, revealed minimal variability and was detected out to 120 days postinfection, suggesting low levels of replication. The M gene sequence also remained stable during persistence despite random mutations during the acute phase. Although the N gene sequence exhibited the greatest diversity, mutations were random and not selected for during persistence. A single exception was detected comprising a prominent Pro to Ser substitution in a region of N not associated with any known regulatory or immune function. Of the N gene mutations found within the CTL epitope in responder mice (H-2d), one resulted in reduced CTL recognition with no evidence of antagonist activity. However, this mutation was also detected in nonresponder mice (H-2b), suggesting that escape variants arising from CTL pressure play no role in establishing persistence in immunocompetent hosts infected as adults.

  10. Inference of high resolution HLA types using genome-wide RNA or DNA sequencing reads.

    Science.gov (United States)

    Bai, Yu; Ni, Min; Cooper, Blerta; Wei, Yi; Fury, Wen

    2014-05-01

    Accurate HLA typing at amino acid level (four-digit resolution) is critical in hematopoietic and organ transplantations, pathogenesis studies of autoimmune and infectious diseases, as well as the development of immunoncology therapies. With the rapid adoption of genome-wide sequencing in biomedical research, HLA typing based on transcriptome and whole exome/genome sequencing data becomes increasingly attractive due to its high throughput and convenience. However, unlike targeted amplicon sequencing, genome-wide sequencing often employs a reduced read length and coverage that impose great challenges in resolving the highly homologous HLA alleles. Though several algorithms exist and have been applied to four-digit typing, some deliver low to moderate accuracies, some output ambiguous predictions. Moreover, few methods suit diverse read lengths and depths, and both RNA and DNA sequencing inputs. New algorithms are therefore needed to leverage the accuracy and flexibility of HLA typing at high resolution using genome-wide sequencing data. We have developed a new algorithm named PHLAT to discover the most probable pair of HLA alleles at four-digit resolution or higher, via a unique integration of a candidate allele selection and a likelihood scoring. Over a comprehensive set of benchmarking data (a total of 768 HLA alleles) from both RNA and DNA sequencing and with a broad range of read lengths and coverage, PHLAT consistently achieves a high accuracy at four-digit (92%-95%) and two-digit resolutions (96%-99%), outcompeting most of the existing methods. It also supports targeted amplicon sequencing data from Illumina Miseq. PHLAT significantly leverages the accuracy and flexibility of high resolution HLA typing based on genome-wide sequencing data. It may benefit both basic and applied research in immunology and related fields as well as numerous clinical applications.

  11. Integrative microRNA and mRNA deep-sequencing expression profiling in endemic Burkitt lymphoma.

    Science.gov (United States)

    Oduor, Cliff I; Kaymaz, Yasin; Chelimo, Kiprotich; Otieno, Juliana A; Ong'echa, John Michael; Moormann, Ann M; Bailey, Jeffrey A

    2017-11-13

    Burkitt lymphoma (BL) is characterized by overexpression of the c-myc oncogene, which in the vast majority of cases is a consequence of an IGH/MYC translocation. While myc is the seminal event, BL is a complex amalgam of genetic and epigenetic changes causing dysregulation of both coding and non-coding transcripts. Emerging evidence suggest that abnormal modulation of mRNA transcription via miRNAs might be a significant factor in lymphomagenesis. However, the alterations in these miRNAs and their correlations to their putative mRNA targets have not been extensively studied relative to normal germinal center (GC) B cells. Using more sensitive and specific transcriptome deep sequencing, we compared previously published small miRNA and long mRNA of a set of GC B cells and eBL tumors. MiRWalk2.0 was used to identify the validated target genes for the deregulated miRNAs, which would be important for understanding the regulatory networks associated with eBL development. We found 211 differentially expressed (DE) genes (79 upregulated and 132 downregulated) and 49 DE miRNAs (22 up-regulated and 27 down-regulated). Gene Set enrichment analysis identified the enrichment of a set of MYC regulated genes. Network propagation-based method and correlated miRNA-mRNA expression analysis identified dysregulated miRNAs, including miR-17~95 cluster members and their target genes, which have diverse oncogenic properties to be critical to eBL lymphomagenesis. Central to all these findings, we observed the downregulation of ATM and NLK genes, which represent important regulators in response to DNA damage in eBL tumor cells. These tumor suppressors were targeted by multiple upregulated miRNAs (miR-19b-3p, miR-26a-5p, miR-30b-5p, miR-92a-5p and miR-27b-3p) which could account for their aberrant expression in eBL. Combined loss of p53 induction and function due to miRNA-mediated regulation of ATM and NLK, together with the upregulation of TFAP4, may be a central role for human miRNAs in e

  12. The brome mosaic virus 3' untranslated sequence regulates RNA replication, recombination, and virion assembly.

    Science.gov (United States)

    Rao, A L N; Cheng Kao, C

    2015-08-03

    The 3' untranslated region in each of the three genomic RNAs of Brome mosaic virus (BMV) is highly homologous and contains a sequence that folds into a tRNA-like structure (TLS). Experiments performed over the past four decades revealed that the BMV 3' TLS regulates many important steps in BMV infection. This review summarizes in vitro and in vivo studies of the roles of the BMV 3' TLS functioning as a minus-strand promoter, in RNA recombination, and to nucleate virion assembly. Copyright © 2015 Elsevier B.V. All rights reserved.

  13. Effect of chronic uremia on the transcriptional profile of the calcified aorta analyzed by RNA sequencing

    DEFF Research Database (Denmark)

    Rukov, Jakob Lewin; Gravesen, Eva; Mace, Maria L.

    2016-01-01

    The development of vascular calcification (VC) in chronic uremia (CU) is a tightly regulated process controlled by factors promoting and inhibiting mineralization. Next-generation high-throughput RNA sequencing (RNA-seq) is a powerful and sensitive tool for quantitative gene expression profiling...... with an expression level of >1 reads/kilobase transcript/million mapped reads, 2,663 genes were differentially expressed with 47% upregulated genes and 53% downregulated genes in uremic rats. Significantly deregulated genes were enriched for ontologies related to the extracellular matrix, response to wounding...

  14. RNA sequencing reveals a depletion of collagen targeting microRNAs in Dupuytren's disease.

    Science.gov (United States)

    Riester, Scott M; Arsoy, Diren; Camilleri, Emily T; Dudakovic, Amel; Paradise, Christopher R; Evans, Jared M; Torres-Mora, Jorge; Rizzo, Marco; Kloen, Peter; Julio, Marianna Kruithof-de; van Wijnen, Andre J; Kakar, Sanjeev

    2015-10-07

    Dupuytren's disease is an inherited disorder in which patients develop fibrotic contractures of the hand. Current treatment strategies include surgical excision or enzymatic digestion of fibrotic tissue. MicroRNAs, which are key posttranscriptional regulators of genes expression, have been shown to play an important regulatory role in disorders of fibrosis. Therefore in this investigation, we apply high throughput next generation RNA sequencing strategies to characterize microRNA expression in diseased and healthy palmar fascia to elucidate molecular mechanisms responsible for pathogenic fibrosis. We applied high throughput RNA sequencing techniques to quantify the expression of all known human microRNAs in Dupuytren's and control palmar fascia. MicroRNAs that were differentially expressed between diseased and healthy tissue samples were used for computational target prediction using the bioinformatics tool ComiR. Molecular pathways that were predicted to be differentially expressed based on computational analysis were validated by performing RT-qPCR on RNA extracted from diseased and non-diseased palmar fascia biopsies. A comparison of microRNAs expressed in Dupuytren's fascia and control fascia identified 74 microRNAs with a 2-fold enrichment in Dupuytren's tissue, and 32 microRNAs with enrichment in control fascia. Computational target prediction for differentially expressed microRNAs indicated preferential targeting of collagens and extracellular matrix related proteins in control palmar fascia. RT-qPCR confirmed the decreased expression of microRNA targeted collagens in control palmar fascia tissues. Control palmar fascia show decreased expression of mRNAs encoding collagens that are preferentially targeted by microRNAs enriched in non-diseased fascia. Thus alterations in microRNA regulatory networks may play an important role in driving the pathogenic fibrosis seen in Dupuytren's disease via direct regulatory effects on extracellular matrix protein synthesis

  15. Enhanced methods for unbiased deep sequencing of Lassa and Ebola RNA viruses from clinical and biological samples.

    Science.gov (United States)

    Matranga, Christian B; Andersen, Kristian G; Winnicki, Sarah; Busby, Michele; Gladden, Adrianne D; Tewhey, Ryan; Stremlau, Matthew; Berlin, Aaron; Gire, Stephen K; England, Eleina; Moses, Lina M; Mikkelsen, Tarjei S; Odia, Ikponmwonsa; Ehiane, Philomena E; Folarin, Onikepe; Goba, Augustine; Kahn, S Humarr; Grant, Donald S; Honko, Anna; Hensley, Lisa; Happi, Christian; Garry, Robert F; Malboeuf, Christine M; Birren, Bruce W; Gnirke, Andreas; Levin, Joshua Z; Sabeti, Pardis C

    2014-01-01

    We have developed a robust RNA sequencing method for generating complete de novo assemblies with intra-host variant calls of Lassa and Ebola virus genomes in clinical and biological samples. Our method uses targeted RNase H-based digestion to remove contaminating poly(rA) carrier and ribosomal RNA. This depletion step improves both the quality of data and quantity of informative reads in unbiased total RNA sequencing libraries. We have also developed a hybrid-selection protocol to further enrich the viral content of sequencing libraries. These protocols have enabled rapid deep sequencing of both Lassa and Ebola virus and are broadly applicable to other viral genomics studies.

  16. [Characterization of 5S rRNA gene sequence and secondary structure in gymnosperms].

    Science.gov (United States)

    Liu, Zhan-Lin; Zhang, Da-Ming; Wang, Xiao-Ru

    2003-01-01

    In higher plants the primary and the secondary structures of 5S ribosomal RNA gene are considered highly conservative. Little is known about the 5S rRNA gene structure, organization and variation in gyimnosperms. In this study we analyzed sequence and structure variation of 5S rRNA gene in Pinus through cloning and sequencing multiple copies of 5S rDNA repeats from individual trees of five pines, P. bungeana, P. tabulaeformis, P. yunnanensis, P. massoniana and P. densata. Pinus bungeana is from the subgenus Strobus while the other four are from the subgenus Pinus (diploxylon pines). Our results revealed variations in both primary and secondary structure among copies of 5S rDNA within individual genomes and between species. 5S rRNA gene in Pinus is 120 bp long in most of the 122 clones we sequenced except for one or two deletions in three clones. Among these clones 50 unique sequences were identified and they were shared by different pine species. Our sequences were compared to 13 sequences each representing a different gymnosperm species, and to six sequences representing both angiosperm monocots and dicots. Average sequence similarity was 97.1% among Pinus species and 94.3% between Pinus and other gymnosperms. Between gymnosperms and angiosperms the sequence similarity decreased to 88.1%. Similar to other molecular data, significant sequence divergence was found between the two Pinus subgenera. The 5S gene tree (neighbor-joining tree) grouped the four diploxylon pines together and separated them distinctly from P. bungeana. Comparison of sequence divergence within individuals and between species suggested that concerted evolution has been very weak especially after the divergence of the four diploxylon pines. The phylogenetic information contained in the 5S rRNA gene is limited due to its shorter length and the difficulties in identifying orthologous and paralogous copies of rDNA multigene family further complicate its phylogenetic application. Pinus densata is a

  17. visnormsc: A Graphical User Interface to Normalize Single-cell RNA Sequencing Data.

    Science.gov (United States)

    Tang, Lijun; Zhou, Nan

    2017-12-26

    Single-cell RNA sequencing (RNA-seq) allows the analysis of gene expression with high resolution. The intrinsic defects of this promising technology imports technical noise into the single-cell RNA-seq data, increasing the difficulty of accurate downstream inference. Normalization is a crucial step in single-cell RNA-seq data pre-processing. SCnorm is an accurate and efficient method that can be used for this purpose. An R implementation of this method is currently available. On one hand, the R package possesses many excellent features from R. On the other hand, R programming ability is required, which prevents the biologists who lack the skills from learning to use it quickly. To make this method more user-friendly, we developed a graphical user interface, visnormsc, for normalization of single-cell RNA-seq data. It is implemented in Python and is freely available at https://github.com/solo7773/visnormsc . Although visnormsc is based on the existing method, it contributes to this field by offering a user-friendly alternative. The out-of-the-box and cross-platform features make visnormsc easy to learn and to use. It is expected to serve biologists by simplifying single-cell RNA-seq normalization.

  18. Structure and sequence motifs of siRNA linked with in vitro down-regulation of morbillivirus gene expression.

    Science.gov (United States)

    de Almeida, Renata Servan; Keita, Djénéba; Libeau, Geneviève; Albina, Emmanuel

    2008-07-01

    The most challenging task in RNA interference is the design of active small interfering RNA (siRNA) sequences. Numerous strategies have been published to select siRNA. They have proved effective in some applications but have failed in many others. Nonetheless, all existing guidelines have been devised to select effective siRNAs targeting human or murine genes. They may not be appropriate to select functional sequences that target genes from other organisms like viruses. In this study, we have analyzed 62 siRNA duplexes of 19 bases targeting three genes of three morbilliviruses. In those duplexes, we have checked which features are associated with siRNA functionality. Our results suggest that the intramolecular secondary structure of the targeted mRNA contributes to siRNA efficiency. We also confirm that the presence of at least the sequence motifs U13, A or U19, as well as the absence of G13, cooperate to increase siRNA knockdown rates. Additionally, we observe that G11 is linked with siRNA efficacy. We believe that an algorithm based on these findings may help in the selection of functional siRNA sequences directed against viral genes.

  19. ssHMM: extracting intuitive sequence-structure motifs from high-throughput RNA-binding protein data.

    Science.gov (United States)

    Heller, David; Krestel, Ralf; Ohler, Uwe; Vingron, Martin; Marsico, Annalisa

    2017-11-02

    RNA-binding proteins (RBPs) play an important role in RNA post-transcriptional regulation and recognize target RNAs via sequence-structure motifs. The extent to which RNA structure influences protein binding in the presence or absence of a sequence motif is still poorly understood. Existing RNA motif finders either take the structure of the RNA only partially into account, or employ models which are not directly interpretable as sequence-structure motifs. We developed ssHMM, an RNA motif finder based on a hidden Markov model (HMM) and Gibbs sampling which fully captures the relationship between RNA sequence and secondary structure preference of a given RBP. Compared to previous methods which output separate logos for sequence and structure, it directly produces a combined sequence-structure motif when trained on a large set of sequences. ssHMM's model is visualized intuitively as a graph and facilitates biological interpretation. ssHMM can be used to find novel bona fide sequence-structure motifs of uncharacterized RBPs, such as the one presented here for the YY1 protein. ssHMM reaches a high motif recovery rate on synthetic data, it recovers known RBP motifs from CLIP-Seq data, and scales linearly on the input size, being considerably faster than MEMERIS and RNAcontext on large datasets while being on par with GraphProt. It is freely available on Github and as a Docker image. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. Inhibition of hepatitis B virus replication with linear DNA sequences expressing antiviral micro-RNA shuttles

    International Nuclear Information System (INIS)

    Chattopadhyay, Saket; Ely, Abdullah; Bloom, Kristie; Weinberg, Marc S.; Arbuthnot, Patrick

    2009-01-01

    RNA interference (RNAi) may be harnessed to inhibit viral gene expression and this approach is being developed to counter chronic infection with hepatitis B virus (HBV). Compared to synthetic RNAi activators, DNA expression cassettes that generate silencing sequences have advantages of sustained efficacy and ease of propagation in plasmid DNA (pDNA). However, the large size of pDNAs and inclusion of sequences conferring antibiotic resistance and immunostimulation limit delivery efficiency and safety. To develop use of alternative DNA templates that may be applied for therapeutic gene silencing, we assessed the usefulness of PCR-generated linear expression cassettes that produce anti-HBV micro-RNA (miR) shuttles. We found that silencing of HBV markers of replication was efficient (>75%) in cell culture and in vivo. miR shuttles were processed to form anti-HBV guide strands and there was no evidence of induction of the interferon response. Modification of terminal sequences to include flanking human adenoviral type-5 inverted terminal repeats was easily achieved and did not compromise silencing efficacy. These linear DNA sequences should have utility in the development of gene silencing applications where modifications of terminal elements with elimination of potentially harmful and non-essential sequences are required.

  1. Inhibition of hepatitis B virus replication with linear DNA sequences expressing antiviral micro-RNA shuttles

    Energy Technology Data Exchange (ETDEWEB)

    Chattopadhyay, Saket; Ely, Abdullah; Bloom, Kristie; Weinberg, Marc S. [Antiviral Gene Therapy Research Unit, University of the Witwatersrand (South Africa); Arbuthnot, Patrick, E-mail: Patrick.Arbuthnot@wits.ac.za [Antiviral Gene Therapy Research Unit, University of the Witwatersrand (South Africa)

    2009-11-20

    RNA interference (RNAi) may be harnessed to inhibit viral gene expression and this approach is being developed to counter chronic infection with hepatitis B virus (HBV). Compared to synthetic RNAi activators, DNA expression cassettes that generate silencing sequences have advantages of sustained efficacy and ease of propagation in plasmid DNA (pDNA). However, the large size of pDNAs and inclusion of sequences conferring antibiotic resistance and immunostimulation limit delivery efficiency and safety. To develop use of alternative DNA templates that may be applied for therapeutic gene silencing, we assessed the usefulness of PCR-generated linear expression cassettes that produce anti-HBV micro-RNA (miR) shuttles. We found that silencing of HBV markers of replication was efficient (>75%) in cell culture and in vivo. miR shuttles were processed to form anti-HBV guide strands and there was no evidence of induction of the interferon response. Modification of terminal sequences to include flanking human adenoviral type-5 inverted terminal repeats was easily achieved and did not compromise silencing efficacy. These linear DNA sequences should have utility in the development of gene silencing applications where modifications of terminal elements with elimination of potentially harmful and non-essential sequences are required.

  2. Construction of small RNA cDNA libraries for deep sequencing.

    Science.gov (United States)

    Lu, Cheng; Meyers, Blake C; Green, Pamela J

    2007-10-01

    Small RNAs (21-24 nucleotides) including microRNAs (miRNAs) and small interfering RNAs (siRNAs) are potent regulators of gene expression in both plants and animals. Several hundred genes encoding miRNAs and thousands of siRNAs have been experimentally identified by cloning approaches. New sequencing technologies facilitate the identification of these molecules and provide global quantitative expression data in a given biological sample. Here, we describe the methods used in our laboratory to construct small RNA cDNA libraries for high-throughput sequencing using technologies such as MPSS, 454 or SBS.

  3. MicroRNA Profiling in Aqueous Humor of Individual Human Eyes by Next-Generation Sequencing.

    Science.gov (United States)

    Wecker, Thomas; Hoffmeier, Klaus; Plötner, Anne; Grüning, Björn Andreas; Horres, Ralf; Backofen, Rolf; Reinhard, Thomas; Schlunck, Günther

    2016-04-01

    Extracellular microRNAs (miRNAs) in aqueous humor were suggested to have a role in transcellular signaling and may serve as disease biomarkers. The authors adopted next-generation sequencing (NGS) techniques to further characterize the miRNA profile in single samples of 60 to 80 μL human aqueous humor. Samples were obtained at the outset of cataract surgery in nine independent, otherwise healthy eyes. Four samples were used to extract RNA and generate sequencing libraries, followed by an adapter-driven amplification step, electrophoretic size selection, sequencing, and data analysis. Five samples were used for quantitative PCR (qPCR) validation of NGS results. Published NGS data on circulating miRNAs in blood were analyzed in comparison. One hundred fifty-eight miRNAs were consistently detected by NGS in all four samples; an additional 59 miRNAs were present in at least three samples. The aqueous humor miRNA profile shows some overlap with published NGS-derived inventories of circulating miRNAs in blood plasma with high prevalence of human miR-451a, -21, and -16. In contrast to blood, miR-184, -4448, -30a, -29a, -29c, -19a, -30d, -205, -24, -22, and -3074 were detected among the 20 most prevalent miRNAs in aqueous humor. Relative expression patterns of miR-451a, -202, and -144 suggested by NGS were confirmed by qPCR. Our data illustrate the feasibility of miRNA analysis by NGS in small individual aqueous humor samples. Intraocular cells as well as blood plasma contribute to the extracellular aqueous humor miRNome. The data suggest possible roles of miRNA in intraocular cell adhesion and signaling by TGF-β and Wnt, which are important in intraocular pressure regulation and glaucoma.

  4. Exploration of sequence space as the basis of viral RNA genome segmentation.

    Science.gov (United States)

    Moreno, Elena; Ojosnegros, Samuel; García-Arriaza, Juan; Escarmís, Cristina; Domingo, Esteban; Perales, Celia

    2014-05-06

    The mechanisms of viral RNA genome segmentation are unknown. On extensive passage of foot-and-mouth disease virus in baby hamster kidney-21 cells, the virus accumulated multiple point mutations and underwent a transition akin to genome segmentation. The standard single RNA genome molecule was replaced by genomes harboring internal in-frame deletions affecting the L- or capsid-coding region. These genomes were infectious and killed cells by complementation. Here we show that the point mutations in the nonstructural protein-coding region (P2, P3) that accumulated in the standard genome before segmentation increased the relative fitness of the segmented version relative to the standard genome. Fitness increase was documented by intracellular expression of virus-coded proteins and infectious progeny production by RNAs with the internal deletions placed in the sequence context of the parental and evolved genome. The complementation activity involved several viral proteins, one of them being the leader proteinase L. Thus, a history of genetic drift with accumulation of point mutations was needed to allow a major variation in the structure of a viral genome. Thus, exploration of sequence space by a viral genome (in this case an unsegmented RNA) can reach a point of the space in which a totally different genome structure (in this case, a segmented RNA) is favored over the form that performed the exploration.

  5. A Joint Bayesian Model for Integrating Microarray and RNA Sequencing Transcriptomic Data.

    Science.gov (United States)

    Ma, Tianzhou; Liang, Faming; Oesterreich, Steffi; Tseng, George C

    2017-07-01

    As the sequencing cost continued to drop in the past decade, RNA sequencing (RNA-seq) has replaced microarray to become the standard high-throughput experimental tool to analyze transcriptomic profile. As more and more datasets are generated and accumulated in the public domain, meta-analysis to combine multiple transcriptomic studies to increase statistical power has received increasing popularity. In this article, we propose a Bayesian hierarchical model to jointly integrate microarray and RNA-seq studies. Since systematic fold change differences across RNA-seq and microarray for detecting differentially expressed genes have been previously reported, we replicated this finding in several real datasets and showed that incorporation of a normalization procedure to account for the bias improves the detection accuracy and power. We compared our method with the popular two-stage Fisher's method using simulations and two real applications in a histological subtype (invasive lobular carcinoma) of breast cancer comparing PR+ versus PR- and early-stage versus late-stage patients. The result showed improved detection power and more significant and interpretable pathways enriched in the detected biomarkers from the proposed Bayesian model.

  6. Systematic Analysis of Small RNAs Associated with Human Mitochondria by Deep Sequencing: Detailed Analysis of Mitochondrial Associated miRNA

    Science.gov (United States)

    Sripada, Lakshmi; Tomar, Dhanendra; Prajapati, Paresh; Singh, Rochika; Singh, Arun Kumar; Singh, Rajesh

    2012-01-01

    Mitochondria are one of the central regulators of many cellular processes beyond its well established role in energy metabolism. The inter-organellar crosstalk is critical for the optimal function of mitochondria. Many nuclear encoded proteins and RNA are imported to mitochondria. The translocation of small RNA (sRNA) including miRNA to mitochondria and other sub-cellular organelle is still not clear. We characterized here sRNA including miRNA associated with human mitochondria by cellular fractionation and deep sequencing approach. Mitochondria were purified from HEK293 and HeLa cells for RNA isolation. The sRNA library was generated and sequenced using Illumina system. The analysis showed the presence of unique population of sRNA associated with mitochondria including miRNA. Putative novel miRNAs were characterized from unannotated sRNA sequences. The study showed the association of 428 known, 196 putative novel miRNAs to mitochondria of HEK293 and 327 known, 13 putative novel miRNAs to mitochondria of HeLa cells. The alignment of sRNA to mitochondrial genome was also studied. The targets were analyzed using DAVID to classify them in unique networks using GO and KEGG tools. Analysis of identified targets showed that miRNA associated with mitochondria regulates critical cellular processes like RNA turnover, apoptosis, cell cycle and nucleotide metabolism. The six miRNAs (counts >1000) associated with mitochondria of both HEK293 and HeLa were validated by RT-qPCR. To our knowledge, this is the first systematic study demonstrating the associations of sRNA including miRNA with mitochondria that may regulate site-specific turnover of target mRNA important for mitochondrial related functions. PMID:22984580

  7. CoverageAnalyzer (CAn: A Tool for Inspection of Modification Signatures in RNA Sequencing Profiles

    Directory of Open Access Journals (Sweden)

    Ralf Hauenschild

    2016-11-01

    Full Text Available Combination of reverse transcription (RT and deep sequencing has emerged as a powerful instrument for the detection of RNA modifications, a field that has seen a recent surge in activity because of its importance in gene regulation. Recent studies yielded high-resolution RT signatures of modified ribonucleotides relying on both sequence-dependent mismatch patterns and reverse transcription arrests. Common alignment viewers lack specialized functionality, such as filtering, tailored visualization, image export and differential analysis. Consequently, the community will profit from a platform seamlessly connecting detailed visual inspection of RT signatures and automated screening for modification candidates. CoverageAnalyzer (CAn was developed in response to the demand for a powerful inspection tool. It is freely available for all three main operating systems. With SAM file format as standard input, CAn is an intuitive and user-friendly tool that is generally applicable to the large community of biomedical users, starting from simple visualization of RNA sequencing (RNA-Seq data, up to sophisticated modification analysis with significance-based modification candidate calling.

  8. Preparation of Single-Cell RNA-Seq Libraries for Next Generation Sequencing.

    Science.gov (United States)

    Trombetta, John J; Gennert, David; Lu, Diana; Satija, Rahul; Shalek, Alex K; Regev, Aviv

    2014-07-01

    For the past several decades, due to technical limitations, the field of transcriptomics has focused on population-level measurements that can mask significant differences between individual cells. With the advent of single-cell RNA-Seq, it is now possible to profile the responses of individual cells at unprecedented depth and thereby uncover, transcriptome-wide, the heterogeneity that exists within these populations. This unit describes a method that merges several important technologies to produce, in high-throughput, single-cell RNA-Seq libraries. Complementary DNA (cDNA) is made from full-length mRNA transcripts using a reverse transcriptase that has terminal transferase activity. This, when combined with a second "template-switch" primer, allows for cDNAs to be constructed that have two universal priming sequences. Following preamplification from these common sequences, Nextera XT is used to prepare a pool of 96 uniquely indexed samples ready for Illumina sequencing. Copyright © 2014 John Wiley & Sons, Inc.

  9. Comparison of DNA-, PMA-, and RNA-based 16S rRNA Illumina sequencing for detection of live bacteria in water.

    Science.gov (United States)

    Li, Ru; Tun, Hein Min; Jahan, Musarrat; Zhang, Zhengxiao; Kumar, Ayush; Fernando, Dilantha; Farenhorst, Annemieke; Khafipour, Ehsan

    2017-07-18

    The limitation of 16S rRNA gene sequencing (DNA-based) for microbial community analyses in water is the inability to differentiate live (dormant cells as well as growing or non-growing metabolically active cells) and dead cells, which can lead to false positive results in the absence of live microbes. Propidium-monoazide (PMA) has been used to selectively remove DNA from dead cells during downstream sequencing process. In comparison, 16S rRNA sequencing (RNA-based) can target live microbial cells in water as both dormant and metabolically active cells produce rRNA. The objective of this study was to compare the efficiency and sensitivity of DNA-based, PMA-based and RNA-based 16S rRNA Illumina sequencing methodologies for live bacteria detection in water samples experimentally spiked with different combination of bacteria (2 gram-negative and 2 gram-positive/acid fast species either all live, all dead, or combinations of live and dead species) or obtained from different sources (First Nation community drinking water; city of Winnipeg tap water; water from Red River, Manitoba, Canada). The RNA-based method, while was superior for detection of live bacterial cells still identified a number of 16S rRNA targets in samples spiked with dead cells. In environmental water samples, the DNA- and PMA-based approaches perhaps overestimated the richness of microbial community compared to RNA-based method. Our results suggest that the RNA-based sequencing was superior to DNA- and PMA-based methods in detecting live bacterial cells in water.

  10. Comprehensive evaluation of extracellular small RNA isolation methods from serum in high throughput sequencing.

    Science.gov (United States)

    Guo, Yan; Vickers, Kasey; Xiong, Yanhua; Zhao, Shilin; Sheng, Quanhu; Zhang, Pan; Zhou, Wanding; Flynn, Charles R

    2017-01-07

    DNA and RNA fractions from whole blood, serum and plasma are increasingly popular analytes that are currently under investigation for their utility in the diagnosis and staging of disease. Small non-coding ribonucleic acids (sRNAs), specifically microRNAs (miRNAs) and their variant isoforms (isomiRs), and transfer RNA (tRNA)-derived small RNAs (tDRs) comprise a repertoire of molecules particularly promising in this regard. In this designed study, we compared the performance of various methods and kits for isolating circulating extracellular sRNAs (ex-sRNAs). ex-sRNAs from one healthy individual were isolated using five different isolation kits: Qiagen Circulating Nucleic Acid Kit, ThermoFisher Scientific Ambion TRIzol LS Reagent, Qiagen miRNEasy, QiaSymphony RNA extraction kit and the Exiqon MiRCURY RNA Isolation Kit. Each isolation method was repeated four times. A total of 20 small RNA sequencing (sRNAseq) libraries were constructed, sequenced and compared using a rigorous bioinformatics approach. The Circulating Nucleic Acid Kit had the greatest miRNA isolation variability, but had the lowest isolation variability for other RNA classes (isomiRs, tDRs, and other miscellaneous sRNAs (osRNA). However, the Circulating Nucleic Acid Kit consistently generated the fewest number of reads mapped to the genome, as compared to the best-performing method, Ambion TRIzol, which mapped 10% of the miRNAs, 7.2% of the tDRs and 23.1% of the osRNAs. The other methods performed intermediary, with QiaSymphony mapping 14% of the osRNAs, and miRNEasy mapping 4.6% of the tDRs and 2.9% of the miRNAs, achieving the second best kit performance rating overall. In summary, each isolation kit displayed different performance characteristics that could be construed as biased or advantageous, depending upon the downstream application and number of samples that require processing.

  11. The 5' non-translated region of Varroa destructor virus 1 (genus Iflavirus): structure prediction and IRES activity in Lymantria dispar cells

    NARCIS (Netherlands)

    Ongus, J.R.; Roode, E.C.; Pleij, C.W.A.; Vlak, J.M.; Oers, van M.M.

    2006-01-01

    Structure prediction of the 5' non-translated region (NTR) of four iflavirus RNAs revealed two types of potential internal ribosome entry site (IRES), which are discriminated by size and level of complexity, in this group of viruses. In contrast to the intergenic IRES of dicistroviruses, the

  12. Extensive 16S rRNA gene sequence diversity in Campylobacter hyointestinalis strains: taxonomic and applied implications

    DEFF Research Database (Denmark)

    Harrington, C.S.; On, Stephen L.W.

    1999-01-01

    Phylogenetic relationships of Campylobacter hyointestinalis subspecies were examined by means of 16S rRNA gene sequencing. Sequence similarities among C. hyointestinalis subsp. lawsonii strains exceeded 99.0 %, but values among C. hyointestinalis subsp. hyointestinalis strains ranged from 96...... of the genus Campylobacter, emphasizing the need for multiple strain analysis when using 16S rRNA gene sequence comparisons for taxonomic investigations........4 to 100 %. Sequence similarites between strains representing the two different subspecies ranged from 95.7 to 99.0 %. An intervening sequence was identified in certain of the C. hyointestinalis subsp. lawsonii strains. C. hyointestinalis strains occupied two distinct branches in a phylogenetic analysis...

  13. Genetic divergence of Asiatic Bdellocephala (Turbellaria, Tricladida, Paludicola) as revealed by partial 18S rRNA gene sequence comparisons.

    Science.gov (United States)

    Kuznedelov, K D; Timoshkin, O A; Goldman, E

    1997-01-01

    Polymerase chain reaction (PCR) and direct sequencing of small ribosomal RNA genes were used for analysis of genetic differences among Asiatic species of freshwater triclad genus Bdellocephala. Representatives of four species and four subspecies of this genus were used to establish homology between nucleotides in the 5'-end portion of small ribosomal RNA gene sequences. Within 552 nucleotide sites of aligned sequences compared, six variable base positions were discovered, dividing Bdellocephala into five different genotypes. Sequence data allow to distinguish two groups of these genotypes. One of them unites species from Kamchatka and Japan, another one unites Baikalian taxa. Agreement between available morphological, cytological and sequence data is discussed.

  14. Dataset of the transcribed 45S ribosomal RNA sequence of the tree crop “yerba mate”

    Directory of Open Access Journals (Sweden)

    Patricia M. Aguilera

    2017-06-01

    Full Text Available This contribution contains data related to the research article entitled “The 18S-25S ribosomal RNA unit of yerba mate (Ilex paraguariensis A. St.-Hil.” (Aguilera et al., 2016 [1]. Through a bioinformatic approach involving NGS data, we provide information of the transcribed 45S ribosomal RNA (rRNA sequence of yerba mate, the first reference for the Ilex L. genus. This dataset (Supplementary file 1 comprises information regarding the assembly and annotation of this rRNA unit. The generated data is applicable for comparative analysis and evolutionary studies among Ilex and related taxa. The raw sequencing data used here is available at DDBJ/EMBL/GenBank (NCBI Resource Coordinators, 2016 [2] Sequence Read Archive (SRA under the accession SRP043293 and the consensus 45S ribosomal RNA sequence has been deposited there under the accession GFHV00000000.

  15. Identification and analysis of pig chimeric mRNAs using RNA sequencing data

    Directory of Open Access Journals (Sweden)

    Ma Lei

    2012-08-01

    Full Text Available Abstract Background Gene fusion is ubiquitous over the course of evolution. It is expected to increase the diversity and complexity of transcriptomes and proteomes through chimeric sequence segments or altered regulation. However, chimeric mRNAs in pigs remain unclear. Here we identified some chimeric mRNAs in pigs and analyzed the expression of them across individuals and breeds using RNA-sequencing data. Results The present study identified 669 putative chimeric mRNAs in pigs, of which 251 chimeric candidates were detected in a set of RNA-sequencing data. The 618 candidates had clear trans-splicing sites, 537 of which obeyed the canonical GU-AG splice rule. Only two putative pig chimera variants whose fusion junction was overlapped with that of a known human chimeric mRNA were found. A set of unique chimeric events were considered middle variances in the expression across individuals and breeds, and revealed non-significant variance between sexes. Furthermore, the genomic region of the 5′ partner gene shares a similar DNA sequence with that of the 3′ partner gene for 458 putative chimeric mRNAs. The 81 of those shared DNA sequences significantly matched the known DNA-binding motifs in the JASPAR CORE database. Four DNA motifs shared in parental genomic regions had significant similarity with known human CTCF binding sites. Conclusions The present study provided detailed information on some pig chimeric mRNAs. We proposed a model that trans-acting factors, such as CTCF, induced the spatial organisation of parental genes to the same transcriptional factory so that parental genes were coordinatively transcribed to give birth to chimeric mRNAs.

  16. Enteroviral RNA sequences detected by polymerase chain reaction in muscle of patients with postviral fatigue syndrome.

    Science.gov (United States)

    Gow, J W; Behan, W M; Clements, G B; Woodall, C; Riding, M; Behan, P O

    1991-03-23

    To determine the presence of enteroviral sequences in muscle of patients with the postviral fatigue syndrome. Detection of sequences with the polymerase chain reaction in a well defined group of patients with the syndrome and controls over the same period. Institute of Neurological Sciences, Glasgow. 60 consecutive patients admitted to the institute with the postviral fatigue syndrome who had undergone extensive investigation to exclude other conditions. 41 controls from the same catchment area without evidence of fatigue, all undergoing routine surgery. Routine investigations, serological screen for antibodies to a range of viruses, and presence of enteroviral RNA sequences in muscle biopsy specimens. 15 (25%) patients and 10 (24.4%) controls had important serological findings. 12 patients had neutralising antibody titres of greater than or equal to 256 to coxsackieviruses B1-5 (six positive for enteroviral RNA sequences, six negative); three were positive for Epstein-Barr virus specific IgM (two positive, one negative). Six controls had similar neutralising antibody titres to coxsackieviruses (all negative); one was positive for Epstein-Barr virus specific IgM (negative); and three had titres of complement fixing antibody greater than or equal to 256 to cytomegalovirus (all negative). Overall, significantly more patients than controls had enteroviral RNA sequences in muscle (32/60, 53% v 6/41, 15%; odds ratio 6.7, 95% confidence interval 2.4 to 18.2). This was not correlated with duration of disease, patient and age, or to raised titres of antibodies to coxsackieviruses B1-5. Persistent enteroviral infection of muscle may occur in some patients with postviral fatigue syndrome and may have an aetiological role.

  17. A phylogenetic framework for the kingdom Fungi based on 18S rRNA gene sequences.

    Science.gov (United States)

    Yarza, Pablo; Yilmaz, Pelin; Panzer, Katrin; Glöckner, Frank Oliver; Reich, Marlis

    2017-12-01

    The usage of molecular phylogenetic approaches is critical to advance the understanding of systematics and community processes in the kingdom Fungi. Among the possible phylogenetic markers (or combinations of them), the 18S rRNA gene appears currently as the most prominent candidate due to its large availability in public databases and informative content. The purpose of this work was the creation of a reference phylogenetic framework that can serve as ready-to-use package for its application on fungal classification and community analysis. The current database contains 9329 representative 18S rRNA gene sequences covering the whole fungal kingdom, a manually curated alignment, an annotated and revised phylogenetic tree with all the sequence entries, updated information on current taxonomy, and recommendations of use. Out of 201 total fungal taxa with more than two sequences in the dataset, 179 were monophyletic. From another perspective, 66% of the entries had a tree-derived classification identical to that obtained from the NCBI taxonomy, whereas 34% differed in one or the other rank. Most of the differences were associated to missing taxonomic assignments in NCBI taxonomy, or the unexpected position of sequences that positioned out of their theoretically corresponding clades. The strong correlation observed with current fungal taxonomy evidences that 18S rRNA gene sequence-based phylogenies are adequate to reflect genealogy of Fungi at the levels of order and above, and justify their further usage and exploration. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.

  18. RNA Sequencing Reveals that Kaposi Sarcoma-Associated Herpesvirus Infection Mimics Hypoxia Gene Expression Signature

    Science.gov (United States)

    Viollet, Coralie; Davis, David A.; Tekeste, Shewit S.; Reczko, Martin; Pezzella, Francesco; Ragoussis, Jiannis

    2017-01-01

    Kaposi sarcoma-associated herpesvirus (KSHV) causes several tumors and hyperproliferative disorders. Hypoxia and hypoxia-inducible factors (HIFs) activate latent and lytic KSHV genes, and several KSHV proteins increase the cellular levels of HIF. Here, we used RNA sequencing, qRT-PCR, Taqman assays, and pathway analysis to explore the miRNA and mRNA response of uninfected and KSHV-infected cells to hypoxia, to compare this with the genetic changes seen in chronic latent KSHV infection, and to explore the degree to which hypoxia and KSHV infection interact in modulating mRNA and miRNA expression. We found that the gene expression signatures for KSHV infection and hypoxia have a 34% overlap. Moreover, there were considerable similarities between the genes up-regulated by hypoxia in uninfected (SLK) and in KSHV-infected (SLKK) cells. hsa-miR-210, a HIF-target known to have pro-angiogenic and anti-apoptotic properties, was significantly up-regulated by both KSHV infection and hypoxia using Taqman assays. Interestingly, expression of KSHV-encoded miRNAs was not affected by hypoxia. These results demonstrate that KSHV harnesses a part of the hypoxic cellular response and that a substantial portion of hypoxia-induced changes in cellular gene expression are induced by KSHV infection. Therefore, targeting hypoxic pathways may be a useful way to develop therapeutic strategies for KSHV-related diseases. PMID:28046107

  19. RNA-sequencing study of peripheral blood monocytes in chronic periodontitis.

    Science.gov (United States)

    Liu, Yao-Zhong; Maney, Pooja; Puri, Jyoti; Zhou, Yu; Baddoo, Melody; Strong, Michael; Wang, Yu-Ping; Flemington, Erik; Deng, Hong-Wen

    2016-05-01

    Monocytes are an important cell type in chronic periodontitis (CP) by interacting with oral bacteria and mediating host immune response. The aim of this study was to reveal new functional genes and pathways for CP at monocyte transcriptomic level. We performed an RNA-sequencing (RNA-seq) study of peripheral blood monocytes (PBMs) in 5 non-smoking moderate to severe CP (case) individuals vs. 5 controls. We took advantage of a microarray study of periodontitis to support our findings. We also performed pathway-based analysis on the identified differentially expressed (DEx) transcripts/isoforms using DAVID (Database for Annotation, Visualization and Integrated Discovery). Through differential expression analyses at both whole gene (or whole non-coding RNA) and isoform levels, we identified 380 DEx transcripts and 5955 DEx isoforms with a PPEE (posterior probability of equal expression) of FACR and CUX1) that have functions to interact with invading microorganisms or enhance TNF production on lipopolysaccharide stimulation. DAVID analysis of both the RNA-seq and the microarray datasets leads to converging evidence supporting "endocytosis", "cytokine production" and "apoptosis" as significant biological processes in CP. As the first RNA-seq study of PBMs for CP, this study provided novel findings at both gene (e.g., FCAR and CUX1) and biological process level. The findings will contribute to better understanding of CP disease mechanisms. Copyright © 2016 Elsevier B.V. All rights reserved.

  20. ncRNAclassifier: a tool for detection and classification of transposable element sequences in RNA hairpins

    Directory of Open Access Journals (Sweden)

    Tempel Sébastien

    2012-09-01

    Full Text Available Abstract Background Inverted repeat genes encode precursor RNAs characterized by hairpin structures. These RNA hairpins are then metabolized by biosynthetic pathways to produce functional small RNAs. In eukaryotic genomes, short non-autonomous transposable elements can have similar size and hairpin structures as non-coding precursor RNAs. This resemblance leads to problems annotating small RNAs. Results We mapped all microRNA precursors from miRBASE to several genomes and studied the repetition and dispersion of the corresponding loci. We then searched for repetitive elements overlapping these loci. We developed an automatic method called ncRNAclassifier to classify pre-ncRNAs according to their relationship with transposable elements (TEs. We showed that there is a correlation between the number of scattered occurrences of ncRNA precursor candidates and the presence of TEs. We applied ncRNAclassifier on six chordate genomes and report our findings. Among the 1,426 human and 721 mouse pre-miRNAs of miRBase, we identified 235 and 68 mis-annotated pre-miRNAs respectively corresponding completely to TEs. Conclusions We provide a tool enabling the identification of repetitive elements in precursor ncRNA sequences. ncRNAclassifier is available at http://EvryRNA.ibisc.univ-evry.fr.

  1. Gene profiling of bone around orthodontic mini-implants by RNA-sequencing analysis.

    Science.gov (United States)

    Nahm, Kyung-Yen; Heo, Jung Sun; Lee, Jae-Hyung; Lee, Dong-Yeol; Chung, Kyu-Rhim; Ahn, Hyo-Won; Kim, Seong-Hun

    2015-01-01

    This study aimed to evaluate the genes that were expressed in the healing bones around SLA-treated titanium orthodontic mini-implants in a beagle at early (1-week) and late (4-week) stages with RNA-sequencing (RNA-Seq). Samples from sites of surgical defects were used as controls. Total RNA was extracted from the tissue around the implants, and an RNA-Seq analysis was performed with Illumina TruSeq. In the 1-week group, genes in the gene ontology (GO) categories of cell growth and the extracellular matrix (ECM) were upregulated, while genes in the categories of the oxidation-reduction process, intermediate filaments, and structural molecule activity were downregulated. In the 4-week group, the genes upregulated included ECM binding, stem cell fate specification, and intramembranous ossification, while genes in the oxidation-reduction process category were downregulated. GO analysis revealed an upregulation of genes that were related to significant mechanisms, including those with roles in cell proliferation, the ECM, growth factors, and osteogenic-related pathways, which are associated with bone formation. From these results, implant-induced bone formation progressed considerably during the times examined in this study. The upregulation or downregulation of selected genes was confirmed with real-time reverse transcription polymerase chain reaction. The RNA-Seq strategy was useful for defining the biological responses to orthodontic mini-implants and identifying the specific genetic networks for targeted evaluations of successful peri-implant bone remodeling.

  2. Deep sequencing analysis of the developing mouse brain reveals a novel microRNA

    Directory of Open Access Journals (Sweden)

    Piltz Sandra

    2011-04-01

    Full Text Available Abstract Background MicroRNAs (miRNAs are small non-coding RNAs that can exert multilevel inhibition/repression at a post-transcriptional or protein synthesis level during disease or development. Characterisation of miRNAs in adult mammalian brains by deep sequencing has been reported previously. However, to date, no small RNA profiling of the developing brain has been undertaken using this method. We have performed deep sequencing and small RNA analysis of a developing (E15.5 mouse brain. Results We identified the expression of 294 known miRNAs in the E15.5 developing mouse brain, which were mostly represented by let-7 family and other brain-specific miRNAs such as miR-9 and miR-124. We also discovered 4 putative 22-23 nt miRNAs: mm_br_e15_1181, mm_br_e15_279920, mm_br_e15_96719 and mm_br_e15_294354 each with a 70-76 nt predicted pre-miRNA. We validated the 4 putative miRNAs and further characterised one of them, mm_br_e15_1181, throughout embryogenesis. Mm_br_e15_1181 biogenesis was Dicer1-dependent and was expressed in E3.5 blastocysts and E7 whole embryos. Embryo-wide expression patterns were observed at E9.5 and E11.5 followed by a near complete loss of expression by E13.5, with expression restricted to a specialised layer of cells within the developing and early postnatal brain. Mm_br_e15_1181 was upregulated during neurodifferentiation of P19 teratocarcinoma cells. This novel miRNA has been identified as miR-3099. Conclusions We have generated and analysed the first deep sequencing dataset of small RNA sequences of the developing mouse brain. The analysis revealed a novel miRNA, miR-3099, with potential regulatory effects on early embryogenesis, and involvement in neuronal cell differentiation/function in the brain during late embryonic and early neonatal development.

  3. DNA and RNA sequencing by nanoscale reading through programmable electrophoresis and nanoelectrode-gated tunneling and dielectric detection

    Science.gov (United States)

    Lee, James W.; Thundat, Thomas G.

    2005-06-14

    An apparatus and method for performing nucleic acid (DNA and/or RNA) sequencing on a single molecule. The genetic sequence information is obtained by probing through a DNA or RNA molecule base by base at nanometer scale as though looking through a strip of movie film. This DNA sequencing nanotechnology has the theoretical capability of performing DNA sequencing at a maximal rate of about 1,000,000 bases per second. This enhanced performance is made possible by a series of innovations including: novel applications of a fine-tuned nanometer gap for passage of a single DNA or RNA molecule; thin layer microfluidics for sample loading and delivery; and programmable electric fields for precise control of DNA or RNA movement. Detection methods include nanoelectrode-gated tunneling current measurements, dielectric molecular characterization, and atomic force microscopy/electrostatic force microscopy (AFM/EFM) probing for nanoscale reading of the nucleic acid sequences.

  4. Diversity, Distribution, and Evolution of Tomato Viruses in China Uncovered by Small RNA Sequencing.

    Science.gov (United States)

    Xu, Chenxi; Sun, Xuepeng; Taylor, Angela; Jiao, Chen; Xu, Yimin; Cai, Xiaofeng; Wang, Xiaoli; Ge, Chenhui; Pan, Guanghui; Wang, Quanxi; Fei, Zhangjun; Wang, Quanhua

    2017-06-01

    Tomato is a major vegetable crop that has tremendous popularity. However, viral disease is still a major factor limiting tomato production. Here, we report the tomato virome identified through sequencing small RNAs of 170 field-grown samples collected in China. A total of 22 viruses were identified, including both well-documented and newly detected viruses. The tomato viral community is dominated by a few species, and they exhibit polymorphisms and recombination in the genomes with cold spots and hot spots. Most samples were coinfected by multiple viruses, and the majority of identified viruses are positive-sense single-stranded RNA viruses. Evolutionary analysis of one of the most dominant tomato viruses, Tomato yellow leaf curl virus (TYLCV), predicts its origin and the time back to its most recent common ancestor. The broadly sampled data have enabled us to identify several unreported viruses in tomato, including a completely new virus, which has a genome of ∼13.4 kb and groups with aphid-transmitted viruses in the genus Cytorhabdovirus Although both DNA and RNA viruses can trigger the biogenesis of virus-derived small interfering RNAs (vsiRNAs), we show that features such as length distribution, paired distance, and base selection bias of vsiRNA sequences reflect different plant Dicer-like proteins and Argonautes involved in vsiRNA biogenesis. Collectively, this study offers insights into host-virus interaction in tomato and provides valuable information to facilitate the management of viral diseases. IMPORTANCE Tomato is an important source of micronutrients in the human diet and is extensively consumed around the world. Virus is among the major constraints on tomato production. Categorizing virus species that are capable of infecting tomato and understanding their diversity and evolution are challenging due to difficulties in detecting such fast-evolving biological entities. Here, we report the landscape of the tomato virome in China, the leading country in

  5. Comparison of two approaches for the classification of 16S rRNA gene sequences.

    Science.gov (United States)

    Chatellier, Sonia; Mugnier, Nathalie; Allard, Françoise; Bonnaud, Bertrand; Collin, Valérie; van Belkum, Alex; Veyrieras, Jean-Baptiste; Emler, Stefan

    2014-10-01

    The use of 16S rRNA gene sequences for microbial identification in clinical microbiology is accepted widely, and requires databases and algorithms. We compared a new research database containing curated 16S rRNA gene sequences in combination with the lca (lowest common ancestor) algorithm (RDB-LCA) to a commercially available 16S rDNA Centroid approach. We used 1025 bacterial isolates characterized by biochemistry, matrix-assisted laser desorption/ionization time-of-flight MS and 16S rDNA sequencing. Nearly 80 % of isolates were identified unambiguously at the species level by both classification platforms used. The remaining isolates were mostly identified correctly at the genus level due to the limited resolution of 16S rDNA sequencing. Discrepancies between both 16S rDNA platforms were due to differences in database content and the algorithm used, and could amount to up to 10.5 %. Up to 1.4 % of the analyses were found to be inconclusive. It is important to realize that despite the overall good performance of the pipelines for analysis, some inconclusive results remain that require additional in-depth analysis performed using supplementary methods. © 2014 The Authors.

  6. Intervening sequences in 23S rRNA genes and 23S rRNA fragmentation in Taylorella asinigenitalis UCD-1(T) strain.

    Science.gov (United States)

    Tazumi, Akihiro; Sekizuka, Tsuyoshi; Moore, John E; Millar, Cherie B; Taneike, Ikue; Matsuda, Motoo

    2008-08-01

    PCR was performed with Taylorella asinigenitalis UCD-1(T) using two primer pairs constructed in silico for the amplification of the intervening sequences (IVSs) in the first quarter and central regions of the 23S rRNA gene. Following TA cloning and sequencing, the strain was identified to carry heterogeneous and multiple IVSs. Two similar tandem repeat units of 25 and 24 base pairs (bp) with unknown function(s) were identified within the two IVSs in the central region. Secondary structure models of IVSs, containing stem and loop structures, were demonstrated. Although 16S rRNA and 4-5S RNA species were identified in the purified RNA fraction, no 23S rRNAs were evident, resulting in the occurrence of some smaller RNA fragments from approximately 500 to 1,600 bp, in length. Thus, the 23S rRNA primary transcripts may be cleaved into some smaller fragments and IVSs. No IVS transcript was detected by northern blot hybridization analysis. The present and previous results strongly demonstrate the occurrence of heterogeneous and multiple IVSs in 23S rRNA gene sequences and 23S rRNA fragmentation, in T. asinigenitalis. (c) 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  7. Direct RNA sequencing mediated identification of mRNA localized in protrusions of human MDA-MB-231 metastatic breast cancer cells

    DEFF Research Database (Denmark)

    Jakobsen, Kristine Raaby; Sørensen, Emilie; Brøndum, Karin Kathrine

    2013-01-01

    To describe genome wide RNA localized in protrusions of the metastatic human breast cancer cell line MDA-MB-231 we used Boyden chamber based methodology followed by direct mRNA sequencing. Results In the hereby identified group of protrusion localized mRNA some previously were described to be localized...... localized transcripts represents novel candidates to mediate cancer cell subcellular region specific functions through mRNA direction to protrusions. We included a further characterization of p0071, an armadillo repeat protein of adherence junctions and desmosomes, in MDA-MB-231 and non-metastatic MCF7...... in protrusions of MDA-MB-231 metastatic cancer cells...

  8. Preliminary study on mitochondrial 16S rRNA gene sequences and phylogeny of flatfishes (Pleuronectiformes)

    Science.gov (United States)

    You, Feng; Liu, Jing; Zhang, Peijun; Xiang, Jianhai

    2005-09-01

    A 605 bp section of mitochondrial 16S rRNA gene from Paralichthys olivaceus, Pseudorhombus cinnamomeus, Psetta maxima and Kareius bicoloratus, which represent 3 families of Order Pleuronectiformes was amplified by PCR and sequenced to show the molecular systematics of Pleuronectiformes for comparison with related gene sequences of other 6 flatfish downloaded from GenBank. Phylogenetic analysis based on genetic distance from related gene sequences of 10 flatfish showed that this method was ideal to explore the relationship between species, genera and families. Phylogenetic trees set-up is based on neighbor-joining, maximum parsimony and maximum likelihood methods that accords to the general rule of Pleuronectiformes evolution. But they also resulted in some confusion. Unlike data from morphological characters, P. olivaceus clustered with K. bicoloratus, but P. cinnamomeus did not cluster with P. olivaceus, which is worth further studying.

  9. Mapping a nucleolar targeting sequence of an RNA binding nucleolar protein, Nop25

    International Nuclear Information System (INIS)

    Fujiwara, Takashi; Suzuki, Shunji; Kanno, Motoko; Sugiyama, Hironobu; Takahashi, Hisaaki; Tanaka, Junya

    2006-01-01

    Nop25 is a putative RNA binding nucleolar protein associated with rRNA transcription. The present study was undertaken to determine the mechanism of Nop25 localization in the nucleolus. Deletion experiments of Nop25 amino acid sequence showed Nop25 to contain a nuclear targeting sequence in the N-terminal and a nucleolar targeting sequence in the C-terminal. By expressing derivative peptides from the C-terminal as GFP-fusion proteins in the cells, a lysine and arginine residue-enriched peptide (KRKHPRRAQDSTKKPPSATRTSKTQRRRR) allowed a GFP-fusion protein to be transported and fully retained in the nucleolus. When the peptide was fused with cMyc epitope and expressed in the cells, a cMyc epitope was then detected in the nucleolus. Nop25 did not localize in the nucleolus by deletion of the peptide from Nop25. Furthermore, deletion of a subdomain (KRKHPRRAQ) in the peptide or amino acid substitution of lysine and arginine residues in the subdomain resulted in the loss of Nop25 nucleolar localization. These results suggest that the lysine and arginine residue-enriched peptide is the most prominent nucleolar targeting sequence of Nop25 and that the long stretch of basic residues might play an important role in the nucleolar localization of Nop25. Although Nop25 contained putative SUMOylation, phosphorylation and glycosylation sites, the amino acid substitution in these sites had no effect on the nucleolar localization, thus suggesting that these post-translational modifications did not contribute to the localization of Nop25 in the nucleolus. The treatment of the cells, which expressed a GFP-fusion protein with a nucleolar targeting sequence of Nop25, with RNase A resulted in a complete dislocation of the protein from the nucleolus. These data suggested that the nucleolar targeting sequence might therefore play an important role in the binding of Nop25 to RNA molecules and that the RNA binding of Nop25 might be essential for the nucleolar localization of Nop25

  10. Identification of fusion genes in breast cancer by paired-end RNA-sequencing.

    Science.gov (United States)

    Edgren, Henrik; Murumagi, Astrid; Kangaspeska, Sara; Nicorici, Daniel; Hongisto, Vesa; Kleivi, Kristine; Rye, Inga H; Nyberg, Sandra; Wolf, Maija; Borresen-Dale, Anne-Lise; Kallioniemi, Olli

    2011-01-01

    Until recently, chromosomal translocations and fusion genes have been an underappreciated class of mutations in solid tumors. Next-generation sequencing technologies provide an opportunity for systematic characterization of cancer cell transcriptomes, including the discovery of expressed fusion genes resulting from underlying genomic rearrangements. We applied paired-end RNA-seq to identify 24 novel and 3 previously known fusion genes in breast cancer cells. Supported by an improved bioinformatic approach, we had a 95% success rate of validating gene fusions initially detected by RNA-seq. Fusion partner genes were found to contribute promoters (5' UTR), coding sequences and 3' UTRs. Most fusion genes were associated with copy number transitions and were particularly common in high-level DNA amplifications. This suggests that fusion events may contribute to the selective advantage provided by DNA amplifications and deletions. Some of the fusion partner genes, such as GSDMB in the TATDN1-GSDMB fusion and IKZF3 in the VAPB-IKZF3 fusion, were only detected as a fusion transcript, indicating activation of a dormant gene by the fusion event. A number of fusion gene partners have either been previously observed in oncogenic gene fusions, mostly in leukemias, or otherwise reported to be oncogenic. RNA interference-mediated knock-down of the VAPB-IKZF3 fusion gene indicated that it may be necessary for cancer cell growth and survival. In summary, using RNA-sequencing and improved bioinformatic stratification, we have discovered a number of novel fusion genes in breast cancer, and identified VAPB-IKZF3 as a potential fusion gene with importance for the growth and survival of breast cancer cells.

  11. Sequencing and expression analysis of hepcidin mRNA in donkey (Equus asinus liver

    Directory of Open Access Journals (Sweden)

    José P. Oliveira-Filho

    2012-10-01

    Full Text Available The hypoferremia that is observed during systemic inflammatory processes is mediated by hepcidin, which is a peptide that is mainly synthesized in the livers of several mammalian species. Hepcidin plays a key role in iron metabolism and in the innate immune system. It's up-regulation is particularly useful during acute inflammation, and it restricts the iron availability that is necessary for the growth of pathogenic microorganisms. In this study, the hepcidin mRNA of Equus asinus has been characterized, and the expression of donkey hepcidin in the liver has been determined. The donkey hepcidin sequence has an open reading frame (ORF of 261 nucleotides, and the deduced corresponding protein sequence has 86 amino acids. The amino acid sequence of donkey hepcidin was most homologous to Equus caballus (98%. The mature donkey hepcidin sequence (25 amino acids was 100% homologous to the equine mature hepcidin and has eight conserved cysteine residues that are found in all of the investigated hepcidin sequences. The expression profile of donkey hepcidin in the liver was high and was similar to the reference gene expression. The donkey hepcidin sequence was deposited in GenBankTM (HQ902884 and may be useful for additional studies on iron metabolism and the inflammatory process in this species.

  12. Identification of miRNA from Porphyra yezoensis by high-throughput sequencing and bioinformatics analysis.

    Directory of Open Access Journals (Sweden)

    Chengwei Liang

    Full Text Available BACKGROUND: miRNAs are a class of non-coding, small RNAs that are approximately 22 nucleotides long and play important roles in the translational level regulation of gene expression by either directly binding or cleaving target mRNAs. The red alga, Porphyra yezoensis is one of the most important marine economic crops worldwide. To date, only a few miRNAs have been identified in green unicellar alga and there is no report about Porphyra miRNAs. METHODOLOGY/PRINCIPAL FINDINGS: To identify miRNAs in Porphyra yezoensis, a small RNA library was constructed. Solexa technology was used to perform high throughput sequencing of the library and subsequent bioinformatics analysis to identify novel miRNAs. Specifically, 180,557,942 reads produced 13,324 unique miRNAs representing 224 conserved miRNA families that have been identified in other plants species. In addition, seven novel putative miRNAs were predicted from a limited number of ESTs. The potential targets of these putative miRNAs were also predicted based on sequence homology search. CONCLUSIONS/SIGNIFICANCE: This study provides a first large scale cloning and characterization of Porphyra miRNAs and their potential targets. These miRNAs belong to 224 conserved miRNA families and 7 miRNAs are novel in Porphyra. These miRNAs add to the growing database of new miRNA and lay the foundation for further understanding of miRNA function in the regulation of Porphyra yezoensis development.

  13. Small RNA and transcriptome deep sequencing proffers insight into floral gene regulation in Rosa cultivars.

    Science.gov (United States)

    Kim, Jungeun; Park, June Hyun; Lim, Chan Ju; Lim, Jae Yun; Ryu, Jee-Youn; Lee, Bong-Woo; Choi, Jae-Pil; Kim, Woong Bom; Lee, Ha Yeon; Choi, Yourim; Kim, Donghyun; Hur, Cheol-Goo; Kim, Sukweon; Noh, Yoo-Sun; Shin, Chanseok; Kwon, Suk-Yoon

    2012-11-21

    Roses (Rosa sp.), which belong to the family Rosaceae, are the most economically important ornamental plants--making up 30% of the floriculture market. However, given high demand for roses, rose breeding programs are limited in molecular resources which can greatly enhance and speed breeding efforts. A better understanding of important genes that contribute to important floral development and desired phenotypes will lead to improved rose cultivars. For this study, we analyzed rose miRNAs and the rose flower transcriptome in order to generate a database to expound upon current knowledge regarding regulation of important floral characteristics. A rose genetic database will enable comprehensive analysis of gene expression and regulation via miRNA among different Rosa cultivars. We produced more than 0.5 million reads from expressed sequences, totalling more than 110 million bp. From these, we generated 35,657, 31,434, 34,725, and 39,722 flower unigenes from Rosa hybrid: 'Vital', 'Maroussia', and 'Sympathy' and Rosa rugosa Thunb., respectively. The unigenes were assigned functional annotations, domains, metabolic pathways, Gene Ontology (GO) terms, Plant Ontology (PO) terms, and MIPS Functional Catalogue (FunCat) terms. Rose flower transcripts were compared with genes from whole genome sequences of Rosaceae members (apple, strawberry, and peach) and grape. We also produced approximately 40 million small RNA reads from flower tissue for Rosa, representing 267 unique miRNA tags. Among identified miRNAs, 25 of them were novel and 242 of them were conserved miRNAs. Statistical analyses of miRNA profiles revealed both shared and species-specific miRNAs, which presumably effect flower development and phenotypes. In this study, we constructed a Rose miRNA and transcriptome database, and we analyzed the miRNAs and transcriptome generated from the flower tissues of four Rosa cultivars. The database provides a comprehensive genetic resource which can be used to better understand

  14. Evaluation of PacBio sequencing for full-length bacterial 16S rRNA gene classification.

    Science.gov (United States)

    Wagner, Josef; Coupland, Paul; Browne, Hilary P; Lawley, Trevor D; Francis, Suzanna C; Parkhill, Julian

    2016-11-14

    Currently, bacterial 16S rRNA gene analyses are based on sequencing of individual variable regions of the 16S rRNA gene (Kozich, et al Appl Environ Microbiol 79:5112-5120, 2013).This short read approach can introduce biases. Thus, full-length bacterial 16S rRNA gene sequencing is needed to reduced biases. A new alternative for full-length bacterial 16S rRNA gene sequencing is offered by PacBio single molecule, real-time (SMRT) technology. The aim of our study was to validate PacBio P6 sequencing chemistry using three approaches: 1) sequencing the full-length bacterial 16S rRNA gene from a single bacterial species Staphylococcus aureus to analyze error modes and to optimize the bioinformatics pipeline; 2) sequencing the full-length bacterial 16S rRNA gene from a pool of 50 different bacterial colonies from human stool samples to compare with full-length bacterial 16S rRNA capillary sequence; and 3) sequencing the full-length bacterial 16S rRNA genes from 11 vaginal microbiome samples and compare with in silico selected bacterial 16S rRNA V1V2 gene region and with bacterial 16S rRNA V1V2 gene regions sequenced using the Illumina MiSeq. Our optimized bioinformatics pipeline for PacBio sequence analysis was able to achieve an error rate of 0.007% on the Staphylococcus aureus full-length 16S rRNA gene. Capillary sequencing of the full-length bacterial 16S rRNA gene from the pool of 50 colonies from stool identified 40 bacterial species of which up to 80% could be identified by PacBio full-length bacterial 16S rRNA gene sequencing. Analysis of the human vaginal microbiome using the bacterial 16S rRNA V1V2 gene region on MiSeq generated 129 operational taxonomic units (OTUs) from which 70 species could be identified. For the PacBio, 36,000 sequences from over 58,000 raw reads could be assigned to a barcode, and the in silico selected bacterial 16S rRNA V1V2 gene region generated 154 OTUs grouped into 63 species, of which 62% were shared with the MiSeq dataset. The Pac

  15. Deep RNA sequencing of the skeletal muscle transcriptome in swimming fish.

    Directory of Open Access Journals (Sweden)

    Arjan P Palstra

    Full Text Available Deep RNA sequencing (RNA-seq was performed to provide an in-depth view of the transcriptome of red and white skeletal muscle of exercised and non-exercised rainbow trout (Oncorhynchus mykiss with the specific objective to identify expressed genes and quantify the transcriptomic effects of swimming-induced exercise. Pubertal autumn-spawning seawater-raised female rainbow trout were rested (n = 10 or swum (n = 10 for 1176 km at 0.75 body-lengths per second in a 6,000-L swim-flume under reproductive conditions for 40 days. Red and white muscle RNA of exercised and non-exercised fish (4 lanes was sequenced and resulted in 15-17 million reads per lane that, after de novo assembly, yielded 149,159 red and 118,572 white muscle contigs. Most contigs were annotated using an iterative homology search strategy against salmonid ESTs, the zebrafish Danio rerio genome and general Metazoan genes. When selecting for large contigs (>500 nucleotides, a number of novel rainbow trout gene sequences were identified in this study: 1,085 and 1,228 novel gene sequences for red and white muscle, respectively, which included a number of important molecules for skeletal muscle function. Transcriptomic analysis revealed that sustained swimming increased transcriptional activity in skeletal muscle and specifically an up-regulation of genes involved in muscle growth and developmental processes in white muscle. The unique collection of transcripts will contribute to our understanding of red and white muscle physiology, specifically during the long-term reproductive migration of salmonids.

  16. Sequence-dependent base-stacking stabilities guide tRNA folding energy landscapes.

    Science.gov (United States)

    Li, Rongzhong; Ge, Heming W; Cho, Samuel S

    2013-10-24

    The folding of bacterial tRNAs with disparate sequences has been observed to proceed in distinct folding mechanisms despite their structural similarity. To explore the folding landscapes of tRNA, we performed ion concentration-dependent coarse-grained TIS model MD simulations of several E. coli tRNAs to compare their thermodynamic melting profiles to the classical absorbance spectra of Crothers and co-workers. To independently validate our findings, we also performed atomistic empirical force field MD simulations of tRNAs, and we compared the base-to-base distances from coarse-grained and atomistic MD simulations to empirical base-stacking free energies. We then projected the free energies to the secondary structural elements of tRNA, and we observe distinct, parallel folding mechanisms whose differences can be inferred on the basis of their sequence-dependent base-stacking stabilities. In some cases, a premature, nonproductive folding intermediate corresponding to the Ψ hairpin loop must backtrack to the unfolded state before proceeding to the folded state. This observation suggests a possible explanation for the fast and slow phases observed in tRNA folding kinetics.

  17. Sequence-based discrimination of protein-RNA interacting residues using a probabilistic approach.

    Science.gov (United States)

    Pai, Priyadarshini P; Dash, Tirtharaj; Mondal, Sukanta

    2017-04-07

    Protein interactions with ribonucleic acids (RNA) are well-known to be crucial for a wide range of cellular processes such as transcriptional regulation, protein synthesis or translation, and post-translational modifications. Identification of the RNA-interacting residues can provide insights into these processes and aid in relevant biotechnological manipulations. Owing to their eventual potential in combating diseases and industrial production, several computational attempts have been made over years using sequence- and structure-based information. Recent comparative studies suggest that despite these developments, many problems are faced with respect to the usability, prerequisites, and accessibility of various tools, thereby calling for an alternative approach and perspective supplementation in the prediction scenario. With this motivation, in this paper, we propose the use of a simple-yet-efficient conditional probabilistic approach based on the application of local occurrence of amino acids in the interacting region in a non-numeric sequence feature space, for discriminating between RNA interacting and non-interacting residues. The proposed method has been meticulously tested for robustness using a cross-estimation method showing MCC of 0.341 and F- measure of 66.84%. Upon exploring large scale applications using benchmark datasets available to date, this approach showed an encouraging performance comparable with the state-of-art. The software is available at https://github.com/ABCgrp/DORAEMON. Copyright © 2017 Elsevier Ltd. All rights reserved.

  18. The Pseudomonas aeruginosa transcriptome in planktonic cultures and static biofilms using RNA sequencing.

    Directory of Open Access Journals (Sweden)

    Andreas Dötsch

    Full Text Available In this study, we evaluated how gene expression differs in mature Pseudomonas aeruginosa biofilms as opposed to planktonic cells by the use of RNA sequencing technology that gives rise to both quantitative and qualitative information on the transcriptome. Although a large proportion of genes were consistently regulated in both the stationary phase and biofilm cultures as opposed to the late exponential growth phase cultures, the global biofilm gene expression pattern was clearly distinct indicating that biofilms are not just surface attached cells in stationary phase. A large amount of the genes found to be biofilm specific were involved in adaptation to microaerophilic growth conditions, repression of type three secretion and production of extracellular matrix components. Additionally, we found many small RNAs to be differentially regulated most of them similarly in stationary phase cultures and biofilms. A qualitative analysis of the RNA-seq data revealed more than 3000 putative transcriptional start sites (TSS. By the use of rapid amplification of cDNA ends (5'-RACE we confirmed the presence of three different TSS associated with the pqsABCDE operon, two in the promoter of pqsA and one upstream of the second gene, pqsB. Taken together, this study reports the first transcriptome study on P. aeruginosa that employs RNA sequencing technology and provides insights into the quantitative and qualitative transcriptome including the expression of small RNAs in P. aeruginosa biofilms.

  19. SeqFold: genome-scale reconstruction of RNA secondary structure integrating high-throughput sequencing data.

    Science.gov (United States)

    Ouyang, Zhengqing; Snyder, Michael P; Chang, Howard Y

    2013-02-01

    We present an integrative approach, SeqFold, that combines high-throughput RNA structure profiling data with computational prediction for genome-scale reconstruction of RNA secondary structures. SeqFold transforms experimental RNA structure information into a structure preference profile (SPP) and uses it to select stable RNA structure candidates representing the structure ensemble. Under a high-dimensional classification framework, SeqFold efficiently matches a given SPP to the most likely cluster of structures sampled from the Boltzmann-weighted ensemble. SeqFold is able to incorporate diverse types of RNA structure profiling data, including parallel analysis of RNA structure (PARS), selective 2'-hydroxyl acylation analyzed by primer extension sequencing (SHAPE-Seq), fragmentation sequencing (FragSeq) data generated by deep sequencing, and conventional SHAPE data. Using the known structures of a wide range of mRNAs and noncoding RNAs as benchmarks, we demonstrate that SeqFold outperforms or matches existing approaches in accuracy and is more robust to noise in experimental data. Application of SeqFold to reconstruct the secondary structures of the yeast transcriptome reveals the diverse impact of RNA secondary structure on gene regulation, including translation efficiency, transcription initiation, and protein-RNA interactions. SeqFold can be easily adapted to incorporate any new types of high-throughput RNA structure profiling data and is widely applicable to analyze RNA structures in any transcriptome.

  20. Sequence analysis of RNA 2 and RNA 3 of lilac leaf chlorosis virus: a putative new member of the genus Ilarvirus.

    Science.gov (United States)

    James, D; Varga, A; Leippi, L; Godkin, S; Masters, C

    2010-06-01

    RNA 2 and RNA 3 of lilac leaf chlorosis virus (LLCV) were sequenced and shown to be 2,762 nucleotides (nt) and 2,117 nts in length, respectively. RNA 2 encodes a putative 807-amino-acid (aa) RNA-dependent RNA polymerase associated protein with an estimated M (r) of 92.75 kDa. RNA 3 is bicistronic, with ORF1 encoding a putative movement protein (277 aa, M (r) 31.45 kDa) and ORF2 encoding the putative coat protein (221 aa, M (r) 24.37 kDa). The genome organization is similar to that typical for members of the genus Ilarvirus. Phylogenetic analyses indicate a close evolutionary relationship between LLCV, ApMV, and PNRSV.

  1. Gene length and detection bias in single cell RNA sequencing protocols [version 1; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Belinda Phipson

    2017-04-01

    Full Text Available Background: Single cell RNA sequencing (scRNA-seq has rapidly gained popularity for profiling transcriptomes of hundreds to thousands of single cells. This technology has led to the discovery of novel cell types and revealed insights into the development of complex tissues. However, many technical challenges need to be overcome during data generation. Due to minute amounts of starting material, samples undergo extensive amplification, increasing technical variability. A solution for mitigating amplification biases is to include unique molecular identifiers (UMIs, which tag individual molecules. Transcript abundances are then estimated from the number of unique UMIs aligning to a specific gene, with PCR duplicates resulting in copies of the UMI not included in expression estimates. Methods: Here we investigate the effect of gene length bias in scRNA-Seq across a variety of datasets that differ in terms of capture technology, library preparation, cell types and species. Results: We find that scRNA-seq datasets that have been sequenced using a full-length transcript protocol exhibit gene length bias akin to bulk RNA-seq data. Specifically, shorter genes tend to have lower counts and a higher rate of dropout. In contrast, protocols that include UMIs do not exhibit gene length bias, with a mostly uniform rate of dropout across genes of varying length. Across four different scRNA-Seq datasets profiling mouse embryonic stem cells (mESCs, we found the subset of genes that are only detected in the UMI datasets tended to be shorter, while the subset of genes detected only in the full-length datasets tended to be longer. Conclusions: We find that the choice of scRNA-seq protocol influences the detection rate of genes, and that full-length datasets exhibit gene-length bias. In addition, despite clear differences between UMI and full-length transcript data, we illustrate that full-length and UMI data can be combined to reveal the underlying biology

  2. Molecular characterization of intervening sequences in 23S rRNA genes and 23S rRNA fragmentation in Taylorella equigenitalis.

    Science.gov (United States)

    Tazumi, A; Sekizuka, T; Moore, J E; Millar, B C; Taneike, I; Matsuda, M

    2008-01-01

    Using two primer pairs constructed in silico for the amplification of the intervening sequences (IVSs) of the 23S rRNA gene sequences of the genus Taylorella, none of the three representative T. equigenitalis strains NCTC11184(T), Kentucky 188 and EQ59 was shown to contain any IVSs in the first quarter region. In the central region, all three strains possessed one approximately 70 bp IVS (TeIVS2) different from any IVSs found in T. asinigenitalis. The predicted secondary structure model of the IVSs contained stem and loop structures. The central region of the IVS-stem structure contains an identical double-stranded consensus 15-bp sequence. The purified RNA fraction from the three strains contained 16S and 4-5S RNA species but no 23S rRNA species. Thus, the primary 23S rRNA transcripts from the three strains would be cleaved into approximately 1.2- and 1.6-kb rRNA fragments and approximately 70-bp IVS. In addition, 16 other T. equigenitalis isolates were found to carry a similar 70-bp IVS in the central region and to produce fragmented 23S rRNA.

  3. Sequence-controlled RNA self-processing: computational design, biochemical analysis, and visualization by AFM.

    Science.gov (United States)

    Petkovic, Sonja; Badelt, Stefan; Block, Stephan; Flamm, Christoph; Delcea, Mihaela; Hofacker, Ivo; Müller, Sabine

    2015-07-01

    Reversible chemistry allowing for assembly and disassembly of molecular entities is important for biological self-organization. Thus, ribozymes that support both cleavage and formation of phosphodiester bonds may have contributed to the emergence of functional diversity and increasing complexity of regulatory RNAs in early life. We have previously engineered a variant of the hairpin ribozyme that shows how ribozymes may have circularized or extended their own length by forming concatemers. Using the Vienna RNA package, we now optimized this hairpin ribozyme variant and selected four different RNA sequences that were expected to circularize more efficiently or form longer concatemers upon transcription. (Two-dimensional) PAGE analysis confirms that (i) all four selected ribozymes are catalytically active and (ii) high yields of cyclic species are obtained. AFM imaging in combination with RNA structure prediction enabled us to calculate the distributions of monomers and self-concatenated dimers and trimers. Our results show that computationally optimized molecules do form reasonable amounts of trimers, which has not been observed for the original system so far, and we demonstrate that the combination of theoretical prediction, biochemical and physical analysis is a promising approach toward accurate prediction of ribozyme behavior and design of ribozymes with predefined functions. © 2015 Petkovic et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  4. RNA sequencing analysis to demonstrate Erk dependent and independent functions of Mek.

    Science.gov (United States)

    Liu, Chang; Chen, Haixia; Liu, Lin; Chen, Lingyi

    2016-03-01

    Mek inhibition and Erk knockout (KO) have quite distinct effects on pluripotency maintenance in mouse embryonic stem cells (ESCs). To test whether there is an Erk-independent function of Mek, RNA-sequencing (RNA-seq) is carried out on six samples, WT KH2 ESCs treated with or without PD0325901 (PD) for 48 h (KH2_PD and KH2, respectively), iErk1; Erk KO ESCs cultured in the presence of Dox (P0), 48 and 96 h after Dox withdrawal (P1 and P2, respectively), and iErk1; Erk KO ESCs cultured without Dox for 96 h, and treated with PD in the last 48 h (P2_PD). These RNA-seq data demonstrate that Mek inhibition has quite different effect on the transcriptional profile of mouse ESCs, compared to Erk KO. Moreover, a significant fraction of genes is regulated by Mek inhibition, regardless of the presence or absence of Erk, indicating an Erk-independent function of Mek. RNA-seq data are deposited in Gene Expression Omnibus (GEO) datasets under accession number GSE70304.

  5. Identification and characterization of rhizospheric microbial diversity by 16S ribosomal RNA gene sequencing

    Directory of Open Access Journals (Sweden)

    Muhammad Naveed

    2014-09-01

    Full Text Available In the present study, samples of rhizosphere and root nodules were collected from different areas of Pakistan to isolate plant growth promoting rhizobacteria. Identification of bacterial isolates was made by 16S rRNA gene sequence analysis and taxonomical confirmation on EzTaxon Server. The identified bacterial strains were belonged to 5 genera i.e. Ensifer, Bacillus, Pseudomona, Leclercia and Rhizobium. Phylogenetic analysis inferred from 16S rRNA gene sequences showed the evolutionary relationship of bacterial strains with the respective genera. Based on phylogenetic analysis, some candidate novel species were also identified. The bacterial strains were also characterized for morphological, physiological, biochemical tests and glucose dehydrogenase (gdh gene that involved in the phosphate solublization using cofactor pyrroloquinolone quinone (PQQ. Seven rhizoshperic and 3 root nodulating stains are positive for gdh gene. Furthermore, this study confirms a novel association between microbes and their hosts like field grown crops, leguminous and non-leguminous plants. It was concluded that a diverse group of bacterial population exist in the rhizosphere and root nodules that might be useful in evaluating the mechanisms behind plant microbial interactions and strains QAU-63 and QAU-68 have sequence similarity of 97 and 95% which might be declared as novel after further taxonomic characterization.

  6. MicroRNA repertoire for functional genome research in tilapia identified by deep sequencing.

    Science.gov (United States)

    Yan, Biao; Wang, Zhen-Hua; Zhu, Chang-Dong; Guo, Jin-Tao; Zhao, Jin-Liang

    2014-08-01

    The Nile tilapia (Oreochromis niloticus; Cichlidae) is an economically important species in aquaculture and occupies a prominent position in the aquaculture industry. MicroRNAs (miRNAs) are a class of noncoding RNAs that post-transcriptionally regulate gene expression involved in diverse biological and metabolic processes. To increase the repertoire of miRNAs characterized in tilapia, we used the Illumina/Solexa sequencing technology to sequence a small RNA library using pooled RNA sample isolated from the different developmental stages of tilapia. Bioinformatic analyses suggest that 197 conserved and 27 novel miRNAs are expressed in tilapia. Sequence alignments indicate that all tested miRNAs and miRNAs* are highly conserved across many species. In addition, we characterized the tissue expression patterns of five miRNAs using real-time quantitative PCR. We found that miR-1/206, miR-7/9, and miR-122 is abundantly expressed in muscle, brain, and liver, respectively, implying a potential role in the regulation of tissue differentiation or the maintenance of tissue identity. Overall, our results expand the number of tilapia miRNAs, and the discovery of miRNAs in tilapia genome contributes to a better understanding the role of miRNAs in regulating diverse biological processes.

  7. Small RNA Library Preparation Method for Next-Generation Sequencing Using Chemical Modifications to Prevent Adapter Dimer Formation.

    Science.gov (United States)

    Shore, Sabrina; Henderson, Jordana M; Lebedev, Alexandre; Salcedo, Michelle P; Zon, Gerald; McCaffrey, Anton P; Paul, Natasha; Hogrefe, Richard I

    2016-01-01

    For most sample types, the automation of RNA and DNA sample preparation workflows enables high throughput next-generation sequencing (NGS) library preparation. Greater adoption of small RNA (sRNA) sequencing has been hindered by high sample input requirements and inherent ligation side products formed during library preparation. These side products, known as adapter dimer, are very similar in size to the tagged library. Most sRNA library preparation strategies thus employ a gel purification step to isolate tagged library from adapter dimer contaminants. At very low sample inputs, adapter dimer side products dominate the reaction and limit the sensitivity of this technique. Here we address the need for improved specificity of sRNA library preparation workflows with a novel library preparation approach that uses modified adapters to suppress adapter dimer formation. This workflow allows for lower sample inputs and elimination of the gel purification step, which in turn allows for an automatable sRNA library preparation protocol.

  8. The nucleotide sequence of 4.5S ribosomal RNA from tobacco chloroplasts.

    OpenAIRE

    Takaiwa, F; Sugiura, M

    1980-01-01

    The nucleotide sequence of tobacco chloroplast 4.5S ribosomal RNA has been determined to be: OHG-A-A-G-G-U-C-A-C-G-G-C-G-A-G-A-C-G-A-G-C-C-G-U-U-U-A-U-C-A-U-U-A-C-G-A-U-A-G-G-U-G-U-C-A-A-G-U-G-G-A-A-G-U-G-C-A-G-U-G-A-U-G-U-A-U-G-C-(G-A)-C-U-G-A-G-G-C-A-U-C-C-U-A-A-C-A-G-A-C-C-G-G-U-A-G-A-C-U-U-G-A-A-COH. The 4.5S RNA is 103 nucleotides long and its 5'-terminus is not phosphorylated.

  9. Single-cell RNA-sequencing: The future of genome biology is now.

    Science.gov (United States)

    Picelli, Simone

    2017-05-04

    Genome-wide single-cell analysis represents the ultimate frontier of genomics research. In particular, single-cell RNA-sequencing (scRNA-seq) studies have been boosted in the last few years by an explosion of new technologies enabling the study of the transcriptomic landscape of thousands of single cells in complex multicellular organisms. More sensitive and automated methods are being continuously developed and promise to deliver better data quality and higher throughput with less hands-on time. The outstanding amount of knowledge that is going to be gained from present and future studies will have a profound impact in many aspects of our society, from the introduction of truly tailored cancer treatments, to a better understanding of antibiotic resistance and host-pathogen interactions; from the discovery of the mechanisms regulating stem cell differentiation to the characterization of the early event of human embryogenesis.

  10. Comprehensive processing of high-throughput small RNA sequencing data including quality checking, normalization, and differential expression analysis using the UEA sRNA Workbench.

    Science.gov (United States)

    Beckers, Matthew; Mohorianu, Irina; Stocks, Matthew; Applegate, Christopher; Dalmay, Tamas; Moulton, Vincent

    2017-06-01

    Recently, high-throughput sequencing (HTS) has revealed compelling details about the small RNA (sRNA) population in eukaryotes. These 20 to 25 nt noncoding RNAs can influence gene expression by acting as guides for the sequence-specific regulatory mechanism known as RNA silencing. The increase in sequencing depth and number of samples per project enables a better understanding of the role sRNAs play by facilitating the study of expression patterns. However, the intricacy of the biological hypotheses coupled with a lack of appropriate tools often leads to inadequate mining of the available data and thus, an incomplete description of the biological mechanisms involved. To enable a comprehensive study of differential expression in sRNA data sets, we present a new interactive pipeline that guides researchers through the various stages of data preprocessing and analysis. This includes various tools, some of which we specifically developed for sRNA analysis, for quality checking and normalization of sRNA samples as well as tools for the detection of differentially expressed sRNAs and identification of the resulting expression patterns. The pipeline is available within the UEA sRNA Workbench, a user-friendly software package for the processing of sRNA data sets. We demonstrate the use of the pipeline on a H. sapiens data set; additional examples on a B. terrestris data set and on an A. thaliana data set are described in the Supplemental Information A comparison with existing approaches is also included, which exemplifies some of the issues that need to be addressed for sRNA analysis and how the new pipeline may be used to do this. © 2017 Beckers et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  11. PiRaNhA: A server for the computational prediction of RNA-binding residues in protein sequences

    OpenAIRE

    Murakami, Yoichi; Spriggs, Ruth V; Nakamura, Haruki; Jones, Susan

    2010-01-01

    The PiRaNhA web server is a publicly available online resource that automatically predicts the location of RNA-binding residues (RBRs) in protein sequences. The goal of functional annotation of sequences in the field of RNA binding is to provide predictions of high accuracy that require only small numbers of targeted mutations for verification. The PiRaNhA server uses a support vector machine (SVM), with position-specific scoring matrices, residue interface propensity, predicted residue acces...

  12. Cloning the human lysozyme cDNA: Inverted Alu repeat in the mRNA and in situ hybridization for macrophages and Paneth cells

    International Nuclear Information System (INIS)

    Chung, L.P.; Keshav, S.; Gordon, S.

    1988-01-01

    Lysozyme is a major secretory product of human and rodent macrophages and a useful marker for myelomonocytic cells. Based on the known human lysozyme amino acid sequence, oligonucleotides were synthesized and used as probes to screen a phorbol 12-myristate 13-acetate-treated U937 cDNA library. A full-length human lysozyme cDNA clone, pHL-2, was obtained and characterized. Sequence analysis shows that human lysozyme, like chicken lysozyme, has in 18-amino-acid-long signal peptide, but unlike the chicken lysozyme cDNA, the human lysozyme cDNA has a >1-kilobase-long 3' nontranslated sequence. Interestingly, within this 3' region, an inverted repeat of the Alu family of repetitive sequences was discovered. In RNA blot analyses, DNA probes prepared from pHL-2 can be used to detect lysozyme mRNA not only from human but also from mouse and rat. Moreover, by in situ hybridization, complementary RNA transcripts have been used as probes to detect lysozyme mRNA in mouse macrophages and Paneth cells. This human lysozyme cDNA clone is therefore likely to be a useful molecular probe for studying macrophage distribution and gene expression

  13. Cloning the human lysozyme cDNA: Inverted Alu repeat in the mRNA and in situ hybridization for macrophages and Paneth cells

    Energy Technology Data Exchange (ETDEWEB)

    Chung, L.P.; Keshav, S.; Gordon, S.

    1988-09-01

    Lysozyme is a major secretory product of human and rodent macrophages and a useful marker for myelomonocytic cells. Based on the known human lysozyme amino acid sequence, oligonucleotides were synthesized and used as probes to screen a phorbol 12-myristate 13-acetate-treated U937 cDNA library. A full-length human lysozyme cDNA clone, pHL-2, was obtained and characterized. Sequence analysis shows that human lysozyme, like chicken lysozyme, has in 18-amino-acid-long signal peptide, but unlike the chicken lysozyme cDNA, the human lysozyme cDNA has a >1-kilobase-long 3' nontranslated sequence. Interestingly, within this 3' region, an inverted repeat of the Alu family of repetitive sequences was discovered. In RNA blot analyses, DNA probes prepared from pHL-2 can be used to detect lysozyme mRNA not only from human but also from mouse and rat. Moreover, by in situ hybridization, complementary RNA transcripts have been used as probes to detect lysozyme mRNA in mouse macrophages and Paneth cells. This human lysozyme cDNA clone is therefore likely to be a useful molecular probe for studying macrophage distribution and gene expression.

  14. High throughput sequencing of small RNA component of leaves and inflorescence revealed conserved and novel miRNAs as well as phasiRNA loci in chickpea.

    Science.gov (United States)

    Srivastava, Sangeeta; Zheng, Yun; Kudapa, Himabindu; Jagadeeswaran, Guru; Hivrale, Vandana; Varshney, Rajeev K; Sunkar, Ramanjulu

    2015-06-01

    Among legumes, chickpea (Cicer arietinum L.) is the second most important crop after soybean. MicroRNAs (miRNAs) play important roles by regulating target gene expression important for plant development and tolerance to stress conditions. Additionally, recently discovered phased siRNAs (phasiRNAs), a new class of small RNAs, are abundantly produced in legumes. Nevertheless, little is known about these regulatory molecules in chickpea. The small RNA population was sequenced from leaves and flowers of chickpea to identify conserved and novel miRNAs as well as phasiRNAs/phasiRNA loci. Bioinformatics analysis revealed 157 miRNA loci for the 96 highly conserved and known miRNA homologs belonging to 38 miRNA families in chickpea. Furthermore, 20 novel miRNAs belonging to 17 miRNA families were identified. Sequence analysis revealed approximately 60 phasiRNA loci. Potential target genes likely to be regulated by these miRNAs were predicted and some were confirmed by modified 5' RACE assay. Predicted targets are mostly transcription factors that might be important for developmental processes, and others include superoxide dismutases, plantacyanin, laccases and F-box proteins that could participate in stress responses and protein degradation. Overall, this study provides an inventory of miRNA-target gene interactions for chickpea, useful for the comparative analysis of small RNAs among legumes. Copyright © 2015 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.

  15. Novel sequence variations in LAMA2 and SGCG genes modulating cis-acting regulatory elements and RNA secondary structure

    Directory of Open Access Journals (Sweden)

    Olfa Siala

    2010-01-01

    Full Text Available In this study, we detected new sequence variations in LAMA2 and SGCG genes in 5 ethnic populations, and analysed their effect on enhancer composition and mRNA structure. PCR amplification and DNA sequencing were performed and followed by bioinformatics analyses using ESEfinder as well as MFOLD software. We found 3 novel sequence variations in the LAMA2 (c.3174+22_23insAT and c.6085 +12delA and SGCG (c.*102A/C genes. These variations were present in 210 tested healthy controls from Tunisian, Moroccan, Algerian, Lebanese and French populations suggesting that they represent novel polymorphisms within LAMA2 and SGCG genes sequences. ESEfinder showed that the c.*102A/C substitution created a new exon splicing enhancer in the 3'UTR of SGCG genes, whereas the c.6085 +12delA deletion was situated in the base pairing region between LAMA2 mRNA and the U1snRNA spliceosomal components. The RNA structure analyses showed that both variations modulated RNA secondary structure. Our results are suggestive of correlations between mRNA folding and the recruitment of spliceosomal components mediating splicing, including SR proteins. The contribution of common sequence variations to mRNA structural and functional diversity will contribute to a better study of gene expression.

  16. Identification of Alternative Splicing and Fusion Transcripts in Non-Small Cell Lung Cancer by RNA Sequencing.

    Science.gov (United States)

    Hong, Yoonki; Kim, Woo Jin; Bang, Chi Young; Lee, Jae Cheol; Oh, Yeon-Mok

    2016-04-01

    Lung cancer is the most common cause of cancer related death. Alterations in gene sequence, structure, and expression have an important role in the pathogenesis of lung cancer. Fusion genes and alternative splicing of cancer-related genes have the potential to be oncogenic. In the current study, we performed RNA-sequencing (RNA-seq) to investigate potential fusion genes and alternative splicing in non-small cell lung cancer. RNA was isolated from lung tissues obtained from 86 subjects with lung cancer. The RNA samples from lung cancer and normal tissues were processed with RNA-seq using the HiSeq 2000 system. Fusion genes were evaluated using Defuse and ChimeraScan. Candidate fusion transcripts were validated by Sanger sequencing. Alternative splicing was analyzed using multivariate analysis of transcript sequencing and validated using quantitative real time polymerase chain reaction. RNA-seq data identified oncogenic fusion genes EML4-ALK and SLC34A2-ROS1 in three of 86 normal-cancer paired samples. Nine distinct fusion transcripts were selected using DeFuse and ChimeraScan; of which, four fusion transcripts were validated by Sanger sequencing. In 33 squamous cell carcinoma, 29 tumor specific skipped exon events and six mutually exclusive exon events were identified. ITGB4 and PYCR1 were top genes that showed significant tumor specific splice variants. In conclusion, RNA-seq data identified novel potential fusion transcripts and splice variants. Further evaluation of their functional significance in the pathogenesis of lung cancer is required.

  17. CMSA: a heterogeneous CPU/GPU computing system for multiple similar RNA/DNA sequence alignment.

    Science.gov (United States)

    Chen, Xi; Wang, Chen; Tang, Shanjiang; Yu, Ce; Zou, Quan

    2017-06-24

    The multiple sequence alignment (MSA) is a classic and powerful technique for sequence analysis in bioinformatics. With the rapid growth of biological datasets, MSA parallelization becomes necessary to keep its running time in an acceptable level. Although there are a lot of work on MSA problems, their approaches are either insufficient or contain some implicit assumptions that limit the generality of usage. First, the information of users' sequences, including the sizes of datasets and the lengths of sequences, can be of arbitrary values and are generally unknown before submitted, which are unfortunately ignored by previous work. Second, the center star strategy is suited for aligning similar sequences. But its first stage, center sequence selection, is highly time-consuming and requires further optimization. Moreover, given the heterogeneous CPU/GPU platform, prior studies consider the MSA parallelization on GPU devices only, making the CPUs idle during the computation. Co-run computation, however, can maximize the utilization of the computing resources by enabling the workload computation on both CPU and GPU simultaneously. This paper presents CMSA, a robust and efficient MSA system for large-scale datasets on the heterogeneous CPU/GPU platform. It performs and optimizes multiple sequence alignment automatically for users' submitted sequences without any assumptions. CMSA adopts the co-run computation model so that both CPU and GPU devices are fully utilized. Moreover, CMSA proposes an improved center star strategy that reduces the time complexity of its center sequence selection process from O(mn 2 ) to O(mn). The experimental results show that CMSA achieves an up to 11× speedup and outperforms the state-of-the-art software. CMSA focuses on the multiple similar RNA/DNA sequence alignment and proposes a novel bitmap based algorithm to improve the center star strategy. We can conclude that harvesting the high performance of modern GPU is a promising approach to

  18. Linking Maternal and Somatic 5S rRNA types with Different Sequence-Specific Non-LTR Retrotransposons

    NARCIS (Netherlands)

    Locati, M.D.; Pagano, J.F.B.; Ensink, W.A.; van Olst, M.; van Leeuwen, S.; Nehrdich, U.; Zhu, K.; Spaink, H.P.; Girard, G.; Rauwerda, H.; Jonker, M.J.; Dekker, R.J.; Breit, T.M.

    5S rRNA is a ribosomal core component, transcribed from many gene copies organized in genomic repeats. Some eukaryotic species have two 5S rRNA types defined by their predominant expression in oogenesis or adult tissue. Our next-generation sequencing study on zebrafish egg, embryo and adult tissue,

  19. A combined sequence and structure based method for discovering enriched motifs in RNA from in vivo binding data.

    Science.gov (United States)

    Polishchuk, Maya; Paz, Inbal; Kohen, Refael; Mesika, Rona; Yakhini, Zohar; Mandel-Gutfreund, Yael

    2017-04-15

    RNA binding proteins (RBPs) play an important role in regulating many processes in the cell. RBPs often recognize their RNA targets in a specific manner. In addition to the RNA primary sequence, the structure of the RNA has been shown to play a central role in RNA recognition by RBPs. In recent years, many experimental approaches, both in vitro and in vivo, were developed and employed to identify and characterize RBP targets and extract their binding specificities. In vivo binding techniques, such as CrossLinking and ImmunoPrecipitation (CLIP)-based methods, enable the characterization of protein binding sites on RNA targets. However, these methods do not provide information regarding the structural preferences of the protein. While methods to obtain the structure of RNA are available, inferring both the sequence and the structure preferences of RBPs remains a challenge. Here we present SMARTIV, a novel computational tool for discovering combined sequence and structure binding motifs from in vivo RNA binding data relying on the sequences of the target sites, the ranking of their binding scores and their predicted secondary structure. The combined motifs are provided in a unified representation that is informative and easy for visual perception. We tested the method on CLIP-seq data from different platforms for a variety of RBPs. Overall, we show that our results are highly consistent with known binding motifs of RBPs, offering additional information on their structural preferences. Copyright © 2017 Elsevier Inc. All rights reserved.

  20. Single-cell full-length total RNA sequencing uncovers dynamics of recursive splicing and enhancer RNAs.

    Science.gov (United States)

    Hayashi, Tetsutaro; Ozaki, Haruka; Sasagawa, Yohei; Umeda, Mana; Danno, Hiroki; Nikaido, Itoshi

    2018-02-12

    Total RNA sequencing has been used to reveal poly(A) and non-poly(A) RNA expression, RNA processing and enhancer activity. To date, no method for full-length total RNA sequencing of single cells has been developed despite the potential of this technology for single-cell biology. Here we describe random displacement amplification sequencing (RamDA-seq), the first full-length total RNA-sequencing method for single cells. Compared with other methods, RamDA-seq shows high sensitivity to non-poly(A) RNA and near-complete full-length transcript coverage. Using RamDA-seq with differentiation time course samples of mouse embryonic stem cells, we reveal hundreds of dynamically regulated non-poly(A) transcripts, including histone transcripts and long noncoding RNA Neat1. Moreover, RamDA-seq profiles recursive splicing in >300-kb introns. RamDA-seq also detects enhancer RNAs and their cell type-specific activity in single cells. Taken together, we demonstrate that RamDA-seq could help investigate the dynamics of gene expression, RNA-processing events and transcriptional regulation in single cells.

  1. Characterization of an extensive rainbow trout miRNA transcriptome by next generation sequencing.

    Science.gov (United States)

    Juanchich, Amelie; Bardou, Philippe; Rué, Olivier; Gabillard, Jean-Charles; Gaspin, Christine; Bobe, Julien; Guiguen, Yann

    2016-03-01

    MicroRNAs (miRNAs) have emerged as important post-transcriptional regulators of gene expression in a wide variety of physiological processes. They can control both temporal and spatial gene expression and are believed to regulate 30 to 70% of the genes. Data are however limited for fish species, with only 9 out of the 30,000 fish species present in miRBase. The aim of the current study was to discover and characterize rainbow trout (Oncorhynchus mykiss) miRNAs in a large number of tissues using next-generation sequencing in order to provide an extensive repertoire of rainbow trout miRNAs. A total of 38 different samples corresponding to 16 different tissues or organs were individually sequenced and analyzed independently in order to identify a large number of miRNAs with high confidence. This led to the identification of 2946 miRNA loci in the rainbow trout genome, including 445 already known miRNAs. Differential expression analysis was performed in order to identify miRNAs exhibiting specific or preferential expression among the 16 analyzed tissues. In most cases, miRNAs exhibit a specific pattern of expression in only a few tissues. The expression data from sRNA sequencing were confirmed by RT-qPCR. In addition, novel miRNAs are described in rainbow trout that had not been previously reported in other species. This study represents the first characterization of rainbow trout miRNA transcriptome from a wide variety of tissue and sets an extensive repertoire of rainbow trout miRNAs. It provides a starting point for future studies aimed at understanding the roles of miRNAs in major physiological process such as growth, reproduction or adaptation to stress. These rainbow trout miRNAs repertoire provide a novel resource to advance genomic research in salmonid species.

  2. Transcriptomic analysis of Petunia hybrida in response to salt stress using high throughput RNA sequencing.

    Directory of Open Access Journals (Sweden)

    Gonzalo H Villarino

    Full Text Available Salinity and drought stress are the primary cause of crop losses worldwide. In sodic saline soils sodium chloride (NaCl disrupts normal plant growth and development. The complex interactions of plant systems with abiotic stress have made RNA sequencing a more holistic and appealing approach to study transcriptome level responses in a single cell and/or tissue. In this work, we determined the Petunia transcriptome response to NaCl stress by sequencing leaf samples and assembling 196 million Illumina reads with Trinity software. Using our reference transcriptome we identified more than 7,000 genes that were differentially expressed within 24 h of acute NaCl stress. The proposed transcriptome can also be used as an excellent tool for biological and bioinformatics in the absence of an available Petunia genome and it is available at the SOL Genomics Network (SGN http://solgenomics.net. Genes related to regulation of reactive oxygen species, transport, and signal transductions as well as novel and undescribed transcripts were among those differentially expressed in response to salt stress. The candidate genes identified in this study can be applied as markers for breeding or to genetically engineer plants to enhance salt tolerance. Gene Ontology analyses indicated that most of the NaCl damage happened at 24 h inducing genotoxicity, affecting transport and organelles due to the high concentration of Na+ ions. Finally, we report a modification to the library preparation protocol whereby cDNA samples were bar-coded with non-HPLC purified primers, without affecting the quality and quantity of the RNA-seq data. The methodological improvement presented here could substantially reduce the cost of sample preparation for future high-throughput RNA sequencing experiments.

  3. Exploring the polyadenylated RNA virome of sweet potato through high-throughput sequencing.

    Science.gov (United States)

    Gu, Ying-Hong; Tao, Xiang; Lai, Xian-Jun; Wang, Hai-Yan; Zhang, Yi-Zheng

    2014-01-01

    Viral diseases are the second most significant biotic stress for sweet potato, with yield losses reaching 20% to 40%. Over 30 viruses have been reported to infect sweet potato around the world, and 11 of these have been detected in China. Most of these viruses were detected by traditional detection approaches that show disadvantages in detection throughput. Next-generation sequencing technology provides a novel, high sensitive method for virus detection and diagnosis. We report the polyadenylated RNA virome of three sweet potato cultivars using a high throughput RNA sequencing approach. Transcripts of 15 different viruses were detected, 11 of which were detected in cultivar Xushu18, whilst 11 and 4 viruses were detected in Guangshu 87 and Jingshu 6, respectively. Four were detected in sweet potato for the first time, and 4 were found for the first time in China. The most prevalent virus was SPFMV, which constituted 88% of the total viral sequence reads. Virus transcripts with extremely low expression levels were also detected, such as transcripts of SPLCV, CMV and CymMV. Digital gene expression (DGE) and reverse transcription polymerase chain reaction (RT-PCR) analyses showed that the highest viral transcript expression levels were found in fibrous and tuberous roots, which suggest that these tissues should be optimum samples for virus detection. A total of 15 viruses were presumed to present in three sweet potato cultivars growing in China. This is the first insight into the sweet potato polyadenylated RNA virome. These results can serve as a basis for further work to investigate whether some of the 'new' viruses infecting sweet potato are pathogenic.

  4. Exploring the polyadenylated RNA virome of sweet potato through high-throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Ying-Hong Gu

    Full Text Available BACKGROUND: Viral diseases are the second most significant biotic stress for sweet potato, with yield losses reaching 20% to 40%. Over 30 viruses have been reported to infect sweet potato around the world, and 11 of these have been detected in China. Most of these viruses were detected by traditional detection approaches that show disadvantages in detection throughput. Next-generation sequencing technology provides a novel, high sensitive method for virus detection and diagnosis. METHODOLOGY/PRINCIPAL FINDINGS: We report the polyadenylated RNA virome of three sweet potato cultivars using a high throughput RNA sequencing approach. Transcripts of 15 different viruses were detected, 11 of which were detected in cultivar Xushu18, whilst 11 and 4 viruses were detected in Guangshu 87 and Jingshu 6, respectively. Four were detected in sweet potato for the first time, and 4 were found for the first time in China. The most prevalent virus was SPFMV, which constituted 88% of the total viral sequence reads. Virus transcripts with extremely low expression levels were also detected, such as transcripts of SPLCV, CMV and CymMV. Digital gene expression (DGE and reverse transcription polymerase chain reaction (RT-PCR analyses showed that the highest viral transcript expression levels were found in fibrous and tuberous roots, which suggest that these tissues should be optimum samples for virus detection. CONCLUSIONS/SIGNIFICANCE: A total of 15 viruses were presumed to present in three sweet potato cultivars growing in China. This is the first insight into the sweet potato polyadenylated RNA virome. These results can serve as a basis for further work to investigate whether some of the 'new' viruses infecting sweet potato are pathogenic.

  5. Comparison of 18S ribosomal RNA gene sequences of Eurytrema coelmaticum and Eurytrema pancreaticum.

    Science.gov (United States)

    Zheng, Yadong; Luo, Xuenong; Jing, Zhizhong; Hu, Zhimin; Cai, Xuepeng

    2007-02-01

    The partial 18S rRNA sequences of E. coelmaticum and E. pancreaticum were amplified using conserved primers and an evolutionary tree was constructed using Neighbor-Joining. The percent identity of Eurytrema species with other Dicrocoeliidae varied from 97.5 to 98.2, while the percent identity between the two Eurytrema species was up to 99.3. The tree showed that E. coelmaticum and E. pancreaticum were not situated in the same position, and they formed one cluster with L. collurioni. These results support a confirmation with molecular data that E. coelomaticum and E. pancreaticum are different species which apparently were not seriously questioned in the past.

  6. Globicatella sanguinis bacteraemia identified by partial 16S rRNA gene sequencing

    DEFF Research Database (Denmark)

    Abdul-Redha, Rawaa Jalil; Balslew, Ulla; Christensen, Jens Jørgen

    2007-01-01

    Globicatella sanguinis is a gram-positive coccus, resembling non-haemolytic streptococci. The organism has been isolated infrequently from normally sterile sites of humans. Three isolates obtained by blood culture could not be identified by Rapid 32 ID Strep, but partial sequencing of the 16S r......RNA gene revealed the identity of the isolated bacteria, and supplementary biochemical tests confirmed the species identification. The cases histories illustrate the dilemma of finding relevant, newly recognized, opportunistic pathogens and the identification achievement (s) that can be obtained by using...

  7. Next-generation sequencing analysis of miRNA expression in control and FSHD myogenesis.

    Directory of Open Access Journals (Sweden)

    Veronica Colangelo

    Full Text Available Emerging evidence has demonstrated that miRNA sequences can regulate skeletal myogenesis by controlling the process of myoblast proliferation and differentiation. However, at present a deep analysis of miRNA expression in control and FSHD myoblasts during differentiation has not yet been derived. To close this gap, we used a next-generation sequencing (NGS approach applied to in vitro myogenesis. Furthermore, to minimize sample genetic heterogeneity and muscle-type specific patterns of gene expression, miRNA profiling from NGS data was filtered with FC ≥ 4 (log(2FC ≥ 2 and p-value<0.05, and its validation was derived by qRT-PCR on myoblasts from seven muscle districts. In particular, control myogenesis showed the modulation of 38 miRNAs, the majority of which (34 out 38 were up-regulated, including myomiRs (miR-1, -133a, -133b and -206. Approximately one third of the modulated miRNAs were not previously reported to be involved in muscle differentiation, and interestingly some of these (i.e. miR-874, -1290, -95 and -146a were previously shown to regulate cell proliferation and differentiation. FSHD myogenesis evidenced a reduced number of modulated miRNAs than healthy muscle cells. The two processes shared nine miRNAs, including myomiRs, although with FC values lower in FSHD than in control cells. In addition, FSHD cells showed the modulation of six miRNAs (miR-1268, -1268b, -1908, 4258, -4508- and -4516 not evidenced in control cells and that therefore could be considered FSHD-specific, likewise three novel miRNAs that seem to be specifically expressed in FSHD myotubes. These data further clarify the impact of miRNA regulation during control myogenesis and strongly suggest that a complex dysregulation of miRNA expression characterizes FSHD, impairing two important features of myogenesis: cell cycle and muscle development. The derived miRNA profiling could represent a novel molecular signature for FSHD that includes diagnostic biomarkers and

  8. The complete genome sequence of a double-stranded RNA mycovirus from Fusarium graminearum strain HN1.

    Science.gov (United States)

    Wang, Luan; Wang, Shuangchao; Yang, Xiufen; Zeng, Hongmei; Qiu, Dewen; Guo, Lihua

    2017-07-01

    The complete nucleotide sequence of a double-stranded RNA (dsRNA) mycovirus, Fusarium graminearum dsRNA virus 5 (FgV5), was identified and characterized. The FgV5 genome comprises two dsRNA genome segments of 2030 bp and 1740 bp. FgV5 dsRNA1 contains a single open reading frame (ORF1), which is predicted to encode a protein of 613 amino acids (aa) with a molecular mass of 70.4 kDa and has a conserved RNA-dependent RNA polymerase (RdRp) motif. FgV5 dsRNA2 is predicted to contain two discontinuous ORFs (ORF2 and ORF3) that code for products of unknown function. Sequence comparisons showed that FgV5 has the highest aa sequence identities to Fusarium graminearum virus 4 (FgV4) (83.01% for ORF1, 78.70% for ORF2, and 76.27% for ORF3), suggesting that FgV5 and FgV4 should be regarded as members of different species. Phylogenetic analysis indicated that FgV5 belongs to a taxonomically unassigned dsRNA mycovirus group that is related to the families Amalgaviridae and Partitiviridae. Here, we propose that FgV5 and related viruses are members of a yet to be named and formally recognized new family.

  9. Polymorphism identification and improved genome annotation of Brassica rapa through Deep RNA sequencing.

    Science.gov (United States)

    Devisetty, Upendra Kumar; Covington, Michael F; Tat, An V; Lekkala, Saradadevi; Maloof, Julin N

    2014-08-12

    The mapping and functional analysis of quantitative traits in Brassica rapa can be greatly improved with the availability of physically positioned, gene-based genetic markers and accurate genome annotation. In this study, deep transcriptome RNA sequencing (RNA-Seq) of Brassica rapa was undertaken with two objectives: SNP detection and improved transcriptome annotation. We performed SNP detection on two varieties that are parents of a mapping population to aid in development of a marker system for this population and subsequent development of high-resolution genetic map. An improved Brassica rapa transcriptome was constructed to detect novel transcripts and to improve the current genome annotation. This is useful for accurate mRNA abundance and detection of expression QTL (eQTLs) in mapping populations. Deep RNA-Seq of two Brassica rapa genotypes-R500 (var. trilocularis, Yellow Sarson) and IMB211 (a rapid cycling variety)-using eight different tissues (root, internode, leaf, petiole, apical meristem, floral meristem, silique, and seedling) grown across three different environments (growth chamber, greenhouse and field) and under two different treatments (simulated sun and simulated shade) generated 2.3 billion high-quality Illumina reads. A total of 330,995 SNPs were identified in transcribed regions between the two genotypes with an average frequency of one SNP in every 200 bases. The deep RNA-Seq reassembled Brassica rapa transcriptome identified 44,239 protein-coding genes. Compared with current gene models of B. rapa, we detected 3537 novel transcripts, 23,754 gene models had structural modifications, and 3655 annotated proteins changed. Gaps in the current genome assembly of B. rapa are highlighted by our identification of 780 unmapped transcripts. All the SNPs, annotations, and predicted transcripts can be viewed at http://phytonetworks.ucdavis.edu/. Copyright © 2014 Devisetty et al.

  10. Import of desired nucleic acid sequences using addressing motif of mitochondrial ribosomal 5S-rRNA for fluorescent in vivo hybridization of mitochondrial DNA and RNA.

    Science.gov (United States)

    Zelenka, Jaroslav; Alán, Lukáš; Jabůrek, Martin; Ježek, Petr

    2014-04-01

    Based on the matrix-addressing sequence of mitochondrial ribosomal 5S-rRNA (termed MAM), which is naturally imported into mitochondria, we have constructed an import system for in vivo targeting of mitochondrial DNA (mtDNA) or mt-mRNA, in order to provide fluorescence hybridization of the desired sequences. Thus DNA oligonucleotides were constructed, containing the 5'-flanked T7 RNA polymerase promoter. After in vitro transcription and fluorescent labeling with Alexa Fluor(®) 488 or 647 dye, we obtained the fluorescent "L-ND5 probe" containing MAM and exemplar cargo, i.e., annealing sequence to a short portion of ND5 mRNA and to the light-strand mtDNA complementary to the heavy strand nd5 mt gene (5'-end 21 base pair sequence). For mitochondrial in vivo fluorescent hybridization, HepG2 cells were treated with dequalinium micelles, containing the fluorescent probes, bringing the probes proximally to the mitochondrial outer membrane and to the natural import system. A verification of import into the mitochondrial matrix of cultured HepG2 cells was provided by confocal microscopy colocalizations. Transfections using lipofectamine or probes without 5S-rRNA addressing MAM sequence or with MAM only were ineffective. Alternatively, the same DNA oligonucleotides with 5'-CACC overhang (substituting T7 promoter) were transcribed from the tetracycline-inducible pENTRH1/TO vector in human embryonic kidney T-REx®-293 cells, while mitochondrial matrix localization after import of the resulting unlabeled RNA was detected by PCR. The MAM-containing probe was then enriched by three-order of magnitude over the natural ND5 mRNA in the mitochondrial matrix. In conclusion, we present a proof-of-principle for mitochondrial in vivo hybridization and mitochondrial nucleic acid import.

  11. Analysis of microRNA profile of Anopheles sinensis by deep sequencing and bioinformatic approaches.

    Science.gov (United States)

    Feng, Xinyu; Zhou, Xiaojian; Zhou, Shuisen; Wang, Jingwen; Hu, Wei

    2018-03-12

    microRNAs (miRNAs) are small non-coding RNAs widely identified in many mosquitoes. They are reported to play important roles in development, differentiation and innate immunity. However, miRNAs in Anopheles sinensis, one of the Chinese malaria mosquitoes, remain largely unknown. We investigated the global miRNA expression profile of An. sinensis using Illumina Hiseq 2000 sequencing. Meanwhile, we applied a bioinformatic approach to identify potential miRNAs in An. sinensis. The identified miRNA profiles were compared and analyzed by two approaches. The selected miRNAs from the sequencing result and the bioinformatic approach were confirmed with qRT-PCR. Moreover, target prediction, GO annotation and pathway analysis were carried out to understand the role of miRNAs in An. sinensis. We identified 49 conserved miRNAs and 12 novel miRNAs by next-generation high-throughput sequencing technology. In contrast, 43 miRNAs were predicted by the bioinformatic approach, of which two were assigned as novel. Comparative analysis of miRNA profiles by two approaches showed that 21 miRNAs were shared between them. Twelve novel miRNAs did not match any known miRNAs of any organism, indicating that they are possibly species-specific. Forty miRNAs were found in many mosquito species, indicating that these miRNAs are evolutionally conserved and may have critical roles in the process of life. Both the selected known and novel miRNAs (asi-miR-281, asi-miR-184, asi-miR-14, asi-miR-nov5, asi-miR-nov4, asi-miR-9383, and asi-miR-2a) could be detected by quantitative real-time PCR (qRT-PCR) in the sequenced sample, and the expression patterns of these miRNAs measured by qRT-PCR were in concordance with the original miRNA sequencing data. The predicted targets for the known and the novel miRNAs covered many important biological roles and pathways indicating the diversity of miRNA functions. We also found 21 conserved miRNAs and eight counterparts of target immune pathway genes in An. sinensis

  12. Efficient extraction of small and large RNAs in bacteria for excellent total RNA sequencing and comprehensive transcriptome analysis.

    Science.gov (United States)

    Heera, Rajandas; Sivachandran, Parimannan; Chinni, Suresh V; Mason, Joanne; Croft, Larry; Ravichandran, Manickam; Yin, Lee Su

    2015-12-08

    Next-generation transcriptome sequencing (RNA-Seq) has become the standard practice for studying gene splicing, mutations and changes in gene expression to obtain valuable, accurate biological conclusions. However, obtaining good sequencing coverage and depth to study these is impeded by the difficulties of obtaining high quality total RNA with minimal genomic DNA contamination. With this in mind, we evaluated the performance of Phenol-free total RNA purification kit (Amresco) in comparison with TRI Reagent (MRC) and RNeasy Mini (Qiagen) for the extraction of total RNA of Pseudomonas aeruginosa which was grown in glucose-supplemented (control) and polyethylene-supplemented (growth-limiting condition) minimal medium. All three extraction methods were coupled with an in-house DNase I treatment before the yield, integrity and size distribution of the purified RNA were assessed. RNA samples extracted with the best extraction kit were then sequenced using the Illumina HiSeq 2000 platform. TRI Reagent gave the lowest yield enriched with small RNAs (sRNAs), while RNeasy gave moderate yield of good quality RNA with trace amounts of sRNAs. The Phenol-free kit, on the other hand, gave the highest yield and the best quality RNA (RIN value of 9.85 ± 0.3) with good amounts of sRNAs. Subsequent bioinformatic analysis of the sequencing data revealed that 5435 coding genes, 452 sRNAs and 7 potential novel intergenic sRNAs were detected, indicating excellent sequencing coverage across RNA size ranges. In addition, detection of low abundance transcripts and consistency of their expression profiles across replicates from the same conditions demonstrated the reproducibility of the RNA extraction technique. Amresco's Phenol-free Total RNA purification kit coupled with DNase I treatment yielded the highest quality RNAs containing good ratios of high and low molecular weight transcripts with minimal genomic DNA. These RNA extracts gave excellent non-biased sequencing coverage useful

  13. Technologically important extremophile 16S rRNA sequence Shannon entropy and fractal property comparison with long term dormant microbes

    Science.gov (United States)

    Holden, Todd; Gadura, N.; Dehipawala, S.; Cheung, E.; Tuffour, M.; Schneider, P.; Tremberger, G., Jr.; Lieberman, D.; Cheung, T.

    2011-10-01

    Technologically important extremophiles including oil eating microbes, uranium and rocket fuel perchlorate reduction microbes, electron producing microbes and electrode electrons feeding microbes were compared in terms of their 16S rRNA sequences, a standard targeted sequence in comparative phylogeny studies. Microbes that were reported to have survived a prolonged dormant duration were also studied. Examples included the recently discovered microbe that survives after 34,000 years in a salty environment while feeding off organic compounds from other trapped dead microbes. Shannon entropy of the 16S rRNA nucleotide composition and fractal dimension of the nucleotide sequence in terms of its atomic number fluctuation analyses suggest a selected range for these extremophiles as compared to other microbes; consistent with the experience of relatively mild evolutionary pressure. However, most of the microbes that have been reported to survive in prolonged dormant duration carry sequences with fractal dimension between 1.995 and 2.005 (N = 10 out of 13). Similar results are observed for halophiles, red-shifted chlorophyll and radiation resistant microbes. The results suggest that prolonged dormant duration, in analogous to high salty or radiation environment, would select high fractal 16S rRNA sequences. Path analysis in structural equation modeling supports a causal relation between entropy and fractal dimension for the studied 16S rRNA sequences (N = 7). Candidate choices for high fractal 16S rRNA microbes could offer protection for prolonged spaceflights. BioBrick gene network manipulation could include extremophile 16S rRNA sequences in synthetic biology and shed more light on exobiology and future colonization in shielded spaceflights. Whether the high fractal 16S rRNA sequences contain an asteroidlike extra-terrestrial source could be speculative but interesting.

  14. 3' and 5' microRNA-end post-biogenesis modifications in plant transcriptomes: Evidences from small RNA next generation sequencing data analysis.

    Science.gov (United States)

    Saraf, Shradha; Sanan-Mishra, Neeti; Gursanscky, Nial R; Carroll, Bernard J; Gupta, Dinesh; Mukherjee, Sunil Kumar

    2015-11-27

    The processing of miRNA from its precursors is a precisely regulated process and after biogenesis, the miRNAs are amenable to different kinds of modifications by the addition or deletion of nucleotides at the terminal ends. However, the mechanism and functions of such modifications are not well studied in plants. In this study, we have specifically analysed the terminal end non-templated miRNA modifications, using NGS data of rice, tomato and Arabidopsis small RNA transcriptomes from different tissues and physiological conditions. Our analysis reveals template independent terminal end modifications in the mature as well as passenger strands of the miRNA duplex. Interestingly, it is also observed that miRNA sequences terminating with a cytosine (C) at the 3' end undergo a higher percentage of 5' end modifications. The terminal end modifications did not correlate with the miRNA abundances and are independent of tissue types, physiological conditions and plant species. Our analysis indicates that the addition of nucleotides at miRNA ends is not influenced by the absence of RNA dependent RNA polymerase 6. Moreover the terminal end modified miRNAs are also observed amongst AGO1 bound small RNAs and have potential to alter target, indicating its important functional role in repression of gene expression. Copyright © 2015 Elsevier Inc. All rights reserved.

  15. 16S rRNA gene sequencing of mock microbial populations- impact of DNA extraction method, primer choice and sequencing platform.

    Science.gov (United States)

    Fouhy, Fiona; Clooney, Adam G; Stanton, Catherine; Claesson, Marcus J; Cotter, Paul D

    2016-06-24

    Next-generation sequencing platforms have revolutionised our ability to investigate the microbiota composition of complex environments, frequently through 16S rRNA gene sequencing of the bacterial component of the community. Numerous factors, including DNA extraction method, primer sequences and sequencing platform employed, can affect the accuracy of the results achieved. The aim of this study was to determine the impact of these three factors on 16S rRNA gene sequencing results, using mock communities and mock community DNA. The use of different primer sequences (V4-V5, V1-V2 and V1-V2 degenerate primers) resulted in differences in the genera and species detected. The V4-V5 primers gave the most comparable results across platforms. The three Ion PGM primer sets detected more of the 20 mock community species than the equivalent MiSeq primer sets. Data generated from DNA extracted using the 2 extraction methods were very similar. Microbiota compositional data differed depending on the primers and sequencing platform that were used. The results demonstrate the risks in comparing data generated using different sequencing approaches and highlight the merits of choosing a standardised approach for sequencing in situations where a comparison across multiple sequencing runs is required.

  16. An efficient and high fidelity method for amplification, cloning and sequencing of complete tospovirus genomic RNA segments

    Science.gov (United States)

    Amplification and sequencing of the complete M- and S-RNA segments of Tomato spotted wilt virus and Impatiens necrotic spot virus as a single fragment is useful for whole genome sequencing of tospoviruses co-infecting a single host plant. It avoids issues associated with overlapping amplicon-based ...

  17. Flow cytometry-assisted cloning of specific sequence motifs from complex 16S rRNA gene libraries

    DEFF Research Database (Denmark)

    Nielsen, J. L.; Schramm, A.; Engh, G. van den

    2004-01-01

    A How cytometry method was developed for rapid screening and recovery of cloned DNA containing common sequence motifs. This approach, termed fluorescence-activated cell sorting-assisted cloning, was used to recover sequences affiliated with a unique lineage within the Bacteroidetes not abundant i...... in a clone library of environmental 16S rRNA genes....

  18. 16S rRNA gene sequencing in routine identification of anaerobic bacteria isolated from blood cultures

    DEFF Research Database (Denmark)

    Justesen, Ulrik Stenz; Skov, Marianne Nielsine; Knudsen, Elisa

    2010-01-01

    A comparison between conventional identification and 16S rRNA gene sequencing of anaerobic bacteria isolated from blood cultures in a routine setting was performed (n = 127). With sequencing, 89% were identified to the species level, versus 52% with conventional identification. The times...

  19. Small RNA sequencing profiles of mir-181 and mir-221, the most relevant microRNAs in acute myeloid leukemia.

    Science.gov (United States)

    Lee, Yun-Gyoo; Kim, Inho; Oh, Somi; Shin, Dong-Yeop; Koh, Youngil; Lee, Keun-Wook

    2017-11-27

    To evaluate and select microRNAs relevant to acute myeloid leukemia (AML) pathogenesis, we analyzed differential microRNA expression by quantitative small RNA next-generation sequencing using duplicate marrow samples from individual AML patients. For this study, we obtained paired marrow samples at two different time points (initial diagnosis and first complete remission status) in patients with AML. Bone marrow microRNAs were profiled by next-generation small RNA sequencing. Quantification of microRNA expression was performed by counting aligned reads to microRNA genes. Among 38 samples (32 paired samples from 16 AML patients and 6 normal marrow controls), 27 were eligible for sequencing. Small RNA sequencing showed that 12 microRNAs were selectively expressed at higher levels in AML patients than in normal controls. Among these 12 microRNAs, mir-181, mir-221, and mir-3154 were more highly expressed at initial AML diagnosis as compared to first complete remission. Significant correlations were found between higher expression levels of mir-221, mir-146, and mir-155 and higher marrow blast counts. Our results demonstrate that mir-221 and mir-181 are selectively enriched in AML marrow and reflect disease activity. mir-3154 is a novel microRNA that is relevant to AML but needs further validation.

  20. Selection of mRNA 5'-untranslated region sequence with high translation efficiency through ribosome display

    International Nuclear Information System (INIS)

    Mie, Masayasu; Shimizu, Shun; Takahashi, Fumio; Kobatake, Eiry

    2008-01-01

    The 5'-untranslated region (5'-UTR) of mRNAs functions as a translation enhancer, promoting translation efficiency. Many in vitro translation systems exhibit a reduced efficiency in protein translation due to decreased translation initiation. The use of a 5'-UTR sequence with high translation efficiency greatly enhances protein production in these systems. In this study, we have developed an in vitro selection system that favors 5'-UTRs with high translation efficiency using a ribosome display technique. A 5'-UTR random library, comprised of 5'-UTRs tagged with a His-tag and Renilla luciferase (R-luc) fusion, were in vitro translated in rabbit reticulocytes. By limiting the translation period, only mRNAs with high translation efficiency were translated. During translation, mRNA, ribosome and translated R-luc with His-tag formed ternary complexes. They were collected with translated His-tag using Ni-particles. Extracted mRNA from ternary complex was amplified using RT-PCR and sequenced. Finally, 5'-UTR with high translation efficiency was obtained from random 5'-UTR library

  1. REMap: Operon map of M. tuberculosis based on RNA sequence data.

    Science.gov (United States)

    Pelly, Shaaretha; Winglee, Kathryn; Xia, Fang Fang; Stevens, Rick L; Bishai, William R; Lamichhane, Gyanu

    2016-07-01

    A map of the transcriptional organization of genes of an organism is a basic tool that is necessary to understand and facilitate a more accurate genetic manipulation of the organism. Operon maps are largely generated by computational prediction programs that rely on gene conservation and genome architecture and may not be physiologically relevant. With the widespread use of RNA sequencing (RNAseq), the prediction of operons based on actual transcriptome sequencing rather than computational genomics alone is much needed. Here, we report a validated operon map of Mycobacterium tuberculosis, developed using RNAseq data from both the exponential and stationary phases of growth. At least 58.4% of M. tuberculosis genes are organized into 749 operons. Our prediction algorithm, REMap (RNA Expression Mapping of operons), considers the many cases of transcription coverage of intergenic regions, and avoids dependencies on functional annotation and arbitrary assumptions about gene structure. As a result, we demonstrate that REMap is able to more accurately predict operons, especially those that contain long intergenic regions or functionally unrelated genes, than previous operon prediction programs. The REMap algorithm is publicly available as a user-friendly tool that can be readily modified to predict operons in other bacteria. Copyright © 2016 Elsevier Ltd. All rights reserved.

  2. Transcriptome analysis of the model protozoan, Tetrahymena thermophila, using Deep RNA sequencing.

    Directory of Open Access Journals (Sweden)

    Jie Xiong

    Full Text Available BACKGROUND: The ciliated protozoan Tetrahymena thermophila is a well-studied single-celled eukaryote model organism for cellular and molecular biology. However, the lack of extensive T. thermophila cDNA libraries or a large expressed sequence tag (EST database limited the quality of the original genome annotation. METHODOLOGY/PRINCIPAL FINDINGS: This RNA-seq study describes the first deep sequencing analysis of the T. thermophila transcriptome during the three major stages of the life cycle: growth, starvation and conjugation. Uniquely mapped reads covered more than 96% of the 24,725 predicted gene models in the somatic genome. More than 1,000 new transcribed regions were identified. The great dynamic range of RNA-seq allowed detection of a nearly six order-of-magnitude range of measurable gene expression orchestrated by this cell. RNA-seq also allowed the first prediction of transcript untranslated regions (UTRs and an updated (larger size estimate of the T. thermophila transcriptome: 57 Mb, or about 55% of the somatic genome. Our study identified nearly 1,500 alternative splicing (AS events distributed over 5.2% of T. thermophila genes. This percentage represents a two order-of-magnitude increase over previous EST-based estimates in Tetrahymena. Evidence of stage-specific regulation of alternative splicing was also obtained. Finally, our study allowed us to completely confirm about 26.8% of the genes originally predicted by the gene finder, to correct coding sequence boundaries and intron-exon junctions for about a third, and to reassign microarray probes and correct earlier microarray data. CONCLUSIONS/SIGNIFICANCE: RNA-seq data significantly improve the genome annotation and provide a fully comprehensive view of the global transcriptome of T. thermophila. To our knowledge, 5.2% of T. thermophila genes with AS is the highest percentage of genes showing AS reported in a unicellular eukaryote. Tetrahymena thus becomes an excellent unicellular

  3. Unprecedented high-resolution view of bacterial operon architecture revealed by RNA sequencing.

    Science.gov (United States)

    Conway, Tyrrell; Creecy, James P; Maddox, Scott M; Grissom, Joe E; Conkle, Trevor L; Shadid, Tyler M; Teramoto, Jun; San Miguel, Phillip; Shimada, Tomohiro; Ishihama, Akira; Mori, Hirotada; Wanner, Barry L

    2014-07-08

    We analyzed the transcriptome of Escherichia coli K-12 by strand-specific RNA sequencing at single-nucleotide resolution during steady-state (logarithmic-phase) growth and upon entry into stationary phase in glucose minimal medium. To generate high-resolution transcriptome maps, we developed an organizational schema which showed that in practice only three features are required to define operon architecture: the promoter, terminator, and deep RNA sequence read coverage. We precisely annotated 2,122 promoters and 1,774 terminators, defining 1,510 operons with an average of 1.98 genes per operon. Our analyses revealed an unprecedented view of E. coli operon architecture. A large proportion (36%) of operons are complex with internal promoters or terminators that generate multiple transcription units. For 43% of operons, we observed differential expression of polycistronic genes, despite being in the same operons, indicating that E. coli operon architecture allows fine-tuning of gene expression. We found that 276 of 370 convergent operons terminate inefficiently, generating complementary 3' transcript ends which overlap on average by 286 nucleotides, and 136 of 388 divergent operons have promoters arranged such that their 5' ends overlap on average by 168 nucleotides. We found 89 antisense transcripts of 397-nucleotide average length, 7 unannotated transcripts within intergenic regions, and 18 sense transcripts that completely overlap operons on the opposite strand. Of 519 overlapping transcripts, 75% correspond to sequences that are highly conserved in E. coli (>50 genomes). Our data extend recent studies showing unexpected transcriptome complexity in several bacteria and suggest that antisense RNA regulation is widespread. Importance: We precisely mapped the 5' and 3' ends of RNA transcripts across the E. coli K-12 genome by using a single-nucleotide analytical approach. Our resulting high-resolution transcriptome maps show that ca. one-third of E. coli operons are

  4. Community analysis of picocyanobacteria in an oligotrophic lake by cloning 16S rRNA gene and 16S rRNA gene amplicon sequencing.

    Science.gov (United States)

    Fujimoto, Naoshi; Mizuno, Keigo; Yokoyama, Tomoki; Ohnishi, Akihiro; Suzuki, Masaharu; Watanabe, Satoru; Komatsu, Kenji; Sakata, Yoichi; Kishida, Naohiro; Akiba, Michihiro; Matsukura, Satoko

    2015-01-01

    In this study, the picocyanobacterial species composition of Lake Miyagase was examined by analyzing the 16S rRNA gene in a clone library and by amplicon sequencing using a benchtop next-generation sequencer. Five separate samples were analyzed from different days over a ten-month period. In the picocyanobacterial lineage, 9 and 12 OTUs were identified from a clone library and by amplicon sequencing, respectively. Both analyses suggested that a picocyanobacterium related to Synechococcus sp. MW6B4 was dominant in Lake Miyagase. Our findings suggest that 16S rRNA gene amplicon sequencing enables detailed evaluation of picocyanobacteria composition. One OTU identified was found to be a novel cluster that does not group with any of the known freshwater picocyanobacteria.

  5. Discovery of differentially expressed genes in cashmere goat (Capra hircus) hair follicles by RNA sequencing.

    Science.gov (United States)

    Qiao, X; Wu, J H; Wu, R B; Su, R; Li, C; Zhang, Y J; Wang, R J; Zhao, Y H; Fan, Y X; Zhang, W G; Li, J Q

    2016-09-02

    The mammalian hair follicle (HF) is a unique, highly regenerative organ with a distinct developmental cycle. Cashmere goat (Capra hircus) HFs can be divided into two categories based on structure and development time: primary and secondary follicles. To identify differentially expressed genes (DEGs) in the primary and secondary HFs of cashmere goats, the RNA sequencing of six individuals from Arbas, Inner Mongolia, was performed. A total of 617 DEGs were identified; 297 were upregulated while 320 were downregulated. Gene ontology analysis revealed that the main functions of the upregulated genes were electron transport, respiratory electron transport, mitochondrial electron transport, and gene expression. The downregulated genes were mainly involved in cell autophagy, protein complexes, neutrophil aggregation, and bacterial fungal defense reactions. According to the Kyoto Encyclopedia of Genes and Genomes database, these genes are mainly involved in the metabolism of cysteine and methionine, RNA polymerization, and the MAPK signaling pathway, and were enriched in primary follicles. A microRNA-target network revealed that secondary follicles are involved in several important biological processes, such as the synthesis of keratin-associated proteins and enzymes involved in amino acid biosynthesis. In summary, these findings will increase our understanding of the complex molecular mechanisms of HF development and cycling, and provide a basis for the further study of the genes and functions of HF development.

  6. RNA sequencing reveals pronounced changes in the noncoding transcriptome of aging synaptosomes.

    Science.gov (United States)

    Chen, Bei Jun; Ueberham, Uwe; Mills, James D; Kirazov, Ludmil; Kirazov, Evgeni; Knobloch, Mara; Bochmann, Jana; Jendrek, Renate; Takenaka, Konii; Bliim, Nicola; Arendt, Thomas; Janitz, Michael

    2017-08-01

    Normal aging is associated with impairments in cognitive functions. These alterations are caused by diminutive changes in the biology of synapses, and ineffective neurotransmission, rather than loss of neurons. Hitherto, only a few studies, exploring molecular mechanisms of healthy brain aging in higher vertebrates, utilized synaptosomal fractions to survey local changes in aging-related transcriptome dynamics. Here we present, for the first time, a comparative analysis of the synaptosomes transcriptome in the aging mouse brain using RNA sequencing. Our results show changes in the expression of genes contributing to biological pathways related to neurite guidance, synaptosomal physiology, and RNA splicing. More intriguingly, we also discovered alterations in the expression of thousands of novel, unannotated lincRNAs during aging. Further, detailed characterization of the cleavage and polyadenylation factor I subunit 1 (Clp1) mRNA and protein expression indicates its increased expression in neuronal processes of hippocampal stratum radiatum in aging mice. Together, our study uncovers a new layer of transcriptional regulation which is targeted by aging within the local environment of interconnecting neuronal cells. Copyright © 2017 Elsevier Inc. All rights reserved.

  7. Small RNA and transcriptome deep sequencing proffers insight into floral gene regulation in Rosa cultivars

    Directory of Open Access Journals (Sweden)

    Kim Jungeun

    2012-11-01

    Full Text Available Abstract Background Roses (Rosa sp., which belong to the family Rosaceae, are the most economically important ornamental plants—making up 30% of the floriculture market. However, given high demand for roses, rose breeding programs are limited in molecular resources which can greatly enhance and speed breeding efforts. A better understanding of important genes that contribute to important floral development and desired phenotypes will lead to improved rose cultivars. For this study, we analyzed rose miRNAs and the rose flower transcriptome in order to generate a database to expound upon current knowledge regarding regulation of important floral characteristics. A rose genetic database will enable comprehensive analysis of gene expression and regulation via miRNA among different Rosa cultivars. Results We produced more than 0.5 million reads from expressed sequences, totalling more than 110 million bp. From these, we generated 35,657, 31,434, 34,725, and 39,722 flower unigenes from Rosa hybrid: ‘Vital’, ‘Maroussia’, and ‘Sympathy’ and Rosa rugosa Thunb. , respectively. The unigenes were assigned functional annotations, domains, metabolic pathways, Gene Ontology (GO terms, Plant Ontology (PO terms, and MIPS Functional Catalogue (FunCat terms. Rose flower transcripts were compared with genes from whole genome sequences of Rosaceae members (apple, strawberry, and peach and grape. We also produced approximately 40 million small RNA reads from flower tissue for Rosa, representing 267 unique miRNA tags. Among identified miRNAs, 25 of them were novel and 242 of them were conserved miRNAs. Statistical analyses of miRNA profiles revealed both shared and species-specific miRNAs, which presumably effect flower development and phenotypes. Conclusions In this study, we constructed a Rose miRNA and transcriptome database, and we analyzed the miRNAs and transcriptome generated from the flower tissues of four Rosa cultivars. The database provides a

  8. Transcriptome walking: a laboratory-oriented GUI-based approach to mRNA identification from deep-sequenced data.

    Science.gov (United States)

    French, Andrew S

    2012-12-05

    Deep sequencing technology provides efficient and economical production of large numbers of randomly positioned, relatively short, estimates of base identities in DNA molecules. Application of this technology to mRNA samples allows rapid examination of the molecular genetic environment in individual cells or tissues, the transcriptome. However, assembly of such short sequences into complete mRNA creates a challenge that limits the usefulness of the technology, particularly when no, or limited, genomic data is available. Several approaches to this problem have been developed, but there is still no general method to rapidly obtain an mRNA sequence from deep sequence data when a specific molecule, or family of molecules, are of interest. A frequent requirement is to identify specific mRNA molecules from tissues that are being investigated by methods such as electrophysiology, immunocytology and pharmacology. To be widely useful, any approach must be relatively simple to use in the laboratory by operators without extensive statistical or bioinformatics knowledge, and with readily available hardware. An approach was developed that allows de novo assembly of individual mRNA sequences in two linked stages: sequence discovery and sequence completion. Both stages rely on computer assisted, Graphical User Interface (GUI)-guided, user interaction with the data, but proceed relatively efficiently once discovery is complete. The method grows a discovered sequence by repeated passes through the complete raw data in a series of steps, and is hence termed 'transcriptome walking'. All of the operations required for transcriptome analysis are combined in one program that presents a relatively simple user interface and runs on a standard desktop, or laptop computer, but takes advantage of multi-core processors, when available. Complete mRNA sequence identifications usually require less than 24 hours. This approach has already identified previously unknown mRNA sequences in two animal

  9. Comprehensive microRNA profiling in B-cells of human centenarians by massively parallel sequencing

    Directory of Open Access Journals (Sweden)

    Gombar Saurabh

    2012-07-01

    Full Text Available Abstract Background MicroRNAs (miRNAs are small, non-coding RNAs that regulate gene expression and play a critical role in development, homeostasis, and disease. Despite their demonstrated roles in age-associated pathologies, little is known about the role of miRNAs in human aging and longevity. Results We employed massively parallel sequencing technology to identify miRNAs expressed in B-cells from Ashkenazi Jewish centenarians, i.e., those living to a hundred and a human model of exceptional longevity, and younger controls without a family history of longevity. With data from 26.7 million reads comprising 9.4 × 108 bp from 3 centenarian and 3 control individuals, we discovered a total of 276 known miRNAs and 8 unknown miRNAs ranging several orders of magnitude in expression levels, a typical characteristics of saturated miRNA-sequencing. A total of 22 miRNAs were found to be significantly upregulated, with only 2 miRNAs downregulated, in centenarians as compared to controls. Gene Ontology analysis of the predicted and validated targets of the 24 differentially expressed miRNAs indicated enrichment of functional pathways involved in cell metabolism, cell cycle, cell signaling, and cell differentiation. A cross sectional expression analysis of the differentially expressed miRNAs in B-cells from Ashkenazi Jewish individuals between the 50th and 100th years of age indicated that expression levels of miR-363* declined significantly with age. Centenarians, however, maintained the youthful expression level. This result suggests that miR-363* may be a candidate longevity-associated miRNA. Conclusion Our comprehensive miRNA data provide a resource for further studies to identify genetic pathways associated with aging and longevity in humans.

  10. Molecular Mechanisms of Mild and Severe Pneumonia: Insights from RNA Sequencing.

    Science.gov (United States)

    Huang, Sai; Feng, Cong; Chen, Li; Huang, Zhi; Zhou, Xuan; Li, Bei; Wang, Li-Li; Chen, Wei; Lv, Fa-Qin; Li, Tan-Shi

    2017-04-06

    BACKGROUND This study aimed to uncover the molecular mechanisms underlying mild and severe pneumonia by use of mRNA sequencing (RNA-seq). MATERIAL AND METHODS RNA was extracted from the peripheral blood of patients with mild pneumonia, severe pneumonia, and healthy controls. Sequencing was performed on the HiSeq4000 platform. After filtering, clean reads were mapped to the human reference genome hg19. Differentially expressed genes (DEGs) were identified between the control group and the mild or severe group. A transcription factor-gene network was constructed for each group. Biological process (BP) terms enriched by DEGs in the network were analyzed and these genes were also mapped to the Connectivity map to search for small-molecule drugs. RESULTS A total of 199 and 560 DEGs were identified from the mild group and severe group, respectively. A transcription factor-gene network consisting of 215 nodes and another network consisting of 451 nodes were constructed in the mild group and severe group, respectively, and 54 DEGs (e.g., S100A9 and S100A12) were found to be common, with consistent differential expression changes in the 2 groups. Genes in the transcription factor-gene network for the mild group were mainly enriched in 13 BP terms, especially defense and inflammatory response (e.g., S100A8) and spermatogenesis, while the top BP terms enriched by genes in the severe group include response to oxidative stress (CCL5), wound healing, and regulation of cell differentiation (CCL5), and of the cellular protein metabolic process. CONCLUSIONS S100A9 and S100A12 may have a role in the pathogenesis of pneumonia: S100A9 and CXCL1 may contribute solely in mild pneumonia, and CCL5 and CXCL11 may contribute in severe pneumonia.

  11. Phylogenetic inference of Coxiella burnetii by 16S rRNA gene sequencing.

    Directory of Open Access Journals (Sweden)

    Heather P McLaughlin

    Full Text Available Coxiella burnetii is a human pathogen that causes the serious zoonotic disease Q fever. It is ubiquitous in the environment and due to its wide host range, long-range dispersal potential and classification as a bioterrorism agent, this microorganism is considered an HHS Select Agent. In the event of an outbreak or intentional release, laboratory strain typing methods can contribute to epidemiological investigations, law enforcement investigation and the public health response by providing critical information about the relatedness between C. burnetii isolates collected from different sources. Laboratory cultivation of C. burnetii is both time-consuming and challenging. Availability of strain collections is often limited and while several strain typing methods have been described over the years, a true gold-standard method is still elusive. Building upon epidemiological knowledge from limited, historical strain collections and typing data is essential to more accurately infer C. burnetii phylogeny. Harmonization of auspicious high-resolution laboratory typing techniques is critical to support epidemiological and law enforcement investigation. The single nucleotide polymorphism (SNP -based genotyping approach offers simplicity, rapidity and robustness. Herein, we demonstrate SNPs identified within 16S rRNA gene sequences can differentiate C. burnetii strains. Using this method, 55 isolates were assigned to six groups based on six polymorphisms. These 16S rRNA SNP-based genotyping results were largely congruent with those obtained by analyzing restriction-endonuclease (RE-digested DNA separated by SDS-PAGE and by the high-resolution approach based on SNPs within multispacer sequence typing (MST loci. The SNPs identified within the 16S rRNA gene can be used as targets for the development of additional SNP-based genotyping assays for C. burnetii.

  12. Emaravirus-specific degenerate PCR primers allowed the identification of partial RNA-dependent RNA polymerase sequences of Maize red stripe virus and Pigeonpea sterility mosaic virus.

    Science.gov (United States)

    Elbeaino, Toufic; Whitfield, Anna; Sharma, Mamta; Digiaro, Michele

    2013-03-01

    Emaravirus is a recently established viral genus that includes two approved virus species: European mountain ash ringspot-associated virus (EMARaV) and Fig mosaic virus (FMV). Other described but unclassified viruses appear to share biological characteristics similar to emaraviruses, including segmented, negative-single stranded RNA genomes with enveloped virions approximately 80-200nm in diameter. Sequence analysis of emaravirus genomes revealed the presence of conserved amino acid sequences in the RNA-dependent RNA polymerase gene (RdRp) denoted as pre-motif A, motifs A and C. Degenerate oligonucleotide primers were developed to these conserved sequences and were shown to amplify in reverse transcription-polymerase chain reaction assay (RT-PCR) DNA fragments of 276bp and 360bp in size. These primers efficiently detected emaraviruses with known sequences available in the database (FMV and EMARaV); they also detected viruses with limited sequence information such as Pigeonpea sterility mosaic virus (PPSMV) and Maize red stripe virus (MRSV). The degenerate primers designed on pre-motif A and motif A sequences successfully amplified the four species used as positive controls (276bp), whereas those of motifs A and C failed to detect only MRSV. The amino acid sequences obtained from PPSMV and MRSV shared the highest identity with those of two other tentative species of the Emaravirus genus, Rose rosette virus (RRV) (69%) and Redbud yellow ringspot virus (RYRV) (60%), respectively. The phylogenetic tree constructed with 92 amino acid-long portions of polypeptide putatively encoded by RNA1 of definitive and tentative emaravirus species clustered PPSMV and MRSV in two separate clades close to RRV and Raspberry leaf blotch virus (RLBV), respectively. The newly developed degenerate primers have proved their efficacy in amplifying new emaravirus-specific sequences; accordingly, they could be useful in identifying new emaravirus-like species in nature. Copyright © 2012

  13. LPEseq: Local-Pooled-Error Test for RNA Sequencing Experiments with a Small Number of Replicates.

    Science.gov (United States)

    Gim, Jungsoo; Won, Sungho; Park, Taesung

    2016-01-01

    RNA-Sequencing (RNA-Seq) provides valuable information for characterizing the molecular nature of the cells, in particular, identification of differentially expressed transcripts on a genome-wide scale. Unfortunately, cost and limited specimen availability often lead to studies with small sample sizes, and hypothesis testing on differential expression between classes with a small number of samples is generally limited. The problem is especially challenging when only one sample per each class exists. In this case, only a few methods among many that have been developed are applicable for identifying differentially expressed transcripts. Thus, the aim of this study was to develop a method able to accurately test differential expression with a limited number of samples, in particular non-replicated samples. We propose a local-pooled-error method for RNA-Seq data (LPEseq) to account for non-replicated samples in the analysis of differential expression. Our LPEseq method extends the existing LPE method, which was proposed for microarray data, to allow examination of non-replicated RNA-Seq experiments. We demonstrated the validity of the LPEseq method using both real and simulated datasets. By comparing the results obtained using the LPEseq method with those obtained from other methods, we found that the LPEseq method outperformed the others for non-replicated datasets, and showed a similar performance with replicated samples; LPEseq consistently showed high true discovery rate while not increasing the rate of false positives regardless of the number of samples. Our proposed LPEseq method can be effectively used to conduct differential expression analysis as a preliminary design step or for investigation of a rare specimen, for which a limited number of samples is available.

  14. Deep small RNA sequencing from the nematode Ascaris reveals conservation, functional diversification, and novel developmental profiles.

    Science.gov (United States)

    Wang, Jianbin; Czech, Benjamin; Crunk, Amanda; Wallace, Adam; Mitreva, Makedonka; Hannon, Gregory J; Davis, Richard E

    2011-09-01

    Eukaryotic cells express several classes of small RNAs that regulate gene expression and ensure genome maintenance. Endogenous siRNAs (endo-siRNAs) and Piwi-interacting RNAs (piRNAs) mainly control gene and transposon expression in the germline, while microRNAs (miRNAs) generally function in post-transcriptional gene silencing in both somatic and germline cells. To provide an evolutionary and developmental perspective on small RNA pathways in nematodes, we identified and characterized known and novel small RNA classes through gametogenesis and embryo development in the parasitic nematode Ascaris suum and compared them with known small RNAs of Caenorhabditis elegans. piRNAs, Piwi-clade Argonautes, and other proteins associated with the piRNA pathway have been lost in Ascaris. miRNAs are synthesized immediately after fertilization in utero, before pronuclear fusion, and before the first cleavage of the zygote. This is the earliest expression of small RNAs ever described at a developmental stage long thought to be transcriptionally quiescent. A comparison of the two classes of Ascaris endo-siRNAs, 22G-RNAs and 26G-RNAs, to those in C. elegans, suggests great diversification and plasticity in the use of small RNA pathways during spermatogenesis in different nematodes. Our data reveal conserved characteristics of nematode small RNAs as well as features unique to Ascaris that illustrate significant flexibility in the use of small RNAs pathways, some of which are likely an adaptation to Ascaris' life cycle and parasitism. The transcriptome assembly has been submitted to NCBI Transcriptome Shotgun Assembly Sequence Database(http://www.ncbi.nlm.nih.gov/genbank/TSA.html) under accession numbers JI163767–JI182837 and JI210738–JI257410.

  15. Both Maintenance and Avoidance of RNA-Binding Protein Interactions Constrain Coding Sequence Evolution.

    Science.gov (United States)

    Savisaar, Rosina; Hurst, Laurence D

    2017-05-01

    While the principal force directing coding sequence (CDS) evolution is selection on protein function, to ensure correct gene expression CDSs must also maintain interactions with RNA-binding proteins (RBPs). Understanding how our genes are shaped by these RNA-level pressures is necessary for diagnostics and for improving transgenes. However, the evolutionary impact of the need to maintain RBP interactions remains unresolved. Are coding sequences constrained by the need to specify RBP binding motifs? If so, what proportion of mutations are affected? Might sequence evolution also be constrained by the need not to specify motifs that might attract unwanted binding, for instance because it would interfere with exon definition? Here, we have scanned human CDSs for motifs that have been experimentally determined to be recognized by RBPs. We observe two sets of motifs-those that are enriched over nucleotide-controlled null and those that are depleted. Importantly, the depleted set is enriched for motifs recognized by non-CDS binding RBPs. Supporting the functional relevance of our observations, we find that motifs that are more enriched are also slower-evolving. The net effect of this selection to preserve is a reduction in the over-all rate of synonymous evolution of 2-3% in both primates and rodents. Stronger motif depletion, on the other hand, is associated with stronger selection against motif gain in evolution. The challenge faced by our CDSs is therefore not only one of attracting the right RBPs but also of avoiding the wrong ones, all while also evolving under selection pressures related to protein structure. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  16. Selective amplification and sequencing of cyclic phosphate-containing RNAs by the cP-RNA-seq method.

    Science.gov (United States)

    Honda, Shozo; Morichika, Keisuke; Kirino, Yohei

    2016-03-01

    RNA digestions catalyzed by many ribonucleases generate RNA fragments that contain a 2',3'-cyclic phosphate (cP) at their 3' termini. However, standard RNA-seq methods are unable to accurately capture cP-containing RNAs because the cP inhibits the adapter ligation reaction. We recently developed a method named cP-RNA-seq that is able to selectively amplify and sequence cP-containing RNAs. Here we describe the cP-RNA-seq protocol in which the 3' termini of all RNAs, except those containing a cP, are cleaved through a periodate treatment after phosphatase treatment; hence, subsequent adapter ligation and cDNA amplification steps are exclusively applied to cP-containing RNAs. cP-RNA-seq takes ∼6 d, excluding the time required for sequencing and bioinformatics analyses, which are not covered in detail in this protocol. Biochemical validation of the existence of cP in the identified RNAs takes ∼3 d. Even though the cP-RNA-seq method was developed to identify angiogenin-generating 5'-tRNA halves as a proof of principle, the method should be applicable to global identification of cP-containing RNA repertoires in various transcriptomes.

  17. Complete nucleotide sequence of a South African isolate of Grapevine fanleaf virus and its associated satellite RNA.

    Science.gov (United States)

    Lamprecht, Renate L; Spaltman, Monique; Stephan, Dirk; Wetzel, Thierry; Burger, Johan T

    2013-07-17

    The complete sequences of RNA1, RNA2 and satellite RNA have been determined for a South African isolate of Grapevine fanleaf virus (GFLV-SACH44). The two RNAs of GFLV-SACH44 are 7,341 nucleotides (nt) and 3,816 nt in length, respectively, and its satellite RNA (satRNA) is 1,104 nt in length, all excluding the poly(A) tail. Multiple sequence alignment of these sequences showed that GFLV-SACH44 RNA1 and RNA2 were the closest to the South African isolate, GFLV-SAPCS3 (98.2% and 98.6% nt identity, respectively), followed by the French isolate, GFLV-F13 (87.3% and 90.1% nt identity, respectively). Interestingly, the GFLV-SACH44 satRNA is more similar to three Arabis mosaic virus satRNAs (85%-87.4% nt identity) than to the satRNA of GFLV-F13 (81.8% nt identity) and was most distantly related to the satRNA of GFLV-R2 (71.0% nt identity). Full-length infectious clones of GFLV-SACH44 satRNA were constructed. The infectivity of the clones was tested with three nepovirus isolates, GFLV-NW, Arabis mosaic virus (ArMV)-NW and GFLV-SAPCS3. The clones were mechanically inoculated in Chenopodium quinoa and were infectious when co-inoculated with the two GFLV helper viruses, but not when co-inoculated with ArMV-NW.

  18. Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure.

    Science.gov (United States)

    Mathews, D H; Sabina, J; Zuker, M; Turner, D H

    1999-05-21

    An improved dynamic programming algorithm is reported for RNA secondary structure prediction by free energy minimization. Thermodynamic parameters for the stabilities of secondary structure motifs are revised to include expanded sequence dependence as revealed by recent experiments. Additional algorithmic improvements include reduced search time and storage for multibranch loop free energies and improved imposition of folding constraints. An extended database of 151,503 nt in 955 structures? determined by comparative sequence analysis was assembled to allow optimization of parameters not based on experiments and to test the accuracy of the algorithm. On average, the predicted lowest free energy structure contains 73 % of known base-pairs when domains of fewer than 700 nt are folded; this compares with 64 % accuracy for previous versions of the algorithm and parameters. For a given sequence, a set of 750 generated structures contains one structure that, on average, has 86 % of known base-pairs. Experimental constraints, derived from enzymatic and flavin mononucleotide cleavage, improve the accuracy of structure predictions. Copyright 1999 Academic Press.

  19. The nucleotide sequence of the RNA-2 of an isolate of the English serotype of tomato black ring virus: RNA recombination in the history of nepoviruses.

    Science.gov (United States)

    Le Gall, O L; Lanneau, M; Candresse, T; Dunez, J

    1995-05-01

    The RNA-2 of a carrot isolate from the English serotype of tomato black ring nepovirus (TBRV-ED) has been sequenced. It is 4618 nucleotides long and contains one open reading frame encoding a polypeptide of 1344 amino acids. The 5' non-coding region contains three repetitions of a stem-loop structure also conserved in TBRV-Scottish and grapevine chrome mosaic nepovirus (GCMV). The coat protein domain was mapped to the carboxy-terminal one-third of the polyprotein. Sequence comparisons indicate that TBRV-ED RNA-2 probably arose by an RNA recombination event that resulted in the exchange of the putative movement protein gene between TBRV and GCMV.

  20. RCK: accurate and efficient inference of sequence- and structure-based protein-RNA binding models from RNAcompete data.

    Science.gov (United States)

    Orenstein, Yaron; Wang, Yuhao; Berger, Bonnie

    2016-06-15

    Protein-RNA interactions, which play vital roles in many processes, are mediated through both RNA sequence and structure. CLIP-based methods, which measure protein-RNA binding in vivo, suffer from experimental noise and systematic biases, whereas in vitro experiments capture a clearer signal of protein RNA-binding. Among them, RNAcompete provides binding affinities of a specific protein to more than 240 000 unstructured RNA probes in one experiment. The computational challenge is to infer RNA structure- and sequence-based binding models from these data. The state-of-the-art in sequence models, Deepbind, does not model structural preferences. RNAcontext models both sequence and structure preferences, but is outperformed by GraphProt. Unfortunately, GraphProt cannot detect structural preferences from RNAcompete data due to the unstructured nature of the data, as noted by its developers, nor can it be tractably run on the full RNACompete dataset. We develop RCK, an efficient, scalable algorithm that infers both sequence and structure preferences based on a new k-mer based model. Remarkably, even though RNAcompete data is designed to be unstructured, RCK can still learn structural preferences from it. RCK significantly outperforms both RNAcontext and Deepbind in in vitro binding prediction for 244 RNAcompete experiments. Moreover, RCK is also faster and uses less memory, which enables scalability. While currently on par with existing methods in in vivo binding prediction on a small scale test, we demonstrate that RCK will increasingly benefit from experimentally measured RNA structure profiles as compared to computationally predicted ones. By running RCK on the entire RNAcompete dataset, we generate and provide as a resource a set of protein-RNA structure-based models on an unprecedented scale. Software and models are freely available at http://rck.csail.mit.edu/ bab@mit.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by

  1. Nucleotide sequence of the 18S-26S rRNA intergene region of the sea urchin.

    Science.gov (United States)

    Hindenach, B R; Stafford, D W

    1984-02-10

    The DNA sequence which spans the internal transcribed spacers of a cloned ribosomal transcription unit from the sea urchin, Lytechinus variegatus, has been determined. The region extends from the conserved Eco RI site near the 3' end of the 18S rDNA to a Bam HI site in the 26S rDNA and includes 232 nucleotides coding for 18S rRNA, 367 nucleotides of internal transcribed spacer, 159 nucleotides coding for 5.8S rRNA, 338 nucleotides of internal transcribed spacer, and 505 nucleotides coding for 26S rRNA. The rRNA coding regions were identified by direct analysis of 3'-labeled 18S and 5.8S rRNA and 5'-labeled 5.8S rRNA, and by sequence homology of the 26S rDNA with yeast and vertebrate 26/28S rRNAs. The internal transcribed spacers are GC-rich, similar to those of vertebrates. The 5.8S and 5' 26S rDNA sequences support a proposed model for a structural domain of the yeast large subunit ribosomal RNA (Veldman et al. [1981] Nucleic Acids Res. 9, 6935-6952).

  2. De novo transcriptome sequencing in Anopheles funestus using Illumina RNA-seq technology.

    Directory of Open Access Journals (Sweden)

    Jacob E Crawford

    Full Text Available BACKGROUND: Anopheles funestus is one of the primary vectors of human malaria, which causes a million deaths each year in sub-Saharan Africa. Few scientific resources are available to facilitate studies of this mosquito species and relatively little is known about its basic biology and evolution, making development and implementation of novel disease control efforts more difficult. The An. funestus genome has not been sequenced, so in order to facilitate genome-scale experimental biology, we have sequenced the adult female transcriptome of An. funestus from a newly founded colony in Burkina Faso, West Africa, using the Illumina GAIIx next generation sequencing platform. METHODOLOGY/PRINCIPAL FINDINGS: We assembled short Illumina reads de novo using a novel approach involving iterative de novo assemblies and "target-based" contig clustering. We then selected a conservative set of 15,527 contigs through comparisons to four Dipteran transcriptomes as well as multiple functional and conserved protein domain databases. Comparison to the Anopheles gambiae immune system identified 339 contigs as putative immune genes, thus identifying a large portion of the immune system that can form the basis for subsequent studies of this important malaria vector. We identified 5,434 1:1 orthologues between An. funestus and An. gambiae and found that among these 1:1 orthologues, the protein sequence of those with putative immune function were significantly more diverged than the transcriptome as a whole. Short read alignments to the contig set revealed almost 367,000 genetic polymorphisms segregating in the An. funestus colony and demonstrated the utility of the assembled transcriptome for use in RNA-seq based measurements of gene expression. CONCLUSIONS/SIGNIFICANCE: We developed a pipeline that makes de novo transcriptome sequencing possible in virtually any organism at a very reasonable cost ($6,300 in sequencing costs in our case. We anticipate that our approach

  3. Improved annotation of 3' untranslated regions and complex loci by combination of strand-specific direct RNA sequencing, RNA-Seq and ESTs.

    Directory of Open Access Journals (Sweden)

    Nicholas J Schurch

    Full Text Available The reference annotations made for a genome sequence provide the framework for all subsequent analyses of the genome. Correct and complete annotation in addition to the underlying genomic sequence is particularly important when interpreting the results of RNA-seq experiments where short sequence reads are mapped against the genome and assigned to genes according to the annotation. Inconsistencies in annotations between the reference and the experimental system can lead to incorrect interpretation of the effect on RNA expression of an experimental treatment or mutation in the system under study. Until recently, the genome-wide annotation of 3' untranslated regions received less attention than coding regions and the delineation of intron/exon boundaries. In this paper, data produced for samples in Human, Chicken and A. thaliana by the novel single-molecule, strand-specific, Direct RNA Sequencing technology from Helicos Biosciences which locates 3' polyadenylation sites to within +/- 2 nt, were combined with archival EST and RNA-Seq data. Nine examples are illustrated where this combination of data allowed: (1 gene and 3' UTR re-annotation (including extension of one 3' UTR by 5.9 kb; (2 disentangling of gene expression in complex regions; (3 clearer interpretation of small RNA expression and (4 identification of novel genes. While the specific examples displayed here may become obsolete as genome sequences and their annotations are refined, the principles laid out in this paper will be of general use both to those annotating genomes and those seeking to interpret existing publically available annotations in the context of their own experimental data.

  4. Probe-Directed Degradation (PDD) for Flexible Removal of Unwanted cDNA Sequences from RNA-Seq Libraries.

    Science.gov (United States)

    Archer, Stuart K; Shirokikh, Nikolay E; Preiss, Thomas

    2015-04-01

    Most applications for RNA-seq require the depletion of abundant transcripts to gain greater coverage of the underlying transcriptome. The sequences to be targeted for depletion depend on application and species and in many cases may not be supported by commercial depletion kits. This unit describes a method for generating RNA-seq libraries that incorporates probe-directed degradation (PDD), which can deplete any unwanted sequence set, with the low-bias split-adapter method of library generation (although many other library generation methods are in principle compatible). The overall strategy is suitable for applications requiring customized sequence depletion or where faithful representation of fragment ends and lack of sequence bias is paramount. We provide guidelines to rapidly design specific probes against the target sequence, and a detailed protocol for library generation using the split-adapter method including several strategies for streamlining the technique and reducing adapter dimer content. Copyright © 2015 John Wiley & Sons, Inc.

  5. Transcriptome Analysis of Ceriops tagal in Saline Environments Using RNA-Sequencing.

    Directory of Open Access Journals (Sweden)

    Xiaorong Xiao

    Full Text Available Identification of genes involved in mangrove species' adaptation to salt stress can provide valuable information for developing salt-tolerant crops and understanding the molecular evolution of salt tolerance in halophiles. Ceriops tagal is a salt-tolerant mangrove tree growing in mudflats and marshes in tropical and subtropical areas, without any prior genome information. In this study, we assessed the biochemical and transcriptional responses of C. tagal to high salt treatment (500 mmol/L NaCl by hydroponic experiments and RNA-seq. In C. tagal root tissues under salt stress, proline accumulated strongly from 3 to 12 h of treatment; meanwhile, malondialdehyde content progressively increased from 0 to 9 h, then dropped to lower than control levels by 24 h. These implied that C. tagal plants could survive salt stress through biochemical modification. Using the Illumina sequencing platform, approximately 27.39 million RNA-seq reads were obtained from three salt-treated and control (untreated root samples. These reads were assembled into 47,111 transcripts with an average length of 514 bp and an N50 of 632 bp. Approximately 78% of the transcripts were annotated, and a total of 437 genes were putative transcription factors. Digital gene expression analysis was conducted by comparing transcripts from the untreated control to the three salt treated samples, and 7,330 differentially expressed transcripts were identified. Using k-means clustering, these transcripts were divided into six clusters that differed in their expression patterns across four treatment time points. The genes identified as being up- or downregulated are involved in salt stress responses, signal transduction, and DNA repair. Our study shows the main adaptive pathway of C. tagal in saline environments, under short-term and long-term treatments of salt stress. This provides vital clues as to which genes may be candidates for breeding salt-tolerant crops and clarifying molecular

  6. CleanTag Adapters Improve Small RNA Next-Generation Sequencing Library Preparation by Reducing Adapter Dimers.

    Science.gov (United States)

    Shore, Sabrina; Henderson, Jordana M; McCaffrey, Anton P

    2018-01-01

    Next-generation small RNA sequencing is a valuable tool which is increasing our knowledge regarding small noncoding RNAs and their function in regulating genetic information. Library preparation protocols for small RNA have thus far been restricted due to higher RNA input requirements (>10 ng), long workflows, and tedious manual gel purifications. Small RNA library preparation methods focus largely on the prevention or depletion of a side product known as adapter dimer that tends to dominate the reaction. Adapter dimer is the ligation of two adapters to one another without an intervening library RNA insert or any useful sequencing information. The amplification of this side reaction is favored over the amplification of tagged library since it is shorter. The small size discrepancy between these two species makes separation and purification of the tagged library very difficult. Adapter dimer hinders the use of low input samples and the ability to automate the workflow so we introduce an improved library preparation protocol which uses chemically modified adapters (CleanTag) to significantly reduce the adapter dimer. CleanTag small RNA library preparation workflow decreases adapter dimer to allow for ultra-low input samples (down to approx. 10 pg total RNA), elimination of the gel purification step, and automation. We demonstrate how to carry out this streamlined protocol to improve NGS data quality and allow for the use of sample types with limited RNA material.

  7. Deep sequencing and proteomic analysis of the microRNA-induced silencing complex in human red blood cells.

    Science.gov (United States)

    Azzouzi, Imane; Moest, Hansjoerg; Wollscheid, Bernd; Schmugge, Markus; Eekels, Julia J M; Speer, Oliver

    2015-05-01

    During maturation, erythropoietic cells extrude their nuclei but retain their ability to respond to oxidant stress by tightly regulating protein translation. Several studies have reported microRNA-mediated regulation of translation during terminal stages of erythropoiesis, even after enucleation. In the present study, we performed a detailed examination of the endogenous microRNA machinery in human red blood cells using a combination of deep sequencing analysis of microRNAs and proteomic analysis of the microRNA-induced silencing complex. Among the 197 different microRNAs detected, miR-451a was the most abundant, representing more than 60% of all read sequences. In addition, miR-451a and its known target, 14-3-3ζ mRNA, were bound to the microRNA-induced silencing complex, implying their direct interaction in red blood cells. The proteomic characterization of endogenous Argonaute 2-associated microRNA-induced silencing complex revealed 26 cofactor candidates. Among these cofactors, we identified several RNA-binding proteins, as well as motor proteins and vesicular trafficking proteins. Our results demonstrate that red blood cells contain complex microRNA machinery, which might enable immature red blood cells to control protein translation independent of de novo nuclei information. Copyright © 2015 ISEH - International Society for Experimental Hematology. Published by Elsevier Inc. All rights reserved.

  8. Translation of the flavivirus kunjin NS3 gene in cis but not its RNA sequence or secondary structure is essential for efficient RNA packaging.

    Science.gov (United States)

    Pijlman, Gorben P; Kondratieva, Natasha; Khromykh, Alexander A

    2006-11-01

    Our previous studies using trans-complementation analysis of Kunjin virus (KUN) full-length cDNA clones harboring in-frame deletions in the NS3 gene demonstrated the inability of these defective complemented RNAs to be packaged into virus particles (W. J. Liu, P. L. Sedlak, N. Kondratieva, and A. A. Khromykh, J. Virol. 76:10766-10775). In this study we aimed to establish whether this requirement for NS3 in RNA packaging is determined by the secondary RNA structure of the NS3 gene or by the essential role of the translated NS3 gene product. Multiple silent mutations of three computer-predicted stable RNA structures in the NS3 coding region of KUN replicon RNA aimed at disrupting RNA secondary structure without affecting amino acid sequence did not affect RNA replication and packaging into virus-like particles in the packaging cell line, thus demonstrating that the predicted conserved RNA structures in the NS3 gene do not play a role in RNA replication and/or packaging. In contrast, double frameshift mutations in the NS3 coding region of full-length KUN RNA, producing scrambled NS3 protein but retaining secondary RNA structure, resulted in the loss of ability of these defective RNAs to be packaged into virus particles in complementation experiments in KUN replicon-expressing cells. Furthermore, the more robust complementation-packaging system based on established stable cell lines producing large amounts of complemented replicating NS3-deficient replicon RNAs and infection with KUN virus to provide structural proteins also failed to detect any secreted virus-like particles containing packaged NS3-deficient replicon RNAs. These results have now firmly established the requirement of KUN NS3 protein translated in cis for genome packaging into virus particles.

  9. RNA sequencing of trigeminal ganglia in Rattus Norvegicus after glyceryl trinitrate infusion with relevance to migraine

    DEFF Research Database (Denmark)

    Pedersen, Sara Hougaard; Sørensen, Lasse Maretty; Ramachandran, Roshni

    2016-01-01

    INTRODUCTION: Infusion of glyceryl trinitrate (GTN), a donor of nitric oxide, induces immediate headache in humans that in migraineurs is followed by a delayed migraine attack. In order to achieve increased knowledge of mechanisms activated during GTN-infusion this present study aims to investigate...... transcriptional responses to GTN-infusion in the rat trigeminal ganglia. METHODS: Rats were infused with GTN or vehicle and trigeminal ganglia were isolated either 30 or 90 minutes post infusion. RNA sequencing was used to investigate transcriptomic changes in response to the treatment. Furthermore, we developed...... a novel method for Gene Set Analysis Of Variance (GSANOVA) to identify gene sets associated with transcriptional changes across time. RESULTS: 15 genes displayed significant changes in transcription levels in response to GTN-infusion. Ten of these genes showed either sustained up- or down...

  10. The nucleotide sequence of histidine tRNA gamma of Drosophila melanogaster.

    OpenAIRE

    Altwegg, M; Kubli, E

    1980-01-01

    The nucleotide sequence of D. melanogaster histidine tRNA gamma was determined to be: pG-G-C-C-G-U-G-A-U-C-G-U-C-psi-A-G-D-G-G-D-D-A-G-G-A-C-C-C-C-A-C-G-psi-U-G-U-G- m1G-C-C-G-U-G-G-U-A-A-C-C-m5C-A-G-G-U-psi-C-G-m1A-A-U-C-C-U-G-G-U-C-A-C-G-G-m5C -A-C-C-AOH. An additional unpaired G is found at the 5' end, and the T in the TpsiC loop is replaced by a U.

  11. Conserved properties of dentate gyrus neurogenesis across postnatal development revealed by single-cell RNA sequencing.

    Science.gov (United States)

    Hochgerner, Hannah; Zeisel, Amit; Lönnerberg, Peter; Linnarsson, Sten

    2018-02-01

    The dentate gyrus of the hippocampus is a brain region in which neurogenesis persists into adulthood; however, the relationship between developmental and adult dentate gyrus neurogenesis has not been examined in detail. Here we used single-cell RNA sequencing to reveal the molecular dynamics and diversity of dentate gyrus cell types in perinatal, juvenile, and adult mice. We found distinct quiescent and proliferating progenitor cell types, linked by transient intermediate states to neuroblast stages and fully mature granule cells. We observed shifts in the molecular identity of quiescent and proliferating radial glia and granule cells during the postnatal period that were then maintained through adult stages. In contrast, intermediate progenitor cells, neuroblasts, and immature granule cells were nearly indistinguishable at all ages. These findings demonstrate the fundamental similarity of postnatal and adult neurogenesis in the hippocampus and pinpoint the early postnatal transformation of radial glia from embryonic progenitors to adult quiescent stem cells.

  12. Expression profiles of mRNA and long noncoding RNA in the ovaries of letrozole-induced polycystic ovary syndrome rat model through deep sequencing.

    Science.gov (United States)

    Fu, Lu-Lu; Xu, Ying; Li, Dan-Dan; Dai, Xiao-Wei; Xu, Xin; Zhang, Jing-Shun; Ming, Hao; Zhang, Xue-Ying; Zhang, Guo-Qing; Ma, Ya-Lan; Zheng, Lian-Wen

    2018-05-30

    Polycystic ovary syndrome (PCOS) is one of the most common endocrine disorders in reproductive-aged women. However, the exact pathophysiology of PCOS remains largely unclear. We performed deep sequencing to investigate the mRNA and long noncoding RNA (lncRNA) expression profiles in the ovarian tissues of letrozole-induced PCOS rat model and control rats. A total of 2147 mRNAs and 158 lncRNAs were differentially expressed between the PCOS models and control. Gene ontology analysis indicated that differentially expressed mRNAs were associated with biological adhesion, reproduction, and metabolic process. Pathway analysis results indicated that these aberrantly expressed mRNAs were related to several specific signaling pathways, including insulin resistance, steroid hormone biosynthesis, PPAR signaling pathway, cell adhesion molecules, autoimmune thyroid disease, and AMPK signaling pathway. The relative expression levels of mRNAs and lncRNAs were validated through qRT-PCR. LncRNA-miRNA-mRNA network was constructed to explore ceRNAs involved in the PCOS model and were also verified by qRTPCR experiment. These findings may provide insight into the pathogenesis of PCOS and clues to find key diagnostic and therapeutic roles of lncRNA in PCOS. Copyright © 2018 Elsevier B.V. All rights reserved.

  13. Physiological and Pathological Transcriptional Activation of Endogenous Retroelements Assessed by RNA-Sequencing of B Lymphocytes

    Directory of Open Access Journals (Sweden)

    Jan Attig

    2017-12-01

    Full Text Available In addition to evolutionarily-accrued sequence mutation or deletion, endogenous retroelements (EREs in eukaryotic genomes are subject to epigenetic silencing, preventing or reducing their transcription, particularly in the germplasm. Nevertheless, transcriptional activation of EREs, including endogenous retroviruses (ERVs and long interspersed nuclear elements (LINEs, is observed in somatic cells, variably upon cellular differentiation and frequently upon cellular transformation. ERE transcription is modulated during physiological and pathological immune cell activation, as well as in immune cell cancers. However, our understanding of the potential consequences of such modulation remains incomplete, partly due to the relative scarcity of information regarding genome-wide ERE transcriptional patterns in immune cells. Here, we describe a methodology that allows probing RNA-sequencing (RNA-seq data for genome-wide expression of EREs in murine and human cells. Our analysis of B cells reveals that their transcriptional response during immune activation is dominated by induction of gene transcription, and that EREs respond to a much lesser extent. The transcriptional activity of the majority of EREs is either unaffected or reduced by B cell activation both in mice and humans, albeit LINEs appear considerably more responsive in the latter host. Nevertheless, a small number of highly distinct ERVs are strongly and consistently induced during B cell activation. Importantly, this pattern contrasts starkly with B cell transformation, which exhibits widespread induction of EREs, including ERVs that minimally overlap with those responsive to immune stimulation. The distinctive patterns of ERE induction suggest different underlying mechanisms and will help separate physiological from pathological expression.

  14. Optimizing a massive parallel sequencing workflow for quantitative miRNA expression analysis.

    Directory of Open Access Journals (Sweden)

    Francesca Cordero

    Full Text Available BACKGROUND: Massive Parallel Sequencing methods (MPS can extend and improve the knowledge obtained by conventional microarray technology, both for mRNAs and short non-coding RNAs, e.g. miRNAs. The processing methods used to extract and interpret the information are an important aspect of dealing with the vast amounts of data generated from short read sequencing. Although the number of computational tools for MPS data analysis is constantly growing, their strengths and weaknesses as part of a complex analytical pipe-line have not yet been well investigated. PRIMARY FINDINGS: A benchmark MPS miRNA dataset, resembling a situation in which miRNAs are spiked in biological replication experiments was assembled by merging a publicly available MPS spike-in miRNAs data set with MPS data derived from healthy donor peripheral blood mononuclear cells. Using this data set we observed that short reads counts estimation is strongly under estimated in case of duplicates miRNAs, if whole genome is used as reference. Furthermore, the sensitivity of miRNAs detection is strongly dependent by the primary tool used in the analysis. Within the six aligners tested, specifically devoted to miRNA detection, SHRiMP and MicroRazerS show the highest sensitivity. Differential expression estimation is quite efficient. Within the five tools investigated, two of them (DESseq, baySeq show a very good specificity and sensitivity in the detection of differential expression. CONCLUSIONS: The results provided by our analysis allow the definition of a clear and simple analytical optimized workflow for miRNAs digital quantitative analysis.

  15. Optimizing a massive parallel sequencing workflow for quantitative miRNA expression analysis.

    Science.gov (United States)

    Cordero, Francesca; Beccuti, Marco; Arigoni, Maddalena; Donatelli, Susanna; Calogero, Raffaele A

    2012-01-01

    Massive Parallel Sequencing methods (MPS) can extend and improve the knowledge obtained by conventional microarray technology, both for mRNAs and short non-coding RNAs, e.g. miRNAs. The processing methods used to extract and interpret the information are an important aspect of dealing with the vast amounts of data generated from short read sequencing. Although the number of computational tools for MPS data analysis is constantly growing, their strengths and weaknesses as part of a complex analytical pipe-line have not yet been well investigated. A benchmark MPS miRNA dataset, resembling a situation in which miRNAs are spiked in biological replication experiments was assembled by merging a publicly available MPS spike-in miRNAs data set with MPS data derived from healthy donor peripheral blood mononuclear cells. Using this data set we observed that short reads counts estimation is strongly under estimated in case of duplicates miRNAs, if whole genome is used as reference. Furthermore, the sensitivity of miRNAs detection is strongly dependent by the primary tool used in the analysis. Within the six aligners tested, specifically devoted to miRNA detection, SHRiMP and MicroRazerS show the highest sensitivity. Differential expression estimation is quite efficient. Within the five tools investigated, two of them (DESseq, baySeq) show a very good specificity and sensitivity in the detection of differential expression. The results provided by our analysis allow the definition of a clear and simple analytical optimized workflow for miRNAs digital quantitative analysis.

  16. Viroids: from genotype to phenotype just relying on RNA sequence and structural motifs

    Directory of Open Access Journals (Sweden)

    Ricardo eFlores

    2012-06-01

    Full Text Available As a consequence of two unique physical properties, small size and circularity, viroid RNAs do not code for proteins and thus depend on RNA sequence/structural motifs for interacting with host proteins that mediate their invasion, replication, spread, and circumvention of defensive barriers. Viroid genomes fold up on themselves adopting collapsed secondary structures wherein stretches of nucleotides stabilized by Watson-Crick pairs are flanked by apparently unstructured loops. However, compelling data show that they are instead stabilized by alternative non-canonical pairs and that specific loops in the rod-like secondary structure, characteristic of Potato spindle tuber viroid and most other members of the family Pospiviroidae, are critical for replication and systemic trafficking. In contrast, rather than folding into a rod-like secondary structure, most members of the family Avsunvioidae adopt multibranched conformations occasionally stabilized by kissing loop interactions critical for viroid viability in vivo. Besides these most stable secondary structures, viroid RNAs alternatively adopt during replication transient metastable conformations containing elements of local higher-order structure, prominent among which are the hammerhead ribozymes catalyzing a key replicative step in the family Avsunvioidae, and certain conserved hairpins that also mediate replication steps in the family Pospiviroidae. Therefore, different RNA structures ⎯either global or local ⎯ determine different functions, thus highlighting the need for in-depth structural studies on viroid RNAs.

  17. Prokaryotic community profiling of local algae wastewaters using advanced 16S rRNA gene sequencing.

    Science.gov (United States)

    Limayem, Alya; Micciche, Andrew; Nayak, Bina; Mohapatra, Shyam

    2018-01-01

    Algae biomass-fed wastewaters are a promising source of lipid and bioenergy manufacture, revealing substantial end-product investment returns. However, wastewaters would contain lytic pathogens carrying drug resistance detrimental to algae yield and environmental safety. This study was conducted to simultaneously decipher through high-throughput advanced Illumina 16S ribosomal RNA (rRNA) gene sequencing, the cultivable and uncultivable bacterial community profile found in a single sample that was directly recovered from the local wastewater systems. Samples were collected from two previously documented sources including anaerobically digested (AD) municipal wastewater and swine wastewater with algae namely Chlorella spp. in addition to control samples, swine wastewater, and municipal wastewater without algae. Results indicated the presence of a significant level of Bacteria in all samples with an average of approximately 95.49% followed by Archaea 2.34%, in local wastewaters designed for algae cultivation. Taxonomic genus identification indicated the presence of Calothrix, Pseudomonas, and Clostridium as the most prevalent strains in both local municipal and swine wastewater samples containing algae with an average of 17.37, 12.19, and 7.84%, respectively. Interestingly, swine wastewater without algae displayed the lowest level of Pseudomonas strains algae indicates potential coexistence between these strains and algae microenvironment, suggesting further investigations. This finding was particularly relevant for the earlier documented adverse effects of some nosocomial Pseudomonas strains on algae growth and their multidrug resistance potential, requiring the development of targeted bioremediation with regard to the beneficial flora.

  18. Deep sequencing reveals unique small RNA repertoire that is regulated during head regeneration in Hydra magnipapillata.

    Science.gov (United States)

    Krishna, Srikar; Nair, Aparna; Cheedipudi, Sirisha; Poduval, Deepak; Dhawan, Jyotsna; Palakodeti, Dasaradhi; Ghanekar, Yashoda

    2013-01-07

    Small non-coding RNAs such as miRNAs, piRNAs and endo-siRNAs fine-tune gene expression through post-transcriptional regulation, modulating important processes in development, differentiation, homeostasis and regeneration. Using deep sequencing, we have profiled small non-coding RNAs in Hydra magnipapillata and investigated changes in small RNA expression pattern during head regeneration. Our results reveal a unique repertoire of small RNAs in hydra. We have identified 126 miRNA loci; 123 of these miRNAs are unique to hydra. Less than 50% are conserved across two different strains of Hydra vulgaris tested in this study, indicating a highly diverse nature of hydra miRNAs in contrast to bilaterian miRNAs. We also identified siRNAs derived from precursors with perfect stem-loop structure and that arise from inverted repeats. piRNAs were the most abundant small RNAs in hydra, mapping to transposable elements, the annotated transcriptome and unique non-coding regions on the genome. piRNAs that map to transposable elements and the annotated transcriptome display a ping-pong signature. Further, we have identified several miRNAs and piRNAs whose expression is regulated during hydra head regeneration. Our study defines different classes of small RNAs in this cnidarian model system, which may play a role in orchestrating gene expression essential for hydra regeneration.

  19. Sequence of the chloroplast 16S rRNA gene and its surrounding regions of Chlamydomonas reinhardii.

    Science.gov (United States)

    Dron, M; Rahire, M; Rochaix, J D

    1982-01-01

    The sequence of a 2 kb DNA fragment containing the chloroplast 16S ribosomal RNA gene from Chlamydomonas reinhardii and its flanking regions has been determined. The algal 16S rRNA sequence (1475 nucleotides) and secondary structure are highly related to those found in bacteria and in the chloroplasts of higher plants. In contrast, the flanking regions are very different. In C. reinhardii the 16S rRNA gene is surrounded by AT rich segments of about 180 bases, which are followed by a long stretch of complementary bases separated from each other by 1833 nucleotides. It is likely that these structures play an important role in the folding and processing of the precursor of 16S rRNA. The primary and secondary structures of the binding sites of two ribosomal proteins in the 16SrRNAs of E. coli and C. reinhardii are considerably related. Images PMID:6296784

  20. Sequence variation of the human immunodeficiency virus primer-binding site suggests the use of an alternative tRNA(Lys) molecule in reverse transcription

    NARCIS (Netherlands)

    Das, A. T.; Klaver, B.; Berkhout, B.

    1997-01-01

    Retroviruses use a cellular tRNA molecule as primer for reverse transcription. The complementarity between the 3' end of this tRNA and a sequence near the 5' end of the viral RNA, the primer-binding site (PBS), allows the primer to anneal onto the viral RNA. During reverse transcription 18

  1. 16S rRNA Amplicon Sequencing for Epidemiological Surveys of Bacteria in Wildlife.

    Science.gov (United States)

    Galan, Maxime; Razzauti, Maria; Bard, Emilie; Bernard, Maria; Brouat, Carine; Charbonnel, Nathalie; Dehne-Garcia, Alexandre; Loiseau, Anne; Tatard, Caroline; Tamisier, Lucie; Vayssier-Taussat, Muriel; Vignes, Helene; Cosson, Jean-François

    2016-01-01

    The human impact on natural habitats is increasing the complexity of human-wildlife interactions and leading to the emergence of infectious diseases worldwide. Highly successful synanthropic wildlife species, such as rodents, will undoubtedly play an increasingly important role in transmitting zoonotic diseases. We investigated the potential for recent developments in 16S rRNA amplicon sequencing to facilitate the multiplexing of the large numbers of samples needed to improve our understanding of the risk of zoonotic disease transmission posed by urban rodents in West Africa. In addition to listing pathogenic bacteria in wild populations, as in other high-throughput sequencing (HTS) studies, our approach can estimate essential parameters for studies of zoonotic risk, such as prevalence and patterns of coinfection within individual hosts. However, the estimation of these parameters requires cleaning of the raw data to mitigate the biases generated by HTS methods. We present here an extensive review of these biases and of their consequences, and we propose a comprehensive trimming strategy for managing these biases. We demonstrated the application of this strategy using 711 commensal rodents, including 208 Mus musculus domesticus , 189 Rattus rattus , 93 Mastomys natalensis , and 221 Mastomys erythroleucus , collected from 24 villages in Senegal. Seven major genera of pathogenic bacteria were detected in their spleens: Borrelia , Bartonella , Mycoplasma , Ehrlichia , Rickettsia , Streptobacillus , and Orientia . Mycoplasma , Ehrlichia , Rickettsia , Streptobacillus , and Orientia have never before been detected in West African rodents. Bacterial prevalence ranged from 0% to 90% of individuals per site, depending on the bacterial taxon, rodent species, and site considered, and 26% of rodents displayed coinfection. The 16S rRNA amplicon sequencing strategy presented here has the advantage over other molecular surveillance tools of dealing with a large spectrum of

  2. 16S rRNA gene sequencing as a tool to study microbial populations in foods and process environments

    DEFF Research Database (Denmark)

    Buschhardt, Tasja; Hansen, Tina Beck; Bahl, Martin Iain

    2015-01-01

    Introduction: Methodological constraints during culturing and biochemical testing have left the true microbiological diversity of foods and process environments unexplored. Culture-independent molecular methods, such as 16S rRNA gene sequencing, may provide deeper insight into microbial communities...... reference. Results: Taxonomic assignments and abundances of sequences in the total community and in the Enterobacteriaceae subpopulation were affected by the 16S rRNA gene variable region, DNA extraction methods, and polymerases chosen. However, community compositions were very reproducible when the same...... methods were used. Conclusions: Altogether, we have shown that conclusions from population studies based on 16S rRNA gene sequencing need to be made with caution. Overcoming the constraints, we believe that population studies can give new research possibilities for e.g. interaction studies, identification...

  3. Co-transcriptomic Analysis by RNA Sequencing to Simultaneously Measure Regulated Gene Expression in Host and Bacterial Pathogen

    KAUST Repository

    Ravasi, Timothy

    2016-01-24

    Intramacrophage pathogens subvert antimicrobial defence pathways using various mechanisms, including the targeting of host TLR-mediated transcriptional responses. Conversely, TLR-inducible host defence mechanisms subject intramacrophage pathogens to stress, thus altering pathogen gene expression programs. Important biological insights can thus be gained through the analysis of gene expression changes in both the host and the pathogen during an infection. Traditionally, research methods have involved the use of qPCR, microarrays and/or RNA sequencing to identify transcriptional changes in either the host or the pathogen. Here we describe the application of RNA sequencing using samples obtained from in vitro infection assays to simultaneously quantify both host and bacterial pathogen gene expression changes, as well as general approaches that can be undertaken to interpret the RNA sequencing data that is generated. These methods can be used to provide insights into host TLR-regulated transcriptional responses to microbial challenge, as well as pathogen subversion mechanisms against such responses.

  4. Demonstration of the absence of intervening sequences (IVSs) within 16S rRNA genes of Taylorella equigenitalis and Taylorella asinigenitalis isolates.

    Science.gov (United States)

    Tazumi, Akihiro; Nakanishi, Shigeyuki; Hayashi, Kyohei; Petry, Sandrine; Tasaki, Erina; Nakajima, Takuya; Ueno, Hitomi; Moore, John E; Millar, Beverley C; Matsuda, Motoo

    2012-06-01

    A total of 57 Taylorella equigenitalis (n=22) and Taylorella asinigenitalis (n=35) isolates was shown not to carry any intervening sequences (IVSs) within 16S rRNA gene sequences. By contrast, we have already shown the genus Taylorella group to carry several kinds of IVSs within the 23S rRNA gene sequences. Copyright © 2011 Elsevier Ltd. All rights reserved.

  5. Comparison of 16S ribosomal RNA gene sequence analysis and conventional culture in the environmental survey of a hospital

    OpenAIRE

    Manaka, Akihiro; Tokue, Yutaka; Murakami, Masami

    2017-01-01

    Background Nosocomial infection is one of the most common complications within health care facilities. Certain studies have reported outbreaks resulting from contaminated hospital environments. Although the identification of bacteria in the environment can readily be achieved using culturing methods, these methods detect live bacteria. Sequencing of the 16S ribosomal RNA (16S rRNA) gene is recognized to be effective for bacterial identification. In this study, we surveyed wards where drug-res...

  6. A framework for establishing predictive relationships between specific bacterial 16S rRNA sequence abundances and biotransformation rates.

    Science.gov (United States)

    Helbling, Damian E; Johnson, David R; Lee, Tae Kwon; Scheidegger, Andreas; Fenner, Kathrin

    2015-03-01

    The rates at which wastewater treatment plant (WWTP) microbial communities biotransform specific substrates can differ by orders of magnitude among WWTP communities. Differences in taxonomic compositions among WWTP communities may predict differences in the rates of some types of biotransformations. In this work, we present a novel framework for establishing predictive relationships between specific bacterial 16S rRNA sequence abundances and biotransformation rates. We selected ten WWTPs with substantial variation in their environmental and operational metrics and measured the in situ ammonia biotransformation rate constants in nine of them. We isolated total RNA from samples from each WWTP and analyzed 16S rRNA sequence reads. We then developed multivariate models between the measured abundances of specific bacterial 16S rRNA sequence reads and the ammonia biotransformation rate constants. We constructed model scenarios that systematically explored the effects of model regularization, model linearity and non-linearity, and aggregation of 16S rRNA sequences into operational taxonomic units (OTUs) as a function of sequence dissimilarity threshold (SDT). A large percentage (greater than 80%) of model scenarios resulted in well-performing and significant models at intermediate SDTs of 0.13-0.14 and 0.26. The 16S rRNA sequences consistently selected into the well-performing and significant models at those SDTs were classified as Nitrosomonas and Nitrospira groups. We then extend the framework by applying it to the biotransformation rate constants of ten micropollutants measured in batch reactors seeded with the ten WWTP communities. We identified phylogenetic groups that were robustly selected into all well-performing and significant models constructed with biotransformation rates of isoproturon, propachlor, ranitidine, and venlafaxine. These phylogenetic groups can be used as predictive biomarkers of WWTP microbial community activity towards these specific

  7. Transcriptional Responses in root and leaf of Prunus persica Under Drought Stress Using RNA Sequencing

    Directory of Open Access Journals (Sweden)

    Najla Ksouri

    2016-11-01

    Full Text Available Prunus persica L. Batch, or peach, is one of the most important crops and it is widely established in irrigated arid and semi-arid regions. However, due to variations in the climate and the increased aridity, drought has become a major constraint, causing crop losses worldwide. The use of drought-tolerant rootstocks in modern fruit production appears to be a useful method of alleviating water deficit problems. However, the transcriptomic variation and the major molecular mechanisms that underlie the adaptation of drought-tolerant rootstocks to water shortage remain unclear. Hence, in this study, high-throughput sequencing (RNA-seq was performed to assess the transcriptomic changes and the key genes involved in the response to drought in root tissues (GF677 rootstock and leaf tissues (graft, var. Catherina subjected to 16 days of drought stress. In total, 12 RNA libraries were constructed and sequenced. This generated a total of 315M raw reads from both tissues, which allowed the assembly of 22,079 and 17,854 genes associated with the root and leaf tissues, respectively. Subsets of 500 differentially expressed genes (DEGs in roots and 236 in leaves were identified and functionally annotated with 56 gene ontology (GO terms and 99 metabolic pathways, which were mostly associated with aminobenzoate degradation and phenylpropanoid biosynthesis. The GO analysis highlighted the biological functions that were exclusive to the root tissue, such as locomotion, hormone metabolic process, and detection of stimulus, indicating the stress-buffering role of the GF677 rootstock. Furthermore, the complex regulatory network involved in the drought response was revealed, involving proteins that are associated with signaling transduction, transcription and hormone regulation, redox homeostasis, and frontline barriers. We identified two poorly characterized genes in P. persica: growth-regulating factor 5 (GRF5, which may be involved in cellular expansion, and AtHB12

  8. Transcriptional Responses in Root and Leaf of Prunus persica under Drought Stress Using RNA Sequencing

    Science.gov (United States)

    Ksouri, Najla; Jiménez, Sergio; Wells, Christina E.; Contreras-Moreira, Bruno; Gogorcena, Yolanda

    2016-01-01

    Prunus persica L. Batsch, or peach, is one of the most important crops and it is widely established in irrigated arid and semi-arid regions. However, due to variations in the climate and the increased aridity, drought has become a major constraint, causing crop losses worldwide. The use of drought-tolerant rootstocks in modern fruit production appears to be a useful method of alleviating water deficit problems. However, the transcriptomic variation and the major molecular mechanisms that underlie the adaptation of drought-tolerant rootstocks to water shortage remain unclear. Hence, in this study, high-throughput sequencing (RNA-seq) was performed to assess the transcriptomic changes and the key genes involved in the response to drought in root tissues (GF677 rootstock) and leaf tissues (graft, var. Catherina) subjected to 16 days of drought stress. In total, 12 RNA libraries were constructed and sequenced. This generated a total of 315 M raw reads from both tissues, which allowed the assembly of 22,079 and 17,854 genes associated with the root and leaf tissues, respectively. Subsets of 500 differentially expressed genes (DEGs) in roots and 236 in leaves were identified and functionally annotated with 56 gene ontology (GO) terms and 99 metabolic pathways, which were mostly associated with aminobenzoate degradation and phenylpropanoid biosynthesis. The GO analysis highlighted the biological functions that were exclusive to the root tissue, such as “locomotion,” “hormone metabolic process,” and “detection of stimulus,” indicating the stress-buffering role of the GF677 rootstock. Furthermore, the complex regulatory network involved in the drought response was revealed, involving proteins that are associated with signaling transduction, transcription and hormone regulation, redox homeostasis, and frontline barriers. We identified two poorly characterized genes in P. persica: growth-regulating factor 5 (GRF5), which may be involved in cellular expansion, and

  9. Archaea box C/D enzymes methylate two distinct substrate rRNA sequences with different efficiency

    Science.gov (United States)

    Graziadei, Andrea; Masiewicz, Pawel; Lapinaite, Audrone; Carlomagno, Teresa

    2016-01-01

    RNA modifications confer complexity to the 4-nucleotide polymer; nevertheless, their exact function is mostly unknown. rRNA 2′-O-ribose methylation concentrates to ribosome functional sites and is important for ribosome biogenesis. The methyl group is transferred to rRNA by the box C/D RNPs: The rRNA sequence to be methylated is recognized by a complementary sequence on the guide RNA, which is part of the enzyme. In contrast to their eukaryotic homologs, archaeal box C/D enzymes can be assembled in vitro and are used to study the mechanism of 2′-O-ribose methylation. In Archaea, each guide RNA directs methylation to two distinct rRNA sequences, posing the question whether this dual architecture of the enzyme has a regulatory role. Here we use methylation assays and low-resolution structural analysis with small-angle X-ray scattering to study the methylation reaction guided by the sR26 guide RNA from Pyrococcus furiosus. We find that the methylation efficacy at sites D and D′ differ substantially, with substrate D′ turning over more efficiently than substrate D. This observation correlates well with structural data: The scattering profile of the box C/D RNP half-loaded with substrate D′ is similar to that of the holo complex, which has the highest activity. Unexpectedly, the guide RNA secondary structure is not responsible for the functional difference at the D and D′ sites. Instead, this difference is recapitulated by the nature of the first base pair of the guide-substrate duplex. We suggest that substrate turnover may occur through a zip mechanism that initiates at the 5′-end of the product. PMID:26925607

  10. Using machine learning and high-throughput RNA sequencing to classify the precursors of small non-coding RNAs.

    Science.gov (United States)

    Ryvkin, Paul; Leung, Yuk Yee; Ungar, Lyle H; Gregory, Brian D; Wang, Li-San

    2014-05-01

    Recent advances in high-throughput sequencing allow researchers to examine the transcriptome in more detail than ever before. Using a method known as high-throughput small RNA-sequencing, we can now profile the expression of small regulatory RNAs such as microRNAs and small interfering RNAs (siRNAs) with a great deal of sensitivity. However, there are many other types of small RNAs (small nucleolar RNAs), snRNAs (small nuclear RNAs), scRNAs (small cytoplasmic RNAs), tRNAs (transfer RNAs), and transposon-derived RNAs. Here, we present a user's guide for CoRAL (Classification of RNAs by Analysis of Length), a computational method for discriminating between different classes of RNA using high-throughput small RNA-sequencing data. Not only can CoRAL distinguish between RNA classes with high accuracy, but it also uses features that are relevant to small RNA biogenesis pathways. By doing so, CoRAL can give biologists a glimpse into the characteristics of different RNA processing pathways and how these might differ between tissue types, biological conditions, or even different species. CoRAL is available at http://wanglab.pcbi.upenn.edu/coral/. Copyright © 2013 Elsevier Inc. All rights reserved.

  11. Sequence organization and control of transcription in the bacteriophage T4 tRNA region.

    Science.gov (United States)

    Broida, J; Abelson, J

    1985-10-05

    Bacteriophage T4 contains genes for eight transfer RNAs and two stable RNAs of unknown function. These are found in two clusters at 70 X 10(3) base-pairs on the T4 genetic map. To understand the control of transcription in this region we have completed the sequencing of 5000 base-pairs in this region. The sequence contains a part of gene 3, gene 1, gene 57, internal protein I, the tRNA genes and five open reading frames which most likely code for heretofore unidentified proteins. We have used subclones of the region to investigate the kinetics of transcription in vivo. The results show that transcription in this region consists of overlapping early, middle and late transcripts. Transcription is directed from two early promoters, one or two middle promoters and perhaps two late promoters. This region contains all of the features that are seen in T4 transcription and as such is a good place to study the phenomenon in more detail.

  12. A Comparison of mRNA Sequencing with Random Primed and 3'-Directed Libraries.

    Science.gov (United States)

    Xiong, Yuguang; Soumillon, Magali; Wu, Jie; Hansen, Jens; Hu, Bin; van Hasselt, Johan G C; Jayaraman, Gomathi; Lim, Ryan; Bouhaddou, Mehdi; Ornelas, Loren; Bochicchio, Jim; Lenaeus, Lindsay; Stocksdale, Jennifer; Shim, Jaehee; Gomez, Emilda; Sareen, Dhruv; Svendsen, Clive; Thompson, Leslie M; Mahajan, Milind; Iyengar, Ravi; Sobie, Eric A; Azeloglu, Evren U; Birtwistle, Marc R

    2017-11-07

    Creating a cDNA library for deep mRNA sequencing (mRNAseq) is generally done by random priming, creating multiple sequencing fragments along each transcript. A 3'-end-focused library approach cannot detect differential splicing, but has potentially higher throughput at a lower cost, along with the ability to improve quantification by using transcript molecule counting with unique molecular identifiers (UMI) that correct PCR bias. Here, we compare an implementation of such a 3'-digital gene expression (3'-DGE) approach with "conventional" random primed mRNAseq. Given our particular datasets on cultured human cardiomyocyte cell lines, we find that, while conventional mRNAseq detects ~15% more genes and needs ~500,000 fewer reads per sample for equivalent statistical power, the resulting differentially expressed genes, biological conclusions, and gene signatures are highly concordant between two techniques. We also find good quantitative agreement at the level of individual genes between two techniques for both read counts and fold changes between given conditions. We conclude that, for high-throughput applications, the potential cost savings associated with 3'-DGE approach are likely a reasonable tradeoff for modest reduction in sensitivity and inability to observe alternative splicing, and should enable many larger scale studies focusing on not only differential expression analysis, but also quantitative transcriptome profiling.

  13. Widespread Polycistronic Transcripts in Fungi Revealed by Single-Molecule mRNA Sequencing.

    Directory of Open Access Journals (Sweden)

    Sean P Gordon

    Full Text Available Genes in prokaryotic genomes are often arranged into clusters and co-transcribed into polycistronic RNAs. Isolated examples of polycistronic RNAs were also reported in some higher eukaryotes but their presence was generally considered rare. Here we developed a long-read sequencing strategy to identify polycistronic transcripts in several mushroom forming fungal species including Plicaturopsis crispa, Phanerochaete chrysosporium, Trametes versicolor, and Gloeophyllum trabeum. We found genome-wide prevalence of polycistronic transcription in these Agaricomycetes, involving up to 8% of the transcribed genes. Unlike polycistronic mRNAs in prokaryotes, these co-transcribed genes are also independently transcribed. We show that polycistronic transcription may interfere with expression of the downstream tandem gene. Further comparative genomic analysis indicates that polycistronic transcription is conserved among a wide range of mushroom forming fungi. In summary, our study revealed, for the first time, the genome prevalence of polycistronic transcription in a phylogenetic range of higher fungi. Furthermore, we systematically show that our long-read sequencing approach and combined bioinformatics pipeline is a generic powerful tool for precise characterization of complex transcriptomes that enables identification of mRNA isoforms not recovered via short-read assembly.

  14. AllelicImbalance: An R/ bioconductor package for detecting, managing, and visualizing allele expression imbalance data from RNA sequencing

    DEFF Research Database (Denmark)

    Gådin, Jesper R.; van't Hooft, Ferdinand M.; Eriksson, Per

    2015-01-01

    the possible biases. Results: We present AllelicImblance, a software program that is designed to detect, manage, and visualize allelic imbalances comprehensively. The purpose of this software is to allow users to pose genetic questions in any RNA sequencing experiment quickly, enhancing the general utility......Background: One aspect in which RNA sequencing is more valuable than microarray-based methods is the ability to examine the allelic imbalance of the expression of a gene. This process is often a complex task that entails quality control, alignment, and the counting of reads over heterozygous single...

  15. Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data

    OpenAIRE

    Degner, Jacob F.; Marioni, John C.; Pai, Athma A.; Pickrell, Joseph K.; Nkadori, Everlyne; Gilad, Yoav; Pritchard, Jonathan K.

    2009-01-01

    Motivation: Next-generation sequencing has become an important tool for genome-wide quantification of DNA and RNA. However, a major technical hurdle lies in the need to map short sequence reads back to their correct locations in a reference genome. Here, we investigate the impact of SNP variation on the reliability of read-mapping in the context of detecting allele-specific expression (ASE). Results: We generated 16 million 35 bp reads from mRNA of each of two HapMap Yoruba individuals. When ...

  16. RNA Sequencing of Formalin-Fixed, Paraffin-Embedded Specimens for Gene Expression Quantification and Data Mining

    Directory of Open Access Journals (Sweden)

    Yan Guo

    2016-01-01

    Full Text Available Background. Proper rRNA depletion is crucial for the successful utilization of FFPE specimens when studying gene expression. We performed a study to evaluate two major rRNA depletion methods: Ribo-Zero and RNase H. RNAs extracted from 4 samples were treated with the two rRNA depletion methods in duplicate and sequenced (N=16. We evaluated their reducibility, ability to detect RNA, and ability to molecularly subtype these triple negative breast cancer specimens. Results. Both rRNA depletion methods produced consistent data between the technical replicates. We found that the RNase H method produced higher quality RNAseq data as compared to the Ribo-Zero method. In addition, we evaluated the RNAseq data generated from the FFPE tissue samples for noncoding RNA, including lncRNA, enhancer/super enhancer RNA, and single nucleotide variation (SNV. We found that the RNase H is more suitable for detecting high-quality, noncoding RNAs as compared to the Ribo-Zero and provided more consistent molecular subtype identification between replicates. Unfortunately, neither method produced reliable SNV data. Conclusions. In conclusion, for FFPE specimens, the RNase H rRNA depletion method performed better than the Ribo-Zero. Neither method generates data sufficient for SNV detection.

  17. Bioinformatic analysis of barcoded cDNA libraries for small RNA profiling by next-generation sequencing.

    Science.gov (United States)

    Farazi, Thalia A; Brown, Miguel; Morozov, Pavel; Ten Hoeve, Jelle J; Ben-Dov, Iddo Z; Hovestadt, Volker; Hafner, Markus; Renwick, Neil; Mihailović, Aleksandra; Wessels, Lodewyk F A; Tuschl, Thomas

    2012-10-01

    The characterization of post-transcriptional gene regulation by small regulatory RNAs of 20-30 nt length, particularly miRNAs and piRNAs, has become a major focus of research in recent years. A prerequisite for the characterization of small RNAs is their identification and quantification across different developmental stages, normal and diseased tissues, as well as model cell lines. Here we present a step-by-step protocol for the bioinformatic analysis of barcoded cDNA libraries for small RNA profiling generated by Illumina sequencing, thereby facilitating miRNA and other small RNA profiling of large sample collections. Copyright © 2012 Elsevier Inc. All rights reserved.

  18. Sequence-specific RNA binding by a Nova KH domain: implications for paraneoplastic disease and the fragile X syndrome.

    Science.gov (United States)

    Lewis, H A; Musunuru, K; Jensen, K B; Edo, C; Chen, H; Darnell, R B; Burley, S K

    2000-02-04

    The structure of a Nova protein K homology (KH) domain recognizing single-stranded RNA has been determined at 2.4 A resolution. Mammalian Nova antigens (1 and 2) constitute an important family of regulators of RNA metabolism in neurons, first identified using sera from cancer patients with the autoimmune disorder paraneoplastic opsoclonus-myoclonus ataxia (POMA). The structure of the third KH domain (KH3) of Nova-2 bound to a stem loop RNA resembles a molecular vise, with 5'-Ura-Cyt-Ade-Cyt-3' pinioned between an invariant Gly-X-X-Gly motif and the variable loop. Tetranucleotide recognition is supported by an aliphatic alpha helix/beta sheet RNA-binding platform, which mimics 5'-Ura-Gua-3' by making Watson-Crick-like hydrogen bonds with 5'-Cyt-Ade-3'. Sequence conservation suggests that fragile X mental retardation results from perturbation of RNA binding by the FMR1 protein.

  19. Deep RNA sequencing analysis of readthrough gene fusions in human prostate adenocarcinoma and reference samples

    Directory of Open Access Journals (Sweden)

    Nacu Serban

    2011-01-01

    Full Text Available Abstract Background Readthrough fusions across adjacent genes in the genome, or transcription-induced chimeras (TICs, have been estimated using expressed sequence tag (EST libraries to involve 4-6% of all genes. Deep transcriptional sequencing (RNA-Seq now makes it possible to study the occurrence and expression levels of TICs in individual samples across the genome. Methods We performed single-end RNA-Seq on three human prostate adenocarcinoma samples and their corresponding normal tissues, as well as brain and universal reference samples. We developed two bioinformatics methods to specifically identify TIC events: a targeted alignment method using artificial exon-exon junctions within 200,000 bp from adjacent genes, and genomic alignment allowing splicing within individual reads. We performed further experimental verification and characterization of selected TIC and fusion events using quantitative RT-PCR and comparative genomic hybridization microarrays. Results Targeted alignment against artificial exon-exon junctions yielded 339 distinct TIC events, including 32 gene pairs with multiple isoforms. The false discovery rate was estimated to be 1.5%. Spliced alignment to the genome was less sensitive, finding only 18% of those found by targeted alignment in 33-nt reads and 59% of those in 50-nt reads. However, spliced alignment revealed 30 cases of TICs with intervening exons, in addition to distant inversions, scrambled genes, and translocations. Our findings increase the catalog of observed TIC gene pairs by 66%. We verified 6 of 6 predicted TICs in all prostate samples, and 2 of 5 predicted novel distant gene fusions, both private events among 54 prostate tumor samples tested. Expression of TICs correlates with that of the upstream gene, which can explain the prostate-specific pattern of some TIC events and the restriction of the SLC45A3-ELK4 e4-e2 TIC to ERG-negative prostate samples, as confirmed in 20 matched prostate tumor and normal

  20. Direct detection of RNA in vitro and in situ by target-primed RCA: The impact of E. coli RNase III on the detection efficiency of RNA sequences distanced far from the 3'-end.

    Science.gov (United States)

    Merkiene, Egle; Gaidamaviciute, Edita; Riauba, Laurynas; Janulaitis, Arvydas; Lagunavicius, Arunas

    2010-08-01

    We improved the target RNA-primed RCA technique for direct detection and analysis of RNA in vitro and in situ. Previously we showed that the 3' --> 5' single-stranded RNA exonucleolytic activity of Phi29 DNA polymerase converts the target RNA into a primer and uses it for RCA initiation. However, in some cases, the single-stranded RNA exoribonucleolytic activity of the polymerase is hindered by strong double-stranded structures at the 3'-end of target RNAs. We demonstrate that in such hampered cases, the double-stranded RNA-specific Escherichia coli RNase III efficiently assists Phi29 DNA polymerase in converting the target RNA into a primer. These observations extend the target RNA-primed RCA possibilities to test RNA sequences distanced far from the 3'-end and customize this technique for the inner RNA sequence analysis.

  1. Advantages and Limitations of Ribosomal RNA PCR and DNA Sequencing for Identification of Bacteria in Cardiac Valves of Danish Patients

    DEFF Research Database (Denmark)

    Kemp, Michael; Bangsborg, Jette; Kjerulf, Anne

    2013-01-01

    of direct molecular identification should also address weaknesses, their relevance in the given setting, and possible improvements. In this study cardiac valves from 56 Danish patients referred for surgery for infective endocarditis were analysed by microscopy and culture as well as by PCR targeting part...... of the bacterial 16S rRNA gene followed by DNA sequencing of the PCR product. PCR and DNA sequencing identified significant bacteria in 49 samples from 43 patients, including five out of 13 culture-negative cases. No rare, exotic, or intracellular bacteria were identified. There was a general agreement between...... bacterial identity obtained by ribosomal PCR and DNA sequencing from the valves and bacterial isolates from blood culture. However, DNA sequencing of the 16S rRNA gene did not discriminate well among non-haemolytic streptococci, especially within the Streptococcus mitis group. Ribosomal PCR with subsequent...

  2. Identification of human microRNA-like sequences embedded within the protein-encoding genes of the human immunodeficiency virus.

    Directory of Open Access Journals (Sweden)

    Bryan Holland

    Full Text Available BACKGROUND: MicroRNAs (miRNAs are highly conserved, short (18-22 nts, non-coding RNA molecules that regulate gene expression by binding to the 3' untranslated regions (3'UTRs of mRNAs. While numerous cellular microRNAs have been associated with the progression of various diseases including cancer, miRNAs associated with retroviruses have not been well characterized. Herein we report identification of microRNA-like sequences in coding regions of several HIV-1 genomes. RESULTS: Based on our earlier proteomics and bioinformatics studies, we have identified 8 cellular miRNAs that are predicted to bind to the mRNAs of multiple proteins that are dysregulated during HIV-infection of CD4+ T-cells in vitro. In silico analysis of the full length and mature sequences of these 8 miRNAs and comparisons with all the genomic and subgenomic sequences of HIV-1 strains in global databases revealed that the first 18/18 sequences of the mature hsa-miR-195 sequence (including the short seed sequence, matched perfectly (100%, or with one nucleotide mismatch, within the envelope (env genes of five HIV-1 genomes from Africa. In addition, we have identified 4 other miRNA-like sequences (hsa-miR-30d, hsa-miR-30e, hsa-miR-374a and hsa-miR-424 within the env and the gag-pol encoding regions of several HIV-1 strains, albeit with reduced homology. Mapping of the miRNA-homologues of env within HIV-1 genomes localized these sequence to the functionally significant variable regions of the env glycoprotein gp120 designated V1, V2, V4 and V5. CONCLUSIONS: We conclude that microRNA-like sequences are embedded within the protein-encoding regions of several HIV-1 genomes. Given that the V1 to V5 regions of HIV-1 envelopes contain specific, well-characterized domains that are critical for immune responses, virus neutralization and disease progression, we propose that the newly discovered miRNA-like sequences within the HIV-1 genomes may have evolved to self-regulate survival of the

  3. Using machine learning and high-throughput RNA sequencing to classify the precursors of small non-coding RNAs

    OpenAIRE

    Ryvkin, Paul; Leung, Yuk Yee; Ungar, Lyle H.; Gregory, Brian D.; Wang, Li-San

    2013-01-01

    Recent advances in high-throughput sequencing allow researchers to examine the transcriptome in more detail than ever before. Using a method known as high-throughput small RNA-sequencing, we can now profile the expression of small regulatory RNAs such as microRNAs and small interfering RNAs (siRNAs) with a great deal of sensitivity. However, there are many other types of small RNAs (

  4. Correlation between sequence conservation and structural thermodynamics of microRNA precursors from human, mouse, and chicken genomes

    Directory of Open Access Journals (Sweden)

    Wang Shengqi

    2010-10-01

    Full Text Available Abstract Background Previous studies have shown that microRNA precursors (pre-miRNAs have considerably more stable secondary structures than other native RNAs (tRNA, rRNA, and mRNA and artificial RNA sequences. However, pre-miRNAs with ultra stable secondary structures have not been investigated. It is not known if there is a tendency in pre-miRNA sequences towards or against ultra stable structures? Furthermore, the relationship between the structural thermodynamic stability of pre-miRNA and their evolution remains unclear. Results We investigated the correlation between pre-miRNA sequence conservation and structural stability as measured by adjusted minimum folding free energies in pre-miRNAs isolated from human, mouse, and chicken. The analysis revealed that conserved and non-conserved pre-miRNA sequences had structures with similar average stabilities. However, the relatively ultra stable and unstable pre-miRNAs were more likely to be non-conserved than pre-miRNAs with moderate stability. Non-conserved pre-miRNAs had more G+C than A+U nucleotides, while conserved pre-miRNAs contained more A+U nucleotides. Notably, the U content of conserved pre-miRNAs was especially higher than that of non-conserved pre-miRNAs. Further investigations showed that conserved and non-conserved pre-miRNAs exhibited different structural element features, even though they had comparable levels of stability. Conclusions We proposed that there is a correlation between structural thermodynamic stability and sequence conservation for pre-miRNAs from human, mouse, and chicken genomes. Our analyses suggested that pre-miRNAs with relatively ultra stable or unstable structures were less favoured by natural selection than those with moderately stable structures. Comparison of nucleotide compositions between non-conserved and conserved pre-miRNAs indicated the importance of U nucleotides in the pre-miRNA evolutionary process. Several characteristic structural elements were

  5. Comparison of MALDI-TOF MS, housekeeping gene sequencing, and 16S rRNA gene sequencing for identification of Aeromonas clinical isolates.

    Science.gov (United States)

    Shin, Hee Bong; Yoon, Jihoon; Lee, Yangsoon; Kim, Myung Sook; Lee, Kyungwon

    2015-03-01

    The genus Aeromonas is a pathogen that is well known to cause severe clinical illnesses, ranging from gastroenteritis to sepsis. Accurate identification of A. hydrophila, A. caviae, and A. veronii is important for the care of patients. However, species identification remains difficult using conventional methods. The aim of this study was to compare the accuracy of different methods of identifying Aeromonas at the species level: a biochemical method, matrix-assisted laser desorption ionization mass spectrometry-time of flight (MALDI-TOF MS), 16S rRNA sequencing, and housekeeping gene sequencing (gyrB, rpoB). We analyzed 65 Aeromonas isolates recovered from patients at a university hospital in Korea between 1996 and 2012. The isolates were recovered from frozen states and tested using the following four methods: a conventional biochemical method, 16S rRNA sequencing, housekeeping gene sequencing with phylogenetic analysis, and MALDI-TOF MS. The conventional biochemical method and 16S rRNA sequencing identified Aeromonas at the genus level very accurately, although species level identification was unsatisfactory. MALDI-TOF MS system correctly identified 60 (92.3%) isolates at the species level and an additional four (6.2%) at the genus level. Overall, housekeeping gene sequencing with phylogenetic analysis was found to be the most accurate in identifying Aeromonas at the species level. The most accurate method of identification of Aeromonas to species level is by housekeeping gene sequencing, although high cost and technical difficulty hinder its usage in clinical settings. An easy-to-use identification method is needed for clinical laboratories, for which MALDI-TOF MS could be a strong candidate.

  6. Selective and flexible depletion of problematic sequences from RNA-seq libraries at the cDNA stage.

    Science.gov (United States)

    Archer, Stuart K; Shirokikh, Nikolay E; Preiss, Thomas

    2014-05-26

    A major hurdle to transcriptome profiling by deep-sequencing technologies is that abundant transcripts, such as rRNAs, can overwhelm the libraries, severely reducing transcriptome-wide coverage. Methods for depletion of such unwanted sequences typically require treatment of RNA samples prior to library preparation, are costly and not suited to unusual species and applications. Here we describe Probe-Directed Degradation (PDD), an approach that employs hybridisation to DNA oligonucleotides at the single-stranded cDNA library stage and digestion with Duplex-Specific Nuclease (DSN). Targeting Saccharomyces cerevisiae rRNA sequences in Illumina HiSeq libraries generated by the split adapter method we show that PDD results in efficient removal of rRNA. The probes generate extended zones of depletion as a function of library insert size and the requirements for DSN cleavage. Using intact total RNA as starting material, probes can be spaced at the minimum anticipated library size minus 20 nucleotides to achieve continuous depletion. No off-target bias is detectable when comparing PDD-treated with untreated libraries. We further provide a bioinformatics tool to design suitable PDD probe sets. We find that PDD is a rapid procedure that results in effective and specific depletion of unwanted sequences from deep-sequencing libraries. Because PDD acts at the cDNA stage, handling of fragile RNA samples can be minimised and it should further be feasible to remediate existing libraries. Importantly, PDD preserves the original RNA fragment boundaries as is required for nucleotide-resolution footprinting or base-cleavage studies. Finally, as PDD utilises unmodified DNA oligonucleotides it can provide a low-cost option for large-scale projects, or be flexibly customised to suit different depletion targets, sample types and organisms.

  7. miRseqViewer: multi-panel visualization of sequence, structure and expression for analysis of microRNA sequencing data.

    Science.gov (United States)

    Jang, Insu; Chang, Hyeshik; Jun, Yukyung; Park, Seongjin; Yang, Jin Ok; Lee, Byungwook; Kim, Wankyu; Kim, V Narry; Lee, Sanghyuk

    2015-02-15

    Deep sequencing of small RNAs has become a routine process in recent years, but no dedicated viewer is as yet available to explore the sequence features simultaneously along with secondary structure and gene expression of microRNA (miRNA). We present a highly interactive application that visualizes the sequence alignment, secondary structure and normalized read counts in synchronous multipanel windows. This helps users to easily examine the relationships between the structure of precursor and the sequences and abundance of final products and thereby will facilitate the studies on miRNA biogenesis and regulation. The project manager handles multiple samples of multiple groups. The read alignment is imported in BAM file format. Implemented features comprise sorting, zooming, highlighting, editing, filtering, saving, exporting, etc. Currently, miRseqViewer supports 84 organisms whose annotation is available at miRBase. miRseqViewer, implemented in Java, is available at https://github.com/insoo078/mirseqviewer or at http://msv.kobic.re.kr. sanghyuk@ewha.ac.kr. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  8. tRNA sequence data, annotation data and curation data - tRNADB-CE | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us tRNA...DB-CE tRNA sequence data, annotation data and curation data Data detail Data name tRNA s...equence data, annotation data and curation data DOI 10.18908/lsdba.nbdc00720-001 Description of data contents Data of tRNA... search results and curation data. Three prediction programs (tRNAScan-SE, Aragorn and tRNA fi...nder) were used together to search tRNA genes. If the prediction results did not

  9. A Bayesian taxonomic classification method for 16S rRNA gene sequences with improved species-level accuracy.

    Science.gov (United States)

    Gao, Xiang; Lin, Huaiying; Revanna, Kashi; Dong, Qunfeng

    2017-05-10

    Species-level classification for 16S rRNA gene sequences remains a serious challenge for microbiome researchers, because existing taxonomic classification tools for 16S rRNA gene sequences either do not provide species-level classification, or their classification results are unreliable. The unreliable results are due to the limitations in the existing methods which either lack solid probabilistic-based criteria to evaluate the confidence of their taxonomic assignments, or use nucleotide k-mer frequency as the proxy for sequence similarity measurement. We have developed a method that shows significantly improved species-level classification results over existing methods. Our method calculates true sequence similarity between query sequences and database hits using pairwise sequence alignment. Taxonomic classifications are assigned from the species to the phylum levels based on the lowest common ancestors of multiple database hits for each query sequence, and further classification reliabilities are evaluated by bootstrap confidence scores. The novelty of our method is that the contribution of each database hit to the taxonomic assignment of the query sequence is weighted by a Bayesian posterior probability based upon the degree of sequence similarity of the database hit to the query sequence. Our method does not need any training datasets specific for different taxonomic groups. Instead only a reference database is required for aligning to the query sequences, making our method easily applicable for different regions of the 16S rRNA gene or other phylogenetic marker genes. Reliable species-level classification for 16S rRNA or other phylogenetic marker genes is critical for microbiome research. Our software shows significantly higher classification accuracy than the existing tools and we provide probabilistic-based confidence scores to evaluate the reliability of our taxonomic classification assignments based on multiple database matches to query sequences. Despite

  10. Transcriptomic changes during tuber dormancy release process revealed by RNA sequencing in potato.

    Science.gov (United States)

    Liu, Bailin; Zhang, Ning; Wen, Yikai; Jin, Xin; Yang, Jiangwei; Si, Huaijun; Wang, Di

    2015-03-20

    Potato tuber dormancy release is a critical development process that allows potato to produce new plant. The first Illumina RNA sequencing to generate the expressed mRNAs at dormancy tuber (DT), dormancy release tuber (DRT) and sprouting tuber (ST) was performed. We identified 26,639 genes including 5,912 (3,450 up-regulated while 2,462 down-regulated) and 3,885 (2,141 up-regulated while 1,744 down-regulated) genes were differentially expressed from DT vs DRT and DRT vs ST. The RNA-Seq results were further verified using qRT-PCR. We found reserve mobilization events were activated before the bud emergence (DT vs DRT) and highlighted after dormancy release (DRT vs ST). Overexpressed genes related to metabolism of auxin, gibberellic acid, cytokinin and barssinosteriod were dominated in DT vs DRT, whereas overexpressed genes involved in metabolism of ethylene, jasmonate and salicylate were prominent in DRT vs ST. Various histone and cyclin isoforms associated genes involved in cell division/cycle were mainly up-regulated in DT vs DRT. Dormancy release process was also companied by stress response and redox regulation, those genes related to biotic stress, cell wall and second metabolism was preferentially overexpressed in DRT vs ST, which might accelerate dormancy breaking and sprout outgrowth. The metabolic processes activated during tuber dormancy release were also supported by plant seed models. These results represented the first comprehensive picture of a large number of genes involved in tuber dormancy release process. Copyright © 2015 Elsevier B.V. All rights reserved.

  11. Transcriptome profiling of bovine milk oligosaccharide metabolism genes using RNA-sequencing.

    Directory of Open Access Journals (Sweden)

    Saumya Wickramasinghe

    2011-04-01

    Full Text Available This study examines the genes coding for enzymes involved in bovine milk oligosaccharide metabolism by comparing the oligosaccharide profiles with the expressions of glycosylation-related genes. Fresh milk samples (n = 32 were collected from four Holstein and Jersey cows at days 1, 15, 90 and 250 of lactation and free milk oligosaccharide profiles were analyzed. RNA was extracted from milk somatic cells at days 15 and 250 of lactation (n = 12 and gene expression analysis was conducted by RNA-Sequencing. A list was created of 121 glycosylation-related genes involved in oligosaccharide metabolism pathways in bovine by analyzing the oligosaccharide profiles and performing an extensive literature search. No significant differences were observed in either oligosaccharide profiles or expressions of glycosylation-related genes between Holstein and Jersey cows. The highest concentrations of free oligosaccharides were observed in the colostrum samples and a sharp decrease was observed in the concentration of free oligosaccharides on day 15, followed by progressive decrease on days 90 and 250. Ninety-two glycosylation-related genes were expressed in milk somatic cells. Most of these genes exhibited higher expression in day 250 samples indicating increases in net glycosylation-related metabolism in spite of decreases in free milk oligosaccharides in late lactation milk. Even though fucosylated free oligosaccharides were not identified, gene expression indicated the likely presence of fucosylated oligosaccharides in bovine milk. Fucosidase genes were expressed in milk and a possible explanation for not detecting fucosylated free oligosaccharides is the degradation of large fucosylated free oligosaccharides by the fucosidases. Detailed characterization of enzymes encoded by the 92 glycosylation-related genes identified in this study will provide the basic knowledge for metabolic network analysis of oligosaccharides in mammalian milk. These candidate

  12. Deep sequencing analysis of small noncoding RNA and mRNA targets of the global post-transcriptional regulator, Hfq

    DEFF Research Database (Denmark)

    Sittka, A; Lucchini, S; Papenfort, K

    2008-01-01

    Recent advances in high-throughput pyrosequencing (HTPS) technology now allow a thorough analysis of RNA bound to cellular proteins, and, therefore, of post-transcriptional regulons. We used HTPS to discover the Salmonella RNAs that are targeted by the common bacterial Sm-like protein, Hfq. Initial...... transcriptomic analysis revealed that Hfq controls the expression of almost a fifth of all Salmonella genes, including several horizontally acquired pathogenicity islands (SPI-1, -2, -4, -5), two sigma factor regulons, and the flagellar gene cascade. Subsequent HTPS analysis of 350,000 cDNAs, derived from RNA co...... would be rescued by overexpression of HilD and FlhDC, and we proved this to be correct. The combination of epitope-tagging and HTPS of immunoprecipitated RNA detected the expression of many intergenic chromosomal regions of Salmonella. Our approach overcomes the limited availability of high...

  13. Sample size calculation while controlling false discovery rate for differential expression analysis with RNA-sequencing experiments.

    Science.gov (United States)

    Bi, Ran; Liu, Peng

    2016-03-31

    RNA-Sequencing (RNA-seq) experiments have been popularly applied to transcriptome studies in recent years. Such experiments are still relatively costly. As a result, RNA-seq experiments often employ a small number of replicates. Power analysis and sample size calculation are challenging in the context of differential expression analysis with RNA-seq data. One challenge is that there are no closed-form formulae to calculate power for the popularly applied tests for differential expression analysis. In addition, false discovery rate (FDR), instead of family-wise type I error rate, is controlled for the multiple testing error in RNA-seq data analysis. So far, there are very few proposals on sample size calculation for RNA-seq experiments. In this paper, we propose a procedure for sample size calculation while controlling FDR for RNA-seq experimental design. Our procedure is based on the weighted linear model analysis facilitated by the voom method which has been shown to have competitive performance in terms of power and FDR control for RNA-seq differential expression analysis. We derive a method that approximates the average power across the differentially expressed genes, and then calculate the sample size to achieve a desired average power while controlling FDR. Simulation results demonstrate that the actual power of several popularly applied tests for differential expression is achieved and is close to the desired power for RNA-seq data with sample size calculated based on our method. Our proposed method provides an efficient algorithm to calculate sample size while controlling FDR for RNA-seq experimental design. We also provide an R package ssizeRNA that implements our proposed method and can be downloaded from the Comprehensive R Archive Network ( http://cran.r-project.org ).

  14. Microbial Dark Matter: Unusual intervening sequences in 16S rRNA genes of candidate phyla from the deep subsurface

    Energy Technology Data Exchange (ETDEWEB)

    Jarett, Jessica; Stepanauskas, Ramunas; Kieft, Thomas; Onstott, Tullis; Woyke, Tanja

    2014-03-17

    The Microbial Dark Matter project has sequenced genomes from over 200 single cells from candidate phyla, greatly expanding our knowledge of the ecology, inferred metabolism, and evolution of these widely distributed, yet poorly understood lineages. The second phase of this project aims to sequence an additional 800 single cells from known as well as potentially novel candidate phyla derived from a variety of environments. In order to identify whole genome amplified single cells, screening based on phylogenetic placement of 16S rRNA gene sequences is being conducted. Briefly, derived 16S rRNA gene sequences are aligned to a custom version of the Greengenes reference database and added to a reference tree in ARB using parsimony. In multiple samples from deep subsurface habitats but not from other habitats, a large number of sequences proved difficult to align and therefore to place in the tree. Based on comparisons to reference sequences and structural alignments using SSU-ALIGN, many of these ?difficult? sequences appear to originate from candidate phyla, and contain intervening sequences (IVSs) within the 16S rRNA genes. These IVSs are short (39 - 79 nt) and do not appear to be self-splicing or to contain open reading frames. IVSs were found in the loop regions of stem-loop structures in several different taxonomic groups. Phylogenetic placement of sequences is strongly affected by IVSs; two out of three groups investigated were classified as different phyla after their removal. Based on data from samples screened in this project, IVSs appear to be more common in microbes occurring in deep subsurface habitats, although the reasons for this remain elusive.

  15. Ontogeny of hepatic energy metabolism genes in mice as revealed by RNA-sequencing.

    Directory of Open Access Journals (Sweden)

    Helen J Renaud

    Full Text Available The liver plays a central role in metabolic homeostasis by coordinating synthesis, storage, breakdown, and redistribution of nutrients. Hepatic energy metabolism is dynamically regulated throughout different life stages due to different demands for energy during growth and development. However, changes in gene expression patterns throughout ontogeny for factors important in hepatic energy metabolism are not well understood. We performed detailed transcript analysis of energy metabolism genes during various stages of liver development in mice. Livers from male C57BL/6J mice were collected at twelve ages, including perinatal and postnatal time points (n = 3/age. The mRNA was quantified by RNA-Sequencing, with transcript abundance estimated by Cufflinks. One thousand sixty energy metabolism genes were examined; 794 were above detection, of which 627 were significantly changed during at least one developmental age compared to adult liver. Two-way hierarchical clustering revealed three major clusters dependent on age: GD17.5-Day 5 (perinatal-enriched, Day 10-Day 20 (pre-weaning-enriched, and Day 25-Day 60 (adolescence/adulthood-enriched. Clustering analysis of cumulative mRNA expression values for individual pathways of energy metabolism revealed three patterns of enrichment: glycolysis, ketogenesis, and glycogenesis were all perinatally-enriched; glycogenolysis was the only pathway enriched during pre-weaning ages; whereas lipid droplet metabolism, cholesterol and bile acid metabolism, gluconeogenesis, and lipid metabolism were all enriched in adolescence/adulthood. This study reveals novel findings such as the divergent expression of the fatty acid β-oxidation enzymes Acyl-CoA oxidase 1 and Carnitine palmitoyltransferase 1a, indicating a switch from mitochondrial to peroxisomal β-oxidation after weaning; as well as the dynamic ontogeny of genes implicated in obesity such as Stearoyl-CoA desaturase 1 and Elongation of very long chain fatty

  16. Analysis and Prediction of Exon Skipping Events from RNA-Seq with Sequence Information Using Rotation Forest

    Directory of Open Access Journals (Sweden)

    Xiuquan Du

    2017-12-01

    Full Text Available In bioinformatics, exon skipping (ES event prediction is an essential part of alternative splicing (AS event analysis. Although many methods have been developed to predict ES events, a solution has yet to be found. In this study, given the limitations of machine learning algorithms with RNA-Seq data or genome sequences, a new feature, called RS (RNA-seq and sequence features, was constructed. These features include RNA-Seq features derived from the RNA-Seq data and sequence features derived from genome sequences. We propose a novel Rotation Forest classifier to predict ES events with the RS features (RotaF-RSES. To validate the efficacy of RotaF-RSES, a dataset from two human tissues was used, and RotaF-RSES achieved an accuracy of 98.4%, a specificity of 99.2%, a sensitivity of 94.1%, and an area under the curve (AUC of 98.6%. When compared to the other available methods, the results indicate that RotaF-RSES is efficient and can predict ES events with RS features.

  17. 16S rRNA partial gene sequencing for the differentiation and molecular subtyping of Listeria species.

    Science.gov (United States)

    Hellberg, Rosalee S; Martin, Keely G; Keys, Ashley L; Haney, Christopher J; Shen, Yuelian; Smiley, R Derike

    2013-12-01

    Use of 16S rRNA partial gene sequencing within the regulatory workflow could greatly reduce the time and labor needed for confirmation and subtyping of Listeria monocytogenes. The goal of this study was to build a 16S rRNA partial gene reference library for Listeria spp. and investigate the potential for 16S rRNA molecular subtyping. A total of 86 isolates of Listeria representing L. innocua, L. seeligeri, L. welshimeri, and L. monocytogenes were obtained for use in building the custom library. Seven non-Listeria species and three additional strains of Listeria were obtained for use in exclusivity and food spiking tests. Isolates were sequenced for the partial 16S rRNA gene using the MicroSeq ID 500 Bacterial Identification Kit (Applied Biosystems). High-quality sequences were obtained for 84 of the custom library isolates and 23 unique 16S sequence types were discovered for use in molecular subtyping. All of the exclusivity strains were negative for Listeria and the three Listeria strains used in food spiking were consistently recovered and correctly identified at the species level. The spiking results also allowed for differentiation beyond the species level, as 87% of replicates for one strain and 100% of replicates for the other two strains consistently matched the same 16S type. Copyright © 2013 Elsevier Ltd. All rights reserved.

  18. Identifying transcriptional miRNA biomarkers by integrating high-throughput sequencing and real-time PCR data

    NARCIS (Netherlands)

    S. Rahmann (Sven); M. Martin; J.H. Schulte (Johannes); J. Köster (Johannes); T. Marschall (Tobias); A. Schramm (Alexander)

    2013-01-01

    htmlabstractUsing both high-throughput sequencing and real-time PCR, the miRNA transcriptome can be analyzed in complementary ways. We describe the necessary bioinformatics pipeline, including software tools, and key methodological steps in the process, such as adapter removal, read mapping,

  19. DeAnnIso: a tool for online detection and annotation of isomiRs from small RNA sequencing data.

    Science.gov (United States)

    Zhang, Yuanwei; Zang, Qiguang; Zhang, Huan; Ban, Rongjun; Yang, Yifan; Iqbal, Furhan; Li, Ao; Shi, Qinghua

    2016-07-08

    Small RNA (sRNA) Sequencing technology has revealed that microRNAs (miRNAs) are capable of exhibiting frequent variations from their canonical sequences, generating multiple variants: the isoforms of miRNAs (isomiRs). However, integrated tool to precisely detect and systematically annotate isomiRs from sRNA sequencing data is still in great demand. Here, we present an online tool, DeAnnIso (Detection and Annotation of IsomiRs from sRNA sequencing data). DeAnnIso can detect all the isomiRs in an uploaded sample, and can extract the differentially expressing isomiRs from paired or multiple samples. Once the isomiRs detection is accomplished, detailed annotation information, including isomiRs expression, isomiRs classification, SNPs in miRNAs and tissue specific isomiR expression are provided to users. Furthermore, DeAnnIso provides a comprehensive module of target analysis and enrichment analysis for the selected isomiRs. Taken together, DeAnnIso is convenient for users to screen for isomiRs of their interest and useful for further functional studies. The server is implemented in PHP + Perl + R and available to all users for free at: http://mcg.ustc.edu.cn/bsc/deanniso/ and http://mcg2.ustc.edu.cn/bsc/deanniso/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. Microbial community structure of Arctic multiyear sea ice and surface seawater by 454 sequencing of the 16S RNA gene

    DEFF Research Database (Denmark)

    Bowman, Jeff S.; Rasmussen, Simon; Blom, Nikolaj

    2011-01-01

    community in MYI at two sites near the geographic North Pole using parallel tag sequencing of the 16S rRNA gene. Although the composition of the MYI microbial community has been characterized by previous studies, microbial community structure has not been. Although richness was lower in MYI than...

  1. Direct 16S rRNA gene sequencing of polymicrobial culture-negative samples with analysis of mixed chromatograms

    DEFF Research Database (Denmark)

    Hartmeyer, Gitte N; Justesen, Ulrik S

    2010-01-01

    Two cases involving polymicrobial culture-negative samples were investigated by 16S rRNA gene sequencing, with analysis of mixed chromatograms. Fusobacterium necrophorum, Prevotella intermedia and Streptococcus constellatus were identified from pleural fluid in a patient with Lemierre's syndrome...

  2. A high-throughput method to detect RNA profiling by integration of RT-MLPA with next generation sequencing technology.

    Science.gov (United States)

    Wang, Jing; Yang, Xue; Chen, Haofeng; Wang, Xuewei; Wang, Xiangyu; Fang, Yi; Jia, Zhenyu; Gao, Jidong

    2017-07-11

    RNA in formalin-fixed and paraffin-embedded (FFPE) tissues provides large amount of information indicating disease stages, histological tumor types and grades, as well as clinical outcomes. However, Detection of RNA expression levels in formalin-fixed and paraffin-embedded samples is extremely difficult due to poor RNA quality. Here we developed a high-throughput method, Reverse Transcription-Multiple Ligation-dependent Probe Sequencing (RT-MLPSeq), to determine expression levels of multiple transcripts in FFPE samples. By combining Reverse Transcription-Multiple Ligation-dependent Amplification method and next generation sequencing technology, RT-MLPSeq overcomes the limit of probe length in multiplex ligation-dependent probe amplification assay and thus could detect expression levels of transcripts without quantitative limitations. We proved that different RT-MLPSeq probes targeting on the same transcripts have highly consistent results and the starting RNA/cDNA input could be as little as 1 ng. RT-MLPSeq also presented consistent relative RNA levels of selected 13 genes with reverse transcription quantitative PCR. Finally, we demonstrated the application of the new RT-MLPSeq method by measuring the mRNA expression levels of 21 genes which can be used for accurate calculation of the breast cancer recurrence score - an index that has been widely used for managing breast cancer patients.

  3. Plant organellar DNA primase-helicase synthesizes RNA primers for organellar DNA polymerases using a unique recognition sequence.

    Science.gov (United States)

    Peralta-Castro, Antolín; Baruch-Torres, Noe; Brieba, Luis G

    2017-10-13

    DNA primases recognize single-stranded DNA (ssDNA) sequences to synthesize RNA primers during lagging-strand replication. Arabidopsis thaliana encodes an ortholog of the DNA primase-helicase from bacteriophage T7, dubbed AtTwinkle, that localizes in chloroplasts and mitochondria. Herein, we report that AtTwinkle synthesizes RNA primers from a 5'-(G/C)GGA-3' template sequence. Within this sequence, the underlined nucleotides are cryptic, meaning that they are essential for template recognition but are not instructional during RNA synthesis. Thus, in contrast to all primases characterized to date, the sequence recognized by AtTwinkle requires two nucleotides (5'-GA-3') as a cryptic element. The divergent zinc finger binding domain (ZBD) of the primase module of AtTwinkle may be responsible for template sequence recognition. During oligoribonucleotide synthesis, AtTwinkle shows a strong preference for rCTP as its initial ribonucleotide and a moderate preference for rGMP or rCMP incorporation during elongation. RNA products synthetized by AtTwinkle are efficiently used as primers for plant organellar DNA polymerases. In sum, our data strongly suggest that AtTwinkle primes organellar DNA polymerases during lagging strand synthesis in plant mitochondria and chloroplast following a primase-mediated mechanism. This mechanism contrasts to lagging-strand DNA replication in metazoan mitochondria, in which transcripts synthesized by mitochondrial RNA polymerase prime mitochondrial DNA polymerase γ. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. Partial Sequencing of 16S rRNA Gene of Selected Staphylococcus aureus Isolates and its Antibiotic Resistance

    Directory of Open Access Journals (Sweden)

    Harsi Dewantari Kusumaningrum

    2016-08-01

    Full Text Available The choice of primer used in 16S rRNA sequencing for identification of Staphylococcus species found in food is important. This study aimed to characterize Staphylococcus aureus isolates by partial sequencing based on 16S rRNA gene employing primers 16sF, 63F or 1387R. The isolates were isolated from milk, egg dishes and chicken dishes and selected based on the presence of sea gene that responsible for formation of enterotoxin-A. Antibiotic susceptibility of the isolates towards six antibiotics was also tested. The use of 16sF resulted generally in higher identity percentage and query coverage compared to the sequencing by 63F or 1387R. BLAST results of all isolates, sequenced by 16sF, showed 99% homology to complete genome of four S. aureus strains, with different characteristics on enterotoxin production and antibiotic resistance. Considering that all isolates were carrying sea gene, indicated by the occurence of 120 bp amplicon after PCR amplification using primer SEA1/SEA2,  the isolates were most in agreeing to S. aureus subsp. aureus ST288. This study indicated that 4 out of 8 selected isolates were resistant towards streptomycin. The 16S rRNA gene sequencing using 16sF is useful for identification of S. aureus. However, additional analysis such as PCR employing specific gene target, should give a valuable supplementary information, when specific characteristic is expected.

  5. [Nucleotide sequences of 5S rRNA genes of polyploid species of wheat and Aegilops species].

    Science.gov (United States)

    Vakhitov, V A; Gimalov, F R; Shumiatskiĭ, G P

    1989-01-01

    Primary structures of 5S rRNA genes and of non-transcribed spacers between them were determined in families of 5S DNA repeats 420 and 500 b.p. long in 8 wheat and Aegilops species. The high conservatism of sequences coding for 5S rRNA, 3'- and 5'-ends of non-transcribed spacers was shown not to depend on the evolutional position, ploidy level and genomic composition of species. The activity of transcription of 5S rRNA cloned genes was determined in vitro. The functional heterogeneity was revealed in each family of repeats due to the existence of exchanges of separate nucleotides within the internal transcription control region. A greater deficiency of CpG dinucleotide was revealed in 5S rRNA genes than in non-transcribed spacers.

  6. Small RNA and Transcriptome Sequencing Reveal a Potential miRNA-Mediated Interaction Network That Functions during Somatic Embryogenesis in Lilium pumilum DC. Fisch.

    Science.gov (United States)

    Zhang, Jing; Xue, Bingyang; Gai, Meizhu; Song, Shengli; Jia, Nana; Sun, Hongmei

    2017-01-01

    Plant somatic embryos are widely used in the fields of germplasm conservation, breeding for genetic engineering and artificial seed production. MicroRNAs (miRNAs) play pivotal roles in somatic embryogenesis (SE) regulation. However, their regulatory roles during various stages of SE remain unclear. In this study, six types of embryogenic samples of Lilium pumilum DC. Fisch., including organogenic callus, embryogenic callus induced for 4 weeks, embryogenic callus induced for 6 weeks, globular embryos, torpedo embryos and cotyledon embryos, were prepared for small RNA sequencing. The results revealed a total of 2,378,760 small RNA reads, among which the most common size was 24 nt. Four hundred and fifty-two known miRNAs, belonging to more than 86 families, 57 novel miRNAs and 40 miRNA*s were identified. The 86 known miRNA families were sorted according to an alignment with their homologs across 24 land plants into the following four categories: 23 highly conserved, 4 moderately conserved, 15 less conserved and 44 species-specific miRNAs. Differentially expressed known miRNAs were identified during various stages of SE. Subsequently, the expression levels of 12 differentially expressed miRNAs and 4 targets were validated using qRT-PCR. In addition, six samples were mixed in equal amounts for transcript sequencing, and the sequencing data were used as transcripts for miRNA target prediction. A total of 66,422 unigenes with an average length of 800 bp were assembled from 56,258,974 raw reads. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment indicated that 38,004 and 15,497 unigenes were successfully assigned to GO terms and KEGG pathways, respectively. Among the unigenes, 2,182 transcripts were predicted to be targets for 396 known miRNAs. The potential targets of the identified miRNAs were mostly classified into the following GO terms: cell, binding and metabolic process. Enriched KEGG analysis demonstrated that carbohydrate metabolism

  7. Patterns of homoeologous gene expression shown by RNA sequencing in hexaploid bread wheat.

    KAUST Repository

    Leach, Lindsey J

    2014-04-11

    BACKGROUND: Bread wheat (Triticum aestivum) has a large, complex and hexaploid genome consisting of A, B and D homoeologous chromosome sets. Therefore each wheat gene potentially exists as a trio of A, B and D homoeoloci, each of which may contribute differentially to wheat phenotypes. We describe a novel approach combining wheat cytogenetic resources (chromosome substitution \\'nullisomic-tetrasomic\\' lines) with next generation deep sequencing of gene transcripts (RNA-Seq), to directly and accurately identify homoeologue-specific single nucleotide variants and quantify the relative contribution of individual homoeoloci to gene expression. RESULTS: We discover, based on a sample comprising ~5-10% of the total wheat gene content, that at least 45% of wheat genes are expressed from all three distinct homoeoloci. Most of these genes show strikingly biased expression patterns in which expression is dominated by a single homoeolocus. The remaining ~55% of wheat genes are expressed from either one or two homoeoloci only, through a combination of extensive transcriptional silencing and homoeolocus loss. CONCLUSIONS: We conclude that wheat is tending towards functional diploidy, through a variety of mechanisms causing single homoeoloci to become the predominant source of gene transcripts. This discovery has profound consequences for wheat breeding and our understanding of wheat evolution.

  8. Single-Cell RNA Sequencing Identifies Extracellular Matrix Gene Expression by Pancreatic Circulating Tumor Cells

    Directory of Open Access Journals (Sweden)

    David T. Ting

    2014-09-01

    Full Text Available Circulating tumor cells (CTCs are shed from primary tumors into the bloodstream, mediating the hematogenous spread of cancer to distant organs. To define their composition, we compared genome-wide expression profiles of CTCs with matched primary tumors in a mouse model of pancreatic cancer, isolating individual CTCs using epitope-independent microfluidic capture, followed by single-cell RNA sequencing. CTCs clustered separately from primary tumors and tumor-derived cell lines, showing low-proliferative signatures, enrichment for the stem-cell-associated gene Aldh1a2, biphenotypic expression of epithelial and mesenchymal markers, and expression of Igfbp5, a gene transcript enriched at the epithelial-stromal interface. Mouse as well as human pancreatic CTCs exhibit a very high expression of stromal-derived extracellular matrix (ECM proteins, including SPARC, whose knockdown in cancer cells suppresses cell migration and invasiveness. The aberrant expression by CTCs of stromal ECM genes points to their contribution of microenvironmental signals for the spread of cancer to distant organs.

  9. Single-cell RNA sequencing identifies extracellular matrix gene expression by pancreatic circulating tumor cells.

    Science.gov (United States)

    Ting, David T; Wittner, Ben S; Ligorio, Matteo; Vincent Jordan, Nicole; Shah, Ajay M; Miyamoto, David T; Aceto, Nicola; Bersani, Francesca; Brannigan, Brian W; Xega, Kristina; Ciciliano, Jordan C; Zhu, Huili; MacKenzie, Olivia C; Trautwein, Julie; Arora, Kshitij S; Shahid, Mohammad; Ellis, Haley L; Qu, Na; Bardeesy, Nabeel; Rivera, Miguel N; Deshpande, Vikram; Ferrone, Cristina R; Kapur, Ravi; Ramaswamy, Sridhar; Shioda, Toshi; Toner, Mehmet; Maheswaran, Shyamala; Haber, Daniel A

    2014-09-25

    Circulating tumor cells (CTCs) are shed from primary tumors into the bloodstream, mediating the hematogenous spread of cancer to distant organs. To define their composition, we compared genome-wide expression profiles of CTCs with matched primary tumors in a mouse model of pancreatic cancer, isolating individual CTCs using epitope-independent microfluidic capture, followed by single-cell RNA sequencing. CTCs clustered separately from primary tumors and tumor-derived cell lines, showing low-proliferative signatures, enrichment for the stem-cell-associated gene Aldh1a2, biphenotypic expression of epithelial and mesenchymal markers, and expression of Igfbp5, a gene transcript enriched at the epithelial-stromal interface. Mouse as well as human pancreatic CTCs exhibit a very high expression of stromal-derived extracellular matrix (ECM) proteins, including SPARC, whose knockdown in cancer cells suppresses cell migration and invasiveness. The aberrant expression by CTCs of stromal ECM genes points to their contribution of microenvironmental signals for the spread of cancer to distant organs. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.

  10. Diversity of DNA and RNA Viruses in Indoor Air As Assessed via Metagenomic Sequencing.

    Science.gov (United States)

    Rosario, Karyna; Fierer, Noah; Miller, Shelly; Luongo, Julia; Breitbart, Mya

    2018-02-06

    Diverse bacterial and fungal communities inhabit human-occupied buildings and circulate in indoor air; however, viral diversity in these man-made environments remains largely unknown. Here we investigated DNA and RNA viruses circulating in the air of 12 university dormitory rooms by analyzing dust accumulated over a one-year period on heating, ventilation, and air conditioning (HVAC) filters. A metagenomic sequencing approach was used to determine the identity and diversity of viral particles extracted from the HVAC filters. We detected a broad diversity of viruses associated with a range of hosts, including animals, arthropods, bacteria, fungi, humans, plants, and protists, suggesting that disparate organisms can contribute to indoor airborne viral communities. Viral community composition and the distribution of human-infecting papillomaviruses and polyomaviruses were distinct in the different dormitory rooms, indicating that airborne viral communities are variable in human-occupied spaces and appear to reflect differential rates of viral shedding from room occupants. This work significantly expands the known airborne viral diversity found indoors, enabling the design of sensitive and quantitative assays to further investigate specific viruses of interest and providing new insight into the likely sources of viruses found in indoor air.

  11. Evolution of green plants as deduced from 5S rRNA sequences.

    Science.gov (United States)

    Hori, H; Lim, B L; Osawa, S

    1985-02-01

    We have constructed a phylogenic tree for green plants by comparing 5S rRNA sequences. The tree suggests that the emergence of most of the uni- and multicellular green algae such as Chlamydomonas, Spirogyra, Ulva, and Chlorella occurred in the early stage of green plant evolution. The branching point of Nitella is a little earlier than that of land plants and much later than that of the above green algae, supporting the view that Nitella-like green algae may be the direct precursor to land plants. The Bryophyta and the Pteridophyta separated from each other after emergence of the Spermatophyta. The result is consistent with the view that the Bryophyta evolved from ferns by degeneration. In the Pteridophyta, Psilotum (whisk fern) separated first, and a little later Lycopodium (club moss) separated from the ancestor common to Equisetum (horsetail) and Dryopteris (fern). This order is in accordance with the classical view. During the Spermatophyta evolution, the gymnosperms (Cycas, Ginkgo, and Metasequoia have been studied here) and the angiosperms (flowering plants) separated, and this was followed by the separation of Metasequoia and Cycas (cycad)/Ginkgo (maidenhair tree) on one branch and various flowering plants on the other.

  12. Transcriptome analysis of the Chinese giant salamander (Andrias davidianus using RNA-sequencing

    Directory of Open Access Journals (Sweden)

    Yong Huang

    2017-12-01

    Full Text Available The Chinese giant salamander (Andrias davidianus is an economically important animal on academic value. However, the genomic information of this species has been less studied. In our study, the transcripts of A. davidianus were obtained by RNA-seq to conduct a transcriptomic analysis. In total 132,912 unigenes were generated with an average length of 690 bp and N50 of 1263 bp by de novo assembly using Trinity software. Using a sequence similarity search against the nine public databases (CDD, KOG, NR, NT, PFAM, Swiss-prot, TrEMBL, GO and KEGG databases, a total of 24,049, 18,406, 36,711, 15,858, 20,500, 27,515, 36,705, 28,879 and 10,958 unigenes were annotated in databases, respectively. Of these, 6323 unigenes were annotated in all database and 39,672 unigenes were annotated in at least one database. Blasted with KEGG pathway, 10,958 unigenes were annotated, and it was divided into 343 categories according to different pathways. In addition, we also identified 29,790 SSRs. This study provided a valuable resource for understanding transcriptomic information of A. davidianus and laid a foundation for further research on functional gene cloning, genomics, genetic diversity analysis and molecular marker exploitation in A. davidianus.

  13. Phylogenetic relationships of the genus Eurytrema from domestic and wild animal based on 18S rRNA sequences.

    Science.gov (United States)

    Cai, Zhihua; Zhang, Yueling; Ye, Xiangqun

    2012-10-01

    Since Loss (1907) established the genus Eurytrema, there were more than eleven species found worldwide from America, Europe to Asia. Adult worms are generally found in pancreatic and bile ducts of wild and domestic ruminants. Some species from wild animal and domestic animal have already differentiated. In this study, we amplified and sequenced the partial 18S rRNA sequences of some Eurytrema species found in wild and domestic animals. The phylogenetic analysis was conducted to show the genetic relationship of these Eurytrema species. The results demonstrated that same species of Eurytrema from domestic animal and wild animal or from separated geological region have a considerable degree of genetic differentiation. Analysis of the 18S rRNA sequences indicated that Eurytrema fukienensis is an independent species and suggested that it may represent the intermediate species between wild and domestic animal.

  14. Herpesvirus telomerase RNA (vTR with a mutated template sequence abrogates herpesvirus-induced lymphomagenesis.

    Directory of Open Access Journals (Sweden)

    Benedikt B Kaufer

    2011-10-01

    Full Text Available Telomerase reverse transcriptase (TERT and telomerase RNA (TR represent the enzymatically active components of telomerase. In the complex, TR provides the template for the addition of telomeric repeats to telomeres, a protective structure at the end of linear chromosomes. Human TR with a mutation in the template region has been previously shown to inhibit proliferation of cancer cells in vitro. In this report, we examined the effects of a mutation in the template of a virus encoded TR (vTR on herpesvirus-induced tumorigenesis in vivo. For this purpose, we used the oncogenic avian herpesvirus Marek's disease virus (MDV as a natural virus-host model for lymphomagenesis. We generated recombinant MDV in which the vTR template sequence was mutated from AATCCCAATC to ATATATATAT (vAU5 by two-step Red-mediated mutagenesis. Recombinant viruses harboring the template mutation replicated with kinetics comparable to parental and revertant viruses in vitro. However, mutation of the vTR template sequence completely abrogated virus-induced tumor formation in vivo, although the virus was able to undergo low-level lytic replication. To confirm that the absence of tumors was dependent on the presence of mutant vTR in the telomerase complex, a second mutation was introduced in vAU5 that targeted the P6.1 stem loop, a conserved region essential for vTR-TERT interaction. Absence of vTR-AU5 from the telomerase complex restored virus-induced lymphoma formation. To test if the attenuated vAU5 could be used as an effective vaccine against MDV, we performed vaccination-challenge studies and determined that vaccination with vAU5 completely protected chickens from lethal challenge with highly virulent MDV. Taken together, our results demonstrate 1 that mutation of the vTR template sequence can completely abrogate virus-induced tumorigenesis, likely by the inhibition of cancer cell proliferation, and 2 that this strategy could be used to generate novel vaccine candidates

  15. Forced selection of a human immunodeficiency virus type 1 variant that uses a non-self tRNA primer for reverse transcription: Involvement of viral RNA sequences and the reverse transcriptase enzyme

    NARCIS (Netherlands)

    Abbink, Truus E. M.; Beerens, Nancy; Berkhout, Ben

    2004-01-01

    Human immunodeficiency virus type 1 uses the tRNA(3)(Lys) molecule as a selective primer for reverse transcription. This primer specificity is imposed by sequence complementarity between the tRNA primer and two motifs in the viral RNA genome: the primer-binding site (PBS) and the primer activation

  16. Identification of Genetic Variation between Obligate Plant Pathogens Pseudoperonospora cubensis and P. humuli Using RNA Sequencing and Genotyping-By-Sequencing.

    Directory of Open Access Journals (Sweden)

    Carly F Summers

    Full Text Available RNA sequencing (RNA-seq and genotyping-by-sequencing (GBS were used for single nucleotide polymorphism (SNP identification from two economically important obligate plant pathogens, Pseudoperonospora cubensis and P. humuli. Twenty isolates of P. cubensis and 19 isolates of P. humuli were genotyped using RNA-seq and GBS. Principle components analysis (PCA of each data set showed genetic separation between the two species. Additionally, results supported previous findings that P. cubensis isolates from squash are genetically distinct from cucumber and cantaloupe isolates. A PCA-based procedure was used to identify SNPs correlated with the separation of the two species, with 994 and 4,231 PCA-correlated SNPs found within the RNA-seq and GBS data, respectively. The corresponding unigenes (n = 800 containing these potential species-specific SNPs were then annotated and 135 putative pathogenicity genes, including 3 effectors, were identified. The characterization of genes containing SNPs differentiating these two closely related downy mildew species may contribute to the development of improved detection and diagnosis strategies and improve our understanding of host specificity pathways.

  17. Identification of Genetic Variation between Obligate Plant Pathogens Pseudoperonospora cubensis and P. humuli Using RNA Sequencing and Genotyping-By-Sequencing.

    Science.gov (United States)

    Summers, Carly F; Gulliford, Colwyn M; Carlson, Craig H; Lillis, Jacquelyn A; Carlson, Maryn O; Cadle-Davidson, Lance; Gent, David H; Smart, Christine D

    2015-01-01

    RNA sequencing (RNA-seq) and genotyping-by-sequencing (GBS) were used for single nucleotide polymorphism (SNP) identification from two economically important obligate plant pathogens, Pseudoperonospora cubensis and P. humuli. Twenty isolates of P. cubensis and 19 isolates of P. humuli were genotyped using RNA-seq and GBS. Principle components analysis (PCA) of each data set showed genetic separation between the two species. Additionally, results supported previous findings that P. cubensis isolates from squash are genetically distinct from cucumber and cantaloupe isolates. A PCA-based procedure was used to identify SNPs correlated with the separation of the two species, with 994 and 4,231 PCA-correlated SNPs found within the RNA-seq and GBS data, respectively. The corresponding unigenes (n = 800) containing these potential species-specific SNPs were then annotated and 135 putative pathogenicity genes, including 3 effectors, were identified. The characterization of genes containing SNPs differentiating these two closely related downy mildew species may contribute to the development of improved detection and diagnosis strategies and improve our understanding of host specificity pathways.

  18. Metallothionein coding sequence identification and seasonal mRNA expression of detoxification genes in the bivalve Corbicula fluminea.

    Science.gov (United States)

    Bigot, Aurélie; Doyen, Périne; Vasseur, Paule; Rodius, François

    2009-02-01

    The aim of this study was to identify a metallothionein (MT) coding sequence from the freshwater bivalve Corbicula fluminea and to measure the seasonal transcriptional pattern of MT in parallel with several detoxification genes: superoxide dismutase (SOD), catalase (CAT), glutathione S-transferases (GST) and glutathione peroxidases (GPx), in the digestive gland and the gills of this bivalve during a 1-year period. We identified a C. fluminea MT complete cDNA sequence using RT-PCR and RACE-PCR. The amino acid sequence deduced from the coding sequence encodes for a protein of 73 amino acids containing 21 cysteine residues. This protein exhibits high identities and similarities with the MT sequences of numerous bivalves. MT, SOD, CAT, pi-GST and Se-GPx expression patterns did not exhibit major seasonal variations. A slight increase of MT was observed in July. Therefore, the mRNA expression of these five genes could be used as biomarkers for monitoring studies.

  19. microRNA Biomarker Discovery and High-Throughput DNA Sequencing Are Possible Using Long-term Archived Serum Samples.

    Science.gov (United States)

    Rounge, Trine B; Lauritzen, Marianne; Langseth, Hilde; Enerly, Espen; Lyle, Robert; Gislefoss, Randi E

    2015-09-01

    The impacts of long-term storage and varying preanalytical factors on the quality and quantity of DNA and miRNA from archived serum have not been fully assessed. Preanalytical and analytical variations and degradation may introduce bias in representation of DNA and miRNA and may result in loss or corruption of quantitative data. We have evaluated DNA and miRNA quantity, quality, and variability in samples stored up to 40 years using one of the oldest prospective serum collections in the world, the Janus Serumbank, a biorepository dedicated to cancer research. miRNAs are present and stable in archived serum samples frozen at -25°C for at least 40 years. Long-time storage did not reduce miRNA yields; however, varying preanalytical conditions had a significant effect and should be taken into consideration during project design. Of note, 500 μL serum yielded sufficient miRNA for qPCR and small RNA sequencing and on average 650 unique miRNAs were detected in samples from presumably healthy donors. Of note, 500 μL serum yielded sufficient DNA for whole-genome sequencing and subsequent SNP calling, giving a uniform representation of the genomes. DNA and miRNA are stable during long-term storage, making large prospectively collected serum repositories an invaluable source for miRNA and DNA biomarker discovery. Large-scale biomarker studies with long follow-up time are possible utilizing biorepositories with archived serum and state-of-the-art technology. ©2015 American Association for Cancer Research.

  20. Deep RNA sequencing reveals dynamic regulation of myocardial noncoding RNAs in failing human heart and remodeling with mechanical circulatory support.

    Science.gov (United States)

    Yang, Kai-Chien; Yamada, Kathryn A; Patel, Akshar Y; Topkara, Veli K; George, Isaac; Cheema, Faisal H; Ewald, Gregory A; Mann, Douglas L; Nerbonne, Jeanne M

    2014-03-04

    Microarrays have been used extensively to profile transcriptome remodeling in failing human heart, although the genomic coverage provided is limited and fails to provide a detailed picture of the myocardial transcriptome landscape. Here, we describe sequencing-based transcriptome profiling, providing comprehensive analysis of myocardial mRNA, microRNA (miRNA), and long noncoding RNA (lncRNA) expression in failing human heart before and after mechanical support with a left ventricular (LV) assist device (LVAD). Deep sequencing of RNA isolated from paired nonischemic (NICM; n=8) and ischemic (ICM; n=8) human failing LV samples collected before and after LVAD and from nonfailing human LV (n=8) was conducted. These analyses revealed high abundance of mRNA (37%) and lncRNA (71%) of mitochondrial origin. miRNASeq revealed 160 and 147 differentially expressed miRNAs in ICM and NICM, respectively, compared with nonfailing LV. Among these, only 2 (ICM) and 5 (NICM) miRNAs are normalized with LVAD. RNASeq detected 18 480, including 113 novel, lncRNAs in human LV. Among the 679 (ICM) and 570 (NICM) lncRNAs differentially expressed with heart failure, ≈10% are improved or normalized with LVAD. In addition, the expression signature of lncRNAs, but not miRNAs or mRNAs, distinguishes ICM from NICM. Further analysis suggests that cis-gene regulation represents a major mechanism of action of human cardiac lncRNAs. The myocardial transcriptome is dynamically regulated in advanced heart failure and after LVAD support. The expression profiles of lncRNAs, but not mRNAs or miRNAs, can discriminate failing hearts of different pathologies and are markedly altered in response to LVAD support. These results suggest an important role for lncRNAs in the pathogenesis of heart failure and in reverse remodeling observed with mechanical support.

  1. A new approach for separating low-molecular-weight RNA molecules by staircase electrophoresis in non-sequencing gels.

    Science.gov (United States)

    Velázquez, Encarna; Rivas, Raúl; del Villar, María; Valverde, Angel; Peix, Alvaro; Mateos, Pedro F; Velázquez, Enrique; Martínez-Molina, Eustoquio

    2006-05-01

    Low-molecular-weight (LMW) RNA profiles, which include ribosomal and transfer RNA molecules with similar small sizes, are molecular signatures of microorganisms with a great potential in microbial identification. The greatest resolution of these profiles was achieved by staircase electrophoresis in sequencing gels. Nevertheless, this technique is difficult to use because it takes 7 h, the gels have large sizes and it is necessary to heat the system and to recycle the buffer to maintain the denaturing conditions and avoid smile effects. Most available sequencing slabs have no internal temperature control or homogenizing devices, which by contrast are present in some newly designed non-sequencing slabs. Nevertheless, these slabs present two important problems for separating LMW RNA molecules, the size of gels is only 20 cm (instead of 40 cm) and the maximum voltage that can be reached is only 840 V (instead 2400 V). Staircase electrophoresis follows a model in which the external polarization is incrementally modified with a constant time step value. In the present work, we experimentally confirmed that by reducing the time step and increasing the total number of steps a suitable resolution is achieved. Under these conditions, despite the smaller size of the gels and the lower values of the electric field, the intensity reaches higher values than in sequencing gels and the LMW RNA profiles are correctly separated in 5 h. The resolution of these profiles obtained in non-sequencing gels is similar to that obtained in sequencing ones facilitating the analysis of large populations of microorganisms in any laboratory.

  2. A comprehensive evaluation of the sl1p pipeline for 16S rRNA gene sequencing analysis.

    Science.gov (United States)

    Whelan, Fiona J; Surette, Michael G

    2017-08-14

    Advances in next-generation sequencing technologies have allowed for detailed, molecular-based studies of microbial communities such as the human gut, soil, and ocean waters. Sequencing of the 16S rRNA gene, specific to prokaryotes, using universal PCR primers has become a common approach to studying the composition of these microbiota. However, the bioinformatic processing of the resulting millions of DNA sequences can be challenging, and a standardized protocol would aid in reproducible analyses. The short-read library 16S rRNA gene sequencing pipeline (sl1p, pronounced "slip") was designed with the purpose of mitigating this lack of reproducibility by combining pre-existing tools into a computational pipeline. This pipeline automates the processing of raw 16S rRNA gene sequencing data to create human-readable tables, graphs, and figures to make the collected data more readily accessible. Data generated from mock communities were compared using eight OTU clustering algorithms, two taxon assignment approaches, and three 16S rRNA gene reference databases. While all of these algorithms and options are available to sl1p users, through testing with human-associated mock communities, AbundantOTU+, the RDP Classifier, and the Greengenes 2011 reference database were chosen as sl1p's defaults based on their ability to best represent the known input communities. sl1p promotes reproducible research by providing a comprehensive log file, and reduces the computational knowledge needed by the user to process next-generation sequencing data. sl1p is freely available at https://bitbucket.org/fwhelan/sl1p .

  3. Concordance between RNA-sequencing data and DNA microarray data in transcriptome analysis of proliferative and quiescent fibroblasts

    Science.gov (United States)

    Trost, Brett; Moir, Catherine A.; Gillespie, Zoe E.; Kusalik, Anthony; Mitchell, Jennifer A.; Eskiw, Christopher H.

    2015-01-01

    DNA microarrays and RNA sequencing (RNA-seq) are major technologies for performing high-throughput analysis of transcript abundance. Recently, concerns have been raised regarding the concordance of data derived from the two techniques. Using cDNA libraries derived from normal human foreskin fibroblasts, we measured changes in transcript abundance as cells transitioned from proliferative growth to quiescence using both DNA microarrays and RNA-seq. The internal reproducibility of the RNA-seq data was greater than that of the microarray data. Correlations between the RNA-seq data and the individual microarrays were low, but correlations between the RNA-seq values and the geometric mean of the microarray values were moderate. The two technologies had good agreement when considering probes with the largest (both positive and negative) fold change (FC) values. An independent technique, quantitative reverse-transcription PCR (qRT-PCR), was used to measure the FC of 76 genes between proliferative and quiescent samples, and a higher correlation was observed between the qRT-PCR data and the RNA-seq data than between the qRT-PCR data and the microarray data. PMID:26473061

  4. pEVL: A Linear Plasmid for Generating mRNA IVT Templates With Extended Encoded Poly(A Sequences

    Directory of Open Access Journals (Sweden)

    Alexandra E Grier

    2016-01-01

    Full Text Available Increasing demand for large-scale synthesis of in vitro transcribed (IVT mRNA is being driven by the increasing use of mRNA for transient gene expression in cell engineering and therapeutic applications. An important determinant of IVT mRNA potency is the 3′ polyadenosine (poly(A tail, the length of which correlates with translational efficiency. However, present methods for generation of IVT mRNA rely on templates derived from circular plasmids or PCR products, in which homopolymeric tracts are unstable, thus limiting encoded poly(A tail lengths to ≃120 base pairs (bp. Here, we have developed a novel method for generation of extended poly(A tracts using a previously described linear plasmid system, pJazz. We find that linear plasmids can successfully propagate poly(A tracts up to ≃500 bp in length for IVT mRNA production. We then modified pJazz by removing extraneous restriction sites, adding a T7 promoter sequence upstream from an extended multiple cloning site, and adding a unique type-IIS restriction site downstream from the encoded poly(A tract to facilitate generation of IVT mRNA with precisely defined encoded poly(A tracts and 3′ termini. The resulting plasmid, designated pEVL, can be used to generate IVT mRNA with consistent defined lengths and terminal residue(s.

  5. Morpholino spin-labeling for base-pair sequencing of a 3'-terminal RNA stem by proton homonuclear Overhauser enhancements: yeast ribosomal 5S RNA

    International Nuclear Information System (INIS)

    Lee, K.M.; Marshall, A.G.

    1987-01-01

    Base-pair sequences for 5S and 5.8S RNAs are not readily extracted from proton homonuclear nuclear Overhauser enhancement (NOE) connectivity experiments alone, due to extensive peak overlap in the downfield (11-15 ppm) proton NMR spectrum. In this paper, we introduce a new method for base-pair proton peak assignment for ribosomal RNAs, based upon the distance-dependent broadening of the resonances of base-pair protons spatially proximal to a paramagnetic group. Introduction of a nitroxide spin-label covalently attached to the 3'-terminal ribose provides an unequivocal starting point for base-pair hydrogen-bond proton NMR assignment. Subsequent NOE connectivities then establish the base-pair sequence for the terminal stem of a 5S RNA. Periodate oxidation of yeast 5S RNA, followed by reaction with 4-amino-2,2,6,6-tetramethylpiperidinyl-1-oxy (TEMPO-NH2) and sodium borohydride reduction, produces yeast 5S RNA specifically labeled with a paramagnetic nitroxide group at the 3'-terminal ribose. Comparison of the 500-MHz 1H NMR spectra of native and 3'-terminal spin-labeled yeast 5S RNA serves to identify the terminal base pair (G1 . C120) and its adjacent base pair (G2 . U119) on the basis of their proximity to the 3'-terminal spin-label. From that starting point, we have then identified (G . C, A . U, or G . U) and sequenced eight of the nine base pairs in the terminal helix via primary and secondary NOE's

  6. StralSV: assessment of sequence variability within similar 3D structures and application to polio RNA-dependent RNA polymerase

    Energy Technology Data Exchange (ETDEWEB)

    Zemla, A; Lang, D; Kostova, T; Andino, R; Zhou, C

    2010-11-29

    Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory - still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could overcome these difficulties and facilitate the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV, a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus and demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique or that shared structural similarity with structures that are distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected

  7. In vivo detection of RNA-binding protein interactions with cognate RNA sequences by fluorescence resonance energy transfer

    Czech Academy of Sciences Publication Activity Database

    Huranová, Martina; Jablonski, J.A.; Benda, Aleš; Hof, Martin; Staněk, David; Caputi, M.

    2009-01-01

    Roč. 15, č. 11 (2009), s. 2063-2071 ISSN 1355-8382 R&D Projects: GA AV ČR KAN200520801; GA MŠk(CZ) LC06063 Institutional research plan: CEZ:AV0Z50520514; CEZ:AV0Z40400503 Keywords : FRET * FLIM * RNA-protein interactions Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 5.198, year: 2009

  8. Evolutionary relationships among Japanese pond frogs inferred from mitochondrial DNA sequences of cytochrome b and 12S ribosomal RNA genes.

    Science.gov (United States)

    Sumida, M; Ogata, M; Kaneda, H; Yonekawa, H

    1998-04-01

    The evolutionary relationships among Japanese pond frogs (Rana nigromaculata, R.porosa porosa, and R. p. brevipoda) were investigated by analyzing nucleotide sequences of mitochondrial cytochrome b (cyt b) and 12S rRNA genes. The nucleotide sequences of 444-bp segment of the cyt b gene and 410-bp segment of 12S rRNA gene were determined by the PCR-direct sequencing method using 18 frogs from 13 populations of Japanese pond frogs, and phylogenetic trees were constructed by the neighbor-joining and maximum likelihood methods using R. catesbeiana as an outgroup. The sequenced 444-bp segment of cyt b gene provided 69 variables sites, and the sequenced 410-bp segment of 12S rRNA gene provided 21 variables sites. The numbers of nucleotide substitutions per site of the cyt b gene within ingroup were 0.0022-0.0205 at the populational level, 0.0368-0.0462 at the racial or subspecific level, and 0.1038-0.1244 at the specific level, whereas those of the 12S rRNA gene were 0-0.0074 at the populational or subspecific level, and 0.0378-0.0456 at the specific level. Most nucleotide substitutions within ingroup occurred at the third codon position of the cyt b gene and were silent mutations. High frequencies of transitions relative to transversions were shown at cyt b and 12S rRNA genes within ingroup. The phylogenetic trees constructed from the nucleotide sequences of the cyt b gene showed that after outgroup R. catesbeiana separated from ingroup frogs, ingroup Japanese pond frogs diverged into R.nigromaculata and R.porosa, then the latter diverged into R.p. porosa, R.p. brevipoda (the typical Okayama race), and the Nagoya race of R.p.porosa. The phylogenetic trees constructed from the nucleotide sequences of the 12S rRNA gene also showed distinct divergence between two species, but not any divergence within species.

  9. Molecular phylogenetic studies on an unnamed bovine Babesia sp. based on small subunit ribosomal RNA gene sequences.

    Science.gov (United States)

    Luo, Jianxun; Yin, Hong; Liu, Zhijie; Yang, Dongying; Guan, Guiquan; Liu, Aihong; Ma, Miling; Dang, Shengzhi; Lu, Bingyi; Sun, Caiqin; Bai, Qi; Lu, Wenshun; Chen, Puyan

    2005-10-10

    The 18S small subunit ribosomal RNA (18S rRNA) gene of an unnamed Babesia species (designated B. U sp.) was sequenced and analyzed in an attempt to distinguish it from other Babesia species in China. The target DNA segment was amplified by polymerase chain reaction (PCR). The PCR product was ligated to the pGEM-T Easy vector for sequencing. It was found that the length of the 18S rRNA gene of all B. U sp. Kashi 1 and B. U sp. Kashi 2 was 1699 bp and 1689 bp. Two phylogenetic trees were, respectively, inferred based on 18S rRNA sequence of the Chinese bovine Babesia isolates and all of Babesia species available in GenBank. The first tree showed that B. U sp. was situated in the branch between B. major Yili and B. bovis Shannxian, and the second tree revealed that B. U sp. was confined to the same group as B. caballi. The percent identity of B. U sp. with other Chinese Babesia species was between 74.2 and 91.8, while the percent identity between two B. U sp. isolates was 99.7. These results demonstrated that this B. U sp. is different from other Babesia species, but that two B. U sp. isolates obtained with nymphal and adultal Hyalomma anatolicum anatolicum tick belong to the same species.

  10. DNA sequencing reveals limited heterogeneity in the 16S rRNA gene from the rrnB operon among five Mycoplasma hominis isolates

    DEFF Research Database (Denmark)

    Mygind, T; Birkelund, Svend; Christiansen, Gunna

    1998-01-01

    To investigate the intraspecies heterogeneity within the 16S rRNA gene of Mycoplasma hominis, five isolates with diverse antigenic profiles, variable/identical P120 hypervariable domains, and different 16S rRNA gene RFLP patterns were analysed. The 16S rRNA gene from the rrnB operon was amplified...... by PCR and the PCR products were sequenced. Three isolates had identical 16S rRNA sequences and two isolates had sequences that differed from the others by only one nucleotide....

  11. Complete sequence of a double-stranded RNA from the phytopathogenic fungus Erysiphe cichoracearum that might represent a novel endornavirus.

    Science.gov (United States)

    Du, Zhenguo; Lin, Wenzhong; Qiu, Ping; Liu, Xiaojuan; Guo, Lingfang; Wu, Kangcheng; Zhang, Songbai; Wu, Zujian

    2016-08-01

    A double-stranded RNA (dsRNA) HBJZ1506 recovered from the phytopathogenic fungus Erysiphe cichoracearum infecting Calendula officinalis in Jingzhou, Hubei Province, China, was sequenced. HBJZ1506 comprises 11,908 nucleotides (nt) and contains a 11,859-nt-long open reading frame (ORF) coding for a polypeptide that is 61 % identical to that of a putative endornavirus named grapevine endophyte endornavirus (GeEV). The putative polyprotein has an RNA-dependent RNA polymerase (RdRp) domain and an RNA helicase domain, which show homology to and have an arrangement that is similar to that of their counterparts in approved or putative endornaviruses. In a phylogenetic tree constructed using amino acid sequences of the RdRp region of HBJZ1506 and selected endornaviruses, HBJZ1506 clustered with endornaviruses and formed a well-supported monophyletic branch with GeEV. These results suggest that HBJZ1506 might represent a novel endornavirus, for which the name Erysiphe cichoracearum endornavirus (EcEV) is proposed.

  12. A comprehensive database of high-throughput sequencing-based RNA secondary structure probing data (Structure Surfer).

    Science.gov (United States)

    Berkowitz, Nathan D; Silverman, Ian M; Childress, Daniel M; Kazan, Hilal; Wang, Li-San; Gregory, Brian D

    2016-05-17

    RNA molecules fold into complex three-dimensional shapes, guided by the pattern of hydrogen bonding between nucleotides. This pattern of base pairing, known as RNA secondary structure, is critical to their cellular function. Recently several diverse methods have been developed to assay RNA secondary structure on a transcriptome-wide scale using high-throughput sequencing. Each approach has its own strengths and caveats, however there is no widely available tool for visualizing and comparing the results from these varied methods. To address this, we have developed Structure Surfer, a database and visualization tool for inspecting RNA secondary structure in six transcriptome-wide data sets from human and mouse ( http://tesla.pcbi.upenn.edu/strucuturesurfer/ ). The data sets were generated using four different high-throughput sequencing based methods. Each one was analyzed with a scoring pipeline specific to its experimental design. Users of Structure Surfer have the ability to query individual loci as well as detect trends across multiple sites. Here, we describe the included data sets and their differences. We illustrate the database's function by examining known structural elements and we explore example use cases in which combined data is used to detect structural trends. In total, Structure Surfer provides an easy-to-use database and visualization interface for allowing users to interrogate the currently available transcriptome-wide RNA secondary structure information for mammals.

  13. Analysis and prediction of translation rate based on sequence and functional features of the mRNA.

    Directory of Open Access Journals (Sweden)

    Tao Huang

    Full Text Available Protein concentrations depend not only on the mRNA level, but also on the translation rate and the degradation rate. Prediction of mRNA's translation rate would provide valuable information for in-depth understanding of the translation mechanism and dynamic proteome. In this study, we developed a new computational model to predict the translation rate, featured by (1 integrating various sequence-derived and functional features, (2 applying the maximum relevance & minimum redundancy method and incremental feature selection to select features to optimize the prediction model, and (3 being able to predict the translation rate of RNA into high or low translation rate category. The prediction accuracies under rich and starvation condition were 68.8% and 70.0%, respectively, evaluated by jackknife cross-validation. It was found that the following features were correlated with translation rate: codon usage frequency, some gene ontology enrichment scores, number of RNA binding proteins known to bind its mRNA product, coding sequence length, protein abundance and 5'UTR free energy. These findings might provide useful information for understanding the mechanisms of translation and dynamic proteome. Our translation rate prediction model might become a high throughput tool for annotating the translation rate of mRNAs in large-scale.

  14. Thousands of primer-free, high-quality, full-length SSU rRNA sequences from all domains of life

    DEFF Research Database (Denmark)

    Karst, Soeren M; Dueholm, Morten S; McIlroy, Simon J

    2016-01-01

    Ribosomal RNA (rRNA) genes are the consensus marker for determination of microbial diversity on the planet, invaluable in studies of evolution and, for the past decade, high-throughput sequencing of variable regions of ribosomal RNA genes has become the backbone of most microbial ecology studies...

  15. Retrieval of a million high-quality, full-length microbial 16S and 18S rRNA gene sequences without primer bias

    DEFF Research Database (Denmark)

    Karst, Søren Michael; Dueholm, Morten Simonsen; McIlroy, Simon Jon

    2018-01-01

    Small subunit ribosomal RNA (SSU rRNA) genes, 16S in bacteria and 18S in eukaryotes, have been the standard phylogenetic markers used to characterize microbial diversity and evolution for decades. However, the reference databases of full-length SSU rRNA gene sequences are skewed to well-studied e...

  16. Differentially expressed gene transcripts using RNA sequencing from the blood of immunosuppressed kidney allograft recipients.

    Directory of Open Access Journals (Sweden)

    Casey Dorr

    Full Text Available We performed RNA sequencing (RNAseq on peripheral blood mononuclear cells (PBMCs to identify differentially expressed gene transcripts (DEGs after kidney transplantation and after the start of immunosuppressive drugs. RNAseq is superior to microarray to determine DEGs because it's not limited to available probes, has increased sensitivity, and detects alternative and previously unknown transcripts. DEGs were determined in 32 adult kidney recipients, without clinical acute rejection (AR, treated with antibody induction, calcineurin inhibitor, mycophenolate, with and without steroids. Blood was obtained pre-transplant (baseline, week 1, months 3 and 6 post-transplant. PBMCs were isolated, RNA extracted and gene expression measured using RNAseq. Principal components (PCs were computed using a surrogate variable approach. DEGs post-transplant were identified by controlling false discovery rate (FDR at < 0.01 with at least a 2 fold change in expression from pre-transplant. The top 5 DEGs with higher levels of transcripts in blood at week 1 were TOMM40L, TMEM205, OLFM4, MMP8, and OSBPL9 compared to baseline. The top 5 DEGs with lower levels at week 1 post-transplant were IL7R, KLRC3, CD3E, CD3D, and KLRC2 (Striking Image compared to baseline. The top pathways from genes with lower levels at 1 week post-transplant compared to baseline, were T cell receptor signaling and iCOS-iCOSL signaling while the top pathways from genes with higher levels than baseline were axonal guidance signaling and LXR/RXR activation. Gene expression signatures at month 3 were similar to week 1. DEGs at 6 months post-transplant create a different gene signature than week 1 or month 3 post-transplant. RNAseq analysis identified more DEGs with lower than higher levels in blood compared to baseline at week 1 and month 3. The number of DEGs decreased with time post-transplant. Further investigations to determine the specific lymphocyte(s responsible for differential gene

  17. Characterization of the small RNA transcriptomes of androgen dependent and independent prostate cancer cell line by deep sequencing.

    Directory of Open Access Journals (Sweden)

    Gang Xu

    2010-11-01

    Full Text Available Given the important roles of miRNA in post-transcriptional regulation and its implications for cancer, characterization of miRNA facilitates us to uncover molecular mechanisms underlying the progression of androgen-independent prostate cancer (PCa. The emergence of next-generation sequencing technologies has dramatically changed the speed of all aspects of sequencing in a rapid and cost-effective fashion, which can permit an unbiased, quantitive and in-depth investigation of small RNA transcriptome. In this study, we used high-throughput Illumina sequencing to comprehensively represent the full complement of individual small RNA and to characterize miRNA expression profiles in both the androgen dependent and independent Pca cell line. At least 83 miRNAs are significantly differentially expressed, of which 41 are up-regulated and 42 are down-regulated, indicating these miRNAs may be involved in the transition of LNCaP to an androgen-independent phenotype. In addition, we have identified 43 novel miRNAs from the androgen dependent and independent PCa library and 3 of them are specific to the androgen-independent PCa. Function annotation of target genes indicated that most of these differentially expressed miRNAs tend to target genes involved in signal transduction and cell communication, epically the MAPK signaling pathway. The small RNA transcriptomes obtained in this study provide considerable insights into a better understanding of the expression and function of small RNAs in the development of androgen-independent prostate cancer.

  18. Phylogenetic relationships within the family Halomonadaceae based on comparative 23S and 16S rRNA gene sequence analysis.

    Science.gov (United States)

    de la Haba, Rafael R; Arahal, David R; Márquez, M Carmen; Ventosa, Antonio

    2010-04-01

    A phylogenetic study of the family Halomonadaceae was carried out based on complete 16S rRNA and 23S rRNA gene sequences. Several 16S rRNA genes of type strains were resequenced, and 28 new sequences of the 23S rRNA gene were obtained. Currently, the family includes nine genera (Carnimonas, Chromohalobacter, Cobetia, Halomonas, Halotalea, Kushneria, Modicisalibacter, Salinicola and Zymobacter). These genera are phylogenetically coherent except Halomonas, which is polyphyletic. This genus comprises two clearly distinguished clusters: group 1 includes Halomonas elongata (the type species) and the species Halomonas eurihalina, H. caseinilytica, H. halmophila, H. sabkhae, H. almeriensis, H. halophila, H. salina, H. organivorans, H. koreensis, H. maura and H. nitroreducens. Group 2 comprises the species Halomonas aquamarina, H. meridiana, H. axialensis, H. magadiensis, H. hydrothermalis, H. alkaliphila, H. venusta, H. boliviensis, H. neptunia, H. variabilis, H. sulfidaeris, H. subterranea, H. janggokensis, H. gomseomensis, H. arcis and H. subglaciescola. Halomonas salaria forms a cluster with Chromohalobacter salarius and the recently described genus Salinicola, and their taxonomic affiliation requires further study. More than 20 Halomonas species are phylogenetically not within the core constituted by the Halomonas sensu stricto cluster (group 1) or group 2 and, since their positions on the different phylogenetic trees are not stable, they cannot be recognized as additional groups either. In general, there is excellent agreement between the phylogenies based on the two rRNA gene sequences, but the 23S rRNA gene showed higher resolution in the differentiation of species of the family Halomonadaceae.

  19. Avian reovirus L2 genome segment sequences and predicted structure/function of the encoded RNA-dependent RNA polymerase protein

    Directory of Open Access Journals (Sweden)

    Xu Wanhong

    2008-12-01

    Full Text Available Abstract Background The orthoreoviruses are infectious agents that possess a genome comprised of 10 double-stranded RNA segments encased in two concentric protein capsids. Like virtually all RNA viruses, an RNA-dependent RNA polymerase (RdRp enzyme is required for viral propagation. RdRp sequences have been determined for the prototype mammalian orthoreoviruses and for several other closely-related reoviruses, including aquareoviruses, but have not yet been reported for any avian orthoreoviruses. Results We determined the L2 genome segment nucleotide sequences, which encode the RdRp proteins, of two different avian reoviruses, strains ARV138 and ARV176 in order to define conserved and variable regions within reovirus RdRp proteins and to better delineate structure/function of this important enzyme. The ARV138 L2 genome segment was 3829 base pairs long, whereas the ARV176 L2 segment was 3830 nucleotides long. Both segments were predicted to encode λB RdRp proteins 1259 amino acids in length. Alignments of these newly-determined ARV genome segments, and their corresponding proteins, were performed with all currently available homologous mammalian reovirus (MRV and aquareovirus (AqRV genome segment and protein sequences. There was ~55% amino acid identity between ARV λB and MRV λ3 proteins, making the RdRp protein the most highly conserved of currently known orthoreovirus proteins, and there was ~28% identity between ARV λB and homologous MRV and AqRV RdRp proteins. Predictive structure/function mapping of identical and conserved residues within the known MRV λ3 atomic structure indicated most identical amino acids and conservative substitutions were located near and within predicted catalytic domains and lining RdRp channels, whereas non-identical amino acids were generally located on the molecule's surfaces. Conclusion The ARV λB and MRV λ3 proteins showed the highest ARV:MRV identity values (~55% amongst all currently known ARV and MRV

  20. Sequence analysis of RNA3 of Maize stripe virus associated with stripe disease of sorghum (Sorghum bicolor in India

    Directory of Open Access Journals (Sweden)

    Kalanghad Puthankalam SRINIVAS

    2014-05-01

    Full Text Available Maize stripe virus (MSpV, one of the distinct species of the genus Tenuivirus, has been associated with stripe disease of sorghum in India. In this study, we report the complete sequence analysis of ambisense RNA3 of four MSpV isolates associated with this disease, to confirm its correct identity. The RNA3 of four MSpV-Sorg isolates is 2357 nucleotides in length with two ORFs, one in virion sense (594 nucleotides, non-structural protein 3, NS3 and the other in complementary sense (951 nucleotides, coat protein, CP. The intergenic region between these two ORFs is 653 nucleotides in length, which is rich in U and A residues. The deduced molecular weights of NS3 and CP are ≈22 and ≈34 kDa, respectively. RNA3 has ≈82% sequence identity at nucleotide level with RNA3 of MSpV infecting maize in Florida, USA and Reunion. NS3 and CP ORFs shared ≈94% and ≈95% identities at amino acid levels, respectively with MSpV isolates of maize from Florida and Reunion. The internal non-coding region between two ORFs has 67–68% identity at nucleotide level with the reported MSpV isolates from Florida and Reunion. The sequence identity was more than ≈98% among the four isolates of MSpV-Sorg. Compared to maize-infecting MSpV isolates in USA and Reunion, the sorghum-infecting MSpV isolates in India had more amino acid substitutions in both NS3 and CP. This is the first report of complete sequence analysis of MSpV RNA3 from Asia.

  1. Sequences in the intergenic spacer influence RNA Pol I transcription from the human rRNA promoter

    Energy Technology Data Exchange (ETDEWEB)

    Li, W.M.; Sylvester, J.E. [Hahnemann Univ., Philadelphia, PA (United States)

    1994-09-01

    In most eucaryotic species, ribosomal genes are tandemly repeated about 100-5000 times per haploid genome. The 43 Kb human rDNA repeat consists of a 13 Kb coding region for the 18S, 5.8S, 28S ribosomal RNAs (rRNAs) and transcribed spacers separated by a 30 Kb intergenic spacer. For species such as frog, mouse and rat, sequences in the intergenic spacer other than the gene promoter have been shown to modulate transcription of the ribosomal gene. These sequences are spacer promoters, enhancers and the terminator for spacer transcription. We are addressing whether the human ribosomal gene promoter is similarly influenced. In-vitro transcription run-off assays have revealed that the 4.5 kb region (CBE), directly upstream of the gene promoter, has cis-stimulation and trans-competition properties. This suggests that the CBE fragment contains an enhancer(s) for ribosomal gene transcription. Further experiments have shown that a fragment ({approximately}1.6 kb) within the CBE fragment also has trans-competition function. Deletion subclones of this region are being tested to delineate the exact sequences responsible for these modulating activities. Previous sequence analysis and functional studies have revealed that CBE contains regions of DNA capable of adopting alternative structures such as bent DNA, Z-DNA, and triple-stranded DNA. Whether these structures are required for modulating transcription remains to be determined as does the specific DNA-protein interaction involved.

  2. Deep sequencing uncovers commonality in small RNA profiles between transgene-induced and naturally occurring RNA silencing of chalcone synthase-A gene in petunia

    Science.gov (United States)

    2013-01-01

    Background Introduction of a transgene that transcribes RNA homologous to an endogenous gene in the plant genome can induce silencing of both genes, a phenomenon termed cosuppression. Cosuppression was first discovered in transgenic petunia plants transformed with the CHS-A gene encoding chalcone synthase, in which nonpigmented sectors in flowers or completely white flowers are produced. Some of the flower-color patterns observed in transgenic petunias having CHS-A cosuppression resemble those in existing nontransgenic varieties. Although the mechanism by which white sectors are generated in nontransgenic petunia is known to be due to RNA silencing of the CHS-A gene as in cosuppression, whether the same trigger(s) and/or pattern of RNA degradation are involved in these phenomena has not been known. Here, we addressed this question using deep-sequencing and bioinformatic analyses of small RNAs. Results We analyzed short interfering RNAs (siRNAs) produced in nonpigmented sectors of petal tissues in transgenic petunia plants that have CHS-A cosuppression and a nontransgenic petunia variety Red Star, that has naturally occurring CHS-A RNA silencing. In both silencing systems, 21-nt and 22-nt siRNAs were the most and the second-most abundant size classes, respectively. CHS-A siRNA production was confined to exon 2, indicating that RNA degradation through the RNA silencing pathway occurred in this exon. Common siRNAs were detected in cosuppression and naturally occurring RNA silencing, and their ranks based on the number of siRNAs in these plants were correlated with each other. Noticeably, highly abundant siRNAs were common in these systems. Phased siRNAs were detected in multiple phases at multiple sites, and some of the ends of the regions that produced phased siRNAs were conserved. Conclusions The features of siRNA production found to be common to cosuppression and naturally occurring silencing of the CHS-A gene indicate mechanistic similarities between these

  3. Alternative splicing of human elastin mRNA indicated by sequence analysis of cloned genomic and complementary DNA

    International Nuclear Information System (INIS)

    Indik, Z.; Yeh, H.; Ornstein-goldstein, N.; Sheppard, P.; Anderson, N.; Rosenbloom, J.C.; Peltonen, L.; Rosenbloom, J.

    1987-01-01

    Poly(A) + RNA, isolated from a single 7-mo fetal human aorta, was used to synthesize cDNA by the RNase H method, and the cDNA was inserted into λgt10. Recombinant phage containing elastin sequences were identified by hybridization with cloned, exon-containing fragments of the human elastin gene. Three clones containing inserts of 3.3, 2.7, and 2.3 kilobases were selected for further analysis. Three overlapping clones containing 17.8 kilobases of the human elastin gene were also isolated from genomic libraries. Complete sequence analysis of the six clones demonstrated that: (i) the cDNA encompassed the entire translated portion of the mRNA encoding 786 amino acids, including several unusual hydrophilic amino acid sequences not previously identified in porcine tropoelastin, (ii) exons encoding either hydrophobic or crosslinking domains in the protein alternated in the gene, and (iii) a great abundance of Alu repetitive sequences occurred throughout the introns. The data also indicated substantial alternative splicing of the mRNA. These results suggest the potential for significant variation in the precise molecular structure of the elastic fiber in the human population

  4. Synthetic spike-in standards for high-throughput 16S rRNA gene amplicon sequencing.

    Science.gov (United States)

    Tourlousse, Dieter M; Yoshiike, Satowa; Ohashi, Akiko; Matsukura, Satoko; Noda, Naohiro; Sekiguchi, Yuji

    2017-02-28

    High-throughput sequencing of 16S rRNA gene amplicons (16S-seq) has become a widely deployed method for profiling complex microbial communities but technical pitfalls related to data reliability and quantification remain to be fully addressed. In this work, we have developed and implemented a set of synthetic 16S rRNA genes to serve as universal spike-in standards for 16S-seq experiments. The spike-ins represent full-length 16S rRNA genes containing artificial variable regions with negligible identity to known nucleotide sequences, permitting unambiguous identification of spike-in sequences in 16S-seq read data from any microbiome sample. Using defined mock communities and environmental microbiota, we characterized the performance of the spike-in standards and demonstrated their utility for evaluating data quality on a per-sample basis. Further, we showed that staggered spike-in mixtures added at the point of DNA extraction enable concurrent estimation of absolute microbial abundances suitable for comparative analysis. Results also underscored that template-specific Illumina sequencing artifacts may lead to biases in the perceived abundance of certain taxa. Taken together, the spike-in standards represent a novel bioanalytical tool that can substantially improve 16S-seq-based microbiome studies by enabling comprehensive quality control along with absolute quantification. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.