WorldWideScience

Sample records for directional transcriptome sequencing

  1. De novo transcriptome sequence assembly from coconut leaves and seeds with a focus on factors involved in RNA-directed DNA methylation.

    Science.gov (United States)

    Huang, Ya-Yi; Lee, Chueh-Pai; Fu, Jason L; Chang, Bill Chia-Han; Matzke, Antonius J M; Matzke, Marjori

    2014-09-04

    Coconut palm (Cocos nucifera) is a symbol of the tropics and a source of numerous edible and nonedible products of economic value. Despite its nutritional and industrial significance, coconut remains under-represented in public repositories for genomic and transcriptomic data. We report de novo transcript assembly from RNA-seq data and analysis of gene expression in seed tissues (embryo and endosperm) and leaves of a dwarf coconut variety. Assembly of 10 GB sequencing data for each tissue resulted in 58,211 total unigenes in embryo, 61,152 in endosperm, and 33,446 in leaf. Within each unigene pool, 24,857 could be annotated in embryo, 29,731 could be annotated in endosperm, and 26,064 could be annotated in leaf. A KEGG analysis identified 138, 138, and 139 pathways, respectively, in transcriptomes of embryo, endosperm, and leaf tissues. Given the extraordinarily large size of coconut seeds and the importance of small RNA-mediated epigenetic regulation during seed development in model plants, we used homology searches to identify putative homologs of factors required for RNA-directed DNA methylation in coconut. The findings suggest that RNA-directed DNA methylation is important during coconut seed development, particularly in maturing endosperm. This dataset will expand the genomics resources available for coconut and provide a foundation for more detailed analyses that may assist molecular breeding strategies aimed at improving this major tropical crop. Copyright © 2014 Huang et al.

  2. Transcriptome Profiling Using Single-Molecule Direct RNA Sequencing Approach for In-depth Understanding of Genes in Secondary Metabolism Pathways of Camellia sinensis

    Directory of Open Access Journals (Sweden)

    Qingshan Xu

    2017-07-01

    Full Text Available Characteristic secondary metabolites, including flavonoids, theanine and caffeine, are important components of Camellia sinensis, and their biosynthesis has attracted widespread interest. Previous studies on the biosynthesis of these major secondary metabolites using next-generation sequencing technologies limited the accurately prediction of full-length (FL splice isoforms. Herein, we applied single-molecule sequencing to pooled tea plant tissues, to provide a more complete transcriptome of C. sinensis. Moreover, we identified 94 FL transcripts and four alternative splicing events for enzyme-coding genes involved in the biosynthesis of flavonoids, theanine and caffeine. According to the comparison between long-read isoforms and assemble transcripts, we improved the quality and accuracy of genes sequenced by short-read next-generation sequencing technology. The resulting FL transcripts, together with the improved assembled transcripts and identified alternative splicing events, enhance our understanding of genes involved in the biosynthesis of characteristic secondary metabolites in C. sinensis.

  3. Survey of the transcriptome of Aspergillus oryzae via massively parallel mRNA sequencing

    OpenAIRE

    Wang, Bin; Guo, Guangwu; Wang, Chao; Lin, Ying; Wang, Xiaoning; Zhao, Mouming; Guo, Yong; He, Minghui; Zhang, Yong; Pan, Li

    2010-01-01

    Aspergillus oryzae, an important filamentous fungus used in food fermentation and the enzyme industry, has been shown through genome sequencing and various other tools to have prominent features in its genomic composition. However, the functional complexity of the A. oryzae transcriptome has not yet been fully elucidated. Here, we applied direct high-throughput paired-end RNA-sequencing (RNA-Seq) to the transcriptome of A. oryzae under four different culture conditions. With the high resoluti...

  4. Sequencing and analysis of the Mediterranean amphioxus (Branchiostoma lanceolatum transcriptome.

    Directory of Open Access Journals (Sweden)

    Silvan Oulion

    Full Text Available BACKGROUND: The basally divergent phylogenetic position of amphioxus (Cephalochordata, as well as its conserved morphology, development and genetics, make it the best proxy for the chordate ancestor. Particularly, studies using the amphioxus model help our understanding of vertebrate evolution and development. Thus, interest for the amphioxus model led to the characterization of both the transcriptome and complete genome sequence of the American species, Branchiostoma floridae. However, recent technical improvements allowing induction of spawning in the laboratory during the breeding season on a daily basis with the Mediterranean species Branchiostoma lanceolatum have encouraged European Evo-Devo researchers to adopt this species as a model even though no genomic or transcriptomic data have been available. To fill this need we used the pyrosequencing method to characterize the B. lanceolatum transcriptome and then compared our results with the published transcriptome of B. floridae. RESULTS: Starting with total RNA from nine different developmental stages of B. lanceolatum, a normalized cDNA library was constructed and sequenced on Roche GS FLX (Titanium mode. Around 1.4 million of reads were produced and assembled into 70,530 contigs (average length of 490 bp. Overall 37% of the assembled sequences were annotated by BlastX and their Gene Ontology terms were determined. These results were then compared to genomic and transcriptomic data of B. floridae to assess similarities and specificities of each species. CONCLUSION: We obtained a high-quality amphioxus (B. lanceolatum reference transcriptome using a high throughput sequencing approach. We found that 83% of the predicted genes in the B. floridae complete genome sequence are also found in the B. lanceolatum transcriptome, while only 41% were found in the B. floridae transcriptome obtained with traditional Sanger based sequencing. Therefore, given the high degree of sequence conservation

  5. Transcriptome sequences resolve deep relationships of the grape family.

    Science.gov (United States)

    Wen, Jun; Xiong, Zhiqiang; Nie, Ze-Long; Mao, Likai; Zhu, Yabing; Kan, Xian-Zhao; Ickert-Bond, Stefanie M; Gerrath, Jean; Zimmer, Elizabeth A; Fang, Xiao-Dong

    2013-01-01

    Previous phylogenetic studies of the grape family (Vitaceae) yielded poorly resolved deep relationships, thus impeding our understanding of the evolution of the family. Next-generation sequencing now offers access to protein coding sequences very easily, quickly and cost-effectively. To improve upon earlier work, we extracted 417 orthologous single-copy nuclear genes from the transcriptomes of 15 species of the Vitaceae, covering its phylogenetic diversity. The resulting transcriptome phylogeny provides robust support for the deep relationships, showing the phylogenetic utility of transcriptome data for plants over a time scale at least since the mid-Cretaceous. The pros and cons of transcriptome data for phylogenetic inference in plants are also evaluated.

  6. Transcriptome sequences resolve deep relationships of the grape family.

    Directory of Open Access Journals (Sweden)

    Jun Wen

    Full Text Available Previous phylogenetic studies of the grape family (Vitaceae yielded poorly resolved deep relationships, thus impeding our understanding of the evolution of the family. Next-generation sequencing now offers access to protein coding sequences very easily, quickly and cost-effectively. To improve upon earlier work, we extracted 417 orthologous single-copy nuclear genes from the transcriptomes of 15 species of the Vitaceae, covering its phylogenetic diversity. The resulting transcriptome phylogeny provides robust support for the deep relationships, showing the phylogenetic utility of transcriptome data for plants over a time scale at least since the mid-Cretaceous. The pros and cons of transcriptome data for phylogenetic inference in plants are also evaluated.

  7. Sequencing and characterization of the guppy (Poecilia reticulata transcriptome

    Directory of Open Access Journals (Sweden)

    Rodd F Helen

    2011-04-01

    Full Text Available Abstract Background Next-generation sequencing is providing researchers with a relatively fast and affordable option for developing genomic resources for organisms that are not among the traditional genetic models. Here we present a de novo assembly of the guppy (Poecilia reticulata transcriptome using 454 sequence reads, and we evaluate potential uses of this transcriptome, including detection of sex-specific transcripts and deployment as a reference for gene expression analysis in guppies and a related species. Guppies have been model organisms in ecology, evolutionary biology, and animal behaviour for over 100 years. An annotated transcriptome and other genomic tools will facilitate understanding the genetic and molecular bases of adaptation and variation in a vertebrate species with a uniquely well known natural history. Results We generated approximately 336 Mbp of mRNA sequence data from male brain, male body, female brain, and female body. The resulting 1,162,670 reads assembled into 54,921 contigs, creating a reference transcriptome for the guppy with an average read depth of 28×. We annotated nearly 40% of this reference transcriptome by searching protein and gene ontology databases. Using this annotated transcriptome database, we identified candidate genes of interest to the guppy research community, putative single nucleotide polymorphisms (SNPs, and male-specific expressed genes. We also showed that our reference transcriptome can be used for RNA-sequencing-based analysis of differential gene expression. We identified transcripts that, in juveniles, are regulated differently in the presence and absence of an important predator, Rivulus hartii, including two genes implicated in stress response. For each sample in the RNA-seq study, >50% of high-quality reads mapped to unique sequences in the reference database with high confidence. In addition, we evaluated the use of the guppy reference transcriptome for gene expression analyses in

  8. Transcriptome sequencing of the Microarray Quality Control (MAQC RNA reference samples using next generation sequencing

    Directory of Open Access Journals (Sweden)

    Thierry-Mieg Danielle

    2009-06-01

    Full Text Available Abstract Background Transcriptome sequencing using next-generation sequencing platforms will soon be competing with DNA microarray technologies for global gene expression analysis. As a preliminary evaluation of these promising technologies, we performed deep sequencing of cDNA synthesized from the Microarray Quality Control (MAQC reference RNA samples using Roche's 454 Genome Sequencer FLX. Results We generated more that 3.6 million sequence reads of average length 250 bp for the MAQC A and B samples and introduced a data analysis pipeline for translating cDNA read counts into gene expression levels. Using BLAST, 90% of the reads mapped to the human genome and 64% of the reads mapped to the RefSeq database of well annotated genes with e-values ≤ 10-20. We measured gene expression levels in the A and B samples by counting the numbers of reads that mapped to individual RefSeq genes in multiple sequencing runs to evaluate the MAQC quality metrics for reproducibility, sensitivity, specificity, and accuracy and compared the results with DNA microarrays and Quantitative RT-PCR (QRTPCR from the MAQC studies. In addition, 88% of the reads were successfully aligned directly to the human genome using the AceView alignment programs with an average 90% sequence similarity to identify 137,899 unique exon junctions, including 22,193 new exon junctions not yet contained in the RefSeq database. Conclusion Using the MAQC metrics for evaluating the performance of gene expression platforms, the ExpressSeq results for gene expression levels showed excellent reproducibility, sensitivity, and specificity that improved systematically with increasing shotgun sequencing depth, and quantitative accuracy that was comparable to DNA microarrays and QRTPCR. In addition, a careful mapping of the reads to the genome using the AceView alignment programs shed new light on the complexity of the human transcriptome including the discovery of thousands of new splice variants.

  9. Illumina-based de novo transcriptome sequencing and analysis

    Indian Academy of Sciences (India)

    In the present study, we used Illumina HiSeq technology to perform de novo assembly of heart and musk gland transcriptomes from the Chinese forest musk deer. A total of 239,383 transcripts and 176,450 unigenes were obtained, of which 37,329 unigenes were matched to known sequences in the NCBI nonredundant ...

  10. Comparison of next generation sequencing technologies for transcriptome characterization

    Directory of Open Access Journals (Sweden)

    Soltis Douglas E

    2009-08-01

    Full Text Available Abstract Background We have developed a simulation approach to help determine the optimal mixture of sequencing methods for most complete and cost effective transcriptome sequencing. We compared simulation results for traditional capillary sequencing with "Next Generation" (NG ultra high-throughput technologies. The simulation model was parameterized using mappings of 130,000 cDNA sequence reads to the Arabidopsis genome (NCBI Accession SRA008180.19. We also generated 454-GS20 sequences and de novo assemblies for the basal eudicot California poppy (Eschscholzia californica and the magnoliid avocado (Persea americana using a variety of methods for cDNA synthesis. Results The Arabidopsis reads tagged more than 15,000 genes, including new splice variants and extended UTR regions. Of the total 134,791 reads (13.8 MB, 119,518 (88.7% mapped exactly to known exons, while 1,117 (0.8% mapped to introns, 11,524 (8.6% spanned annotated intron/exon boundaries, and 3,066 (2.3% extended beyond the end of annotated UTRs. Sequence-based inference of relative gene expression levels correlated significantly with microarray data. As expected, NG sequencing of normalized libraries tagged more genes than non-normalized libraries, although non-normalized libraries yielded more full-length cDNA sequences. The Arabidopsis data were used to simulate additional rounds of NG and traditional EST sequencing, and various combinations of each. Our simulations suggest a combination of FLX and Solexa sequencing for optimal transcriptome coverage at modest cost. We have also developed ESTcalc http://fgp.huck.psu.edu/NG_Sims/ngsim.pl, an online webtool, which allows users to explore the results of this study by specifying individualized costs and sequencing characteristics. Conclusion NG sequencing technologies are a highly flexible set of platforms that can be scaled to suit different project goals. In terms of sequence coverage alone, the NG sequencing is a dramatic advance

  11. Bacillus anthracis genome organization in light of whole transcriptome sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Martin, Jeffrey; Zhu, Wenhan; Passalacqua, Karla D.; Bergman, Nicholas; Borodovsky, Mark

    2010-03-22

    Emerging knowledge of whole prokaryotic transcriptomes could validate a number of theoretical concepts introduced in the early days of genomics. What are the rules connecting gene expression levels with sequence determinants such as quantitative scores of promoters and terminators? Are translation efficiency measures, e.g. codon adaptation index and RBS score related to gene expression? We used the whole transcriptome shotgun sequencing of a bacterial pathogen Bacillus anthracis to assess correlation of gene expression level with promoter, terminator and RBS scores, codon adaptation index, as well as with a new measure of gene translational efficiency, average translation speed. We compared computational predictions of operon topologies with the transcript borders inferred from RNA-Seq reads. Transcriptome mapping may also improve existing gene annotation. Upon assessment of accuracy of current annotation of protein-coding genes in the B. anthracis genome we have shown that the transcriptome data indicate existence of more than a hundred genes missing in the annotation though predicted by an ab initio gene finder. Interestingly, we observed that many pseudogenes possess not only a sequence with detectable coding potential but also promoters that maintain transcriptional activity.

  12. Transcriptome sequencing and comparative transcriptome analysis of the scleroglucan producer Sclerotium rolfsii

    Directory of Open Access Journals (Sweden)

    Stahl Ulf

    2010-05-01

    Full Text Available Abstract Background The plant pathogenic basidiomycete Sclerotium rolfsii produces the industrially exploited exopolysaccharide scleroglucan, a polymer that consists of (1 → 3-β-linked glucose with a (1 → 6-β-glycosyl branch on every third unit. Although the physicochemical properties of scleroglucan are well understood, almost nothing is known about the genetics of scleroglucan biosynthesis. Similarly, the biosynthetic pathway of oxalate, the main by-product during scleroglucan production, has not been elucidated yet. In order to provide a basis for genetic and metabolic engineering approaches, we studied scleroglucan and oxalate biosynthesis in S. rolfsii using different transcriptomic approaches. Results Two S. rolfsii transcriptomes obtained from scleroglucan-producing and scleroglucan-nonproducing conditions were pooled and sequenced using the 454 pyrosequencing technique yielding ~350,000 reads. These could be assembled into 21,937 contigs and 171,833 singletons, for which 6,951 had significant matches in public protein data bases. Sequence data were used to obtain first insights into the genomics of scleroglucan and oxalate production and to predict putative proteins involved in the synthesis of both metabolites. Using comparative transcriptomics, namely Agilent microarray hybridization and suppression subtractive hybridization, we identified ~800 unigenes which are differently expressed under scleroglucan-producing and non-producing conditions. From these, candidate genes were identified which could represent potential leads for targeted modification of the S. rolfsii metabolism for increased scleroglucan yields. Conclusions The results presented in this paper provide for the first time genomic and transcriptomic data about S. rolfsii and demonstrate the power and usefulness of combined transcriptome sequencing and comparative microarray analysis. The data obtained allowed us to predict the biosynthetic pathways of scleroglucan and

  13. Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes

    Directory of Open Access Journals (Sweden)

    Rebecca M. Davidson

    2011-11-01

    Full Text Available Transcriptome sequencing is a powerful method for studying global expression patterns in large, complex genomes. Evaluation of sequence-based expression profiles during reproductive development would provide functional annotation to genes underlying agronomic traits. We generated transcriptome profiles for 12 diverse maize ( L. reproductive tissues representing male, female, developing seed, and leaf tissues using high throughput transcriptome sequencing. Overall, ∼80% of annotated genes were expressed. Comparative analysis between sequence and hybridization-based methods demonstrated the utility of ribonucleic acid sequencing (RNA-seq for expression determination and differentiation of paralagous genes (∼85% of maize genes. Analysis of 4975 gene families across reproductive tissues revealed expression divergence is proportional to family size. In all pairwise comparisons between tissues, 7 (pre- vs. postemergence cobs to 48% (pollen vs. ovule of genes were differentially expressed. Genes with expression restricted to a single tissue within this study were identified with the highest numbers observed in leaves, endosperm, and pollen. Coexpression network analysis identified 17 gene modules with complex and shared expression patterns containing many previously described maize genes. The data and analyses in this study provide valuable tools through improved gene annotation, gene family characterization, and a core set of candidate genes to further characterize maize reproductive development and improve grain yield potential.

  14. Sequencing and De Novo Transcriptome Assembly of Brachypodium sylvaticum (Poaceae

    Directory of Open Access Journals (Sweden)

    Samuel E. Fox

    2013-03-01

    Full Text Available Premise of the study: We report the de novo assembly and characterization of the transcriptomes of Brachypodium sylvaticum (slender false-brome accessions from native populations of Spain and Greece, and an invasive population west of Corvallis, Oregon, USA. Methods and Results: More than 350 million sequence reads from the mRNA libraries prepared from three B. sylvaticum genotypes were assembled into 120,091 (Corvallis, 104,950 (Spain, and 177,682 (Greece transcript contigs. In comparison with the B. distachyon Bd21 reference genome and GenBank protein sequences, we estimate >90% exome coverage for B. sylvaticum. The transcripts were assigned Gene Ontology and InterPro annotations. Brachypodium sylvaticum sequence reads aligned against the Bd21 genome revealed 394,654 single-nucleotide polymorphisms (SNPs and >20,000 simple sequence repeat (SSR DNA sites. Conclusions: To our knowledge, this is the first report of transcriptome sequencing of invasive plant species with a closely related sequenced reference genome. The sequences and identified SNP variant and SSR sites will provide tools for developing novel genetic markers for use in genotyping and characterization of invasive behavior of B. sylvaticum.

  15. Survey of the transcriptome of Aspergillus oryzae via massively parallel mRNA sequencing.

    Science.gov (United States)

    Wang, Bin; Guo, Guangwu; Wang, Chao; Lin, Ying; Wang, Xiaoning; Zhao, Mouming; Guo, Yong; He, Minghui; Zhang, Yong; Pan, Li

    2010-08-01

    Aspergillus oryzae, an important filamentous fungus used in food fermentation and the enzyme industry, has been shown through genome sequencing and various other tools to have prominent features in its genomic composition. However, the functional complexity of the A. oryzae transcriptome has not yet been fully elucidated. Here, we applied direct high-throughput paired-end RNA-sequencing (RNA-Seq) to the transcriptome of A. oryzae under four different culture conditions. With the high resolution and sensitivity afforded by RNA-Seq, we were able to identify a substantial number of novel transcripts, new exons, untranslated regions, alternative upstream initiation codons and upstream open reading frames, which provide remarkable insight into the A. oryzae transcriptome. We were also able to assess the alternative mRNA isoforms in A. oryzae and found a large number of genes undergoing alternative splicing. Many genes and pathways that might be involved in higher levels of protein production in solid-state culture than in liquid culture were identified by comparing gene expression levels between different cultures. Our analysis indicated that the transcriptome of A. oryzae is much more complex than previously anticipated, and these results may provide a blueprint for further study of the A. oryzae transcriptome.

  16. Nuclear RNA sequencing of the mouse erythroid cell transcriptome.

    Directory of Open Access Journals (Sweden)

    Jennifer A Mitchell

    Full Text Available In addition to protein coding genes a substantial proportion of mammalian genomes are transcribed. However, most transcriptome studies investigate steady-state mRNA levels, ignoring a considerable fraction of the transcribed genome. In addition, steady-state mRNA levels are influenced by both transcriptional and posttranscriptional mechanisms, and thus do not provide a clear picture of transcriptional output. Here, using deep sequencing of nuclear RNAs (nucRNA-Seq in parallel with chromatin immunoprecipitation sequencing (ChIP-Seq of active RNA polymerase II, we compared the nuclear transcriptome of mouse anemic spleen erythroid cells with polymerase occupancy on a genome-wide scale. We demonstrate that unspliced transcripts quantified by nucRNA-seq correlate with primary transcript frequencies measured by RNA FISH, but differ from steady-state mRNA levels measured by poly(A-enriched RNA-seq. Highly expressed protein coding genes showed good correlation between RNAPII occupancy and transcriptional output; however, genome-wide we observed a poor correlation between transcriptional output and RNAPII association. This poor correlation is due to intergenic regions associated with RNAPII which correspond with transcription factor bound regulatory regions and a group of stable, nuclear-retained long non-coding transcripts. In conclusion, sequencing the nuclear transcriptome provides an opportunity to investigate the transcriptional landscape in a given cell type through quantification of unspliced primary transcripts and the identification of nuclear-retained long non-coding RNAs.

  17. Massively parallel sequencing and analysis of the Necator americanus transcriptome.

    Directory of Open Access Journals (Sweden)

    Cinzia Cantacessi

    2010-05-01

    Full Text Available The blood-feeding hookworm Necator americanus infects hundreds of millions of people worldwide. In order to elucidate fundamental molecular biological aspects of this hookworm, the transcriptome of the adult stage of Necator americanus was explored using next-generation sequencing and bioinformatic analyses.A total of 19,997 contigs were assembled from the sequence data; 6,771 of these contigs had known orthologues in the free-living nematode Caenorhabditis elegans, and most of them encoded proteins with WD40 repeats (10.6%, proteinase inhibitors (7.8% or calcium-binding EF-hand proteins (6.7%. Bioinformatic analyses inferred that the C. elegans homologues are involved mainly in biological pathways linked to ribosome biogenesis (70%, oxidative phosphorylation (63% and/or proteases (60%; most of these molecules were predicted to be involved in more than one biological pathway. Comparative analyses of the transcriptomes of N. americanus and the canine hookworm, Ancylostoma caninum, revealed qualitative and quantitative differences. For instance, proteinase inhibitors were inferred to be highly represented in the former species, whereas SCP/Tpx-1/Ag5/PR-1/Sc7 proteins ( = SCP/TAPS or Ancylostoma-secreted proteins were predominant in the latter. In N. americanus, essential molecules were predicted using a combination of orthology mapping and functional data available for C. elegans. Further analyses allowed the prioritization of 18 predicted drug targets which did not have homologues in the human host. These candidate targets were inferred to be linked to mitochondrial (e.g., processing proteins or amino acid metabolism (e.g., asparagine t-RNA synthetase.This study has provided detailed insights into the transcriptome of the adult stage of N. americanus and examines similarities and differences between this species and A. caninum. Future efforts should focus on comparative transcriptomic and proteomic investigations of the other predominant human

  18. Using next generation transcriptome sequencing to predict an ectomycorrhizal metabolome

    Directory of Open Access Journals (Sweden)

    Cseke Leland J

    2011-05-01

    Full Text Available Abstract Background Mycorrhizae, symbiotic interactions between soil fungi and tree roots, are ubiquitous in terrestrial ecosystems. The fungi contribute phosphorous, nitrogen and mobilized nutrients from organic matter in the soil and in return the fungus receives photosynthetically-derived carbohydrates. This union of plant and fungal metabolisms is the mycorrhizal metabolome. Understanding this symbiotic relationship at a molecular level provides important contributions to the understanding of forest ecosystems and global carbon cycling. Results We generated next generation short-read transcriptomic sequencing data from fully-formed ectomycorrhizae between Laccaria bicolor and aspen (Populus tremuloides roots. The transcriptomic data was used to identify statistically significantly expressed gene models using a bootstrap-style approach, and these expressed genes were mapped to specific metabolic pathways. Integration of expressed genes that code for metabolic enzymes and the set of expressed membrane transporters generates a predictive model of the ectomycorrhizal metabolome. The generated model of mycorrhizal metabolome predicts that the specific compounds glycine, glutamate, and allantoin are synthesized by L. bicolor and that these compounds or their metabolites may be used for the benefit of aspen in exchange for the photosynthetically-derived sugars fructose and glucose. Conclusions The analysis illustrates an approach to generate testable biological hypotheses to investigate the complex molecular interactions that drive ectomycorrhizal symbiosis. These models are consistent with experimental environmental data and provide insight into the molecular exchange processes for organisms in this complex ecosystem. The method used here for predicting metabolomic models of mycorrhizal systems from deep RNA sequencing data can be generalized and is broadly applicable to transcriptomic data derived from complex systems.

  19. High-throughput sequencing of black pepper root transcriptome

    Science.gov (United States)

    2012-01-01

    Background Black pepper (Piper nigrum L.) is one of the most popular spices in the world. It is used in cooking and the preservation of food and even has medicinal properties. Losses in production from disease are a major limitation in the culture of this crop. The major diseases are root rot and foot rot, which are results of root infection by Fusarium solani and Phytophtora capsici, respectively. Understanding the molecular interaction between the pathogens and the host’s root region is important for obtaining resistant cultivars by biotechnological breeding. Genetic and molecular data for this species, though, are limited. In this paper, RNA-Seq technology has been employed, for the first time, to describe the root transcriptome of black pepper. Results The root transcriptome of black pepper was sequenced by the NGS SOLiD platform and assembled using the multiple-k method. Blast2Go and orthoMCL methods were used to annotate 10338 unigenes. The 4472 predicted proteins showed about 52% homology with the Arabidopsis proteome. Two root proteomes identified 615 proteins, which seem to define the plant’s root pattern. Simple-sequence repeats were identified that may be useful in studies of genetic diversity and may have applications in biotechnology and ecology. Conclusions This dataset of 10338 unigenes is crucially important for the biotechnological breeding of black pepper and the ecogenomics of the Magnoliids, a major group of basal angiosperms. PMID:22984782

  20. High-throughput sequencing of black pepper root transcriptome

    Directory of Open Access Journals (Sweden)

    Gordo Sheila MC

    2012-09-01

    Full Text Available Abstract Background Black pepper (Piper nigrum L. is one of the most popular spices in the world. It is used in cooking and the preservation of food and even has medicinal properties. Losses in production from disease are a major limitation in the culture of this crop. The major diseases are root rot and foot rot, which are results of root infection by Fusarium solani and Phytophtora capsici, respectively. Understanding the molecular interaction between the pathogens and the host’s root region is important for obtaining resistant cultivars by biotechnological breeding. Genetic and molecular data for this species, though, are limited. In this paper, RNA-Seq technology has been employed, for the first time, to describe the root transcriptome of black pepper. Results The root transcriptome of black pepper was sequenced by the NGS SOLiD platform and assembled using the multiple-k method. Blast2Go and orthoMCL methods were used to annotate 10338 unigenes. The 4472 predicted proteins showed about 52% homology with the Arabidopsis proteome. Two root proteomes identified 615 proteins, which seem to define the plant’s root pattern. Simple-sequence repeats were identified that may be useful in studies of genetic diversity and may have applications in biotechnology and ecology. Conclusions This dataset of 10338 unigenes is crucially important for the biotechnological breeding of black pepper and the ecogenomics of the Magnoliids, a major group of basal angiosperms.

  1. Direct, rapid RNA sequence analysis

    International Nuclear Information System (INIS)

    Peattie, D.A.

    1987-01-01

    The original methods of RNA sequence analysis were based on enzymatic production and chromatographic separation of overlapping oligonucleotide fragments from within an RNA molecule followed by identification of the mononucleotides comprising the oligomer. Over the past decade the field of nucleic acid sequencing has changed dramatically, however, and RNA molecules now can be sequenced in a variety of more streamlined fashions. Most of the more recent advances in RNA sequencing have involved one-dimensional electrophoretic separation of 32 P-end-labeled oligoribonucleotides on polyacrylamide gels. In this chapter the author discusses two of these methods for determining the nucleotide sequences of RNA molecules rapidly: the chemical method and the enzymatic method. Both methods are direct and degradative, i.e., they rely on fragmatic and chemical approaches should be utilized. The single-strand-specific ribonucleases (A, T 1 , T 2 , and S 1 ) provide an efficient means to locate double-helical regions rapidly, and the chemical reactions provide a means to determine the RNA sequence within these regions. In addition, the chemical reactions allow one to assign interactions to specific atoms and to distinguish secondary interactions from tertiary ones. If the RNA molecule is small enough to be sequenced directly by the enzymatic or chemical method, the probing reactions can be done easily at the same time as sequencing reactions

  2. De novo transcriptome sequencing and assembly from apomictic and sexual Eragrostis curvula genotypes.

    Directory of Open Access Journals (Sweden)

    Ingrid Garbus

    Full Text Available A long-standing goal in plant breeding has been the ability to confer apomixis to agriculturally relevant species, which would require a deeper comprehension of the molecular basis of apomictic regulatory mechanisms. Eragrostis curvula (Schrad. Nees is a perennial grass that includes both sexual and apomictic cytotypes. The availability of a reference transcriptome for this species would constitute a very important tool toward the identification of genes controlling key steps of the apomictic pathway. Here, we used Roche/454 sequencing technologies to generate reads from inflorescences of E. curvula apomictic and sexual genotypes that were de novo assembled into a reference transcriptome. Near 90% of the 49568 assembled isotigs showed sequence similarity to sequences deposited in the public databases. A gene ontology analysis categorized 27448 isotigs into at least one of the three main GO categories. We identified 11475 SSRs, and several of them were assayed in E curvula germoplasm using SSR-based primers, providing a valuable set of molecular markers that could allow direct allele selection. The differential contribution to each library of the spliced forms of several transcripts revealed the existence of several isotigs produced via alternative splicing of single genes. The reference transcriptome presented and validated in this work will be useful for the identification of a wide range of gene(s related to agronomic traits of E. curvula, including those controlling key steps of the apomictic pathway in this species, allowing the extrapolation of the findings to other plant species.

  3. Researches on Transcriptome Sequencing in the Study of Traditional Chinese Medicine

    Science.gov (United States)

    Xin, Jie; Zhang, Rong-chao; Wang, Lei

    2017-01-01

    Due to its incomparable advantages, the application of transcriptome sequencing in the study of traditional Chinese medicine attracts more and more attention of researchers, which greatly promote the development of traditional Chinese medicine. In this paper, the applications of transcriptome sequencing in traditional Chinese medicine were summarized by reviewing recent related papers. PMID:28900463

  4. Somatic sex-specific transcriptome differences in Drosophila revealed by whole transcriptome sequencing

    Directory of Open Access Journals (Sweden)

    Arbeitman Michelle N

    2011-07-01

    Full Text Available Abstract Background Understanding animal development and physiology at a molecular-biological level has been advanced by the ability to determine at high resolution the repertoire of mRNA molecules by whole transcriptome resequencing. This includes the ability to detect and quantify rare abundance transcripts and isoform-specific mRNA variants produced from a gene. The sex hierarchy consists of a pre-mRNA splicing cascade that directs the production of sex-specific transcription factors that specify nearly all sexual dimorphism. We have used deep RNA sequencing to gain insight into how the Drosophila sex hierarchy generates somatic sex differences, by examining gene and transcript isoform expression differences between the sexes in adult head tissues. Results Here we find 1,381 genes that differ in overall expression levels and 1,370 isoform-specific transcripts that differ between males and females. Additionally, we find 512 genes not regulated downstream of transformer that are significantly more highly expressed in males than females. These 512 genes are enriched on the × chromosome and reside adjacent to dosage compensation complex entry sites, which taken together suggests that their residence on the × chromosome might be sufficient to confer male-biased expression. There are no transcription unit structural features, from a set of features, that are robustly significantly different in the genes with significant sex differences in the ratio of isoform-specific transcripts, as compared to random isoform-specific transcripts, suggesting that there is no single molecular mechanism that generates isoform-specific transcript differences between the sexes, even though the sex hierarchy is known to include three pre-mRNA splicing factors. Conclusions We identify thousands of genes that show sex-specific differences in overall gene expression levels, and identify hundreds of additional genes that have differences in the abundance of isoform

  5. Transcriptome sequencing in prostate cancer identifies inter-tumor heterogeneity

    Directory of Open Access Journals (Sweden)

    Janet Mendonca

    2015-06-01

    Full Text Available Given the dearth of gene mutations in prostate cancer, [1] ,[2] it is likely that genomic rearrangements play a significant role in the evolution of prostate cancer. However, in the search for recurrent genomic alterations, "private alterations" have received less attention. Such alterations may provide insights into the evolution, behavior, and clinical outcome of an individual tumor. In a recent report in "Genome Biology" Wyatt et al. [3] defines unique alterations in a cohort of high-risk prostate cancer patient with a lethal phenotype. Utilizing a transcriptome sequencing approach they observe high inter-tumor heterogeneity; however, the genes altered distill into three distinct cancer-relevant pathways. Their analysis reveals the presence of several non-ETS fusions, which may contribute to the phenotype of individual tumors, and have significance for disease progression.

  6. De novo transcriptome sequencing and sequence analysis of the malaria vector Anopheles sinensis (Diptera: Culicidae)

    Science.gov (United States)

    2014-01-01

    Background Anopheles sinensis is the major malaria vector in China and Southeast Asia. Vector control is one of the most effective measures to prevent malaria transmission. However, there is little transcriptome information available for the malaria vector. To better understand the biological basis of malaria transmission and to develop novel and effective means of vector control, there is a need to build a transcriptome dataset for functional genomics analysis by large-scale RNA sequencing (RNA-seq). Methods To provide a more comprehensive and complete transcriptome of An. sinensis, eggs, larvae, pupae, male adults and female adults RNA were pooled together for cDNA preparation, sequenced using the Illumina paired-end sequencing technology and assembled into unigenes. These unigenes were then analyzed in their genome mapping, functional annotation, homology, codon usage bias and simple sequence repeats (SSRs). Results Approximately 51.6 million clean reads were obtained, trimmed, and assembled into 38,504 unigenes with an average length of 571 bp, an N50 of 711 bp, and an average GC content 51.26%. Among them, 98.4% of unigenes could be mapped onto the reference genome, and 69% of unigenes could be annotated with known biological functions. Homology analysis identified certain numbers of An. sinensis unigenes that showed homology or being putative 1:1 orthologues with genomes of other Dipteran species. Codon usage bias was analyzed and 1,904 SSRs were detected, which will provide effective molecular markers for the population genetics of this species. Conclusions Our data and analysis provide the most comprehensive transcriptomic resource and characteristics currently available for An. sinensis, and will facilitate genetic, genomic studies, and further vector control of An. sinensis. PMID:25000941

  7. Transcriptome sequencing from diverse human populations reveals differentiated regulatory architecture.

    Directory of Open Access Journals (Sweden)

    Alicia R Martin

    2014-08-01

    Full Text Available Large-scale sequencing efforts have documented extensive genetic variation within the human genome. However, our understanding of the origins, global distribution, and functional consequences of this variation is far from complete. While regulatory variation influencing gene expression has been studied within a handful of populations, the breadth of transcriptome differences across diverse human populations has not been systematically analyzed. To better understand the spectrum of gene expression variation, alternative splicing, and the population genetics of regulatory variation in humans, we have sequenced the genomes, exomes, and transcriptomes of EBV transformed lymphoblastoid cell lines derived from 45 individuals in the Human Genome Diversity Panel (HGDP. The populations sampled span the geographic breadth of human migration history and include Namibian San, Mbuti Pygmies of the Democratic Republic of Congo, Algerian Mozabites, Pathan of Pakistan, Cambodians of East Asia, Yakut of Siberia, and Mayans of Mexico. We discover that approximately 25.0% of the variation in gene expression found amongst individuals can be attributed to population differences. However, we find few genes that are systematically differentially expressed among populations. Of this population-specific variation, 75.5% is due to expression rather than splicing variability, and we find few genes with strong evidence for differential splicing across populations. Allelic expression analyses indicate that previously mapped common regulatory variants identified in eight populations from the International Haplotype Map Phase 3 project have similar effects in our seven sampled HGDP populations, suggesting that the cellular effects of common variants are shared across diverse populations. Together, these results provide a resource for studies analyzing functional differences across populations by estimating the degree of shared gene expression, alternative splicing, and

  8. Transcriptome

    Science.gov (United States)

    ... Also: Talking Glossary of Genetic Terms Definitions for genetic terms used on this page En Español: Transcriptoma Transcriptome What is a transcriptome? What can a transcriptome tell us? How can transcriptome data be used to explore gene function? What is ...

  9. Deep RNA Sequencing of the Skeletal Muscle Transcriptome in Swimming Fish

    NARCIS (Netherlands)

    Palstra, A.P.; Beltran, S.; Burgerhout, E.; Brittijn, S.A.; Magnoni, L.J.; Henkel, C.V.; Jansen, A.; Thillart, G.E.E.J.M.; Spaink, H.P.; Planas, J.V.

    2013-01-01

    Deep RNA sequencing (RNA-seq) was performed to provide an in-depth view of the transcriptome of red and white skeletal muscle of exercised and non-exercised rainbow trout (Oncorhynchus mykiss) with the specific objective to identify expressed genes and quantify the transcriptomic effects of

  10. Transcriptome sequencing and positive selected genes analysis of Bombyx mandarina.

    Directory of Open Access Journals (Sweden)

    Tingcai Cheng

    Full Text Available The wild silkworm Bombyx mandarina is widely believed to be an ancestor of the domesticated silkworm, Bombyx mori. Silkworms are often used as a model for studying the mechanism of species domestication. Here, we performed transcriptome sequencing of the wild silkworm using an Illumina HiSeq2000 platform. We produced 100,004,078 high-quality reads and assembled them into 50,773 contigs with an N50 length of 1764 bp and a mean length of 941.62 bp. A total of 33,759 unigenes were identified, with 12,805 annotated in the Nr database, 8273 in the Pfam database, and 9093 in the Swiss-Prot database. Expression profile analysis found significant differential expression of 1308 unigenes between the middle silk gland (MSG and posterior silk gland (PSG. Three sericin genes (sericin 1, sericin 2, and sericin 3 were expressed specifically in the MSG and three fibroin genes (fibroin-H, fibroin-L, and fibroin/P25 were expressed specifically in the PSG. In addition, 32,297 Single-nucleotide polymorphisms (SNPs and 361 insertion-deletions (INDELs were detected. Comparison with the domesticated silkworm p50/Dazao identified 5,295 orthologous genes, among which 400 might have experienced or to be experiencing positive selection by Ka/Ks analysis. These data and analyses presented here provide insights into silkworm domestication and an invaluable resource for wild silkworm genomics research.

  11. The contribution of 700,000 ORF sequence tags to the definition of the human transcriptome

    Science.gov (United States)

    Camargo, Anamaria A.; Samaia, Helena P. B.; Dias-Neto, Emmanuel; Simão, Daniel F.; Migotto, Italo A.; Briones, Marcelo R. S.; Costa, Fernando F.; Aparecida Nagai, Maria; Verjovski-Almeida, Sergio; Zago, Marco A.; Andrade, Luis Eduardo C.; Carrer, Helaine; El-Dorry, Hamza F. A.; Espreafico, Enilza M.; Habr-Gama, Angelita; Giannella-Neto, Daniel; Goldman, Gustavo H.; Gruber, Arthur; Hackel, Christine; Kimura, Edna T.; Maciel, Rui M. B.; Marie, Suely K. N.; Martins, Elizabeth A. L.; Nóbrega, Marina P.; Paçó-Larson, Maria Luisa; Pardini, Maria Inês M. C.; Pereira, Gonçalo G.; Pesquero, João Bosco; Rodrigues, Vanderlei; Rogatto, Silvia R.; da Silva, Ismael D. C. G.; Sogayar, Mari C.; Sonati, Maria de Fátima; Tajara, Eloiza H.; Valentini, Sandro R.; Alberto, Fernando L.; Amaral, Maria Elisabete J.; Aneas, Ivy; Arnaldi, Liliane A. T.; de Assis, Angela M.; Bengtson, Mário Henrique; Bergamo, Nadia Aparecida; Bombonato, Vanessa; de Camargo, Maria E. R.; Canevari, Renata A.; Carraro, Dirce M.; Cerutti, Janete M.; Corrêa, Maria Lucia C.; Corrêa, Rosana F. R.; Costa, Maria Cristina R.; Curcio, Cyntia; Hokama, Paula O. M.; Ferreira, Ari J. S.; Furuzawa, Gilberto K.; Gushiken, Tsieko; Ho, Paulo L.; Kimura, Elza; Krieger, José E.; Leite, Luciana C. C.; Majumder, Paromita; Marins, Mozart; Marques, Everaldo R.; Melo, Analy S. A.; Melo, Monica; Mestriner, Carlos Alberto; Miracca, Elisabete C.; Miranda, Daniela C.; Nascimento, Ana Lucia T. O.; Nóbrega, Francisco G.; Ojopi, Élida P. B.; Pandolfi, José Rodrigo C.; Pessoa, Luciana G.; Prevedel, Aline C.; Rahal, Paula; Rainho, Claudia A.; Reis, Eduardo M. R.; Ribeiro, Marcelo L.; da Rós, Nancy; de Sá, Renata G.; Sales, Magaly M.; Sant'anna, Simone Cristina; dos Santos, Mariana L.; da Silva, Aline M.; da Silva, Neusa P.; Silva, Wilson A.; da Silveira, Rosana A.; Sousa, Josane F.; Stecconi, Daniella; Tsukumo, Fernando; Valente, Valéria; Soares, Fernando; Moreira, Eloisa S.; Nunes, Diana N.; Correa, Ricardo G.; Zalcberg, Heloisa; Carvalho, Alex F.; Reis, Luis F. L.; Brentani, Ricardo R.; Simpson, Andrew J. G.; de Souza, Sandro J.

    2001-01-01

    Open reading frame expressed sequences tags (ORESTES) differ from conventional ESTs by providing sequence data from the central protein coding portion of transcripts. We generated a total of 696,745 ORESTES sequences from 24 human tissues and used a subset of the data that correspond to a set of 15,095 full-length mRNAs as a means of assessing the efficiency of the strategy and its potential contribution to the definition of the human transcriptome. We estimate that ORESTES sampled over 80% of all highly and moderately expressed, and between 40% and 50% of rarely expressed, human genes. In our most thoroughly sequenced tissue, the breast, the 130,000 ORESTES generated are derived from transcripts from an estimated 70% of all genes expressed in that tissue, with an equally efficient representation of both highly and poorly expressed genes. In this respect, we find that the capacity of the ORESTES strategy both for gene discovery and shotgun transcript sequence generation significantly exceeds that of conventional ESTs. The distribution of ORESTES is such that many human transcripts are now represented by a scaffold of partial sequences distributed along the length of each gene product. The experimental joining of the scaffold components, by reverse transcription–PCR, represents a direct route to transcript finishing that may represent a useful alternative to full-length cDNA cloning. PMID:11593022

  12. The contribution of 700,000 ORF sequence tags to the definition of the human transcriptome.

    Science.gov (United States)

    Camargo, A A; Samaia, H P; Dias-Neto, E; Simão, D F; Migotto, I A; Briones, M R; Costa, F F; Nagai, M A; Verjovski-Almeida, S; Zago, M A; Andrade, L E; Carrer, H; El-Dorry, H F; Espreafico, E M; Habr-Gama, A; Giannella-Neto, D; Goldman, G H; Gruber, A; Hackel, C; Kimura, E T; Maciel, R M; Marie, S K; Martins, E A; Nobrega, M P; Paco-Larson, M L; Pardini, M I; Pereira, G G; Pesquero, J B; Rodrigues, V; Rogatto, S R; da Silva, I D; Sogayar, M C; Sonati, M F; Tajara, E H; Valentini, S R; Alberto, F L; Amaral, M E; Aneas, I; Arnaldi, L A; de Assis, A M; Bengtson, M H; Bergamo, N A; Bombonato, V; de Camargo, M E; Canevari, R A; Carraro, D M; Cerutti, J M; Correa, M L; Correa, R F; Costa, M C; Curcio, C; Hokama, P O; Ferreira, A J; Furuzawa, G K; Gushiken, T; Ho, P L; Kimura, E; Krieger, J E; Leite, L C; Majumder, P; Marins, M; Marques, E R; Melo, A S; Melo, M B; Mestriner, C A; Miracca, E C; Miranda, D C; Nascimento, A L; Nobrega, F G; Ojopi, E P; Pandolfi, J R; Pessoa, L G; Prevedel, A C; Rahal, P; Rainho, C A; Reis, E M; Ribeiro, M L; da Ros, N; de Sa, R G; Sales, M M; Sant'anna, S C; dos Santos, M L; da Silva, A M; da Silva, N P; Silva, W A; da Silveira, R A; Sousa, J F; Stecconi, D; Tsukumo, F; Valente, V; Soares, F; Moreira, E S; Nunes, D N; Correa, R G; Zalcberg, H; Carvalho, A F; Reis, L F; Brentani, R R; Simpson, A J; de Souza, S J; Melo, M

    2001-10-09

    Open reading frame expressed sequences tags (ORESTES) differ from conventional ESTs by providing sequence data from the central protein coding portion of transcripts. We generated a total of 696,745 ORESTES sequences from 24 human tissues and used a subset of the data that correspond to a set of 15,095 full-length mRNAs as a means of assessing the efficiency of the strategy and its potential contribution to the definition of the human transcriptome. We estimate that ORESTES sampled over 80% of all highly and moderately expressed, and between 40% and 50% of rarely expressed, human genes. In our most thoroughly sequenced tissue, the breast, the 130,000 ORESTES generated are derived from transcripts from an estimated 70% of all genes expressed in that tissue, with an equally efficient representation of both highly and poorly expressed genes. In this respect, we find that the capacity of the ORESTES strategy both for gene discovery and shotgun transcript sequence generation significantly exceeds that of conventional ESTs. The distribution of ORESTES is such that many human transcripts are now represented by a scaffold of partial sequences distributed along the length of each gene product. The experimental joining of the scaffold components, by reverse transcription-PCR, represents a direct route to transcript finishing that may represent a useful alternative to full-length cDNA cloning.

  13. Illumina–based de novo transcriptome sequencing and analysis of ...

    Indian Academy of Sciences (India)

    Administrator

    2017-10-25

    Oct 25, 2017 ... (Shanghai, China) following manufacturer's protocols (Illumina, San .... suggests that pathways involved in musk production are expressed at a ..... Strickler S. R., Aureliano B. and Mueller L. A. 2012 Designing a transcriptome.

  14. Transcriptome sequencing and characterization for the sea cucumber Apostichopus japonicus (Selenka, 1867.

    Directory of Open Access Journals (Sweden)

    Huixia Du

    Full Text Available BACKGROUND: Sea cucumbers are a special group of marine invertebrates. They occupy a taxonomic position that is believed to be important for understanding the origin and evolution of deuterostomes. Some of them such as Apostichopus japonicus represent commercially important aquaculture species in Asian countries. Many efforts have been devoted to increasing the number of expressed sequence tags (ESTs for A. japonicus, but a comprehensive characterization of its transcriptome remains lacking. Here, we performed the large-scale transcriptome profiling and characterization by pyrosequencing diverse cDNA libraries from A. japonicus. RESULTS: In total, 1,061,078 reads were obtained by 454 sequencing of eight cDNA libraries representing different developmental stages and adult tissues in A. japonicus. These reads were assembled into 29,666 isotigs, which were further clustered into 21,071 isogroups. Nearly 40% of the isogroups showed significant matches to known proteins based on sequence similarity. Gene ontology (GO and KEGG pathway analyses recovered diverse biological functions and processes. Candidate genes that were potentially involved in aestivation were identified. Transcriptome comparison with the sea urchin Strongylocentrotus purpuratus revealed similar patterns of GO term representation. In addition, 4,882 putative orthologous genes were identified, of which 202 were not present in the non-echinoderm organisms. More than 700 simple sequence repeats (SSRs and 54,000 single nucleotide polymorphisms (SNPs were detected in the A. japonicus transcriptome. CONCLUSION: Pyrosequencing was proven to be efficient in rapidly identifying a large set of genes for the sea cucumber A. japonicus. Through the large-scale transcriptome sequencing as well as public EST data integration, we performed a comprehensive characterization of the A. japonicus transcriptome and identified candidate aestivation-related genes. A large number of potential genetic

  15. Analysis of the Macaca mulatta transcriptome and the sequence divergence between Macaca and human.

    Science.gov (United States)

    Magness, Charles L; Fellin, P Campion; Thomas, Matthew J; Korth, Marcus J; Agy, Michael B; Proll, Sean C; Fitzgibbon, Matthew; Scherer, Christina A; Miner, Douglas G; Katze, Michael G; Iadonato, Shawn P

    2005-01-01

    We report the initial sequencing and comparative analysis of the Macaca mulatta transcriptome. Cloned sequences from 11 tissues, nine animals, and three species (M. mulatta, M. fascicularis, and M. nemestrina) were sampled, resulting in the generation of 48,642 sequence reads. These data represent an initial sampling of the putative rhesus orthologs for 6,216 human genes. Mean nucleotide diversity within M. mulatta and sequence divergence among M. fascicularis, M. nemestrina, and M. mulatta are also reported.

  16. Complete Genome Sequence of Sporisorium scitamineum and Biotrophic Interaction Transcriptome with Sugarcane.

    Directory of Open Access Journals (Sweden)

    Lucas M Taniguti

    Full Text Available Sporisorium scitamineum is a biotrophic fungus responsible for the sugarcane smut, a worldwide spread disease. This study provides the complete sequence of individual chromosomes of S. scitamineum from telomere to telomere achieved by a combination of PacBio long reads and Illumina short reads sequence data, as well as a draft sequence of a second fungal strain. Comparative analysis to previous available sequences of another strain detected few polymorphisms among the three genomes. The novel complete sequence described herein allowed us to identify and annotate extended subtelomeric regions, repetitive elements and the mitochondrial DNA sequence. The genome comprises 19,979,571 bases, 6,677 genes encoding proteins, 111 tRNAs and 3 assembled copies of rDNA, out of our estimated number of copies as 130. Chromosomal reorganizations were detected when comparing to sequences of S. reilianum, the closest smut relative, potentially influenced by repeats of transposable elements. Repetitive elements may have also directed the linkage of the two mating-type loci. The fungal transcriptome profiling from in vitro and from interaction with sugarcane at two time points (early infection and whip emergence revealed that 13.5% of the genes were differentially expressed in planta and particular to each developmental stage. Among them are plant cell wall degrading enzymes, proteases, lipases, chitin modification and lignin degradation enzymes, sugar transporters and transcriptional factors. The fungus also modulates transcription of genes related to surviving against reactive oxygen species and other toxic metabolites produced by the plant. Previously described effectors in smut/plant interactions were detected but some new candidates are proposed. Ten genomic islands harboring some of the candidate genes unique to S. scitamineum were expressed only in planta. RNAseq data was also used to reassure gene predictions.

  17. Characterization of Liaoning cashmere goat transcriptome: sequencing, de novo assembly, functional annotation and comparative analysis.

    Directory of Open Access Journals (Sweden)

    Hongliang Liu

    Full Text Available Liaoning cashmere goat is a famous goat breed for cashmere wool. In order to increase the transcriptome data and accelerate genetic improvement for this breed, we performed de novo transcriptome sequencing to generate the first expressed sequence tag dataset for the Liaoning cashmere goat, using next-generation sequencing technology.Transcriptome sequencing of Liaoning cashmere goat on a Roche 454 platform yielded 804,601 high-quality reads. Clustering and assembly of these reads produced a non-redundant set of 117,854 unigenes, comprising 13,194 isotigs and 104,660 singletons. Based on similarity searches with known proteins, 17,356 unigenes were assigned to 6,700 GO categories, and the terms were summarized into three main GO categories and 59 sub-categories. 3,548 and 46,778 unigenes had significant similarity to existing sequences in the KEGG and COG databases, respectively. Comparative analysis revealed that 42,254 unigenes were aligned to 17,532 different sequences in NCBI non-redundant nucleotide databases. 97,236 (82.51% unigenes were mapped to the 30 goat chromosomes. 35,551 (30.17% unigenes were matched to 11,438 reported goat protein-coding genes. The remaining non-matched unigenes were further compared with cattle and human reference genes, 67 putative new goat genes were discovered. Additionally, 2,781 potential simple sequence repeats were initially identified from all unigenes.The transcriptome of Liaoning cashmere goat was deep sequenced, de novo assembled, and annotated, providing abundant data to better understand the Liaoning cashmere goat transcriptome. The potential simple sequence repeats provide a material basis for future genetic linkage and quantitative trait loci analyses.

  18. Haematobia irritans dataset of raw sequence reads from Illumina-based transcriptome sequencing of specific tissues and life stages

    Science.gov (United States)

    Illumina HiSeq technology was used to sequence the transcriptome from various dissected tissues and life stages from the horn fly, Haematobia irritans. These samples include eggs (0, 2, 4, and 9 hours post-oviposition), adult fly gut, adult fly legs, adult fly malpighian tubule, adult fly ovary, adu...

  19. Sequencing and De Novo Assembly of the Toxicodendron radicans (Poison Ivy) Transcriptome.

    Science.gov (United States)

    Weisberg, Alexandra J; Kim, Gunjune; Westwood, James H; Jelesko, John G

    2017-11-10

    Contact with poison ivy plants is widely dreaded because they produce a natural product called urushiol that is responsible for allergenic contact delayed-dermatitis symptoms lasting for weeks. For this reason, the catchphrase most associated with poison ivy is "leaves of three, let it be", which serves the purpose of both identification and an appeal for avoidance. Ironically, despite this notoriety, there is a dearth of specific knowledge about nearly all other aspects of poison ivy physiology and ecology. As a means of gaining a more molecular-oriented understanding of poison ivy physiology and ecology, Next Generation DNA sequencing technology was used to develop poison ivy root and leaf RNA-seq transcriptome resources. De novo assembled transcriptomes were analyzed to generate a core set of high quality expressed transcripts present in poison ivy tissue. The predicted protein sequences were evaluated for similarity to SwissProt homologs and InterProScan domains, as well as assigned both GO terms and KEGG annotations. Over 23,000 simple sequence repeats were identified in the transcriptome, and corresponding oligo nucleotide primer pairs were designed. A pan-transcriptome analysis of existing Anacardiaceae transcriptomes revealed conserved and unique transcripts among these species.

  20. SNP discovery in the transcriptome of white Pacific shrimp Litopenaeus vannamei by next generation sequencing.

    Directory of Open Access Journals (Sweden)

    Yang Yu

    Full Text Available The application of next generation sequencing technology has greatly facilitated high throughput single nucleotide polymorphism (SNP discovery and genotyping in genetic research. In the present study, SNPs were discovered based on two transcriptomes of Litopenaeus vannamei (L. vannamei generated from Illumina sequencing platform HiSeq 2000. One transcriptome of L. vannamei was obtained through sequencing on the RNA from larvae at mysis stage and its reference sequence was de novo assembled. The data from another transcriptome were downloaded from NCBI and the reads of the two transcriptomes were mapped separately to the assembled reference by BWA. SNP calling was performed using SAMtools. A total of 58,717 and 36,277 SNPs with high quality were predicted from the two transcriptomes, respectively. SNP calling was also performed using the reads of two transcriptomes together, and a total of 96,040 SNPs with high quality were predicted. Among these 96,040 SNPs, 5,242 and 29,129 were predicted as non-synonymous and synonymous SNPs respectively. Characterization analysis of the predicted SNPs in L. vannamei showed that the estimated SNP frequency was 0.21% (one SNP per 476 bp and the estimated ratio for transition to transversion was 2.0. Fifty SNPs were randomly selected for validation by Sanger sequencing after PCR amplification and 76% of SNPs were confirmed, which indicated that the SNPs predicted in this study were reliable. These SNPs will be very useful for genetic study in L. vannamei, especially for the high density linkage map construction and genome-wide association studies.

  1. Genome and Transcriptome Sequencing of the Ostreid herpesvirus 1 From Tomales Bay, California

    Science.gov (United States)

    Burge, C. A.; Langevin, S.; Closek, C. J.; Roberts, S. B.; Friedman, C. S.

    2016-02-01

    Mass mortalities of larval and seed bivalve molluscs attributed to the Ostreid herpesvirus 1 (OsHV-1) occur globally. OsHV-1 was fully sequenced and characterized as a member of the Family Malacoherpesviridae. Multiple strains of OsHV-1 exist and may vary in virulence, i.e. OsHV-1 µvar. For most global variants of OsHV-1, sequence data is limited to PCR-based sequencing of segments, including two recent genomes. In the United States, OsHV-1 is limited to detection in adjacent embayments in California, Tomales and Drakes bays. Limited DNA sequence data of OsHV-1 infecting oysters in Tomales Bay indicates the virus detected in Tomales Bay is similar but not identical to any one global variant of OsHV-1. In order to better understand both strain variation and virulence of OsHV-1 infecting oysters in Tomales Bay, we used genomic and transcriptomic sequencing. Meta-genomic sequencing (Illumina MiSeq) was conducted from infected oysters (n=4 per year) collected in 2003, 2007, and 2014, where full OsHV-1 genome sequences and low overall microbial diversity were achieved from highly infected oysters. Increased microbial diversity was detected in three of four samples sequenced from 2003, where qPCR based genome copy numbers of OsHV-1 were lower. Expression analysis (SOLiD RNA sequencing) of OsHV-1 genes expressed in oyster larvae at 24 hours post exposure revealed a nearly complete transcriptome, with several highly expressed genes, which are similar to recent transcriptomic analyses of other OsHV-1 variants. Taken together, our results indicate that genome and transcriptome sequencing may be powerful tools in understanding both strain variation and virulence of non-culturable marine viruses.

  2. Integrated analysis of whole genome and transcriptome sequencing reveals diverse transcriptomic aberrations driven by somatic genomic changes in liver cancers.

    Directory of Open Access Journals (Sweden)

    Yuichi Shiraishi

    Full Text Available Recent studies applying high-throughput sequencing technologies have identified several recurrently mutated genes and pathways in multiple cancer genomes. However, transcriptional consequences from these genomic alterations in cancer genome remain unclear. In this study, we performed integrated and comparative analyses of whole genomes and transcriptomes of 22 hepatitis B virus (HBV-related hepatocellular carcinomas (HCCs and their matched controls. Comparison of whole genome sequence (WGS and RNA-Seq revealed much evidence that various types of genomic mutations triggered diverse transcriptional changes. Not only splice-site mutations, but also silent mutations in coding regions, deep intronic mutations and structural changes caused splicing aberrations. HBV integrations generated diverse patterns of virus-human fusion transcripts depending on affected gene, such as TERT, CDK15, FN1 and MLL4. Structural variations could drive over-expression of genes such as WNT ligands, with/without creating gene fusions. Furthermore, by taking account of genomic mutations causing transcriptional aberrations, we could improve the sensitivity of deleterious mutation detection in known cancer driver genes (TP53, AXIN1, ARID2, RPS6KA3, and identified recurrent disruptions in putative cancer driver genes such as HNF4A, CPS1, TSC1 and THRAP3 in HCCs. These findings indicate genomic alterations in cancer genome have diverse transcriptomic effects, and integrated analysis of WGS and RNA-Seq can facilitate the interpretation of a large number of genomic alterations detected in cancer genome.

  3. Comparative analyses of two Geraniaceae transcriptomes using next-generation sequencing.

    Science.gov (United States)

    Zhang, Jin; Ruhlman, Tracey A; Mower, Jeffrey P; Jansen, Robert K

    2013-12-29

    Organelle genomes of Geraniaceae exhibit several unusual evolutionary phenomena compared to other angiosperm families including accelerated nucleotide substitution rates, widespread gene loss, reduced RNA editing, and extensive genomic rearrangements. Since most organelle-encoded proteins function in multi-subunit complexes that also contain nuclear-encoded proteins, it is likely that the atypical organellar phenomena affect the evolution of nuclear genes encoding organellar proteins. To begin to unravel the complex co-evolutionary interplay between organellar and nuclear genomes in this family, we sequenced nuclear transcriptomes of two species, Geranium maderense and Pelargonium x hortorum. Normalized cDNA libraries of G. maderense and P. x hortorum were used for transcriptome sequencing. Five assemblers (MIRA, Newbler, SOAPdenovo, SOAPdenovo-trans [SOAPtrans], Trinity) and two next-generation technologies (454 and Illumina) were compared to determine the optimal transcriptome sequencing approach. Trinity provided the highest quality assembly of Illumina data with the deepest transcriptome coverage. An analysis to determine the amount of sequencing needed for de novo assembly revealed diminishing returns of coverage and quality with data sets larger than sixty million Illumina paired end reads for both species. The G. maderense and P. x hortorum transcriptomes contained fewer transcripts encoding the PLS subclass of PPR proteins relative to other angiosperms, consistent with reduced mitochondrial RNA editing activity in Geraniaceae. In addition, transcripts for all six plastid targeted sigma factors were identified in both transcriptomes, suggesting that one of the highly divergent rpoA-like ORFs in the P. x hortorum plastid genome is functional. The findings support the use of the Illumina platform and assemblers optimized for transcriptome assembly, such as Trinity or SOAPtrans, to generate high-quality de novo transcriptomes with broad coverage. In addition

  4. A transcriptome resource for the koala (Phascolarctos cinereus): insights into koala retrovirus transcription and sequence diversity.

    Science.gov (United States)

    Hobbs, Matthew; Pavasovic, Ana; King, Andrew G; Prentis, Peter J; Eldridge, Mark D B; Chen, Zhiliang; Colgan, Donald J; Polkinghorne, Adam; Wilkins, Marc R; Flanagan, Cheyne; Gillett, Amber; Hanger, Jon; Johnson, Rebecca N; Timms, Peter

    2014-09-11

    The koala, Phascolarctos cinereus, is a biologically unique and evolutionarily distinct Australian arboreal marsupial. The goal of this study was to sequence the transcriptome from several tissues of two geographically separate koalas, and to create the first comprehensive catalog of annotated transcripts for this species, enabling detailed analysis of the unique attributes of this threatened native marsupial, including infection by the koala retrovirus. RNA-Seq data was generated from a range of tissues from one male and one female koala and assembled de novo into transcripts using Velvet-Oases. Transcript abundance in each tissue was estimated. Transcripts were searched for likely protein-coding regions and a non-redundant set of 117,563 putative protein sequences was produced. In similarity searches there were 84,907 (72%) sequences that aligned to at least one sequence in the NCBI nr protein database. The best alignments were to sequences from other marsupials. After applying a reciprocal best hit requirement of koala sequences to those from tammar wallaby, Tasmanian devil and the gray short-tailed opossum, we estimate that our transcriptome dataset represents approximately 15,000 koala genes. The marsupial alignment information was used to look for potential gene duplications and we report evidence for copy number expansion of the alpha amylase gene, and of an aldehyde reductase gene.Koala retrovirus (KoRV) transcripts were detected in the transcriptomes. These were analysed in detail and the structure of the spliced envelope gene transcript was determined. There was appreciable sequence diversity within KoRV, with 233 sites in the KoRV genome showing small insertions/deletions or single nucleotide polymorphisms. Both koalas had sequences from the KoRV-A subtype, but the male koala transcriptome has, in addition, sequences more closely related to the KoRV-B subtype. This is the first report of a KoRV-B-like sequence in a wild population. This transcriptomic

  5. Sequencing and de novo assembly of the transcriptome of the glassy-winged sharpshooter (Homalodisca vitripennis.

    Directory of Open Access Journals (Sweden)

    Raja Sekhar Nandety

    Full Text Available BACKGROUND: The glassy-winged sharpshooter Homalodisca vitripennis (Hemiptera: Cicadellidae, is a xylem-feeding leafhopper and important vector of the bacterium Xylella fastidiosa; the causal agent of Pierce's disease of grapevines. The functional complexity of the transcriptome of H. vitripennis has not been elucidated thus far. It is a necessary blueprint for an understanding of the development of H. vitripennis and for designing efficient biorational control strategies including those based on RNA interference. RESULTS: Here we elucidate and explore the transcriptome of adult H. vitripennis using high-throughput paired end deep sequencing and de novo assembly. A total of 32,803,656 paired-end reads were obtained with an average transcript length of 624 nucleotides. We assembled 32.9 Mb of the transcriptome of H. vitripennis that spanned across 47,265 loci and 52,708 transcripts. Comparison of our non-redundant database showed that 45% of the deduced proteins of H. vitripennis exhibit identity (e-value ≤1(-5 with known proteins. We assigned Gene Ontology (GO terms, Kyoto Encyclopedia of Genes and Genomes (KEGG annotations, and potential Pfam domains to each transcript isoform. In order to gain insight into the molecular basis of key regulatory genes of H. vitripennis, we characterized predicted proteins involved in the metabolism of juvenile hormone, and biogenesis of small RNAs (Dicer and Piwi sequences from the transcriptomic sequences. Analysis of transposable element sequences of H. vitripennis indicated that the genome is less expanded in comparison to many other insects with approximately 1% of the transcriptome carrying transposable elements. CONCLUSIONS: Our data significantly enhance the molecular resources available for future study and control of this economically important hemipteran. This transcriptional information not only provides a more nuanced understanding of the underlying biological and physiological mechanisms that

  6. Analysis of a native whitefly transcriptome and its sequence divergence with two invasive whitefly species

    Directory of Open Access Journals (Sweden)

    Wang Xiao-Wei

    2012-10-01

    Full Text Available Abstract Background Genomic divergence between invasive and native species may provide insight into the molecular basis underlying specific characteristics that drive the invasion and displacement of closely related species. In this study, we sequenced the transcriptome of an indigenous species, Asia II 3, of the Bemisia tabaci complex and compared its genetic divergence with the transcriptomes of two invasive whiteflies species, Middle East Asia Minor 1 (MEAM1 and Mediterranean (MED, respectively. Results More than 16 million reads of 74 base pairs in length were obtained for the Asia II 3 species using the Illumina sequencing platform. These reads were assembled into 52,535 distinct sequences (mean size: 466 bp and 16,596 sequences were annotated with an E-value above 10-5. Protein family comparisons revealed obvious diversification among the transcriptomes of these species suggesting species-specific adaptations during whitefly evolution. On the contrary, substantial conservation of the whitefly transcriptomes was also evident, despite their differences. The overall divergence of coding sequences between the orthologous gene pairs of Asia II 3 and MEAM1 is 1.73%, which is comparable to the average divergence of Asia II 3 and MED transcriptomes (1.84% and much higher than that of MEAM1 and MED (0.83%. This is consistent with the previous phylogenetic analyses and crossing experiments suggesting these are distinct species. We also identified hundreds of highly diverged genes and compiled sequence identify data into gene functional groups and found the most divergent gene classes are Cytochrome P450, Glutathione metabolism and Oxidative phosphorylation. These results strongly suggest that the divergence of genes related to metabolism might be the driving force of the MEAM1 and Asia II 3 differentiation. We also analyzed single nucleotide polymorphisms within the orthologous gene pairs of indigenous and invasive whiteflies which are helpful for

  7. Interference management using direct sequence spread spectrum ...

    African Journals Online (AJOL)

    Interference management using direct sequence spread spectrum (DSSS) technique ... Journal of Fundamental and Applied Sciences ... Keywords: DSSS, LTE network; Wi-Fi network; SINR; interference management and interference power.

  8. Transcriptome sequencing and de novo analysis of the copepod Calanus sinicus using 454 GS FLX.

    Directory of Open Access Journals (Sweden)

    Juan Ning

    Full Text Available BACKGROUND: Despite their species abundance and primary economic importance, genomic information about copepods is still limited. In particular, genomic resources are lacking for the copepod Calanus sinicus, which is a dominant species in the coastal waters of East Asia. In this study, we performed de novo transcriptome sequencing to produce a large number of expressed sequence tags for the copepod C. sinicus. RESULTS: Copepodid larvae and adults were used as the basic material for transcriptome sequencing. Using 454 pyrosequencing, a total of 1,470,799 reads were obtained, which were assembled into 56,809 high quality expressed sequence tags. Based on their sequence similarity to known proteins, about 14,000 different genes were identified, including members of all major conserved signaling pathways. Transcripts that were putatively involved with growth, lipid metabolism, molting, and diapause were also identified among these genes. Differentially expressed genes related to several processes were found in C. sinicus copepodid larvae and adults. We detected 284,154 single nucleotide polymorphisms (SNPs that provide a resource for gene function studies. CONCLUSION: Our data provide the most comprehensive transcriptome resource available for C. sinicus. This resource allowed us to identify genes associated with primary physiological processes and SNPs in coding regions, which facilitated the quantitative analysis of differential gene expression. These data should provide foundation for future genetic and genomic studies of this and related species.

  9. Whole transcriptome sequencing enables discovery and analysis of viruses in archived primary central nervous system lymphomas.

    Directory of Open Access Journals (Sweden)

    Christopher DeBoever

    Full Text Available Primary central nervous system lymphomas (PCNSL have a dramatically increased prevalence among persons living with AIDS and are known to be associated with human Epstein Barr virus (EBV infection. Previous work suggests that in some cases, co-infection with other viruses may be important for PCNSL pathogenesis. Viral transcription in tumor samples can be measured using next generation transcriptome sequencing. We demonstrate the ability of transcriptome sequencing to identify viruses, characterize viral expression, and identify viral variants by sequencing four archived AIDS-related PCNSL tissue samples and analyzing raw sequencing reads. EBV was detected in all four PCNSL samples and cytomegalovirus (CMV, JC polyomavirus (JCV, and HIV were also discovered, consistent with clinical diagnoses. CMV was found to express three long non-coding RNAs recently reported as expressed during active infection. Single nucleotide variants were observed in each of the viruses observed and three indels were found in CMV. No viruses were found in several control tumor types including 32 diffuse large B-cell lymphoma samples. This study demonstrates the ability of next generation transcriptome sequencing to accurately identify viruses, including DNA viruses, in solid human cancer tissue samples.

  10. [Study on quality evaluation of sequence and SSR information in transcriptome of Astragalus membranacus].

    Science.gov (United States)

    Chang, Yue; Yang, Song; Liu, Zhen-Peng; Ren, Wei-Chao; Liu, Jie; Ma, Wei

    2016-04-01

    In this study, 454/Roche GS FLX sequencing technology was used to obtain the data of the Astragalus membranaceus. Four hundred and fifty-four Sequencing System Software was applied to carry out the transcription of the group from scratch. Using MISA tools, 9 893 unigenes were selected for the sequence of the genome of A. membranaceus, and the information of SSR locus was analyzed. According to the result, the average length of reads was 413 bp, about 86% of the reads was involved in the splicing, the length of the N50 was 1 205 bp, the number of unigenes was measured by the whole transcript. 1 729 SSR loci in the A. membranaceus transcriptome were searched, the occurrence frequency of SSR was 9.24%, the frequency of SSR in the whole transcriptome was 13.42%, the average length of SSR was 7.97 kb. One hundred and twenty-seven kinds of core repeat sequences were found, the dominant type was TG/AC type of dinucleotide, it appeared to account for 4.25% of the total SSR locus. The results of the sequence of the transcription of the A. membranaceus transcriptome revealed the overall expression, and a large number of unigenessequence was obtained, and the SSR locus in the genome of the A. membranaceus is high, and the type is diverse, and the polymorphism of the gene is high. Copyright© by the Chinese Pharmaceutical Association.

  11. Genetic signatures of adaptation revealed from transcriptome sequencing of Arctic and red foxes.

    Science.gov (United States)

    Kumar, Vikas; Kutschera, Verena E; Nilsson, Maria A; Janke, Axel

    2015-08-07

    The genus Vulpes (true foxes) comprises numerous species that inhabit a wide range of habitats and climatic conditions, including one species, the Arctic fox (Vulpes lagopus) which is adapted to the arctic region. A close relative to the Arctic fox, the red fox (Vulpes vulpes), occurs in subarctic to subtropical habitats. To study the genetic basis of their adaptations to different environments, transcriptome sequences from two Arctic foxes and one red fox individual were generated and analyzed for signatures of positive selection. In addition, the data allowed for a phylogenetic analysis and divergence time estimate between the two fox species. The de novo assembly of reads resulted in more than 160,000 contigs/transcripts per individual. Approximately 17,000 homologous genes were identified using human and the non-redundant databases. Positive selection analyses revealed several genes involved in various metabolic and molecular processes such as energy metabolism, cardiac gene regulation, apoptosis and blood coagulation to be under positive selection in foxes. Branch site tests identified four genes to be under positive selection in the Arctic fox transcriptome, two of which are fat metabolism genes. In the red fox transcriptome eight genes are under positive selection, including molecular process genes, notably genes involved in ATP metabolism. Analysis of the three transcriptomes and five Sanger re-sequenced genes in additional individuals identified a lower genetic variability within Arctic foxes compared to red foxes, which is consistent with distribution range differences and demographic responses to past climatic fluctuations. A phylogenomic analysis estimated that the Arctic and red fox lineages diverged about three million years ago. Transcriptome data are an economic way to generate genomic resources for evolutionary studies. Despite not representing an entire genome, this transcriptome analysis identified numerous genes that are relevant to arctic

  12. De novo transcriptome sequencing of axolotl blastema for identification of differentially expressed genes during limb regeneration

    Science.gov (United States)

    2013-01-01

    Background Salamanders are unique among vertebrates in their ability to completely regenerate amputated limbs through the mediation of blastema cells located at the stump ends. This regeneration is nerve-dependent because blastema formation and regeneration does not occur after limb denervation. To obtain the genomic information of blastema tissues, de novo transcriptomes from both blastema tissues and denervated stump ends of Ambystoma mexicanum (axolotls) 14 days post-amputation were sequenced and compared using Solexa DNA sequencing. Results The sequencing done for this study produced 40,688,892 reads that were assembled into 307,345 transcribed sequences. The N50 of transcribed sequence length was 562 bases. A similarity search with known proteins identified 39,200 different genes to be expressed during limb regeneration with a cut-off E-value exceeding 10-5. We annotated assembled sequences by using gene descriptions, gene ontology, and clusters of orthologous group terms. Targeted searches using these annotations showed that the majority of the genes were in the categories of essential metabolic pathways, transcription factors and conserved signaling pathways, and novel candidate genes for regenerative processes. We discovered and confirmed numerous sequences of the candidate genes by using quantitative polymerase chain reaction and in situ hybridization. Conclusion The results of this study demonstrate that de novo transcriptome sequencing allows gene expression analysis in a species lacking genome information and provides the most comprehensive mRNA sequence resources for axolotls. The characterization of the axolotl transcriptome can help elucidate the molecular mechanisms underlying blastema formation during limb regeneration. PMID:23815514

  13. Transcriptome Sequencing of Chemically Induced Aquilaria sinensis to Identify Genes Related to Agarwood Formation.

    Science.gov (United States)

    Ye, Wei; Wu, Hongqing; He, Xin; Wang, Lei; Zhang, Weimin; Li, Haohua; Fan, Yunfei; Tan, Guohui; Liu, Taomei; Gao, Xiaoxia

    2016-01-01

    Agarwood is a traditional Chinese medicine used as a clinical sedative, carminative, and antiemetic drug. Agarwood is formed in Aquilaria sinensis when A. sinensis trees are threatened by external physical, chemical injury or endophytic fungal irritation. However, the mechanism of agarwood formation via chemical induction remains unclear. In this study, we characterized the transcriptome of different parts of a chemically induced A. sinensis trunk sample with agarwood. The Illumina sequencing platform was used to identify the genes involved in agarwood formation. A five-year-old Aquilaria sinensis treated by formic acid was selected. The white wood part (B1 sample), the transition part between agarwood and white wood (W2 sample), the agarwood part (J3 sample), and the rotten wood part (F5 sample) were collected for transcriptome sequencing. Accordingly, 54,685,634 clean reads, which were assembled into 83,467 unigenes, were obtained with a Q20 value of 97.5%. A total of 50,565 unigenes were annotated using the Nr, Nt, SWISS-PROT, KEGG, COG, and GO databases. In particular, 171,331,352 unigenes were annotated by various pathways, including the sesquiterpenoid (ko00909) and plant-pathogen interaction (ko03040) pathways. These pathways were related to sesquiterpenoid biosynthesis and defensive responses to chemical stimulation. The transcriptome data of the different parts of the chemically induced A. sinensis trunk provide a rich source of materials for discovering and identifying the genes involved in sesquiterpenoid production and in defensive responses to chemical stimulation. This study is the first to use de novo sequencing and transcriptome assembly for different parts of chemically induced A. sinensis. Results demonstrate that the sesquiterpenoid biosynthesis pathway and WRKY transcription factor play important roles in agarwood formation via chemical induction. The comparative analysis of the transcriptome data of agarwood and A. sinensis lays the foundation

  14. Transcriptome sequencing and De Novo analysis of Youngia japonica using the illumina platform.

    Directory of Open Access Journals (Sweden)

    Yulan Peng

    Full Text Available Youngia japonica, a weed species distributed worldwide, has been widely used in traditional Chinese medicine. It is an ideal plant for studying the evolution of Asteraceae plants because of its short life history and abundant source. However, little is known about its evolution and genetic diversity. In this study, de novo transcriptome sequencing was conducted for the first time for the comprehensive analysis of the genetic diversity of Y. japonica. The Y. japonica transcriptome was sequenced using Illumina paired-end sequencing technology. We produced 21,847,909 high-quality reads for Y. japonica and assembled them into contigs. A total of 51,850 unigenes were identified, among which 46,087 were annotated in the NCBI non-redundant protein database and 41,752 were annotated in the Swiss-Prot database. We mapped 9,125 unigenes onto 163 pathways using the Kyoto Encyclopedia of Genes and Genomes Pathway database. In addition, 3,648 simple sequence repeats (SSRs were detected. Our data provide the most comprehensive transcriptome resource currently available for Y. japonica. C4 photosynthesis unigenes were found in the biological process of Y. japonica. There were 5596 unigenes related to defense response and 1344 ungienes related to signal transduction mechanisms (10.95%. These data provide insights into the genetic diversity of Y. japonica. Numerous SSRs contributed to the development of novel markers. These data may serve as a new valuable resource for genomic studies on Youngia and, more generally, Cichoraceae.

  15. Transcriptome analysis of blueberry using 454 EST sequencing

    Science.gov (United States)

    Blueberry (Vaccinium corymbosum) is a major berry crop in the United States, and one that has great nutritional and economical value. Next generation sequencing methodologies, such as 454, have been demonstrated to be successful and efficient in producing a snap-shot of transcriptional activities du...

  16. Sequencing and de novo analysis of a coral larval transcriptome using 454 GSFlx

    Directory of Open Access Journals (Sweden)

    Colbourne John K

    2009-05-01

    Full Text Available Abstract Background New methods are needed for genomic-scale analysis of emerging model organisms that exemplify important biological questions but lack fully sequenced genomes. For example, there is an urgent need to understand the potential for corals to adapt to climate change, but few molecular resources are available for studying these processes in reef-building corals. To facilitate genomics studies in corals and other non-model systems, we describe methods for transcriptome sequencing using 454, as well as strategies for assembling a useful catalog of genes from the output. We have applied these methods to sequence the transcriptome of planulae larvae from the coral Acropora millepora. Results More than 600,000 reads produced in a single 454 sequencing run were assembled into ~40,000 contigs with five-fold average sequencing coverage. Based on sequence similarity with known proteins, these analyses identified ~11,000 different genes expressed in a range of conditions including thermal stress and settlement induction. Assembled sequences were annotated with gene names, conserved domains, and Gene Ontology terms. Targeted searches using these annotations identified the majority of genes associated with essential metabolic pathways and conserved signaling pathways, as well as novel candidate genes for stress-related processes. Comparisons with the genome of the anemone Nematostella vectensis revealed ~8,500 pairs of orthologs and ~100 candidate coral-specific genes. More than 30,000 SNPs were detected in the coral sequences, and a subset of these validated by re-sequencing. Conclusion The methods described here for deep sequencing of the transcriptome should be widely applicable to generate catalogs of genes and genetic markers in emerging model organisms. Our data provide the most comprehensive sequence resource currently available for reef-building corals, and include an extensive collection of potential genetic markers for association and

  17. Transcriptome sequencing of different narrow-leafed lupin tissue types provides a comprehensive uni-gene assembly and extensive gene-based molecular markers

    Science.gov (United States)

    Kamphuis, Lars G; Hane, James K; Nelson, Matthew N; Gao, Lingling; Atkins, Craig A; Singh, Karam B

    2015-01-01

    Narrow-leafed lupin (NLL; Lupinus angustifolius L.) is an important grain legume crop that is valuable for sustainable farming and is becoming recognized as a human health food. NLL breeding is directed at improving grain production, disease resistance, drought tolerance and health benefits. However, genetic and genomic studies have been hindered by a lack of extensive genomic resources for the species. Here, the generation, de novo assembly and annotation of transcriptome datasets derived from five different NLL tissue types of the reference accession cv. Tanjil are described. The Tanjil transcriptome was compared to transcriptomes of an early domesticated cv. Unicrop, a wild accession P27255, as well as accession 83A:476, together being the founding parents of two recombinant inbred line (RIL) populations. In silico predictions for transcriptome-derived gene-based length and SNP polymorphic markers were conducted and corroborated using a survey assembly sequence for NLL cv. Tanjil. This yielded extensive indel and SNP polymorphic markers for the two RIL populations. A total of 335 transcriptome-derived markers and 66 BAC-end sequence-derived markers were evaluated, and 275 polymorphic markers were selected to genotype the reference NLL 83A:476 × P27255 RIL population. This significantly improved the completeness, marker density and quality of the reference NLL genetic map. PMID:25060816

  18. Soybean (Glycine max) SWEET gene family: insights through comparative genomics, transcriptome profiling and whole genome re-sequence analysis.

    Science.gov (United States)

    Patil, Gunvant; Valliyodan, Babu; Deshmukh, Rupesh; Prince, Silvas; Nicander, Bjorn; Zhao, Mingzhe; Sonah, Humira; Song, Li; Lin, Li; Chaudhary, Juhi; Liu, Yang; Joshi, Trupti; Xu, Dong; Nguyen, Henry T

    2015-07-11

    SWEET (MtN3_saliva) domain proteins, a recently identified group of efflux transporters, play an indispensable role in sugar efflux, phloem loading, plant-pathogen interaction and reproductive tissue development. The SWEET gene family is predominantly studied in Arabidopsis and members of the family are being investigated in rice. To date, no transcriptome or genomics analysis of soybean SWEET genes has been reported. In the present investigation, we explored the evolutionary aspect of the SWEET gene family in diverse plant species including primitive single cell algae to angiosperms with a major emphasis on Glycine max. Evolutionary features showed expansion and duplication of the SWEET gene family in land plants. Homology searches with BLAST tools and Hidden Markov Model-directed sequence alignments identified 52 SWEET genes that were mapped to 15 chromosomes in the soybean genome as tandem duplication events. Soybean SWEET (GmSWEET) genes showed a wide range of expression profiles in different tissues and developmental stages. Analysis of public transcriptome data and expression profiling using quantitative real time PCR (qRT-PCR) showed that a majority of the GmSWEET genes were confined to reproductive tissue development. Several natural genetic variants (non-synonymous SNPs, premature stop codons and haplotype) were identified in the GmSWEET genes using whole genome re-sequencing data analysis of 106 soybean genotypes. A significant association was observed between SNP-haplogroup and seed sucrose content in three gene clusters on chromosome 6. Present investigation utilized comparative genomics, transcriptome profiling and whole genome re-sequencing approaches and provided a systematic description of soybean SWEET genes and identified putative candidates with probable roles in the reproductive tissue development. Gene expression profiling at different developmental stages and genomic variation data will aid as an important resource for the soybean research

  19. Genetic signatures of adaptation revealed from transcriptome sequencing of Arctic and red foxes

    OpenAIRE

    Kumar, Vikas; Kutschera, Verena E.; Nilsson, Maria A.; Janke, Axel

    2015-01-01

    Background The genus Vulpes (true foxes) comprises numerous species that inhabit a wide range of habitats and climatic conditions, including one species, the Arctic fox (Vulpes lagopus) which is adapted to the arctic region. A close relative to the Arctic fox, the red fox (Vulpes vulpes), occurs in subarctic to subtropical habitats. To study the genetic basis of their adaptations to different environments, transcriptome sequences from two Arctic foxes and one red fox individual were generated...

  20. Comparative transcriptome sequencing and de novo analysis of Vaccinium corymbosum during fruit and color development.

    Science.gov (United States)

    Li, Lingli; Zhang, Hehua; Liu, Zhongshuai; Cui, Xiaoyue; Zhang, Tong; Li, Yanfang; Zhang, Lingyun

    2016-10-12

    Blueberry is an economically important fruit crop in Ericaceae family. The substantial quantities of flavonoids in blueberry have been implicated in a broad range of health benefits. However, the information regarding fruit development and flavonoid metabolites based on the transcriptome level is still limited. In the present study, the transcriptome and gene expression profiling over berry development, especially during color development were initiated. A total of approximately 13.67 Gbp of data were obtained and assembled into 186,962 transcripts and 80,836 unigenes from three stages of blueberry fruit and color development. A large number of simple sequence repeats (SSRs) and candidate genes, which are potentially involved in plant development, metabolic and hormone pathways, were identified. A total of 6429 sequences containing 8796 SSRs were characterized from 15,457 unigenes and 1763 unigenes contained more than one SSR. The expression profiles of key genes involved in anthocyanin biosynthesis were also studied. In addition, a comparison between our dataset and other published results was carried out. Our high quality reads produced in this study are an important advancement and provide a new resource for the interpretation of high-throughput data for blueberry species whether regarding sequencing data depth or species extension. The use of this transcriptome data will serve as a valuable public information database for the studies of blueberry genome and would greatly boost the research of fruit and color development, flavonoid metabolisms and regulation and breeding of more healthful blueberries.

  1. Transcriptome sequencing and annotation for the Jamaican fruit bat (Artibeus jamaicensis.

    Directory of Open Access Journals (Sweden)

    Timothy I Shaw

    Full Text Available The Jamaican fruit bat (Artibeus jamaicensis is one of the most common bats in the tropical Americas. It is thought to be a potential reservoir host of Tacaribe virus, an arenavirus closely related to the South American hemorrhagic fever viruses. We performed transcriptome sequencing and annotation from lung, kidney and spleen tissues using 454 and Illumina platforms to develop this species as an animal model. More than 100,000 contigs were assembled, with 25,000 genes that were functionally annotated. Of the remaining unannotated contigs, 80% were found within bat genomes or transcriptomes. Annotated genes are involved in a broad range of activities ranging from cellular metabolism to genome regulation through ncRNAs. Reciprocal BLAST best hits yielded 8,785 sequences that are orthologous to mouse, rat, cattle, horse and human. Species tree analysis of sequences from 2,378 loci was used to achieve 95% bootstrap support for the placement of bat as sister to the clade containing horse, dog, and cattle. Through substitution rate estimation between bat and human, 32 genes were identified with evidence for positive selection. We also identified 466 immune-related genes, which may be useful for studying Tacaribe virus infection of this species. The Jamaican fruit bat transcriptome dataset is a resource that should provide additional candidate markers for studying bat evolution and ecology, and tools for analysis of the host response and pathology of disease.

  2. De novo assembly, characterization and functional annotation of pineapple fruit transcriptome through massively parallel sequencing.

    Science.gov (United States)

    Ong, Wen Dee; Voo, Lok-Yung Christopher; Kumar, Vijay Subbiah

    2012-01-01

    Pineapple (Ananas comosus var. comosus), is an important tropical non-climacteric fruit with high commercial potential. Understanding the mechanism and processes underlying fruit ripening would enable scientists to enhance the improvement of quality traits such as, flavor, texture, appearance and fruit sweetness. Although, the pineapple is an important fruit, there is insufficient transcriptomic or genomic information that is available in public databases. Application of high throughput transcriptome sequencing to profile the pineapple fruit transcripts is therefore needed. To facilitate this, we have performed transcriptome sequencing of ripe yellow pineapple fruit flesh using Illumina technology. About 4.7 millions Illumina paired-end reads were generated and assembled using the Velvet de novo assembler. The assembly produced 28,728 unique transcripts with a mean length of approximately 200 bp. Sequence similarity search against non-redundant NCBI database identified a total of 16,932 unique transcripts (58.93%) with significant hits. Out of these, 15,507 unique transcripts were assigned to gene ontology terms. Functional annotation against Kyoto Encyclopedia of Genes and Genomes pathway database identified 13,598 unique transcripts (47.33%) which were mapped to 126 pathways. The assembly revealed many transcripts that were previously unknown. The unique transcripts derived from this work have rapidly increased of the number of the pineapple fruit mRNA transcripts as it is now available in public databases. This information can be further utilized in gene expression, genomics and other functional genomics studies in pineapple.

  3. "Transcriptomics": molecular diagnosis of inborn errors of metabolism via RNA-sequencing.

    Science.gov (United States)

    Kremer, Laura S; Wortmann, Saskia B; Prokisch, Holger

    2018-01-25

    Exome wide sequencing techniques have revolutionized molecular diagnostics in patients with suspected inborn errors of metabolism or neuromuscular disorders. However, the diagnostic yield of 25-60% still leaves a large fraction of individuals without a diagnosis. This indicates a causative role for non-exonic regulatory variants not covered by whole exome sequencing. Here we review how systematic RNA-sequencing analysis (RNA-seq, "transcriptomics") lead to a molecular diagnosis in 10-35% of patients in whom whole exome sequencing failed to do so. Importantly, RNA-sequencing based discoveries cannot only guide molecular diagnosis but might also unravel therapeutic intervention points such as antisense oligonucleotide treatment for splicing defects as recently reported for spinal muscular atrophy.

  4. De novo sequencing, assembly and characterization of antennal transcriptome of Anomala corpulenta Motschulsky (Coleoptera: Rutelidae.

    Directory of Open Access Journals (Sweden)

    Haoliang Chen

    Full Text Available Anomala corpulenta is an important insect pest and can cause enormous economic losses in agriculture, horticulture and forestry. It is widely distributed in China, and both larvae and adults can cause serious damage. It is difficult to control this pest because the larvae live underground. Any new control strategy should exploit alternatives to heavily and frequently used chemical insecticides. However, little genetic research has been carried out on A. corpulenta due to the lack of genomic resources. Genomic resources could be produced by next generation sequencing technologies with low cost and in a short time. In this study, we performed de novo sequencing, assembly and characterization of the antennal transcriptome of A. corpulenta.Illumina sequencing technology was used to sequence the antennal transcriptome of A. corpulenta. Approximately 76.7 million total raw reads and about 68.9 million total clean reads were obtained, and then 35,656 unigenes were assembled. Of these unigenes, 21,463 of them could be annotated in the NCBI nr database, and, among the annotated unigenes, 11,154 and 6,625 unigenes could be assigned to GO and COG, respectively. Additionally, 16,350 unigenes could be annotated in the Swiss-Prot database, and 14,499 unigenes could map onto 258 pathways in the KEGG Pathway database. We also found 24 unigenes related to OBPs, 6 to CSPs, and in total 167 unigenes related to chemodetection. We analyzed 4 OBPs and 3CSPs sequences and their RT-qPCR results agreed well with their FPKM values.We produced the first large-scale antennal transcriptome of A. corpulenta, which is a species that has little genomic information in public databases. The identified chemodetection unigenes can promote the molecular mechanistic study of behavior in A. corpulenta. These findings provide a general sequence resource for molecular genetics research on A. corpulenta.

  5. Optimizing and benchmarking de novo transcriptome sequencing: from library preparation to assembly evaluation.

    Science.gov (United States)

    Hara, Yuichiro; Tatsumi, Kaori; Yoshida, Michio; Kajikawa, Eriko; Kiyonari, Hiroshi; Kuraku, Shigehiro

    2015-11-18

    RNA-seq enables gene expression profiling in selected spatiotemporal windows and yields massive sequence information with relatively low cost and time investment, even for non-model species. However, there remains a large room for optimizing its workflow, in order to take full advantage of continuously developing sequencing capacity. Transcriptome sequencing for three embryonic stages of Madagascar ground gecko (Paroedura picta) was performed with the Illumina platform. The output reads were assembled de novo for reconstructing transcript sequences. In order to evaluate the completeness of transcriptome assemblies, we prepared a reference gene set consisting of vertebrate one-to-one orthologs. To take advantage of increased read length of >150 nt, we demonstrated shortened RNA fragmentation time, which resulted in a dramatic shift of insert size distribution. To evaluate products of multiple de novo assembly runs incorporating reads with different RNA sources, read lengths, and insert sizes, we introduce a new reference gene set, core vertebrate genes (CVG), consisting of 233 genes that are shared as one-to-one orthologs by all vertebrate genomes examined (29 species)., The completeness assessment performed by the computational pipelines CEGMA and BUSCO referring to CVG, demonstrated higher accuracy and resolution than with the gene set previously established for this purpose. As a result of the assessment with CVG, we have derived the most comprehensive transcript sequence set of the Madagascar ground gecko by means of assembling individual libraries followed by clustering the assembled sequences based on their overall similarities. Our results provide several insights into optimizing de novo RNA-seq workflow, including the coordination between library insert size and read length, which manifested in improved connectivity of assemblies. The approach and assembly assessment with CVG demonstrated here would be applicable to transcriptome analysis of other species as

  6. Transcriptomic analysis of Petunia hybrida in response to salt stress using high throughput RNA sequencing.

    Directory of Open Access Journals (Sweden)

    Gonzalo H Villarino

    Full Text Available Salinity and drought stress are the primary cause of crop losses worldwide. In sodic saline soils sodium chloride (NaCl disrupts normal plant growth and development. The complex interactions of plant systems with abiotic stress have made RNA sequencing a more holistic and appealing approach to study transcriptome level responses in a single cell and/or tissue. In this work, we determined the Petunia transcriptome response to NaCl stress by sequencing leaf samples and assembling 196 million Illumina reads with Trinity software. Using our reference transcriptome we identified more than 7,000 genes that were differentially expressed within 24 h of acute NaCl stress. The proposed transcriptome can also be used as an excellent tool for biological and bioinformatics in the absence of an available Petunia genome and it is available at the SOL Genomics Network (SGN http://solgenomics.net. Genes related to regulation of reactive oxygen species, transport, and signal transductions as well as novel and undescribed transcripts were among those differentially expressed in response to salt stress. The candidate genes identified in this study can be applied as markers for breeding or to genetically engineer plants to enhance salt tolerance. Gene Ontology analyses indicated that most of the NaCl damage happened at 24 h inducing genotoxicity, affecting transport and organelles due to the high concentration of Na+ ions. Finally, we report a modification to the library preparation protocol whereby cDNA samples were bar-coded with non-HPLC purified primers, without affecting the quality and quantity of the RNA-seq data. The methodological improvement presented here could substantially reduce the cost of sample preparation for future high-throughput RNA sequencing experiments.

  7. Transcriptomic analysis of Petunia hybrida in response to salt stress using high throughput RNA sequencing.

    Science.gov (United States)

    Villarino, Gonzalo H; Bombarely, Aureliano; Giovannoni, James J; Scanlon, Michael J; Mattson, Neil S

    2014-01-01

    Salinity and drought stress are the primary cause of crop losses worldwide. In sodic saline soils sodium chloride (NaCl) disrupts normal plant growth and development. The complex interactions of plant systems with abiotic stress have made RNA sequencing a more holistic and appealing approach to study transcriptome level responses in a single cell and/or tissue. In this work, we determined the Petunia transcriptome response to NaCl stress by sequencing leaf samples and assembling 196 million Illumina reads with Trinity software. Using our reference transcriptome we identified more than 7,000 genes that were differentially expressed within 24 h of acute NaCl stress. The proposed transcriptome can also be used as an excellent tool for biological and bioinformatics in the absence of an available Petunia genome and it is available at the SOL Genomics Network (SGN) http://solgenomics.net. Genes related to regulation of reactive oxygen species, transport, and signal transductions as well as novel and undescribed transcripts were among those differentially expressed in response to salt stress. The candidate genes identified in this study can be applied as markers for breeding or to genetically engineer plants to enhance salt tolerance. Gene Ontology analyses indicated that most of the NaCl damage happened at 24 h inducing genotoxicity, affecting transport and organelles due to the high concentration of Na+ ions. Finally, we report a modification to the library preparation protocol whereby cDNA samples were bar-coded with non-HPLC purified primers, without affecting the quality and quantity of the RNA-seq data. The methodological improvement presented here could substantially reduce the cost of sample preparation for future high-throughput RNA sequencing experiments.

  8. Transcriptome profiling of testis during sexual maturation stages in Eriocheir sinensis using Illumina sequencing.

    Directory of Open Access Journals (Sweden)

    Lin He

    Full Text Available The testis is a highly specialized tissue that plays dual roles in ensuring fertility by producing spermatozoa and hormones. Spermatogenesis is a complex process, resulting in the production of mature sperm from primordial germ cells. Significant structural and biochemical changes take place in the seminiferous epithelium of the adult testis during spermatogenesis. The gene expression pattern of testis in Chinese mitten crab (Eriocheir sinensis has not been extensively studied, and limited genetic research has been performed on this species. The advent of high-throughput sequencing technologies enables the generation of genomic resources within a short period of time and at minimal cost. In the present study, we performed de novo transcriptome sequencing to produce a comprehensive transcript dataset for testis of E. sinensis. In two runs, we produced 25,698,778 sequencing reads corresponding with 2.31 Gb total nucleotides. These reads were assembled into 342,753 contigs or 141,861 scaffold sequences, which identified 96,311 unigenes. Based on similarity searches with known proteins, 39,995 unigenes were annotated based on having a Blast hit in the non-redundant database or ESTscan results with a cut-off E-value above 10(-5. This is the first report of a mitten crab transcriptome using high-throughput sequencing technology, and all these testes transcripts can help us understand the molecular mechanisms involved in spermatogenesis and testis maturation.

  9. Evaluating Methods for Isolating Total RNA and Predicting the Success of Sequencing Phylogenetically Diverse Plant Transcriptomes

    Science.gov (United States)

    Bruskiewich, Richard; Burris, Jason N.; Carrigan, Charlotte T.; Chase, Mark W.; Clarke, Neil D.; Covshoff, Sarah; dePamphilis, Claude W.; Edger, Patrick P.; Goh, Falicia; Graham, Sean; Greiner, Stephan; Hibberd, Julian M.; Jordon-Thaden, Ingrid; Kutchan, Toni M.; Leebens-Mack, James; Melkonian, Michael; Miles, Nicholas; Myburg, Henrietta; Patterson, Jordan; Pires, J. Chris; Ralph, Paula; Rolf, Megan; Sage, Rowan F.; Soltis, Douglas; Soltis, Pamela; Stevenson, Dennis; Stewart, C. Neal; Surek, Barbara; Thomsen, Christina J. M.; Villarreal, Juan Carlos; Wu, Xiaolei; Zhang, Yong; Deyholos, Michael K.; Wong, Gane Ka-Shu

    2012-01-01

    Next-generation sequencing plays a central role in the characterization and quantification of transcriptomes. Although numerous metrics are purported to quantify the quality of RNA, there have been no large-scale empirical evaluations of the major determinants of sequencing success. We used a combination of existing and newly developed methods to isolate total RNA from 1115 samples from 695 plant species in 324 families, which represents >900 million years of phylogenetic diversity from green algae through flowering plants, including many plants of economic importance. We then sequenced 629 of these samples on Illumina GAIIx and HiSeq platforms and performed a large comparative analysis to identify predictors of RNA quality and the diversity of putative genes (scaffolds) expressed within samples. Tissue types (e.g., leaf vs. flower) varied in RNA quality, sequencing depth and the number of scaffolds. Tissue age also influenced RNA quality but not the number of scaffolds ≥1000 bp. Overall, 36% of the variation in the number of scaffolds was explained by metrics of RNA integrity (RIN score), RNA purity (OD 260/230), sequencing platform (GAIIx vs HiSeq) and the amount of total RNA used for sequencing. However, our results show that the most commonly used measures of RNA quality (e.g., RIN) are weak predictors of the number of scaffolds because Illumina sequencing is robust to variation in RNA quality. These results provide novel insight into the methods that are most important in isolating high quality RNA for sequencing and assembling plant transcriptomes. The methods and recommendations provided here could increase the efficiency and decrease the cost of RNA sequencing for individual labs and genome centers. PMID:23185583

  10. Sequencing and de novo transcriptome assembly of the Chinese giant salamander (Andrias davidianus

    Directory of Open Access Journals (Sweden)

    Yong Huang

    2017-06-01

    Full Text Available Next-generation technologies for determination of genomics and transcriptomics composition have a wide range of applications. Andrias davidianus, has become an endangered amphibian species of salamander endemic in China. However, there is a lack of the molecular information. In this study, we obtained the RNA-Seq data from a pool of A. davidianus tissue including spleen, liver, muscle, kidney, skin, testis, gut and heart using Illumina HiSeq 2500 platform. A total of 15,398,997,600 bp were obtained, corresponding to 102,659,984 raw reads. A total of 102,659,984 reads were filtered after removing low-quality reads and trimming the adapter sequences. The Trinity program was used to de novo assemble 132,912 unigenes with an average length of 690 bp and N50 of 1263 bp. Unigenes were annotated through number of databases. These transcriptomic data of A. davidianus should open the door to molecular evolution studies based on the entire transcriptome or targeted genes of interest to sequence. The raw data in this study can be available in NCBI SRA database with accession number of SRP099564.

  11. The Pseudomonas aeruginosa transcriptome in planktonic cultures and static biofilms using RNA sequencing.

    Directory of Open Access Journals (Sweden)

    Andreas Dötsch

    Full Text Available In this study, we evaluated how gene expression differs in mature Pseudomonas aeruginosa biofilms as opposed to planktonic cells by the use of RNA sequencing technology that gives rise to both quantitative and qualitative information on the transcriptome. Although a large proportion of genes were consistently regulated in both the stationary phase and biofilm cultures as opposed to the late exponential growth phase cultures, the global biofilm gene expression pattern was clearly distinct indicating that biofilms are not just surface attached cells in stationary phase. A large amount of the genes found to be biofilm specific were involved in adaptation to microaerophilic growth conditions, repression of type three secretion and production of extracellular matrix components. Additionally, we found many small RNAs to be differentially regulated most of them similarly in stationary phase cultures and biofilms. A qualitative analysis of the RNA-seq data revealed more than 3000 putative transcriptional start sites (TSS. By the use of rapid amplification of cDNA ends (5'-RACE we confirmed the presence of three different TSS associated with the pqsABCDE operon, two in the promoter of pqsA and one upstream of the second gene, pqsB. Taken together, this study reports the first transcriptome study on P. aeruginosa that employs RNA sequencing technology and provides insights into the quantitative and qualitative transcriptome including the expression of small RNAs in P. aeruginosa biofilms.

  12. Transcriptome sequencing of two phenotypic mosaic Eucalyptus trees reveals large scale transcriptome re-modelling.

    Directory of Open Access Journals (Sweden)

    Amanda Padovan

    Full Text Available Phenotypic mosaic trees offer an ideal system for studying differential gene expression. We have investigated two mosaic eucalypt trees from two closely related species (Eucalyptus melliodora and E. sideroxylon, which each support two types of leaves: one part of the canopy is resistant to insect herbivory and the remaining leaves are susceptible. Driving this ecological distinction are differences in plant secondary metabolites. We used these phenotypic mosaics to investigate genome wide patterns of foliar gene expression with the aim of identifying patterns of differential gene expression and the somatic mutation(s that lead to this phenotypic mosaicism. We sequenced the mRNA pool from leaves of the resistant and susceptible ecotypes from both mosaic eucalypts using the Illumina HiSeq 2000 platform. We found large differences in pathway regulation and gene expression between the ecotypes of each mosaic. The expression of the genes in the MVA and MEP pathways is reflected by variation in leaf chemistry, however this is not the case for the terpene synthases. Apart from the terpene biosynthetic pathway, there are several other metabolic pathways that are differentially regulated between the two ecotypes, suggesting there is much more phenotypic diversity than has been described. Despite the close relationship between the two species, they show large differences in the global patterns of gene and pathway regulation.

  13. RNA sequencing of the exercise transcriptome in equine athletes.

    Directory of Open Access Journals (Sweden)

    Stefano Capomaccio

    Full Text Available The horse is an optimal model organism for studying the genomic response to exercise-induced stress, due to its natural aptitude for athletic performance and the relative homogeneity of its genetic and environmental backgrounds. Here, we applied RNA-sequencing analysis through the use of SOLiD technology in an experimental framework centered on exercise-induced stress during endurance races in equine athletes. We monitored the transcriptional landscape by comparing gene expression levels between animals at rest and after competition. Overall, we observed a shift from coding to non-coding regions, suggesting that the stress response involves the differential expression of not annotated regions. Notably, we observed significant post-race increases of reads that correspond to repeats, especially the intergenic and intronic L1 and L2 transposable elements. We also observed increased expression of the antisense strands compared to the sense strands in intronic and regulatory regions (1 kb up- and downstream of the genes, suggesting that antisense transcription could be one of the main mechanisms for transposon regulation in the horse under stress conditions. We identified a large number of transcripts corresponding to intergenic and intronic regions putatively associated with new transcriptional elements. Gene expression and pathway analysis allowed us to identify several biological processes and molecular functions that may be involved with exercise-induced stress. Ontology clustering reflected mechanisms that are already known to be stress activated (e.g., chemokine-type cytokines, Toll-like receptors, and kinases, as well as "nucleic acid binding" and "signal transduction activity" functions. There was also a general and transient decrease in the global rates of protein synthesis, which would be expected after strenuous global stress. In sum, our network analysis points toward the involvement of specific gene clusters in equine exercise

  14. Sequencing and Characterization of Divergent Marbling Levels in the Beef Cattle ( Muscle Transcriptome

    Directory of Open Access Journals (Sweden)

    Dong Chen

    2015-02-01

    Full Text Available Marbling is an important trait regarding the quality of beef. Analysis of beef cattle transcriptome and its expression profile data are essential to extend the genetic information resources and would support further studies on beef cattle. RNA sequencing was performed in beef cattle using the Illumina High-Seq2000 platform. Approximately 251.58 million clean reads were generated from a high marbling (H group and low marbling (L group. Approximately 80.12% of the 19,994 bovine genes (protein coding were detected in all samples, and 749 genes exhibited differential expression between the H and L groups based on fold change (>1.5-fold, p<0.05. Multiple gene ontology terms and biological pathways were found significantly enriched among the differentially expressed genes. The transcriptome data will facilitate future functional studies on marbling formation in beef cattle and may be applied to improve breeding programs for cattle and closely related mammals.

  15. Deep Sequencing Reveals Uncharted Isoform Heterogeneity of the Protein-Coding Transcriptome in Cerebral Ischemia.

    Science.gov (United States)

    Bhattarai, Sunil; Aly, Ahmed; Garcia, Kristy; Ruiz, Diandra; Pontarelli, Fabrizio; Dharap, Ashutosh

    2018-06-03

    Gene expression in cerebral ischemia has been a subject of intense investigations for several years. Studies utilizing probe-based high-throughput methodologies such as microarrays have contributed significantly to our existing knowledge but lacked the capacity to dissect the transcriptome in detail. Genome-wide RNA-sequencing (RNA-seq) enables comprehensive examinations of transcriptomes for attributes such as strandedness, alternative splicing, alternative transcription start/stop sites, and sequence composition, thus providing a very detailed account of gene expression. Leveraging this capability, we conducted an in-depth, genome-wide evaluation of the protein-coding transcriptome of the adult mouse cortex after transient focal ischemia at 6, 12, or 24 h of reperfusion using RNA-seq. We identified a total of 1007 transcripts at 6 h, 1878 transcripts at 12 h, and 1618 transcripts at 24 h of reperfusion that were significantly altered as compared to sham controls. With isoform-level resolution, we identified 23 splice variants arising from 23 genes that were novel mRNA isoforms. For a subset of genes, we detected reperfusion time-point-dependent splice isoform switching, indicating an expression and/or functional switch for these genes. Finally, for 286 genes across all three reperfusion time-points, we discovered multiple, distinct, simultaneously expressed and differentially altered isoforms per gene that were generated via alternative transcription start/stop sites. Of these, 165 isoforms derived from 109 genes were novel mRNAs. Together, our data unravel the protein-coding transcriptome of the cerebral cortex at an unprecedented depth to provide several new insights into the flexibility and complexity of stroke-related gene transcription and transcript organization.

  16. Development of genic-SSR markers by deep transcriptome sequencing in pigeonpea [Cajanus cajan (L.) Millspaugh

    Science.gov (United States)

    2011-01-01

    Background Pigeonpea [Cajanus cajan (L.) Millspaugh], one of the most important food legumes of semi-arid tropical and subtropical regions, has limited genomic resources, particularly expressed sequence based (genic) markers. We report a comprehensive set of validated genic simple sequence repeat (SSR) markers using deep transcriptome sequencing, and its application in genetic diversity analysis and mapping. Results In this study, 43,324 transcriptome shotgun assembly unigene contigs were assembled from 1.696 million 454 GS-FLX sequence reads of separate pooled cDNA libraries prepared from leaf, root, stem and immature seed of two pigeonpea varieties, Asha and UPAS 120. A total of 3,771 genic-SSR loci, excluding homopolymeric and compound repeats, were identified; of which 2,877 PCR primer pairs were designed for marker development. Dinucleotide was the most common repeat motif with a frequency of 60.41%, followed by tri- (34.52%), hexa- (2.62%), tetra- (1.67%) and pentanucleotide (0.76%) repeat motifs. Primers were synthesized and tested for 772 of these loci with repeat lengths of ≥18 bp. Of these, 550 markers were validated for consistent amplification in eight diverse pigeonpea varieties; 71 were found to be polymorphic on agarose gel electrophoresis. Genetic diversity analysis was done on 22 pigeonpea varieties and eight wild species using 20 highly polymorphic genic-SSR markers. The number of alleles at these loci ranged from 4-10 and the polymorphism information content values ranged from 0.46 to 0.72. Neighbor-joining dendrogram showed distinct separation of the different groups of pigeonpea cultivars and wild species. Deep transcriptome sequencing of the two parental lines helped in silico identification of polymorphic genic-SSR loci to facilitate the rapid development of an intra-species reference genetic map, a subset of which was validated for expected allelic segregation in the reference mapping population. Conclusion We developed 550 validated genic

  17. Transcriptome sequencing of the Antarctic vascular plant Deschampsia antarctica Desv. under abiotic stress.

    Science.gov (United States)

    Lee, Jungeun; Noh, Eun Kyeung; Choi, Hyung-Seok; Shin, Seung Chul; Park, Hyun; Lee, Hyoungseok

    2013-03-01

    Antarctic hairgrass (Deschampsia antarctica Desv.) is the only natural grass species in the maritime Antarctic. It has been studied as an extremophile that has successfully adapted to marginal land with the harshest environment for terrestrial plants. However, limited genetic research has focused on this species due to the lack of genomic resources. Here, we present the first de novo assembly of its transcriptome by massive parallel sequencing and its expression profile using D. antarctica grown under various stress conditions. Total sequence reads generated by pyrosequencing were assembled into 60,765 unigenes (28,177 contigs and 32,588 singletons). A total of 29,173 unique protein-coding genes were identified based on sequence similarities to known proteins. The combined results from all three stress conditions indicated differential expression of 3,110 genes. Quantitative reverse transcription polymerase chain reaction showed that several well-known stress-responsive genes encoding late embryogenesis abundant protein, dehydrin 1, and ice recrystallization inhibition protein were induced dramatically and that genes encoding U-box-domain-containing protein, electron transfer flavoprotein-ubiquinone, and F-box-containing protein were induced by abiotic stressors in a manner conserved with other plant species. We identified more than 2,000 simple sequence repeats that can be developed as functional molecular markers. This dataset is the most comprehensive transcriptome resource currently available for D. antarctica and is therefore expected to be an important foundation for future genetic studies of grasses and extremophiles.

  18. Deep RNA sequencing of the skeletal muscle transcriptome in swimming fish.

    Directory of Open Access Journals (Sweden)

    Arjan P Palstra

    Full Text Available Deep RNA sequencing (RNA-seq was performed to provide an in-depth view of the transcriptome of red and white skeletal muscle of exercised and non-exercised rainbow trout (Oncorhynchus mykiss with the specific objective to identify expressed genes and quantify the transcriptomic effects of swimming-induced exercise. Pubertal autumn-spawning seawater-raised female rainbow trout were rested (n = 10 or swum (n = 10 for 1176 km at 0.75 body-lengths per second in a 6,000-L swim-flume under reproductive conditions for 40 days. Red and white muscle RNA of exercised and non-exercised fish (4 lanes was sequenced and resulted in 15-17 million reads per lane that, after de novo assembly, yielded 149,159 red and 118,572 white muscle contigs. Most contigs were annotated using an iterative homology search strategy against salmonid ESTs, the zebrafish Danio rerio genome and general Metazoan genes. When selecting for large contigs (>500 nucleotides, a number of novel rainbow trout gene sequences were identified in this study: 1,085 and 1,228 novel gene sequences for red and white muscle, respectively, which included a number of important molecules for skeletal muscle function. Transcriptomic analysis revealed that sustained swimming increased transcriptional activity in skeletal muscle and specifically an up-regulation of genes involved in muscle growth and developmental processes in white muscle. The unique collection of transcripts will contribute to our understanding of red and white muscle physiology, specifically during the long-term reproductive migration of salmonids.

  19. Sequencing and Characterization of the Invasive Sycamore Lace Bug Corythucha ciliata (Hemiptera: Tingidae) Transcriptome

    Science.gov (United States)

    Qu, Cheng; Fu, Ningning; Xu, Yihua

    2016-01-01

    The sycamore lace bug, Corythucha ciliata (Hemiptera: Tingidae), is an invasive forestry pest rapidly expanding in many countries. This pest poses a considerable threat to the urban forestry ecosystem, especially to Platanus spp. However, its molecular biology and biochemistry are poorly understood. This study reports the first C. ciliata transcriptome, encompassing three different life stages (Nymphs, adults female (AF) and adults male (AM)). In total, 26.53 GB of clean data and 60,879 unigenes were obtained from three RNA-seq libraries. These unigenes were annotated and classified by Nr (NCBI non-redundant protein sequences), Nt (NCBI non-redundant nucleotide sequences), Pfam (Protein family), KOG/COG (Clusters of Orthologous Groups of proteins), Swiss-Prot (A manually annotated and reviewed protein sequence database), and KO (KEGG Ortholog database). After all pairwise comparisons between these three different samples, a large number of differentially expressed genes were revealed. The dramatic differences in global gene expression profiles were found between distinct life stages (nymphs and AF, nymphs and AM) and sex difference (AF and AM), with some of the significantly differentially expressed genes (DEGs) being related to metamorphosis, digestion, immune and sex difference. The different express of unigenes were validated through quantitative Real-Time PCR (qRT-PCR) for 16 randomly selected unigenes. In addition, 17,462 potential simple sequence repeat molecular markers were identified in these transcriptome resources. These comprehensive C. ciliata transcriptomic information can be utilized to promote the development of environmentally friendly methodologies to disrupt the processes of metamorphosis, digestion, immune and sex differences. PMID:27494615

  20. Analysis of Litopenaeus vannamei transcriptome using the next-generation DNA sequencing technique.

    Directory of Open Access Journals (Sweden)

    Chaozheng Li

    Full Text Available BACKGROUND: Pacific white shrimp (Litopenaeus vannamei, the major species of farmed shrimps in the world, has been attracting extensive studies, which require more and more genome background knowledge. The now available transcriptome data of L. vannamei are insufficient for research requirements, and have not been adequately assembled and annotated. METHODOLOGY/PRINCIPAL FINDINGS: This is the first study that used a next-generation high-throughput DNA sequencing technique, the Solexa/Illumina GA II method, to analyze the transcriptome from whole bodies of L. vannamei larvae. More than 2.4 Gb of raw data were generated, and 109,169 unigenes with a mean length of 396 bp were assembled using the SOAP denovo software. 73,505 unigenes (>200 bp with good quality sequences were selected and subjected to annotation analysis, among which 37.80% can be matched in NCBI Nr database, 37.3% matched in Swissprot, and 44.1% matched in TrEMBL. Using BLAST and BLAST2Go softwares, 11,153 unigenes were classified into 25 Clusters of Orthologous Groups of proteins (COG categories, 8171 unigenes were assigned into 51 Gene ontology (GO functional groups, and 18,154 unigenes were divided into 220 Kyoto Encyclopedia of Genes and Genomes (KEGG pathways. To primarily verify part of the results of assembly and annotations, 12 assembled unigenes that are homologous to many embryo development-related genes were chosen and subjected to RT-PCR for electrophoresis and Sanger sequencing analyses, and to real-time PCR for expression profile analyses during embryo development. CONCLUSIONS/SIGNIFICANCE: The L. vannamei transcriptome analyzed using the next-generation sequencing technique enriches the information of L. vannamei genes, which will facilitate our understanding of the genome background of crustaceans, and promote the studies on L. vannamei.

  1. Alignment-free Transcriptomic and Metatranscriptomic Comparison Using Sequencing Signatures with Variable Length Markov Chains.

    Science.gov (United States)

    Liao, Weinan; Ren, Jie; Wang, Kun; Wang, Shun; Zeng, Feng; Wang, Ying; Sun, Fengzhu

    2016-11-23

    The comparison between microbial sequencing data is critical to understand the dynamics of microbial communities. The alignment-based tools analyzing metagenomic datasets require reference sequences and read alignments. The available alignment-free dissimilarity approaches model the background sequences with Fixed Order Markov Chain (FOMC) yielding promising results for the comparison of microbial communities. However, in FOMC, the number of parameters grows exponentially with the increase of the order of Markov Chain (MC). Under a fixed high order of MC, the parameters might not be accurately estimated owing to the limitation of sequencing depth. In our study, we investigate an alternative to FOMC to model background sequences with the data-driven Variable Length Markov Chain (VLMC) in metatranscriptomic data. The VLMC originally designed for long sequences was extended to apply to high-throughput sequencing reads and the strategies to estimate the corresponding parameters were developed. The flexible number of parameters in VLMC avoids estimating the vast number of parameters of high-order MC under limited sequencing depth. Different from the manual selection in FOMC, VLMC determines the MC order adaptively. Several beta diversity measures based on VLMC were applied to compare the bacterial RNA-Seq and metatranscriptomic datasets. Experiments show that VLMC outperforms FOMC to model the background sequences in transcriptomic and metatranscriptomic samples. A software pipeline is available at https://d2vlmc.codeplex.com.

  2. Transcriptome characterization of the South African abalone Haliotis midae using sequencing-by-synthesis

    Directory of Open Access Journals (Sweden)

    Roodt-Wilding Rouvay

    2011-03-01

    Full Text Available Abstract Background Worldwide, the genus Haliotis is represented by 56 extant species and several of these are commercially cultured. Among the six abalone species found in South Africa, Haliotis midae is the only aquacultured species. Despite its economic importance, genomic sequence resources for H. midae, and for abalone in general, are still scarce. Next generation sequencing technologies provide a fast and efficient tool to generate large sequence collections that can be used to characterize the transcriptome and identify expressed genes associated with economically important traits like growth and disease resistance. Results More than 25 million short reads generated by the Illumina Genome Analyzer were de novo assembled in 22,761 contigs with an average size of 260 bp. With a stringent E-value threshold of 10-10, 3,841 contigs (16.8% had a BLAST homologous match against the Genbank non-redundant (NR protein database. Most of these sequences were annotated using the gene ontology (GO and eukaryotic orthologous groups of proteins (KOG databases and assigned to various functional categories. According to annotation results, many gene families involved in immune response were identified. Thousands of simple sequence repeats (SSR and single nucleotide polymorphisms (SNP were detected. Setting stringent parameters to ensure a high probability of amplification, 420 primer pairs in 181 contigs containing SSR loci were designed. Conclusion This data represents the most comprehensive genomic resource for the South African abalone H. midae to date. The amount of assembled sequences demonstrated the utility of the Illumina sequencing technology in the transcriptome characterization of a non-model species. It allowed the development of several markers and the identification of promising candidate genes for future studies on population and functional genomics in H. midae and in other abalone species.

  3. Comparative Transcriptomes and EVO-DEVO Studies Depending on Next Generation Sequencing

    Directory of Open Access Journals (Sweden)

    Tiancheng Liu

    2015-01-01

    Full Text Available High throughput technology has prompted the progressive omics studies, including genomics and transcriptomics. We have reviewed the improvement of comparative omic studies, which are attributed to the high throughput measurement of next generation sequencing technology. Comparative genomics have been successfully applied to evolution analysis while comparative transcriptomics are adopted in comparison of expression profile from two subjects by differential expression or differential coexpression, which enables their application in evolutionary developmental biology (EVO-DEVO studies. EVO-DEVO studies focus on the evolutionary pressure affecting the morphogenesis of development and previous works have been conducted to illustrate the most conserved stages during embryonic development. Old measurements of these studies are based on the morphological similarity from macro view and new technology enables the micro detection of similarity in molecular mechanism. Evolutionary model of embryo development, which includes the “funnel-like” model and the “hourglass” model, has been evaluated by combination of these new comparative transcriptomic methods with prior comparative genomic information. Although the technology has promoted the EVO-DEVO studies into a new era, technological and material limitation still exist and further investigations require more subtle study design and procedure.

  4. Deep Sequencing of Porphyromonas gingivalis and comparative transcriptome analysis of a LuxS mutant

    Directory of Open Access Journals (Sweden)

    Takanoi eHirano

    2012-06-01

    Full Text Available Porphyromonas gingivalis is a major etiological agent and chronic and aggressive forms of periodontal disease. The organism is an assacharolytic anaerobe and is a constituent of mixed species biofilms in a variety of microenvironments in the oral cavity. P. gingivalis expresses a range of virulence factors over which it exerts tight control. High-throughput sequencing technologies provide the opportunity to relate functional genomics to basic biology. In this study we report qualitative and quantitative RNA-Seq analysis of the transcriptome of P. gingivalis. We have also applied RNA-Seq to the transcriptome of a ΔluxS mutant of P. gingivalis deficient in AI-2-mediated bacterial communication. The transcriptome analysis confirmed the expression of all predicted ORFs for strain ATCC 33277, including 854 hypothetical proteins, and allowed the identification of hitherto unknown transcriptional units. Twelve noncoding RNAs were identified, including 11 small RNAs and one cobalamine riboswitch. Fifty seven genes were differentially regulated in the LuxS mutant. Addition of exogenous synthetic 4,5-dihydroxy-2,3-pentanedione (DPD, AI-2 precursor to the ΔluxS mutant culture complemented expression of a subset of genes, indicating that LuxS is involved in both AI-2 signaling and non-signaling dependent systems in P. gingivalis. This work provides an important dataset for future study of P. gingivalis pathophysiology and further defines the LuxS regulon in this oral pathogen.

  5. Deep transcriptome sequencing provides new insights into the structural and functional organization of the wheat genome.

    Science.gov (United States)

    Pingault, Lise; Choulet, Frédéric; Alberti, Adriana; Glover, Natasha; Wincker, Patrick; Feuillet, Catherine; Paux, Etienne

    2015-02-10

    Because of its size, allohexaploid nature, and high repeat content, the bread wheat genome is a good model to study the impact of the genome structure on gene organization, function, and regulation. However, because of the lack of a reference genome sequence, such studies have long been hampered and our knowledge of the wheat gene space is still limited. The access to the reference sequence of the wheat chromosome 3B provided us with an opportunity to study the wheat transcriptome and its relationships to genome and gene structure at a level that has never been reached before. By combining this sequence with RNA-seq data, we construct a fine transcriptome map of the chromosome 3B. More than 8,800 transcription sites are identified, that are distributed throughout the entire chromosome. Expression level, expression breadth, alternative splicing as well as several structural features of genes, including transcript length, number of exons, and cumulative intron length are investigated. Our analysis reveals a non-monotonic relationship between gene expression and structure and leads to the hypothesis that gene structure is determined by its function, whereas gene expression is subject to energetic cost. Moreover, we observe a recombination-based partitioning at the gene structure and function level. Our analysis provides new insights into the relationships between gene and genome structure and function. It reveals mechanisms conserved with other plant species as well as superimposed evolutionary forces that shaped the wheat gene space, likely participating in wheat adaptation.

  6. Transcriptome sequencing and whole genome expression profiling of chrysanthemum under dehydration stress

    Science.gov (United States)

    2013-01-01

    Background Chrysanthemum is one of the most important ornamental crops in the world and drought stress seriously limits its production and distribution. In order to generate a functional genomics resource and obtain a deeper understanding of the molecular mechanisms regarding chrysanthemum responses to dehydration stress, we performed large-scale transcriptome sequencing of chrysanthemum plants under dehydration stress using the Illumina sequencing technology. Results Two cDNA libraries constructed from mRNAs of control and dehydration-treated seedlings were sequenced by Illumina technology. A total of more than 100 million reads were generated and de novo assembled into 98,180 unique transcripts which were further extensively annotated by comparing their sequencing to different protein databases. Biochemical pathways were predicted from these transcript sequences. Furthermore, we performed gene expression profiling analysis upon dehydration treatment in chrysanthemum and identified 8,558 dehydration-responsive unique transcripts, including 307 transcription factors and 229 protein kinases and many well-known stress responsive genes. Gene ontology (GO) term enrichment and biochemical pathway analyses showed that dehydration stress caused changes in hormone response, secondary and amino acid metabolism, and light and photoperiod response. These findings suggest that drought tolerance of chrysanthemum plants may be related to the regulation of hormone biosynthesis and signaling, reduction of oxidative damage, stabilization of cell proteins and structures, and maintenance of energy and carbon supply. Conclusions Our transcriptome sequences can provide a valuable resource for chrysanthemum breeding and research and novel insights into chrysanthemum responses to dehydration stress and offer candidate genes or markers that can be used to guide future studies attempting to breed drought tolerant chrysanthemum cultivars. PMID:24074255

  7. Integrated analysis of whole-exome sequencing and transcriptome profiling in males with autism spectrum disorders.

    Science.gov (United States)

    Codina-Solà, Marta; Rodríguez-Santiago, Benjamín; Homs, Aïda; Santoyo, Javier; Rigau, Maria; Aznar-Laín, Gemma; Del Campo, Miguel; Gener, Blanca; Gabau, Elisabeth; Botella, María Pilar; Gutiérrez-Arumí, Armand; Antiñolo, Guillermo; Pérez-Jurado, Luis Alberto; Cuscó, Ivon

    2015-01-01

    Autism spectrum disorders (ASD) are a group of neurodevelopmental disorders with high heritability. Recent findings support a highly heterogeneous and complex genetic etiology including rare de novo and inherited mutations or chromosomal rearrangements as well as double or multiple hits. We performed whole-exome sequencing (WES) and blood cell transcriptome by RNAseq in a subset of male patients with idiopathic ASD (n = 36) in order to identify causative genes, transcriptomic alterations, and susceptibility variants. We detected likely monogenic causes in seven cases: five de novo (SCN2A, MED13L, KCNV1, CUL3, and PTEN) and two inherited X-linked variants (MAOA and CDKL5). Transcriptomic analyses allowed the identification of intronic causative mutations missed by the usual filtering of WES and revealed functional consequences of some rare mutations. These included aberrant transcripts (PTEN, POLR3C), deregulated expression in 1.7% of mutated genes (that is, SEMA6B, MECP2, ANK3, CREBBP), allele-specific expression (FUS, MTOR, TAF1C), and non-sense-mediated decay (RIT1, ALG9). The analysis of rare inherited variants showed enrichment in relevant pathways such as the PI3K-Akt signaling and the axon guidance. Integrative analysis of WES and blood RNAseq data has proven to be an efficient strategy to identify likely monogenic forms of ASD (19% in our cohort), as well as additional rare inherited mutations that can contribute to ASD risk in a multifactorial manner. Blood transcriptomic data, besides validating 88% of expressed variants, allowed the identification of missed intronic mutations and revealed functional correlations of genetic variants, including changes in splicing, expression levels, and allelic expression.

  8. Characterization of the transcriptome, nucleotide sequence polymorphism, and natural selection in the desert adapted mouse Peromyscus eremicus

    Directory of Open Access Journals (Sweden)

    Matthew D. MacManes

    2014-10-01

    Full Text Available As a direct result of intense heat and aridity, deserts are thought to be among the most harsh of environments, particularly for their mammalian inhabitants. Given that osmoregulation can be challenging for these animals, with failure resulting in death, strong selection should be observed on genes related to the maintenance of water and solute balance. One such animal, Peromyscus eremicus, is native to the desert regions of the southwest United States and may live its entire life without oral fluid intake. As a first step toward understanding the genetics that underlie this phenotype, we present a characterization of the P. eremicus transcriptome. We assay four tissues (kidney, liver, brain, testes from a single individual and supplement this with population level renal transcriptome sequencing from 15 additional animals. We identified a set of transcripts undergoing both purifying and balancing selection based on estimates of Tajima’s D. In addition, we used the branch-site test to identify a transcript—Slc2a9, likely related to desert osmoregulation—undergoing enhanced selection in P. eremicus relative to a set of related non-desert rodents.

  9. Prevalence of single nucleotide polymorphism among 27 diverse alfalfa genotypes as assessed by transcriptome sequencing

    Directory of Open Access Journals (Sweden)

    Li Xuehui

    2012-10-01

    Full Text Available Abstract Background Alfalfa, a perennial, outcrossing species, is a widely planted forage legume producing highly nutritious biomass. Currently, improvement of cultivated alfalfa mainly relies on recurrent phenotypic selection. Marker assisted breeding strategies can enhance alfalfa improvement efforts, particularly if many genome-wide markers are available. Transcriptome sequencing enables efficient high-throughput discovery of single nucleotide polymorphism (SNP markers for a complex polyploid species. Result The transcriptomes of 27 alfalfa genotypes, including elite breeding genotypes, parents of mapping populations, and unimproved wild genotypes, were sequenced using an Illumina Genome Analyzer IIx. De novo assembly of quality-filtered 72-bp reads generated 25,183 contigs with a total length of 26.8 Mbp and an average length of 1,065 bp, with an average read depth of 55.9-fold for each genotype. Overall, 21,954 (87.2% of the 25,183 contigs represented 14,878 unique protein accessions. Gene ontology (GO analysis suggested that a broad diversity of genes was represented in the resulting sequences. The realignment of individual reads to the contigs enabled the detection of 872,384 SNPs and 31,760 InDels. High resolution melting (HRM analysis was used to validate 91% of 192 putative SNPs identified by sequencing. Both allelic variants at about 95% of SNP sites identified among five wild, unimproved genotypes are still present in cultivated alfalfa, and all four US breeding programs also contain a high proportion of these SNPs. Thus, little evidence exists among this dataset for loss of significant DNA sequence diversity from either domestication or breeding of alfalfa. Structure analysis indicated that individuals from the subspecies falcata, the diploid subspecies caerulea, and the tetraploid subspecies sativa (cultivated tetraploid alfalfa were clearly separated. Conclusion We used transcriptome sequencing to discover large numbers of SNPs

  10. High-resolution analysis of the 5'-end transcriptome using a next generation DNA sequencer.

    Directory of Open Access Journals (Sweden)

    Shin-ichi Hashimoto

    Full Text Available Massively parallel, tag-based sequencing systems, such as the SOLiD system, hold the promise of revolutionizing the study of whole genome gene expression due to the number of data points that can be generated in a simple and cost-effective manner. We describe the development of a 5'-end transcriptome workflow for the SOLiD system and demonstrate the advantages in sensitivity and dynamic range offered by this tag-based application over traditional approaches for the study of whole genome gene expression. 5'-end transcriptome analysis was used to study whole genome gene expression within a colon cancer cell line, HT-29, treated with the DNA methyltransferase inhibitor, 5-aza-2'-deoxycytidine (5Aza. More than 20 million 25-base 5'-end tags were obtained from untreated and 5Aza-treated cells and matched to sequences within the human genome. Seventy three percent of the mapped unique tags were associated with RefSeq cDNA sequences, corresponding to approximately 14,000 different protein-coding genes in this single cell type. The level of expression of these genes ranged from 0.02 to 4,704 transcripts per cell. The sensitivity of a single sequence run of the SOLiD platform was 100-1,000 fold greater than that observed from 5'end SAGE data generated from the analysis of 70,000 tags obtained by Sanger sequencing. The high-resolution 5'end gene expression profiling presented in this study will not only provide novel insight into the transcriptional machinery but should also serve as a basis for a better understanding of cell biology.

  11. Sequence protein identification by randomized sequence database and transcriptome mass spectrometry (SPIDER-TMS): from manual to automatic application of a 'de novo sequencing' approach.

    Science.gov (United States)

    Pascale, Raffaella; Grossi, Gerarda; Cruciani, Gabriele; Mecca, Giansalvatore; Santoro, Donatello; Sarli Calace, Renzo; Falabella, Patrizia; Bianco, Giuliana

    Sequence protein identification by a randomized sequence database and transcriptome mass spectrometry software package has been developed at the University of Basilicata in Potenza (Italy) and designed to facilitate the determination of the amino acid sequence of a peptide as well as an unequivocal identification of proteins in a high-throughput manner with enormous advantages of time, economical resource and expertise. The software package is a valid tool for the automation of a de novo sequencing approach, overcoming the main limits and a versatile platform useful in the proteomic field for an unequivocal identification of proteins, starting from tandem mass spectrometry data. The strength of this software is that it is a user-friendly and non-statistical approach, so protein identification can be considered unambiguous.

  12. De novo transcriptome sequencing and analysis of the cereal cyst nematode, Heterodera avenae.

    Directory of Open Access Journals (Sweden)

    Mukesh Kumar

    Full Text Available The cereal cyst nematode (CCN, Heterodera avenae is a major pest of wheat (Triticum spp that reduces crop yields in many countries. Cyst nematodes are obligate sedentary endoparasites that reproduce by amphimixis. Here, we report the first transcriptome analysis of two stages of H. avenae. After sequencing extracted RNA from pre parasitic infective juvenile and adult stages of the life cycle, 131 million Illumina high quality paired end reads were obtained which generated 27,765 contigs with N50 of 1,028 base pairs, of which 10,452 were annotated. Comparative analyses were undertaken to evaluate H. avenae sequences with those of other plant, animal and free living nematodes to identify differences in expressed genes. There were 4,431 transcripts common to H. avenae and the free living nematode Caenorhabditis elegans, and 9,462 in common with more closely related potato cyst nematode, Globodera pallida. Annotation of H. avenae carbohydrate active enzymes (CAZy revealed fewer glycoside hydrolases (GHs but more glycosyl transferases (GTs and carbohydrate esterases (CEs when compared to M. incognita. 1,280 transcripts were found to have secretory signature, presence of signal peptide and absence of transmembrane. In a comparison of genes expressed in the pre-parasitic juvenile and feeding female stages, expression levels of 30 genes with high RPKM (reads per base per kilo million value, were analysed by qRT-PCR which confirmed the observed differences in their levels of expression levels. In addition, we have also developed a user-friendly resource, Heterodera transcriptome database (HATdb for public access of the data generated in this study. The new data provided on the transcriptome of H. avenae adds to the genetic resources available to study plant parasitic nematodes and provides an opportunity to seek new effectors that are specifically involved in the H. avenae-cereal host interaction.

  13. Analyzing AbrB-Knockout Effects through Genome and Transcriptome Sequencing of Bacillus licheniformis DW2

    Science.gov (United States)

    Shu, Cheng-Cheng; Wang, Dong; Guo, Jing; Song, Jia-Ming; Chen, Shou-Wen; Chen, Ling-Ling; Gao, Jun-Xiang

    2018-01-01

    As an industrial bacterium, Bacillus licheniformis DW2 produces bacitracin which is an important antibiotic for many pathogenic microorganisms. Our previous study showed AbrB-knockout could significantly increase the production of bacitracin. Accordingly, it was meaningful to understand its genome features, expression differences between wild and AbrB-knockout (ΔAbrB) strains, and the regulation of bacitracin biosynthesis. Here, we sequenced, de novo assembled and annotated its genome, and also sequenced the transcriptomes in three growth phases. The genome of DW2 contained a DNA molecule of 4,468,952 bp with 45.93% GC content and 4,717 protein coding genes. The transcriptome reads were mapped to the assembled genome, and obtained 4,102∼4,536 expressed genes from different samples. We investigated transcription changes in B. licheniformis DW2 and showed that ΔAbrB caused hundreds of genes up-regulation and down-regulation in different growth phases. We identified a complete bacitracin synthetase gene cluster, including the location and length of bacABC, bcrABC, and bacT, as well as their arrangement. The gene cluster bcrABC were significantly up-regulated in ΔAbrB strain, which supported the hypothesis in previous study of bcrABC transporting bacitracin out of the cell to avoid self-intoxication, and was consistent with the previous experimental result that ΔAbrB could yield more bacitracin. This study provided a high quality reference genome for B. licheniformis DW2, and the transcriptome data depicted global alterations across two strains and three phases offered an understanding of AbrB regulation and bacitracin biosynthesis through gene expression. PMID:29599755

  14. De novo transcriptome sequencing and analysis of the cereal cyst nematode, Heterodera avenae.

    Science.gov (United States)

    Kumar, Mukesh; Gantasala, Nagavara Prasad; Roychowdhury, Tanmoy; Thakur, Prasoon Kumar; Banakar, Prakash; Shukla, Rohit N; Jones, Michael G K; Rao, Uma

    2014-01-01

    The cereal cyst nematode (CCN, Heterodera avenae) is a major pest of wheat (Triticum spp) that reduces crop yields in many countries. Cyst nematodes are obligate sedentary endoparasites that reproduce by amphimixis. Here, we report the first transcriptome analysis of two stages of H. avenae. After sequencing extracted RNA from pre parasitic infective juvenile and adult stages of the life cycle, 131 million Illumina high quality paired end reads were obtained which generated 27,765 contigs with N50 of 1,028 base pairs, of which 10,452 were annotated. Comparative analyses were undertaken to evaluate H. avenae sequences with those of other plant, animal and free living nematodes to identify differences in expressed genes. There were 4,431 transcripts common to H. avenae and the free living nematode Caenorhabditis elegans, and 9,462 in common with more closely related potato cyst nematode, Globodera pallida. Annotation of H. avenae carbohydrate active enzymes (CAZy) revealed fewer glycoside hydrolases (GHs) but more glycosyl transferases (GTs) and carbohydrate esterases (CEs) when compared to M. incognita. 1,280 transcripts were found to have secretory signature, presence of signal peptide and absence of transmembrane. In a comparison of genes expressed in the pre-parasitic juvenile and feeding female stages, expression levels of 30 genes with high RPKM (reads per base per kilo million) value, were analysed by qRT-PCR which confirmed the observed differences in their levels of expression levels. In addition, we have also developed a user-friendly resource, Heterodera transcriptome database (HATdb) for public access of the data generated in this study. The new data provided on the transcriptome of H. avenae adds to the genetic resources available to study plant parasitic nematodes and provides an opportunity to seek new effectors that are specifically involved in the H. avenae-cereal host interaction.

  15. Brain transcriptome sequencing and assembly of three songbird model systems for the study of social behavior

    Directory of Open Access Journals (Sweden)

    Christopher N. Balakrishnan

    2014-05-01

    Full Text Available Emberizid sparrows (emberizidae have played a prominent role in the study of avian vocal communication and social behavior. We present here brain transcriptomes for three emberizid model systems, song sparrow Melospiza melodia, white-throated sparrow Zonotrichia albicollis, and Gambel’s white-crowned sparrow Zonotrichia leucophrys gambelii. Each of the assemblies covered fully or in part, over 89% of the previously annotated protein coding genes in the zebra finch Taeniopygia guttata, with 16,846, 15,805, and 16,646 unique BLAST hits in song, white-throated and white-crowned sparrows, respectively. As in previous studies, we find tissue of origin (auditory forebrain versus hypothalamus and whole brain as an important determinant of overall expression profile. We also demonstrate the successful isolation of RNA and RNA-sequencing from post-mortem samples from building strikes and suggest that such an approach could be useful when traditional sampling opportunities are limited. These transcriptomes will be an important resource for the study of social behavior in birds and for data driven annotation of forthcoming whole genome sequences for these and other bird species.

  16. Improving transcriptome assembly through error correction of high-throughput sequence reads

    Directory of Open Access Journals (Sweden)

    Matthew D. MacManes

    2013-07-01

    Full Text Available The study of functional genomics, particularly in non-model organisms, has been dramatically improved over the last few years by the use of transcriptomes and RNAseq. While these studies are potentially extremely powerful, a computationally intensive procedure, the de novo construction of a reference transcriptome must be completed as a prerequisite to further analyses. The accurate reference is critically important as all downstream steps, including estimating transcript abundance are critically dependent on the construction of an accurate reference. Though a substantial amount of research has been done on assembly, only recently have the pre-assembly procedures been studied in detail. Specifically, several stand-alone error correction modules have been reported on and, while they have shown to be effective in reducing errors at the level of sequencing reads, how error correction impacts assembly accuracy is largely unknown. Here, we show via use of a simulated and empiric dataset, that applying error correction to sequencing reads has significant positive effects on assembly accuracy, and should be applied to all datasets. A complete collection of commands which will allow for the production of Reptile corrected reads is available at https://github.com/macmanes/error_correction/tree/master/scripts and as File S1.

  17. Sequencing and analysis of the gastrula transcriptome of the brittle star Ophiocoma wendtii

    Directory of Open Access Journals (Sweden)

    Vaughn Roy

    2012-09-01

    Full Text Available Abstract Background The gastrula stage represents the point in development at which the three primary germ layers diverge. At this point the gene regulatory networks that specify the germ layers are established and the genes that define the differentiated states of the tissues have begun to be activated. These networks have been well-characterized in sea urchins, but not in other echinoderms. Embryos of the brittle star Ophiocoma wendtii share a number of developmental features with sea urchin embryos, including the ingression of mesenchyme cells that give rise to an embryonic skeleton. Notable differences are that no micromeres are formed during cleavage divisions and no pigment cells are formed during development to the pluteus larval stage. More subtle changes in timing of developmental events also occur. To explore the molecular basis for the similarities and differences between these two echinoderms, we have sequenced and characterized the gastrula transcriptome of O. wendtii. Methods Development of Ophiocoma wendtii embryos was characterized and RNA was isolated from the gastrula stage. A transcriptome data base was generated from this RNA and was analyzed using a variety of methods to identify transcripts expressed and to compare those transcripts to those expressed at the gastrula stage in other organisms. Results Using existing databases, we identified brittle star transcripts that correspond to 3,385 genes, including 1,863 genes shared with the sea urchin Strongylocentrotus purpuratus gastrula transcriptome. We characterized the functional classes of genes present in the transcriptome and compared them to those found in this sea urchin. We then examined those members of the germ-layer specific gene regulatory networks (GRNs of S. purpuratus that are expressed in the O. wendtii gastrula. Our results indicate that there is a shared ‘genetic toolkit’ central to the echinoderm gastrula, a key stage in embryonic development, though

  18. Whole-genome and Transcriptome Sequencing of Prostate Cancer Identify New Genetic Alterations Driving Disease Progression

    DEFF Research Database (Denmark)

    Ren, Shancheng; Wei, Gong-Hong; Liu, Dongbing

    2018-01-01

    BACKGROUND: Global disparities in prostate cancer (PCa) incidence highlight the urgent need to identify genomic abnormalities in prostate tumors in different ethnic populations including Asian men. OBJECTIVE: To systematically explore the genomic complexity and define disease-driven genetic......-scale and comprehensive genomic data of prostate cancer from Asian population. Identification of these genetic alterations may help advance prostate cancer diagnosis, prognosis, and treatment....... alterations in PCa. DESIGN, SETTING, AND PARTICIPANTS: The study sequenced whole-genome and transcriptome of tumor-benign paired tissues from 65 treatment-naive Chinese PCa patients. Subsequent targeted deep sequencing of 293 PCa-relevant genes was performed in another cohort of 145 prostate tumors. OUTCOME...

  19. Deep sequencing-based transcriptome analysis of Plutella xylostella larvae parasitized by Diadegma semiclausum

    Science.gov (United States)

    2011-01-01

    Background Parasitoid insects manipulate their hosts' physiology by injecting various factors into their host upon parasitization. Transcriptomic approaches provide a powerful approach to study insect host-parasitoid interactions at the molecular level. In order to investigate the effects of parasitization by an ichneumonid wasp (Diadegma semiclausum) on the host (Plutella xylostella), the larval transcriptome profile was analyzed using a short-read deep sequencing method (Illumina). Symbiotic polydnaviruses (PDVs) associated with ichneumonid parasitoids, known as ichnoviruses, play significant roles in host immune suppression and developmental regulation. In the current study, D. semiclausum ichnovirus (DsIV) genes expressed in P. xylostella were identified and their sequences compared with other reported PDVs. Five of these genes encode proteins of unknown identity, that have not previously been reported. Results De novo assembly of cDNA sequence data generated 172,660 contigs between 100 and 10000 bp in length; with 35% of > 200 bp in length. Parasitization had significant impacts on expression levels of 928 identified insect host transcripts. Gene ontology data illustrated that the majority of the differentially expressed genes are involved in binding, catalytic activity, and metabolic and cellular processes. In addition, the results show that transcription levels of antimicrobial peptides, such as gloverin, cecropin E and lysozyme, were up-regulated after parasitism. Expression of ichnovirus genes were detected in parasitized larvae with 19 unique sequences identified from five PDV gene families including vankyrin, viral innexin, repeat elements, a cysteine-rich motif, and polar residue rich protein. Vankyrin 1 and repeat element 1 genes showed the highest transcription levels among the DsIV genes. Conclusion This study provides detailed information on differential expression of P. xylostella larval genes following parasitization, DsIV genes expressed in the

  20. Gene discovery in the threatened elkhorn coral: 454 sequencing of the Acropora palmata transcriptome.

    Directory of Open Access Journals (Sweden)

    Nicholas R Polato

    Full Text Available BACKGROUND: Cnidarians, including corals and anemones, offer unique insights into metazoan evolution because they harbor genetic similarities with vertebrates beyond that found in model invertebrates and retain genes known only from non-metazoans. Cataloging genes expressed in Acropora palmata, a foundation-species of reefs in the Caribbean and western Atlantic, will advance our understanding of the genetic basis of ecologically important traits in corals and comes at a time when sequencing efforts in other cnidarians allow for multi-species comparisons. RESULTS: A cDNA library from a sample enriched for symbiont free larval tissue was sequenced on the 454 GS-FLX platform. Over 960,000 reads were obtained and assembled into 42,630 contigs. Annotation data was acquired for 57% of the assembled sequences. Analysis of the assembled sequences indicated that 83-100% of all A. palmata transcripts were tagged, and provided a rough estimate of the total number genes expressed in our samples (~18,000-20,000. The coral annotation data contained many of the same molecular components as in the Bilateria, particularly in pathways associated with oxidative stress and DNA damage repair, and provided evidence that homologs of p53, a key player in DNA repair pathways, has experienced selection along the branch separating Cnidaria and Bilateria. Transcriptome wide screens of paralog groups and transition/transversion ratios highlighted genes including: green fluorescent proteins, carbonic anhydrase, and oxidative stress proteins; and functional groups involved in protein and nucleic acid metabolism, and the formation of structural molecules. These results provide a starting point for study of adaptive evolution in corals. CONCLUSIONS: Currently available transcriptome data now make comparative studies of the mechanisms underlying coral's evolutionary success possible. Here we identified candidate genes that enable corals to maintain genomic integrity despite

  1. Gene discovery in the threatened elkhorn coral: 454 sequencing of the Acropora palmata transcriptome.

    Science.gov (United States)

    Polato, Nicholas R; Vera, J Cristobal; Baums, Iliana B

    2011-01-01

    Cnidarians, including corals and anemones, offer unique insights into metazoan evolution because they harbor genetic similarities with vertebrates beyond that found in model invertebrates and retain genes known only from non-metazoans. Cataloging genes expressed in Acropora palmata, a foundation-species of reefs in the Caribbean and western Atlantic, will advance our understanding of the genetic basis of ecologically important traits in corals and comes at a time when sequencing efforts in other cnidarians allow for multi-species comparisons. A cDNA library from a sample enriched for symbiont free larval tissue was sequenced on the 454 GS-FLX platform. Over 960,000 reads were obtained and assembled into 42,630 contigs. Annotation data was acquired for 57% of the assembled sequences. Analysis of the assembled sequences indicated that 83-100% of all A. palmata transcripts were tagged, and provided a rough estimate of the total number genes expressed in our samples (~18,000-20,000). The coral annotation data contained many of the same molecular components as in the Bilateria, particularly in pathways associated with oxidative stress and DNA damage repair, and provided evidence that homologs of p53, a key player in DNA repair pathways, has experienced selection along the branch separating Cnidaria and Bilateria. Transcriptome wide screens of paralog groups and transition/transversion ratios highlighted genes including: green fluorescent proteins, carbonic anhydrase, and oxidative stress proteins; and functional groups involved in protein and nucleic acid metabolism, and the formation of structural molecules. These results provide a starting point for study of adaptive evolution in corals. Currently available transcriptome data now make comparative studies of the mechanisms underlying coral's evolutionary success possible. Here we identified candidate genes that enable corals to maintain genomic integrity despite considerable exposure to genotoxic stress over long life

  2. The Porcelain Crab Transcriptome and PCAD, the Porcelain Crab Microarray and Sequence Database

    Energy Technology Data Exchange (ETDEWEB)

    Tagmount, Abderrahmane; Wang, Mei; Lindquist, Erika; Tanaka, Yoshihiro; Teranishi, Kristen S.; Sunagawa, Shinichi; Wong, Mike; Stillman, Jonathon H.

    2010-01-27

    Background: With the emergence of a completed genome sequence of the freshwater crustacean Daphnia pulex, construction of genomic-scale sequence databases for additional crustacean sequences are important for comparative genomics and annotation. Porcelain crabs, genus Petrolisthes, have been powerful crustacean models for environmental and evolutionary physiology with respect to thermal adaptation and understanding responses of marine organisms to climate change. Here, we present a large-scale EST sequencing and cDNA microarray database project for the porcelain crab Petrolisthes cinctipes. Methodology/Principal Findings: A set of ~;;30K unique sequences (UniSeqs) representing ~;;19K clusters were generated from ~;;98K high quality ESTs from a set of tissue specific non-normalized and mixed-tissue normalized cDNA libraries from the porcelain crab Petrolisthes cinctipes. Homology for each UniSeq was assessed using BLAST, InterProScan, GO and KEGG database searches. Approximately 66percent of the UniSeqs had homology in at least one of the databases. All EST and UniSeq sequences along with annotation results and coordinated cDNA microarray datasets have been made publicly accessible at the Porcelain Crab Array Database (PCAD), a feature-enriched version of the Stanford and Longhorn Array Databases.Conclusions/Significance: The EST project presented here represents the third largest sequencing effort for any crustacean, and the largest effort for any crab species. Our assembly and clustering results suggest that our porcelain crab EST data set is equally diverse to the much larger EST set generated in the Daphnia pulex genome sequencing project, and thus will be an important resource to the Daphnia research community. Our homology results support the pancrustacea hypothesis and suggest that Malacostraca may be ancestral to Branchiopoda and Hexapoda. Our results also suggest that our cDNA microarrays cover as much of the transcriptome as can reasonably be captured in

  3. Fast skeletal muscle transcriptome of the Gilthead sea bream (Sparus aurata determined by next generation sequencing

    Directory of Open Access Journals (Sweden)

    Garcia de la serrana Daniel

    2012-05-01

    Full Text Available Abstract Background The gilthead sea bream (Sparus aurata L. occurs around the Mediterranean and along Eastern Atlantic coasts from Great Britain to Senegal. It is tolerant of a wide range of temperatures and salinities and is often found in brackish coastal lagoons and estuarine areas, particularly early in its life cycle. Gilthead sea bream are extensively cultivated in the Mediterranean with an annual production of 125,000 metric tonnes. Here we present a de novo assembly of the fast skeletal muscle transcriptome of gilthead sea bream using 454 reads and identify gene paralogues, splice variants and microsatellite repeats. An annotated transcriptome of the skeletal muscle will facilitate understanding of the genetic and molecular basis of traits linked to production in this economically important species. Results Around 2.7 million reads of mRNA sequence data were generated from the fast myotomal of adult fish (~2 kg and juvenile fish (~0.09 kg that had been either fed to satiation, fasted for 3-5d or transferred to low (11°C or high (33°C temperatures for 3-5d. Newbler v2.5 assembly resulted in 43,461 isotigs >100 bp. The number of sequences annotated by searching protein and gene ontology databases was 10,465. The average coverage of the annotated isotigs was x40 containing 5655 unique gene IDs and 785 full-length cDNAs coding for proteins containing 58–1536 amino acids. The v2.5 assembly was found to be of good quality based on validation using 200 full-length cDNAs from GenBank. Annotated isotigs from the reference transcriptome were attributable to 344 KEGG pathway maps. We identified 26 gene paralogues (20 of them teleost-specific and 43 splice variants, of which 12 had functional domains missing that were likely to affect their biological function. Many key transcription factors, signaling molecules and structural proteins necessary for myogenesis and muscle growth have been identified. Physiological status affected the

  4. An empirical strategy to detect bacterial transcript structure from directional RNA-seq transcriptome data.

    Science.gov (United States)

    Wang, Yejun; MacKenzie, Keith D; White, Aaron P

    2015-05-07

    As sequencing costs are being lowered continuously, RNA-seq has gradually been adopted as the first choice for comparative transcriptome studies with bacteria. Unlike microarrays, RNA-seq can directly detect cDNA derived from mRNA transcripts at a single nucleotide resolution. Not only does this allow researchers to determine the absolute expression level of genes, but it also conveys information about transcript structure. Few automatic software tools have yet been established to investigate large-scale RNA-seq data for bacterial transcript structure analysis. In this study, 54 directional RNA-seq libraries from Salmonella serovar Typhimurium (S. Typhimurium) 14028s were examined for potential relationships between read mapping patterns and transcript structure. We developed an empirical method, combined with statistical tests, to automatically detect key transcript features, including transcriptional start sites (TSSs), transcriptional termination sites (TTSs) and operon organization. Using our method, we obtained 2,764 TSSs and 1,467 TTSs for 1331 and 844 different genes, respectively. Identification of TSSs facilitated further discrimination of 215 putative sigma 38 regulons and 863 potential sigma 70 regulons. Combining the TSSs and TTSs with intergenic distance and co-expression information, we comprehensively annotated the operon organization in S. Typhimurium 14028s. Our results show that directional RNA-seq can be used to detect transcriptional borders at an acceptable resolution of ±10-20 nucleotides. Technical limitations of the RNA-seq procedure may prevent single nucleotide resolution. The automatic transcript border detection methods, statistical models and operon organization pipeline that we have described could be widely applied to RNA-seq studies in other bacteria. Furthermore, the TSSs, TTSs, operons, promoters and unstranslated regions that we have defined for S. Typhimurium 14028s may constitute valuable resources that can be used for

  5. An abundance of ubiquitously expressed genes revealed by tissue transcriptome sequence data.

    Directory of Open Access Journals (Sweden)

    Daniel Ramsköld

    2009-12-01

    Full Text Available The parts of the genome transcribed by a cell or tissue reflect the biological processes and functions it carries out. We characterized the features of mammalian tissue transcriptomes at the gene level through analysis of RNA deep sequencing (RNA-Seq data across human and mouse tissues and cell lines. We observed that roughly 8,000 protein-coding genes were ubiquitously expressed, contributing to around 75% of all mRNAs by message copy number in most tissues. These mRNAs encoded proteins that were often intracellular, and tended to be involved in metabolism, transcription, RNA processing or translation. In contrast, genes for secreted or plasma membrane proteins were generally expressed in only a subset of tissues. The distribution of expression levels was broad but fairly continuous: no support was found for the concept of distinct expression classes of genes. Expression estimates that included reads mapping to coding exons only correlated better with qRT-PCR data than estimates which also included 3' untranslated regions (UTRs. Muscle and liver had the least complex transcriptomes, in that they expressed predominantly ubiquitous genes and a large fraction of the transcripts came from a few highly expressed genes, whereas brain, kidney and testis expressed more complex transcriptomes with the vast majority of genes expressed and relatively small contributions from the most expressed genes. mRNAs expressed in brain had unusually long 3'UTRs, and mean 3'UTR length was higher for genes involved in development, morphogenesis and signal transduction, suggesting added complexity of UTR-based regulation for these genes. Our results support a model in which variable exterior components feed into a large, densely connected core composed of ubiquitously expressed intracellular proteins.

  6. De novo transcriptome sequencing and analysis of the juvenile and adult stages of Fasciola gigantica.

    Science.gov (United States)

    Zhang, Xiao-Xuan; Cong, Wei; Elsheikha, Hany M; Liu, Guo-Hua; Ma, Jian-Gang; Huang, Wei-Yi; Zhao, Quan; Zhu, Xing-Quan

    2017-07-01

    Fasciola gigantica is regarded as the major liver fluke causing fasciolosis in livestock in tropical countries. Despite the significant economic and public health impacts of F. gigantica there are few studies on the pathogenesis of this parasite and our understanding is further limited by the lack of genome and transcriptome information. In this study, de novo Illumina RNA sequencing (RNA-seq) was performed to obtain a comprehensive transcriptome profile of the juvenile (42days post infection) and adult stages of F. gigantica. A total of 49,720 unigenes were produced from juvenile and adult stages of F. gigantica, with an average length of 1286 nucleotides (nt) and N50 of 2076nt. A total of 27,862 (56.03%) unigenes were annotated by BLAST similarity searches against the NCBI non-redundant protein database. Because F. gigantica needs to feed and/or digest host tissues, some proteases (including cysteine proteases and aspartic proteases), which play a role in the degradation of host tissues (protein), have been paid more attention in the present study. A total of 6511 distinct genes were found differentially expressed between juveniles and adults, of which 3993 genes were up-regulated and 2518 genes were down-regulated in adults versus juveniles, respectively. Moreover, stage-specific differentially expressed genes were identified in juvenile (17,009) and adult (6517) F. gigantica. The significantly divergent pathways of differentially expressed genes included cAMP signaling pathway (226; 4.12%), proteoglycans in cancer (256; 4.67%) and focal adhesion (199; 3.63%). The transcription pattern also revealed two egg-laying-associated pathways: cGMP-PKG signaling pathway and TGF-β signaling pathway. This study provides the first comparative transcriptomic data concerning juvenile and adult stages of F. gigantica that will be of great value for future research efforts into understanding parasite pathogenesis and developing vaccines against this important parasite

  7. Deep sequencing-based analysis of the Cymbidium ensifolium floral transcriptome.

    Directory of Open Access Journals (Sweden)

    Xiaobai Li

    Full Text Available Cymbidium ensifolium is a Chinese Cymbidium with an elegant shape, beautiful appearance, and a fragrant aroma. C. ensifolium has a long history of cultivation in China and it has excellent commercial value as a potted plant and cut flower. The development of C. ensifolium genomic resources has been delayed because of its large genome size. Taking advantage of technical and cost improvement of RNA-Seq, we extracted total mRNA from flower buds and mature flowers and obtained a total of 9.52 Gb of filtered nucleotides comprising 98,819,349 filtered reads. The filtered reads were assembled into 101,423 isotigs, representing 51,696 genes. Of the 101,423 isotigs, 41,873 were putative homologs of annotated sequences in the public databases, of which 158 were associated with floral development and 119 were associated with flowering. The isotigs were categorized according to their putative functions. In total, 10,212 of the isotigs were assigned into 25 eukaryotic orthologous groups (KOGs, 41,690 into 58 gene ontology (GO terms, and 9,830 into 126 Arabidopsis Kyoto Encyclopedia of Genes and Genomes (KEGG pathways, and 9,539 isotigs into 123 rice pathways. Comparison of the isotigs with those of the two related orchid species P. equestris and C. sinense showed that 17,906 isotigs are unique to C. ensifolium. In addition, a total of 7,936 SSRs and 16,676 putative SNPs were identified. To our knowledge, this transcriptome database is the first major genomic resource for C. ensifolium and the most comprehensive transcriptomic resource for genus Cymbidium. These sequences provide valuable information for understanding the molecular mechanisms of floral development and flowering. Sequences predicted to be unique to C. ensifolium would provide more insights into C. ensifolium gene diversity. The numerous SNPs and SSRs identified in the present study will contribute to marker development for C. ensifolium.

  8. De novo assembly, gene annotation and marker development using Illumina paired-end transcriptome sequences in celery (Apium graveolens L..

    Directory of Open Access Journals (Sweden)

    Nan Fu

    Full Text Available BACKGROUND: Celery is an increasing popular vegetable species, but limited transcriptome and genomic data hinder the research to it. In addition, a lack of celery molecular markers limits the process of molecular genetic breeding. High-throughput transcriptome sequencing is an efficient method to generate a large transcriptome sequence dataset for gene discovery, molecular marker development and marker-assisted selection breeding. PRINCIPAL FINDINGS: Celery transcriptomes from four tissues were sequenced using Illumina paired-end sequencing technology. De novo assembling was performed to generate a collection of 42,280 unigenes (average length of 502.6 bp that represent the first transcriptome of the species. 78.43% and 48.93% of the unigenes had significant similarity with proteins in the National Center for Biotechnology Information (NCBI non-redundant protein database (Nr and Swiss-Prot database respectively, and 10,473 (24.77% unigenes were assigned to Clusters of Orthologous Groups (COG. 21,126 (49.97% unigenes harboring Interpro domains were annotated, in which 15,409 (36.45% were assigned to Gene Ontology(GO categories. Additionally, 7,478 unigenes were mapped onto 228 pathways using the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG. Large numbers of simple sequence repeats (SSRs were indentified, and then the rate of successful amplication and polymorphism were investigated among 31 celery accessions. CONCLUSIONS: This study demonstrates the feasibility of generating a large scale of sequence information by Illumina paired-end sequencing and efficient assembling. Our results provide a valuable resource for celery research. The developed molecular markers are the foundation of further genetic linkage analysis and gene localization, and they will be essential to accelerate the process of breeding.

  9. Transcriptome analysis of the model protozoan, Tetrahymena thermophila, using Deep RNA sequencing.

    Directory of Open Access Journals (Sweden)

    Jie Xiong

    Full Text Available BACKGROUND: The ciliated protozoan Tetrahymena thermophila is a well-studied single-celled eukaryote model organism for cellular and molecular biology. However, the lack of extensive T. thermophila cDNA libraries or a large expressed sequence tag (EST database limited the quality of the original genome annotation. METHODOLOGY/PRINCIPAL FINDINGS: This RNA-seq study describes the first deep sequencing analysis of the T. thermophila transcriptome during the three major stages of the life cycle: growth, starvation and conjugation. Uniquely mapped reads covered more than 96% of the 24,725 predicted gene models in the somatic genome. More than 1,000 new transcribed regions were identified. The great dynamic range of RNA-seq allowed detection of a nearly six order-of-magnitude range of measurable gene expression orchestrated by this cell. RNA-seq also allowed the first prediction of transcript untranslated regions (UTRs and an updated (larger size estimate of the T. thermophila transcriptome: 57 Mb, or about 55% of the somatic genome. Our study identified nearly 1,500 alternative splicing (AS events distributed over 5.2% of T. thermophila genes. This percentage represents a two order-of-magnitude increase over previous EST-based estimates in Tetrahymena. Evidence of stage-specific regulation of alternative splicing was also obtained. Finally, our study allowed us to completely confirm about 26.8% of the genes originally predicted by the gene finder, to correct coding sequence boundaries and intron-exon junctions for about a third, and to reassign microarray probes and correct earlier microarray data. CONCLUSIONS/SIGNIFICANCE: RNA-seq data significantly improve the genome annotation and provide a fully comprehensive view of the global transcriptome of T. thermophila. To our knowledge, 5.2% of T. thermophila genes with AS is the highest percentage of genes showing AS reported in a unicellular eukaryote. Tetrahymena thus becomes an excellent unicellular

  10. Deep sequencing of the Camellia chekiangoleosa transcriptome revealed candidate genes for anthocyanin biosynthesis.

    Science.gov (United States)

    Wang, Zhong-Wei; Jiang, Cong; Wen, Qiang; Wang, Na; Tao, Yuan-Yuan; Xu, Li-An

    2014-03-15

    Camellia chekiangoleosa is an important species of genus Camellia. It provides high-quality edible oil and has great ornamental value. The flowers are big and red which bloom between February and March. Flower pigmentation is closely related to the accumulation of anthocyanin. Although anthocyanin biosynthesis has been studied extensively in herbaceous plants, little molecular information on the anthocyanin biosynthesis pathway of C. chekiangoleosa is yet known. In the present study, a cDNA library was constructed to obtain detailed and general data from the flowers of C. chekiangoleosa. To explore the transcriptome of C. chekiangoleosa and investigate genes involved in anthocyanin biosynthesis, a 454 GS FLX Titanium platform was used to generate an EST dataset. About 46,279 sequences were obtained, and 24,593 (53.1%) were annotated. Using Blast search against the AGRIS, 1740 unigenes were found homologous to 599 Arabidopsis transcription factor genes. Based on the transcriptome dataset, nine anthocyanin biosynthesis pathway genes (PAL, CHS1, CHS2, CHS3, CHI, F3H, DFR, ANS, and UFGT) were identified and cloned. The spatio-temporal expression patterns of these genes were also analyzed using quantitative real-time polymerase chain reaction. The study results not only enrich the gene resource but also provide valuable information for further studies concerning anthocyanin biosynthesis. Copyright © 2014 Elsevier B.V. All rights reserved.

  11. Characterizing the Genetic Basis for Nicotine Induced Cancer Development: A Transcriptome Sequencing Study.

    Directory of Open Access Journals (Sweden)

    Jasmin H Bavarva

    Full Text Available Nicotine is a known risk factor for cancer development and has been shown to alter gene expression in cells and tissue upon exposure. We used Illumina® Next Generation Sequencing (NGS technology to gain unbiased biological insight into the transcriptome of normal epithelial cells (MCF-10A to nicotine exposure. We generated expression data from 54,699 transcripts using triplicates of control and nicotine stressed cells. As a result, we identified 138 differentially expressed transcripts, including 39 uncharacterized genes. Additionally, 173 transcripts that are primarily associated with DNA replication, recombination, and repair showed evidence for alternative splicing. We discovered the greatest nicotine stress response by HPCAL4 (up-regulated by 4.71 fold and NPAS3 (down-regulated by -2.73 fold; both are genes that have not been previously implicated in nicotine exposure but are linked to cancer. We also discovered significant down-regulation (-2.3 fold and alternative splicing of NEAT1 (lncRNA that may have an important, yet undiscovered regulatory role. Gene ontology analysis revealed nicotine exposure influenced genes involved in cellular and metabolic processes. This study reveals previously unknown consequences of nicotine stress on the transcriptome of normal breast epithelial cells and provides insight into the underlying biological influence of nicotine on normal cells, marking the foundation for future studies.

  12. Transcriptome analysis of the Chinese giant salamander (Andrias davidianus using RNA-sequencing

    Directory of Open Access Journals (Sweden)

    Yong Huang

    2017-12-01

    Full Text Available The Chinese giant salamander (Andrias davidianus is an economically important animal on academic value. However, the genomic information of this species has been less studied. In our study, the transcripts of A. davidianus were obtained by RNA-seq to conduct a transcriptomic analysis. In total 132,912 unigenes were generated with an average length of 690 bp and N50 of 1263 bp by de novo assembly using Trinity software. Using a sequence similarity search against the nine public databases (CDD, KOG, NR, NT, PFAM, Swiss-prot, TrEMBL, GO and KEGG databases, a total of 24,049, 18,406, 36,711, 15,858, 20,500, 27,515, 36,705, 28,879 and 10,958 unigenes were annotated in databases, respectively. Of these, 6323 unigenes were annotated in all database and 39,672 unigenes were annotated in at least one database. Blasted with KEGG pathway, 10,958 unigenes were annotated, and it was divided into 343 categories according to different pathways. In addition, we also identified 29,790 SSRs. This study provided a valuable resource for understanding transcriptomic information of A. davidianus and laid a foundation for further research on functional gene cloning, genomics, genetic diversity analysis and molecular marker exploitation in A. davidianus.

  13. Insight into the transcriptome of Arthrobotrys conoides using high throughput sequencing.

    Science.gov (United States)

    Ramesh, Pandit; Reena, Patel; Amitbikram, Mohapatra; Chaitanya, Joshi; Anju, Kunjadia

    2015-12-01

    Arthrobotrys conoides is a nematode-trapping fungus belonging to Orbiliales, Ascomycota group, and traps prey nematodes by means of adhesive network. Fungus has a potential to be used as a biocontrol agent against plant parasitic nematodes. In the present study, we characterized the transcriptome of A. conoides using high-throughput sequencing technology and characterized its virulence unigenes. Total 7,255 cDNA contigs with an average length of 425 bp were generated and 6184 (61.81%) transcripts were functionally annotated and characterized. Majority of unigenes were found analogous to the genes of plant pathogenic fungi. A total of 1749 transcripts were found to be orthologous with eukaryotic proteins of KOG database. Several carbohydrate active enzymes and peptidases were identified. We also analyzed classically and nonclassically secreted proteins and confirmed by BLASTP against fungal secretome database. A total of 916 contigs were analogous to 556 unique proteins of Pathogen Host Interaction (PHI) database. Further, we identified 91 unigenes homologous to the database of fungal virulence factor (DFVF). A total of 104 putative protein kinases coding transcripts were identified by BLASTP against KinBase database, which are major players in signaling pathways. This study provides a comprehensive look at the transcriptome of A. conoides and the identified unigenes might have a role in catching and killing prey nematodes by A. conoides. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. Development of genic SSR markers from transcriptome sequencing of pear buds.

    Science.gov (United States)

    Yue, Xiao-yan; Liu, Guo-qin; Zong, Yu; Teng, Yuan-wen; Cai, Dan-ying

    2014-04-01

    A total of 8375 genic simple sequence repeat (SSR) loci were discovered from a unigene set assembled from 116282 transcriptomic unigenes in this study. Dinucleotide repeat motifs were the most common with a frequency of 65.11%, followed by trinucleotide (32.81%). A total of 4100 primer pairs were designed from the SSR loci. Of these, 343 primer pairs (repeat length ≥15 bp) were synthesized with an M13 tail and tested for stable amplification and polymorphism in four Pyrus accessions. After the preliminary test, 104 polymorphic genic SSR markers were developed; dinucleotide and trinucleotide repeats represented 97.11% (101) of these. Twenty-eight polymorphic genic SSR markers were selected randomly to further validate genetic diversity among 28 Pyrus accessions. These markers displayed a high level of polymorphism. The number of alleles at these SSR loci ranged from 2 to 17, with a mean of 9.43 alleles per locus, and the polymorphism information content (PIC) values ranged from 0.26 to 0.91. The UPGMA (unweighted pair-group method with arithmetic average) cluster analysis grouped the 28 Pyrus accessions into two groups: Oriental pears and Occidental pears, which are congruent to the traditional taxonomy, demonstrating their effectiveness in analyzing Pyrus phylogenetic relationships, enriching rare Pyrus EST-SSR resources, and confirming the potential value of a pear transcriptome database for the development of new SSR markers.

  15. De novo assembly and characterization of the garlic (Allium sativum) bud transcriptome by Illumina sequencing.

    Science.gov (United States)

    Sun, Xiudong; Zhou, Shumei; Meng, Fanlu; Liu, Shiqi

    2012-10-01

    Garlic is widely used as a spice throughout the world for the culinary value of its flavor and aroma, which are created by the chemical transformation of a series of organic sulfur compounds. To analyze the transcriptome of Allium sativum and discover the genes involved in sulfur metabolism, cDNAs derived from the total RNA of Allium sativum buds were analyzed by Illumina sequencing. Approximately 26.67 million 90 bp paired-end clean reads were achieved in two libraries. A total of 127,933 unigenes were generated by de novo assembly and were compared with the sequences in public databases. Of these, 45,286 unigenes had significant hits to the sequences in the Nr database, 29,514 showed significant similarity to known proteins in the Swiss-Prot database and, 20,706 and 21,952 unigenes had significant similarity to existing sequences in the KEGG and COG databases, respectively. Moreover, genes involved in organic sulfur biosynthesis were identified. These unigenes data will provide the foundation for research on gene expression, genomics and functional genomics in Allium sativum. Key message The obtained unigenes will provide the foundation for research on functional genomics in Allium sativum and its closely related species, and fill the gap of the existing plant EST database.

  16. Next generation sequencing and analysis of a conserved transcriptome of New Zealand's kiwi.

    Science.gov (United States)

    Subramanian, Sankar; Huynen, Leon; Millar, Craig D; Lambert, David M

    2010-12-15

    Kiwi is a highly distinctive, flightless and endangered ratite bird endemic to New Zealand. To understand the patterns of molecular evolution of the nuclear protein-coding genes in brown kiwi (Apteryx australis mantelli) and to determine the timescale of avian history we sequenced a transcriptome obtained from a kiwi embryo using next generation sequencing methods. We then assembled the conserved protein-coding regions using the chicken proteome as a scaffold. Using 1,543 conserved protein coding genes we estimated the neutral evolutionary divergence between the kiwi and chicken to be ~45%, which is approximately equal to the divergence computed for the human-mouse pair using the same set of genes. A large fraction of genes was found to be under high selective constraint, as most of the expressed genes appeared to be involved in developmental gene regulation. Our study suggests a significant relationship between gene expression levels and protein evolution. Using sequences from over 700 nuclear genes we estimated the divergence between the two basal avian groups, Palaeognathae and Neognathae to be 132 million years, which is consistent with previous studies using mitochondrial genes. The results of this investigation revealed patterns of mutation and purifying selection in conserved protein coding regions in birds. Furthermore this study suggests a relatively cost-effective way of obtaining a glimpse into the fundamental molecular evolutionary attributes of a genome, particularly when no closely related genomic sequence is available.

  17. Next generation sequencing and analysis of a conserved transcriptome of New Zealand's kiwi

    Directory of Open Access Journals (Sweden)

    Huynen Leon

    2010-12-01

    Full Text Available Abstract Background Kiwi is a highly distinctive, flightless and endangered ratite bird endemic to New Zealand. To understand the patterns of molecular evolution of the nuclear protein-coding genes in brown kiwi (Apteryx australis mantelli and to determine the timescale of avian history we sequenced a transcriptome obtained from a kiwi embryo using next generation sequencing methods. We then assembled the conserved protein-coding regions using the chicken proteome as a scaffold. Results Using 1,543 conserved protein coding genes we estimated the neutral evolutionary divergence between the kiwi and chicken to be ~45%, which is approximately equal to the divergence computed for the human-mouse pair using the same set of genes. A large fraction of genes was found to be under high selective constraint, as most of the expressed genes appeared to be involved in developmental gene regulation. Our study suggests a significant relationship between gene expression levels and protein evolution. Using sequences from over 700 nuclear genes we estimated the divergence between the two basal avian groups, Palaeognathae and Neognathae to be 132 million years, which is consistent with previous studies using mitochondrial genes. Conclusions The results of this investigation revealed patterns of mutation and purifying selection in conserved protein coding regions in birds. Furthermore this study suggests a relatively cost-effective way of obtaining a glimpse into the fundamental molecular evolutionary attributes of a genome, particularly when no closely related genomic sequence is available.

  18. RNA sequencing atopic dermatitis transcriptome profiling provides insights into novel disease mechanisms with potential therapeutic implications

    DEFF Research Database (Denmark)

    Suárez-Fariñas, Mayte; Ungar, Benjamin; Correa da Rosa, Joel

    2015-01-01

    . These limitations might be lessened with next-generation RNA sequencing (RNA-seq). Objective: We sought to define the lesional AD transcriptome using RNA-seq and compare it using microarrays performed on the same cohort. Methods: RNA-seq and microarrays were performed to identify differentially expressed genes...... RNA-seq showed somewhat better agreement with RT-PCR (intraclass correlation coefficient, 0.57 and 0.70 for microarrays and RNA-seq vs RT-PCR, respectively), bias was not eliminated. Among genes uniquely identified by using RNA-seq were triggering receptor expressed on myeloid cells 1 (TREM-1......) signaling (eg, CCL2, CCL3, and single immunoglobulin domain IL1R1 related [SIGIRR]) and IL-36 isoform genes. TREM-1 is a surface receptor implicated in innate and adaptive immunity that amplifies infection-related inflammation. Conclusions: This is the first report of a lesional AD phenotype using RNA...

  19. A combination of LongSAGE with Solexa sequencing is well suited to explore the depth and the complexity of transcriptome

    Directory of Open Access Journals (Sweden)

    Scoté-Blachon Céline

    2008-09-01

    Full Text Available Abstract Background "Open" transcriptome analysis methods allow to study gene expression without a priori knowledge of the transcript sequences. As of now, SAGE (Serial Analysis of Gene Expression, LongSAGE and MPSS (Massively Parallel Signature Sequencing are the mostly used methods for "open" transcriptome analysis. Both LongSAGE and MPSS rely on the isolation of 21 pb tag sequences from each transcript. In contrast to LongSAGE, the high throughput sequencing method used in MPSS enables the rapid sequencing of very large libraries containing several millions of tags, allowing deep transcriptome analysis. However, a bias in the complexity of the transcriptome representation obtained by MPSS was recently uncovered. Results In order to make a deep analysis of mouse hypothalamus transcriptome avoiding the limitation introduced by MPSS, we combined LongSAGE with the Solexa sequencing technology and obtained a library of more than 11 millions of tags. We then compared it to a LongSAGE library of mouse hypothalamus sequenced with the Sanger method. Conclusion We found that Solexa sequencing technology combined with LongSAGE is perfectly suited for deep transcriptome analysis. In contrast to MPSS, it gives a complex representation of transcriptome as reliable as a LongSAGE library sequenced by the Sanger method.

  20. Transcriptome sequencing of the naked mole rat (Heterocephalus glaber and identification of hypoxia tolerance genes

    Directory of Open Access Journals (Sweden)

    Bang Xiao

    2017-12-01

    Full Text Available The naked mole rat (NMR; Heterocephalus glaber is a small rodent species found in regions of Kenya, Ethiopia and Somalia. It has a high tolerance for hypoxia and is thus considered one of the most important natural models for studying hypoxia tolerance mechanisms. The various mechanisms underlying the NMR's hypoxia tolerance are beginning to be understood at different levels of organization, and next-generation sequencing methods promise to expand this understanding to the level of gene expression. In this study, we examined the sequence and transcript abundance data of the muscle transcriptome of NMRs exposed to hypoxia using the Illumina HiSeq 2500 system to clarify the possible genomic adaptive responses to the hypoxic underground surroundings. The RNA-seq raw FastQ data were mapped against the NMR genome. We identified 2337 differentially expressed genes (DEGs by comparison of the hypoxic and control groups. Functional annotation of the DEGs by gene ontology (GO analysis revealed enrichment of hypoxia stress-related GO categories, including ‘biological regulation’, ‘cellular process’, ‘ion transport’ and ‘cell-cell signaling’. Enrichment of DEGs in signaling pathways was analyzed against the Kyoto Encyclopedia of Genes and Genomes (KEGG database to identify possible interactions between DEGs. The results revealed significant enrichment of DEGs in focal adhesion, the mitogen-activated protein kinase (MAPK signaling pathway and the glycine, serine and threonine metabolism pathway. Furthermore, inhibition of DEGs (STMN1, MAPK8IP1 and MAPK10 expression induced apoptosis and arrested cell growth in NMR fibroblasts following hypoxia. Thus, this global transcriptome analysis of NMRs can provide an important genetic resource for the study of hypoxia tolerance in mammals. Furthermore, the identified DEGs may provide important molecular targets for biomedical research into therapeutic strategies for stroke and cardiovascular diseases.

  1. Transcriptome Sequencing and Analysis for Culm Elongation of the World's Largest Bamboo (Dendrocalamus sinicus.

    Directory of Open Access Journals (Sweden)

    Kai Cui

    Full Text Available Dendrocalamus sinicus is the world's largest bamboo species with strong woody culms, and known for its fast-growing culms. As an economic bamboo species, it was popularized for multi-functional applications including furniture, construction, and industrial paper pulp. To comprehensively elucidate the molecular processes involved in its culm elongation, Illumina paired-end sequencing was conducted. About 65.08 million high-quality reads were produced, and assembled into 81,744 unigenes with an average length of 723 bp. A total of 64,338 (79% unigenes were annotated for their functions, of which, 56,587 were annotated in the NCBI non-redundant protein database and 35,262 were annotated in the Swiss-Prot database. Also, 42,508 and 21,009 annotated unigenes were allocated to gene ontology (GO categories and clusters of orthologous groups (COG, respectively. By searching against the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG, 33,920 unigenes were assigned to 128 KEGG pathways. Meanwhile, 8,553 simple sequence repeats (SSRs and 81,534 single-nucleotide polymorphism (SNPs were identified, respectively. Additionally, 388 transcripts encoding lignin biosynthesis were detected, among which, 27 transcripts encoding Shikimate O-hydroxycinnamoyltransferase (HCT specifically expressed in D. sinicus when compared to other bamboo species and rice. The phylogenetic relationship between D. sinicus and other plants was analyzed, suggesting functional diversity of HCT unigenes in D. sinicus. We conjectured that HCT might lead to the high lignin content and giant culm. Given that the leaves are not yet formed and culm is covered with sheaths during culm elongation, the existence of photosynthesis of bamboo culm is usually neglected. Surprisedly, 109 transcripts encoding photosynthesis were identified, including photosystem I and II, cytochrome b6/f complex, photosynthetic electron transport and F-type ATPase, and 24 transcripts were characterized

  2. Transcriptome Sequencing and Analysis for Culm Elongation of the World's Largest Bamboo (Dendrocalamus sinicus).

    Science.gov (United States)

    Cui, Kai; Wang, Haiying; Liao, Shengxi; Tang, Qi; Li, Li; Cui, Yongzhong; He, Yuan

    2016-01-01

    Dendrocalamus sinicus is the world's largest bamboo species with strong woody culms, and known for its fast-growing culms. As an economic bamboo species, it was popularized for multi-functional applications including furniture, construction, and industrial paper pulp. To comprehensively elucidate the molecular processes involved in its culm elongation, Illumina paired-end sequencing was conducted. About 65.08 million high-quality reads were produced, and assembled into 81,744 unigenes with an average length of 723 bp. A total of 64,338 (79%) unigenes were annotated for their functions, of which, 56,587 were annotated in the NCBI non-redundant protein database and 35,262 were annotated in the Swiss-Prot database. Also, 42,508 and 21,009 annotated unigenes were allocated to gene ontology (GO) categories and clusters of orthologous groups (COG), respectively. By searching against the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG), 33,920 unigenes were assigned to 128 KEGG pathways. Meanwhile, 8,553 simple sequence repeats (SSRs) and 81,534 single-nucleotide polymorphism (SNPs) were identified, respectively. Additionally, 388 transcripts encoding lignin biosynthesis were detected, among which, 27 transcripts encoding Shikimate O-hydroxycinnamoyltransferase (HCT) specifically expressed in D. sinicus when compared to other bamboo species and rice. The phylogenetic relationship between D. sinicus and other plants was analyzed, suggesting functional diversity of HCT unigenes in D. sinicus. We conjectured that HCT might lead to the high lignin content and giant culm. Given that the leaves are not yet formed and culm is covered with sheaths during culm elongation, the existence of photosynthesis of bamboo culm is usually neglected. Surprisedly, 109 transcripts encoding photosynthesis were identified, including photosystem I and II, cytochrome b6/f complex, photosynthetic electron transport and F-type ATPase, and 24 transcripts were characterized as antenna

  3. Small RNA and transcriptome deep sequencing proffers insight into floral gene regulation in Rosa cultivars

    Directory of Open Access Journals (Sweden)

    Kim Jungeun

    2012-11-01

    Full Text Available Abstract Background Roses (Rosa sp., which belong to the family Rosaceae, are the most economically important ornamental plants—making up 30% of the floriculture market. However, given high demand for roses, rose breeding programs are limited in molecular resources which can greatly enhance and speed breeding efforts. A better understanding of important genes that contribute to important floral development and desired phenotypes will lead to improved rose cultivars. For this study, we analyzed rose miRNAs and the rose flower transcriptome in order to generate a database to expound upon current knowledge regarding regulation of important floral characteristics. A rose genetic database will enable comprehensive analysis of gene expression and regulation via miRNA among different Rosa cultivars. Results We produced more than 0.5 million reads from expressed sequences, totalling more than 110 million bp. From these, we generated 35,657, 31,434, 34,725, and 39,722 flower unigenes from Rosa hybrid: ‘Vital’, ‘Maroussia’, and ‘Sympathy’ and Rosa rugosa Thunb. , respectively. The unigenes were assigned functional annotations, domains, metabolic pathways, Gene Ontology (GO terms, Plant Ontology (PO terms, and MIPS Functional Catalogue (FunCat terms. Rose flower transcripts were compared with genes from whole genome sequences of Rosaceae members (apple, strawberry, and peach and grape. We also produced approximately 40 million small RNA reads from flower tissue for Rosa, representing 267 unique miRNA tags. Among identified miRNAs, 25 of them were novel and 242 of them were conserved miRNAs. Statistical analyses of miRNA profiles revealed both shared and species-specific miRNAs, which presumably effect flower development and phenotypes. Conclusions In this study, we constructed a Rose miRNA and transcriptome database, and we analyzed the miRNAs and transcriptome generated from the flower tissues of four Rosa cultivars. The database provides a

  4. Next-Generation Sequencing of the Chrysanthemum nankingense (Asteraceae) Transcriptome Permits Large-Scale Unigene Assembly and SSR Marker Discovery

    Science.gov (United States)

    Wang, Haibin; Jiang, Jiafu; Chen, Sumei; Qi, Xiangyu; Peng, Hui; Li, Pirui; Song, Aiping; Guan, Zhiyong; Fang, Weimin; Liao, Yuan; Chen, Fadi

    2013-01-01

    Background Simple sequence repeats (SSRs) are ubiquitous in eukaryotic genomes. Chrysanthemum is one of the largest genera in the Asteraceae family. Only few Chrysanthemum expressed sequence tag (EST) sequences have been acquired to date, so the number of available EST-SSR markers is very low. Methodology/Principal Findings Illumina paired-end sequencing technology produced over 53 million sequencing reads from C. nankingense mRNA. The subsequent de novo assembly yielded 70,895 unigenes, of which 45,789 (64.59%) unigenes showed similarity to the sequences in NCBI database. Out of 45,789 sequences, 107 have hits to the Chrysanthemum Nr protein database; 679 and 277 sequences have hits to the database of Helianthus and Lactuca species, respectively. MISA software identified a large number of putative EST-SSRs, allowing 1,788 primer pairs to be designed from the de novo transcriptome sequence and a further 363 from archival EST sequence. Among 100 primer pairs randomly chosen, 81 markers have amplicons and 20 are polymorphic for genotypes analysis in Chrysanthemum. The results showed that most (but not all) of the assays were transferable across species and that they exposed a significant amount of allelic diversity. Conclusions/Significance SSR markers acquired by transcriptome sequencing are potentially useful for marker-assisted breeding and genetic analysis in the genus Chrysanthemum and its related genera. PMID:23626799

  5. Transcriptome Sequencing Revealed Significant Alteration of Cortical Promoter Usage and Splicing in Schizophrenia

    Science.gov (United States)

    Wu, Jing Qin; Wang, Xi; Beveridge, Natalie J.; Tooney, Paul A.; Scott, Rodney J.; Carr, Vaughan J.; Cairns, Murray J.

    2012-01-01

    Background While hybridization based analysis of the cortical transcriptome has provided important insight into the neuropathology of schizophrenia, it represents a restricted view of disease-associated gene activity based on predetermined probes. By contrast, sequencing technology can provide un-biased analysis of transcription at nucleotide resolution. Here we use this approach to investigate schizophrenia-associated cortical gene expression. Methodology/Principal Findings The data was generated from 76 bp reads of RNA-Seq, aligned to the reference genome and assembled into transcripts for quantification of exons, splice variants and alternative promoters in postmortem superior temporal gyrus (STG/BA22) from 9 male subjects with schizophrenia and 9 matched non-psychiatric controls. Differentially expressed genes were then subjected to further sequence and functional group analysis. The output, amounting to more than 38 Gb of sequence, revealed significant alteration of gene expression including many previously shown to be associated with schizophrenia. Gene ontology enrichment analysis followed by functional map construction identified three functional clusters highly relevant to schizophrenia including neurotransmission related functions, synaptic vesicle trafficking, and neural development. Significantly, more than 2000 genes displayed schizophrenia-associated alternative promoter usage and more than 1000 genes showed differential splicing (FDRschizophrenia-associated transcriptional diversity within the STG, and revealed variants with important implications for the complex pathophysiology of schizophrenia. PMID:22558445

  6. [EST-SSR identification, markers development of Ligusticum chuanxiong based on Ligusticum chuanxiong transcriptome sequences].

    Science.gov (United States)

    Yuan, Can; Peng, Fang; Yang, Ze-Mao; Zhong, Wen-Juan; Mou, Fang-Sheng; Gong, Yi-Yun; Ji, Pei-Cheng; Pu, De-Qiang; Huang, Hai-Yan; Yang, Xiao; Zhang, Chao

    2017-09-01

    Ligusticum chuanxiong is a well-known traditional Chinese medicine plant. The study on its molecular markers development and germplasm resources is very important. In this study, we obtained 24 422 unigenes by assembling transcriptome sequencing reads of L. chuanxiong root. EST-SSR was detected and 4 073 SSR loci were identified. EST-SSR distribution and characteristic analysis results showed that the mono-nucleotide repeats were the main repeat types, accounting for 41.0%. In addition, the sequences containing SSR were functionally annotated in Gene Ontology (GO) and KEGG pathway and were assigned to 49 GO categories, 242 KEGG pathways, among them 2 201 sequences were annotated against Nr database. By validating 235 EST-SSRs,74 primer pairs were ultimately proved to have high quality amplification. Subsequently, genetic diversity analysis, UPGMA cluster analysis, PCoA analysis and population structure analysis of 34 L. chuanxiong germplasm resources were carried out with 74 primer pairs. In both UPGMA tree and PCoA results, L. chuanxiong resources were clustered into two groups, which are believed to be partial related to their geographical distribution. In this study, EST-SSRs in L. chuanxiong was firstly identified, and newly developed molecular markers would contribute significantly to further genetic diversity study, the purity detection, gene mapping, and molecular breeding. Copyright© by the Chinese Pharmaceutical Association.

  7. Transcriptomic and genetic analysis of direct interspecies electron transfer

    DEFF Research Database (Denmark)

    Shrestha, Pravin Malla; Rotaru, Amelia-Elena; Summers, Zarath M

    2013-01-01

    The possibility that metatranscriptomic analysis could distinguish between direct interspecies electron transfer (DIET) and H2 interspecies transfer (HIT) in anaerobic communities was investigated by comparing gene transcript abundance in cocultures in which Geobacter sulfurreducens....... These results demonstrate that there are unique gene expression patterns that distinguish DIET from HIT and suggest that metatranscriptomics may be a promising route to investigate interspecies electron transfer pathways in more-complex environments....

  8. De novo transcriptome sequencing of the Octopus vulgaris hemocytes using Illumina RNA-Seq technology: response to the infection by the gastrointestinal parasite Aggregata octopiana.

    Science.gov (United States)

    Castellanos-Martínez, Sheila; Arteta, David; Catarino, Susana; Gestal, Camino

    2014-01-01

    Octopus vulgaris is a highly valuable species of great commercial interest and excellent candidate for aquaculture diversification; however, the octopus' well-being is impaired by pathogens, of which the gastrointestinal coccidian parasite Aggregata octopiana is one of the most important. The knowledge of the molecular mechanisms of the immune response in cephalopods, especially in octopus is scarce. The transcriptome of the hemocytes of O. vulgaris was de novo sequenced using the high-throughput paired-end Illumina technology to identify genes involved in immune defense and to understand the molecular basis of octopus tolerance/resistance to coccidiosis. A bi-directional mRNA library was constructed from hemocytes of two groups of octopus according to the infection by A. octopiana, sick octopus, suffering coccidiosis, and healthy octopus, and reads were de novo assembled together. The differential expression of transcripts was analysed using the general assembly as a reference for mapping the reads from each condition. After sequencing, a total of 75,571,280 high quality reads were obtained from the sick octopus group and 74,731,646 from the healthy group. The general transcriptome of the O. vulgaris hemocytes was assembled in 254,506 contigs. A total of 48,225 contigs were successfully identified, and 538 transcripts exhibited differential expression between groups of infection. The general transcriptome revealed genes involved in pathways like NF-kB, TLR and Complement. Differential expression of TLR-2, PGRP, C1q and PRDX genes due to infection was validated using RT-qPCR. In sick octopuses, only TLR-2 was up-regulated in hemocytes, but all of them were up-regulated in caecum and gills. The transcriptome reported here de novo establishes the first molecular clues to understand how the octopus immune system works and interacts with a highly pathogenic coccidian. The data provided here will contribute to identification of biomarkers for octopus resistance against

  9. Analysis of Pteridium ribosomal RNA sequences by rapid direct sequencing.

    Science.gov (United States)

    Tan, M K

    1991-08-01

    A total of 864 bases from 5 regions interspersed in the 18S and 26S rRNA molecules from various clones of Pteridium covering the general geographical distribution of the genus was analysed using a rapid rRNA sequencing technique. No base difference has been detected amongst the three major lineages, two of which apparently separated before the breakup of the ancient supercontinent, Pangaea. These regions of the rRNA sequences have thus been conserved for at least 160 million years and are here compared with other eukaryotic, especially plant rRNAs.

  10. Prediction of Scylla olivacea (Crustacea; Brachyura) peptide hormones using publicly accessible transcriptome shotgun assembly (TSA) sequences.

    Science.gov (United States)

    Christie, Andrew E

    2016-05-01

    The aquaculture of crabs from the genus Scylla is of increasing economic importance for many Southeast Asian countries. Expansion of Scylla farming has led to increased efforts to understand the physiology and behavior of these crabs, and as such, there are growing molecular resources for them. Here, publicly accessible Scylla olivacea transcriptomic data were mined for putative peptide-encoding transcripts; the proteins deduced from the identified sequences were then used to predict the structures of mature peptide hormones. Forty-nine pre/preprohormone-encoding transcripts were identified, allowing for the prediction of 187 distinct mature peptides. The identified peptides included isoforms of adipokinetic hormone-corazonin-like peptide, allatostatin A, allatostatin B, allatostatin C, bursicon β, CCHamide, corazonin, crustacean cardioactive peptide, crustacean hyperglycemic hormone/molt-inhibiting hormone, diuretic hormone 31, eclosion hormone, FMRFamide-like peptide, HIGSLYRamide, insulin-like peptide, intocin, leucokinin, myosuppressin, neuroparsin, neuropeptide F, orcokinin, pigment dispersing hormone, pyrokinin, red pigment concentrating hormone, RYamide, short neuropeptide F, SIFamide and tachykinin-related peptide, all well-known neuropeptide families. Surprisingly, the tissue used to generate the transcriptome mined here is reported to be testis. Whether or not the testis samples had neural contamination is unknown. However, if the peptides are truly produced by this reproductive organ, it could have far reaching consequences for the study of crustacean endocrinology, particularly in the area of reproductive control. Regardless, this peptidome is the largest thus far predicted for any brachyuran (true crab) species, and will serve as a foundation for future studies of peptidergic control in members of the commercially important genus Scylla. Copyright © 2016 Elsevier Inc. All rights reserved.

  11. Prediction of Toxin Genes from Chinese Yellow Catfish Based on Transcriptomic and Proteomic Sequencing

    Directory of Open Access Journals (Sweden)

    Bing Xie

    2016-04-01

    Full Text Available Fish venom remains a virtually untapped resource. There are so few fish toxin sequences for reference, which increases the difficulty to study toxins from venomous fish and to develop efficient and fast methods to dig out toxin genes or proteins. Here, we utilized Chinese yellow catfish (Pelteobagrus fulvidraco as our research object, since it is a representative species in Siluriformes with its venom glands embedded in the pectoral and dorsal fins. In this study, we set up an in-house toxin database and a novel toxin-discovering protocol to dig out precise toxin genes by combination of transcriptomic and proteomic sequencing. Finally, we obtained 15 putative toxin proteins distributed in five groups, namely Veficolin, Ink toxin, Adamalysin, Za2G and CRISP toxin. It seems that we have developed a novel bioinformatics method, through which we could identify toxin proteins with high confidence. Meanwhile, these toxins can also be useful for comparative studies in other fish and development of potential drugs.

  12. Transcriptome sequencing in pediatric acute lymphoblastic leukemia identifies fusion genes associated with distinct DNA methylation profiles

    Directory of Open Access Journals (Sweden)

    Yanara Marincevic-Zuniga

    2017-08-01

    Full Text Available Abstract Background Structural chromosomal rearrangements that lead to expressed fusion genes are a hallmark of acute lymphoblastic leukemia (ALL. In this study, we performed transcriptome sequencing of 134 primary ALL patient samples to comprehensively detect fusion transcripts. Methods We combined fusion gene detection with genome-wide DNA methylation analysis, gene expression profiling, and targeted sequencing to determine molecular signatures of emerging ALL subtypes. Results We identified 64 unique fusion events distributed among 80 individual patients, of which over 50% have not previously been reported in ALL. Although the majority of the fusion genes were found only in a single patient, we identified several recurrent fusion gene families defined by promiscuous fusion gene partners, such as ETV6, RUNX1, PAX5, and ZNF384, or recurrent fusion genes, such as DUX4-IGH. Our data show that patients harboring these fusion genes displayed characteristic genome-wide DNA methylation and gene expression signatures in addition to distinct patterns in single nucleotide variants and recurrent copy number alterations. Conclusion Our study delineates the fusion gene landscape in pediatric ALL, including both known and novel fusion genes, and highlights fusion gene families with shared molecular etiologies, which may provide additional information for prognosis and therapeutic options in the future.

  13. Development of novel genic microsatellite markers from transcriptome sequencing in sugar maple (Acer saccharum Marsh.).

    Science.gov (United States)

    Harmon, Monica; Lane, Thomas; Staton, Margaret; Coggeshall, Mark V; Best, Teodora; Chen, Chien-Chih; Liang, Haiying; Zembower, Nicole; Drautz-Moses, Daniela I; Hwee, Yap Zhei; Schuster, Stephan C; Schlarbaum, Scott E; Carlson, John E; Gailing, Oliver

    2017-08-08

    Sugar maple (Acer saccharum Marsh.) is a hardwood tree species native to northeastern North America and economically valued for its wood and sap. Yet, few molecular genetic resources have been developed for this species to date. Microsatellite markers have been a useful tool in population genetics, e.g., to monitor genetic variation and to analyze gene flow patterns. The objective of this study is to develop a reference transcriptome and microsatellite markers in sugar maple. A set of 117,861 putative unique transcripts were assembled using 29.2 Gb of RNA sequencing data derived from different tissues and stress treatments. From this set of sequences a total of 1068 microsatellite motifs were identified. Out of 58 genic microsatellite markers tested on a population of 47 sugar maple trees in upper Michigan, 22 amplified well, of which 16 were polymorphic and 6 were monomorphic. Values for expected heterozygosity varied from 0.224 to 0.726 for individual loci. Of the 16 polymorphic markers, 15 exhibited transferability to other Acer L. species. Genic microsatellite markers can be applied to analyze genetic variation in potentially adaptive genes relative to genomic reference markers as a basis for the management of sugar maple genetic resources in the face of climate change.

  14. Tumor transcriptome sequencing reveals allelic expression imbalances associated with copy number alterations.

    Directory of Open Access Journals (Sweden)

    Brian B Tuch

    Full Text Available Due to growing throughput and shrinking cost, massively parallel sequencing is rapidly becoming an attractive alternative to microarrays for the genome-wide study of gene expression and copy number alterations in primary tumors. The sequencing of transcripts (RNA-Seq should offer several advantages over microarray-based methods, including the ability to detect somatic mutations and accurately measure allele-specific expression. To investigate these advantages we have applied a novel, strand-specific RNA-Seq method to tumors and matched normal tissue from three patients with oral squamous cell carcinomas. Additionally, to better understand the genomic determinants of the gene expression changes observed, we have sequenced the tumor and normal genomes of one of these patients. We demonstrate here that our RNA-Seq method accurately measures allelic imbalance and that measurement on the genome-wide scale yields novel insights into cancer etiology. As expected, the set of genes differentially expressed in the tumors is enriched for cell adhesion and differentiation functions, but, unexpectedly, the set of allelically imbalanced genes is also enriched for these same cancer-related functions. By comparing the transcriptomic perturbations observed in one patient to his underlying normal and tumor genomes, we find that allelic imbalance in the tumor is associated with copy number mutations and that copy number mutations are, in turn, strongly associated with changes in transcript abundance. These results support a model in which allele-specific deletions and duplications drive allele-specific changes in gene expression in the developing tumor.

  15. Transcriptome sequencing revealed significant alteration of cortical promoter usage and splicing in schizophrenia.

    Directory of Open Access Journals (Sweden)

    Jing Qin Wu

    Full Text Available While hybridization based analysis of the cortical transcriptome has provided important insight into the neuropathology of schizophrenia, it represents a restricted view of disease-associated gene activity based on predetermined probes. By contrast, sequencing technology can provide un-biased analysis of transcription at nucleotide resolution. Here we use this approach to investigate schizophrenia-associated cortical gene expression.The data was generated from 76 bp reads of RNA-Seq, aligned to the reference genome and assembled into transcripts for quantification of exons, splice variants and alternative promoters in postmortem superior temporal gyrus (STG/BA22 from 9 male subjects with schizophrenia and 9 matched non-psychiatric controls. Differentially expressed genes were then subjected to further sequence and functional group analysis. The output, amounting to more than 38 Gb of sequence, revealed significant alteration of gene expression including many previously shown to be associated with schizophrenia. Gene ontology enrichment analysis followed by functional map construction identified three functional clusters highly relevant to schizophrenia including neurotransmission related functions, synaptic vesicle trafficking, and neural development. Significantly, more than 2000 genes displayed schizophrenia-associated alternative promoter usage and more than 1000 genes showed differential splicing (FDR<0.05. Both types of transcriptional isoforms were exemplified by reads aligned to the neurodevelopmentally significant doublecortin-like kinase 1 (DCLK1 gene.This study provided the first deep and un-biased analysis of schizophrenia-associated transcriptional diversity within the STG, and revealed variants with important implications for the complex pathophysiology of schizophrenia.

  16. Comprehensive transcriptome assembly of Chickpea (Cicer arietinum L. using sanger and next generation sequencing platforms: development and applications.

    Directory of Open Access Journals (Sweden)

    Himabindu Kudapa

    Full Text Available A comprehensive transcriptome assembly of chickpea has been developed using 134.95 million Illumina single-end reads, 7.12 million single-end FLX/454 reads and 139,214 Sanger expressed sequence tags (ESTs from >17 genotypes. This hybrid transcriptome assembly, referred to as Cicer arietinumTranscriptome Assembly version 2 (CaTA v2, available at http://data.comparative-legumes.org/transcriptomes/cicar/lista_cicar-201201, comprising 46,369 transcript assembly contigs (TACs has an N50 length of 1,726 bp and a maximum contig size of 15,644 bp. Putative functions were determined for 32,869 (70.8% of the TACs and gene ontology assignments were determined for 21,471 (46.3%. The new transcriptome assembly was compared with the previously available chickpea transcriptome assemblies as well as to the chickpea genome. Comparative analysis of CaTA v2 against transcriptomes of three legumes - Medicago, soybean and common bean, resulted in 27,771 TACs common to all three legumes indicating strong conservation of genes across legumes. CaTA v2 was also used for identification of simple sequence repeats (SSRs and intron spanning regions (ISRs for developing molecular markers. ISRs were identified by aligning TACs to the Medicago genome, and their putative mapping positions at chromosomal level were identified using transcript map of chickpea. Primer pairs were designed for 4,990 ISRs, each representing a single contig for which predicted positions are inferred and distributed across eight linkage groups. A subset of randomly selected ISRs representing all eight chickpea linkage groups were validated on five chickpea genotypes and showed 20% polymorphism with average polymorphic information content (PIC of 0.27. In summary, the hybrid transcriptome assembly developed and novel markers identified can be used for a variety of applications such as gene discovery, marker-trait association, diversity analysis etc., to advance genetics research and breeding

  17. Transcriptome sequencing and annotation of the microalgae Dunaliella tertiolecta: Pathway description and gene discovery for production of next-generation biofuels

    Directory of Open Access Journals (Sweden)

    Bibby Kyle

    2011-03-01

    Full Text Available Abstract Background Biodiesel or ethanol derived from lipids or starch produced by microalgae may overcome many of the sustainability challenges previously ascribed to petroleum-based fuels and first generation plant-based biofuels. The paucity of microalgae genome sequences, however, limits gene-based biofuel feedstock optimization studies. Here we describe the sequencing and de novo transcriptome assembly for the non-model microalgae species, Dunaliella tertiolecta, and identify pathways and genes of importance related to biofuel production. Results Next generation DNA pyrosequencing technology applied to D. tertiolecta transcripts produced 1,363,336 high quality reads with an average length of 400 bases. Following quality and size trimming, ~ 45% of the high quality reads were assembled into 33,307 isotigs with a 31-fold coverage and 376,482 singletons. Assembled sequences and singletons were subjected to BLAST similarity searches and annotated with Gene Ontology (GO and Kyoto Encyclopedia of Genes and Genomes (KEGG orthology (KO identifiers. These analyses identified the majority of lipid and starch biosynthesis and catabolism pathways in D. tertiolecta. Conclusions The construction of metabolic pathways involved in the biosynthesis and catabolism of fatty acids, triacylglycrols, and starch in D. tertiolecta as well as the assembled transcriptome provide a foundation for the molecular genetics and functional genomics required to direct metabolic engineering efforts that seek to enhance the quantity and character of microalgae-based biofuel feedstock.

  18. De novo transcriptome sequencing of two cultivated jute species under salinity stress.

    Directory of Open Access Journals (Sweden)

    Zemao Yang

    Full Text Available Soil salinity, a major environmental stress, reduces agricultural productivity by restricting plant development and growth. Jute (Corchorus spp., a commercially important bast fiber crop, includes two commercially cultivated species, Corchorus capsularis and Corchorus olitorius. We conducted high-throughput transcriptome sequencing of 24 C. capsularis and C. olitorius samples under salt stress and found 127 common differentially expressed genes (DEGs; additionally, 4489 and 492 common DEGs were identified in the root and leaf tissues, respectively, of both Corchorus species. Further, 32, 196, and 11 common differentially expressed transcription factors (DTFs were detected in the leaf, root, or both tissues, respectively. Several Gene Ontology (GO terms were enriched in NY and YY. A Kyoto Encyclopedia of Genes and Genomes analysis revealed numerous DEGs in both species. Abscisic acid and cytokinin signal pathways enriched respectively about 20 DEGs in leaves and roots of both NY and YY. The Ca2+, mitogen-activated protein kinase signaling and oxidative phosphorylation pathways were also found to be related to the plant response to salt stress, as evidenced by the DEGs in the roots of both species. These results provide insight into salt stress response mechanisms in plants as well as a basis for future breeding of salt-tolerant cultivars.

  19. Transcriptome Sequencing in a Tibetan Barley Landrace with High Resistance to Powdery Mildew

    Directory of Open Access Journals (Sweden)

    Xing-Quan Zeng

    2014-01-01

    Full Text Available Hulless barley is an important cereal crop worldwide, especially in Tibet of China. However, this crop is usually susceptible to powdery mildew caused by Blumeria graminis f. sp. hordei. In this study, we aimed to understand the functions and pathways of genes involved in the disease resistance by transcriptome sequencing of a Tibetan barley landrace with high resistance to powdery mildew. A total of 831 significant differentially expressed genes were found in the infected seedlings, covering 19 functions. Either “cell,” “cell part,” and “extracellular region” in the cellular component category or “binding” and “catalytic” in the category of molecular function as well as “metabolic process” and “cellular process” in the biological process category together demonstrated that these functions may be involved in the resistance to powdery mildew of the hulless barley. In addition, 330 KEGG pathways were found using BLASTx with an E-value cut-off of <10−5. Among them, three pathways, namely, “photosynthesis,” “plant-pathogen interaction,” and “photosynthesis-antenna proteins” had significant matches in the database. Significant expressions of the three pathways were detected at 24 h, 48 h, and 96 h after infection, respectively. These results indicated a complex process of barley response to powdery mildew infection.

  20. Retinal transcriptome sequencing sheds light on the adaptation to nocturnal and diurnal lifestyles in raptors.

    Science.gov (United States)

    Wu, Yonghua; Hadly, Elizabeth A; Teng, Wenjia; Hao, Yuyang; Liang, Wei; Liu, Yu; Wang, Haitao

    2016-09-20

    Owls (Strigiformes) represent a fascinating group of birds that are the ecological night-time counterparts to diurnal raptors (Accipitriformes). The nocturnality of owls, unusual within birds, has favored an exceptional visual system that is highly tuned for hunting at night, yet the molecular basis for this adaptation is lacking. Here, using a comparative evolutionary analysis of 120 vision genes obtained by retinal transcriptome sequencing, we found strong positive selection for low-light vision genes in owls, which contributes to their remarkable nocturnal vision. Not surprisingly, we detected gene loss of the violet/ultraviolet-sensitive opsin (SWS1) in all owls we studied, but two other color vision genes, the red-sensitive LWS and the blue-sensitive SWS2, were found to be under strong positive selection, which may be linked to the spectral tunings of these genes toward maximizing photon absorption in crepuscular conditions. We also detected the only other positively selected genes associated with motion detection in falcons and positively selected genes associated with bright-light vision and eye protection in other diurnal raptors (Accipitriformes). Our results suggest the adaptive evolution of vision genes reflect differentiated activity time and distinct hunting behaviors.

  1. De Novo Sequencing and Assembly Analysis of Transcriptome in Pinus bungeana Zucc. ex Endl.

    Directory of Open Access Journals (Sweden)

    Qifei Cai

    2018-03-01

    Full Text Available To enrich the molecular data of Pinus bungeana Zucc. ex Endl. and study the regulating factors of different morphology controled by apical dominance. In this study, de novo assembly of transcriptome annotation was performed for two varieties of Pinus bungeana Zucc. ex Endl. that are obviously different in morphology. More than 147 million reads were produced, which were assembled into 88,092 unigenes. Based on a similarity search, 11,692 unigenes showed significant similarity to proteins from Picea sitchensis (Bong. Carr. From this collection of unigenes, a large number of molecular markers were identified, including 2829 simple sequence repeats (SSRs. A total of 158 unigenes expressed differently between two varieties, including 98 up-regulated and 60 down-regulated unigenes. Furthermore, among the differently expressed genes (DEGs, five genes which may impact the plant morphology were further validated by reverse transcription quantitative polymerase chain reaction (RT-qPCR. The five genes related to cytokinin oxidase/dehydrogenase (CKX, two-component response regulator ARR-A family (ARR-A, plant hormone signal transduction (AHP, and MADS-box transcription factors have a close relationship with apical dominance. This new dataset will be a useful resource for future genetic and genomic studies in Pinus bungeana Zucc. ex Endl.

  2. Transcriptome sequencing and metabolite analysis reveals the role of delphinidin metabolism in flower colour in grape hyacinth

    OpenAIRE

    Lou, Qian; Liu, Yali; Qi, Yinyan; Jiao, Shuzhen; Tian, Feifei; Jiang, Ling; Wang, Yuejin

    2014-01-01

    Grape hyacinth (Muscari) is an important ornamental bulbous plant with an extraordinary blue colour. Muscari armeniacum, whose flowers can be naturally white, provides an opportunity to unravel the complex metabolic networks underlying certain biochemical traits, especially colour. A blue flower cDNA library of M. armeniacum and a white flower library of M. armeniacum f. album were used for transcriptome sequencing. A total of 89 926 uni-transcripts were isolated, 143 of which could be identi...

  3. Detection of G-Quadruplex Structures Formed by G-Rich Sequences from Rice Genome and Transcriptome Using Combined Probes.

    Science.gov (United States)

    Chang, Tianjun; Li, Weiguo; Ding, Zhan; Cheng, Shaofei; Liang, Kun; Liu, Xiangjun; Bing, Tao; Shangguan, Dihua

    2017-08-01

    Putative G-quadruplex (G4) forming sequences (PQS) are highly prevalent in the genome and transcriptome of various organisms and are considered as potential regulation elements in many biological processes by forming G4 structures. The formation of G4 structures highly depends on the sequences and the environment. In most cases, it is difficult to predict G4 formation by PQS, especially PQS containing G2 tracts. Therefore, the experimental identification of G4 formation is essential in the study of G4-related biological functions. Herein, we report a rapid and simple method for the detection of G4 structures by using a pair of complementary reporters, hemin and BMSP. This method was applied to detect G4 structures formed by PQS (DNA and RNA) searched in the genome and transcriptome of Oryza sativa. Unlike most of the reported G4 probes that only recognize part of G4 structures, the proposed method based on combined probes positively responded to almost all G4 conformations, including parallel, antiparallel, and mixed/hybrid G4, but did not respond to non-G4 sequences. This method shows potential for high-throughput identification of G4 structures in genome and transcriptome. Furthermore, BMSP was observed to drive some PQS to form more stable G4 structures or induce the G4 formation of some PQS that cannot form G4 in normal physiological conditions, which may provide a powerful molecular tool for gene regulation.

  4. De Novo Sequencing and Analysis of Lemongrass Transcriptome Provide First Insights into the Essential Oil Biosynthesis of Aromatic Grasses.

    Science.gov (United States)

    Meena, Seema; Kumar, Sarma R; Venkata Rao, D K; Dwivedi, Varun; Shilpashree, H B; Rastogi, Shubhra; Shasany, Ajit K; Nagegowda, Dinesh A

    2016-01-01

    Aromatic grasses of the genus Cymbopogon (Poaceae family) represent unique group of plants that produce diverse composition of monoterpene rich essential oils, which have great value in flavor, fragrance, cosmetic, and aromatherapy industries. Despite the commercial importance of these natural aromatic oils, their biosynthesis at the molecular level remains unexplored. As the first step toward understanding the essential oil biosynthesis, we performed de novo transcriptome assembly and analysis of C. flexuosus (lemongrass) by employing Illumina sequencing. Mining of transcriptome data and subsequent phylogenetic analysis led to identification of terpene synthases, pyrophosphatases, alcohol dehydrogenases, aldo-keto reductases, carotenoid cleavage dioxygenases, alcohol acetyltransferases, and aldehyde dehydrogenases, which are potentially involved in essential oil biosynthesis. Comparative essential oil profiling and mRNA expression analysis in three Cymbopogon species (C. flexuosus, aldehyde type; C. martinii, alcohol type; and C. winterianus, intermediate type) with varying essential oil composition indicated the involvement of identified candidate genes in the formation of alcohols, aldehydes, and acetates. Molecular modeling and docking further supported the role of identified protein sequences in aroma formation in Cymbopogon. Also, simple sequence repeats were found in the transcriptome with many linked to terpene pathway genes including the genes potentially involved in aroma biosynthesis. This work provides the first insights into the essential oil biosynthesis of aromatic grasses, and the identified candidate genes and markers can be a great resource for biotechnological and molecular breeding approaches to modulate the essential oil composition.

  5. De Novo Sequencing and Analysis of Lemongrass Transcriptome Provide First Insights into the Essential Oil Biosynthesis of Aromatic Grasses

    Science.gov (United States)

    Meena, Seema; Kumar, Sarma R.; Venkata Rao, D. K.; Dwivedi, Varun; Shilpashree, H. B.; Rastogi, Shubhra; Shasany, Ajit K.; Nagegowda, Dinesh A.

    2016-01-01

    Aromatic grasses of the genus Cymbopogon (Poaceae family) represent unique group of plants that produce diverse composition of monoterpene rich essential oils, which have great value in flavor, fragrance, cosmetic, and aromatherapy industries. Despite the commercial importance of these natural aromatic oils, their biosynthesis at the molecular level remains unexplored. As the first step toward understanding the essential oil biosynthesis, we performed de novo transcriptome assembly and analysis of C. flexuosus (lemongrass) by employing Illumina sequencing. Mining of transcriptome data and subsequent phylogenetic analysis led to identification of terpene synthases, pyrophosphatases, alcohol dehydrogenases, aldo-keto reductases, carotenoid cleavage dioxygenases, alcohol acetyltransferases, and aldehyde dehydrogenases, which are potentially involved in essential oil biosynthesis. Comparative essential oil profiling and mRNA expression analysis in three Cymbopogon species (C. flexuosus, aldehyde type; C. martinii, alcohol type; and C. winterianus, intermediate type) with varying essential oil composition indicated the involvement of identified candidate genes in the formation of alcohols, aldehydes, and acetates. Molecular modeling and docking further supported the role of identified protein sequences in aroma formation in Cymbopogon. Also, simple sequence repeats were found in the transcriptome with many linked to terpene pathway genes including the genes potentially involved in aroma biosynthesis. This work provides the first insights into the essential oil biosynthesis of aromatic grasses, and the identified candidate genes and markers can be a great resource for biotechnological and molecular breeding approaches to modulate the essential oil composition. PMID:27516768

  6. A transcriptome atlas of rabbit revealed by PacBio single-molecule long-read sequencing.

    Science.gov (United States)

    Chen, Shi-Yi; Deng, Feilong; Jia, Xianbo; Li, Cao; Lai, Song-Jia

    2017-08-09

    It is widely acknowledged that transcriptional diversity largely contributes to biological regulation in eukaryotes. Since the advent of second-generation sequencing technologies, a large number of RNA sequencing studies have considerably improved our understanding of transcriptome complexity. However, it still remains a huge challenge for obtaining full-length transcripts because of difficulties in the short read-based assembly. In the present study we employ PacBio single-molecule long-read sequencing technology for whole-transcriptome profiling in rabbit (Oryctolagus cuniculus). We totally obtain 36,186 high-confidence transcripts from 14,474 genic loci, among which more than 23% of genic loci and 66% of isoforms have not been annotated yet within the current reference genome. Furthermore, about 17% of transcripts are computationally revealed to be non-coding RNAs. Up to 24,797 alternative splicing (AS) and 11,184 alternative polyadenylation (APA) events are detected within this de novo constructed transcriptome, respectively. The results provide a comprehensive set of reference transcripts and hence contribute to the improved annotation of rabbit genome.

  7. In-Depth Transcriptome Sequencing of Mexican Lime Trees Infected with Candidatus Phytoplasma aurantifolia.

    Science.gov (United States)

    Mardi, Mohsen; Karimi Farsad, Laleh; Gharechahi, Javad; Salekdeh, Ghasem Hosseini

    2015-01-01

    Witches' broom disease of acid lime greatly affects the production of Mexican lime in Iran. It is caused by a phytoplasma (Candidatus Phytoplasma aurantifolia). However, the molecular mechanisms that underlie phytoplasma pathogenicity and the mode of interactions with host plants are largely unknown. Here, high-throughput transcriptome sequencing was conducted to explore gene expression signatures associated with phytoplasma infection in Mexican lime trees. We assembled 78,185 unique transcript sequences (unigenes) with an average length of 530 nt. Of these, 41,805 (53.4%) were annotated against the NCBI non-redundant (nr) protein database using a BLASTx search (e-value ≤ 1e-5). When the abundances of unigenes in healthy and infected plants were compared, 2,805 transcripts showed significant differences (false discovery rate ≤ 0.001 and log2 ratio ≥ 1.5). These differentially expressed genes (DEGs) were significantly enriched in 43 KEGG metabolic and regulatory pathways. The up-regulated DEGs were mainly categorized into pathways with possible implication in plant-pathogen interaction, including cell wall biogenesis and degradation, sucrose metabolism, secondary metabolism, hormone biosynthesis and signalling, amino acid and lipid metabolism, while down-regulated DEGs were predominantly enriched in ubiquitin proteolysis and oxidative phosphorylation pathways. Our analysis provides novel insight into the molecular pathways that are deregulated during the host-pathogen interaction in Mexican lime trees infected by phytoplasma. The findings can be valuable for unravelling the molecular mechanisms of plant-phytoplasma interactions and can pave the way for engineering lime trees with resistance to witches' broom disease.

  8. Peripheral blood transcriptome sequencing reveals rejection-relevant genes in long-term heart transplantation.

    Science.gov (United States)

    Chen, Yan; Zhang, Haibo; Xiao, Xue; Jia, Yixin; Wu, Weili; Liu, Licheng; Jiang, Jun; Zhu, Baoli; Meng, Xu; Chen, Weijun

    2013-10-03

    Peripheral blood-based gene expression patterns have been investigated as biomarkers to monitor the immune system and rule out rejection after heart transplantation. Recent advances in the high-throughput deep sequencing (HTS) technologies provide new leads in transcriptome analysis. By performing Solexa/Illumina's digital gene expression (DGE) profiling, we analyzed gene expression profiles of PBMCs from 6 quiescent (grade 0) and 6 rejection (grade 2R&3R) heart transplant recipients at more than 6 months after transplantation. Subsequently, quantitative real-time polymerase chain reaction (qRT-PCR) was carried out in an independent validation cohort of 47 individuals from three rejection groups (ISHLT, grade 0,1R, 2R&3R). Through DGE sequencing and qPCR validation, 10 genes were identified as informative genes for detection of cardiac transplant rejection. A further clustering analysis showed that the 10 genes were not only effective for distinguishing patients with acute cardiac allograft rejection, but also informative for discriminating patients with renal allograft rejection based on both blood and biopsy samples. Moreover, PPI network analysis revealed that the 10 genes were connected to each other within a short interaction distance. We proposed a 10-gene signature for heart transplant patients at high-risk of developing severe rejection, which was found to be effective as well in other organ transplant. Moreover, we supposed that these genes function systematically as biomarkers in long-time allograft rejection. Further validation in broad transplant population would be required before the non-invasive biomarkers can be generally utilized to predict the risk of transplant rejection. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  9. Genome and transcriptome sequencing characterises the gene space of Macadamia integrifolia (Proteaceae).

    Science.gov (United States)

    Nock, Catherine J; Baten, Abdul; Barkla, Bronwyn J; Furtado, Agnelo; Henry, Robert J; King, Graham J

    2016-11-17

    The large Gondwanan plant family Proteaceae is an early-diverging eudicot lineage renowned for its morphological, taxonomic and ecological diversity. Macadamia is the most economically important Proteaceae crop and represents an ancient rainforest-restricted lineage. The family is a focus for studies of adaptive radiation due to remarkable species diversification in Mediterranean-climate biodiversity hotspots, and numerous evolutionary transitions between biomes. Despite a long history of research, comparative analyses in the Proteaceae and macadamia breeding programs are restricted by a paucity of genetic information. To address this, we sequenced the genome and transcriptome of the widely grown Macadamia integrifolia cultivar 741. Over 95 gigabases of DNA and RNA-seq sequence data were de novo assembled and annotated. The draft assembly has a total length of 518 Mb and spans approximately 79% of the estimated genome size. Following annotation, 35,337 protein-coding genes were predicted of which over 90% were expressed in at least one of the leaf, shoot or flower tissues examined. Gene family comparisons with five other eudicot species revealed 13,689 clusters containing macadamia genes and 1005 macadamia-specific clusters, and provides evidence for linage-specific expansion of gene families involved in pathogen recognition, plant defense and monoterpene synthesis. Cyanogenesis is an important defense strategy in the Proteaceae, and a detailed analysis of macadamia gene homologues potentially involved in cyanogenic glycoside biosynthesis revealed several highly expressed candidate genes. The gene space of macadamia provides a foundation for comparative genomics, gene discovery and the acceleration of molecular-assisted breeding. This study presents the first available genomic resources for the large basal eudicot family Proteaceae, access to most macadamia genes and opportunities to uncover the genetic basis of traits of importance for adaptation and crop

  10. Transcriptome Analysis of Flower Sex Differentiation in Jatropha curcas L. Using RNA Sequencing.

    Science.gov (United States)

    Xu, Gang; Huang, Jian; Yang, Yong; Yao, Yin-an

    2016-01-01

    Jatropha curcas is thought to be a promising biofuel material, but its yield is restricted by a low ratio of instaminate/staminate flowers (1/10-1/30). Furthermore, valuable information about flower sex differentiation in this plant is scarce. To explore the mechanism of this process in J. curcas, transcriptome profiling of flower development was carried out, and certain genes related with sex differentiation were obtained through digital gene expression analysis of flower buds from different phases of floral development. After Illumina sequencing and clustering, 57,962 unigenes were identified. A total of 47,423 unigenes were annotated, with 85 being related to carpel and stamen differentiation, 126 involved in carpel and stamen development, and 592 functioning in the later development stage for the maturation of staminate or instaminate flowers. Annotation of these genes provided comprehensive information regarding the sex differentiation of flowers, including the signaling system, hormone biosynthesis and regulation, transcription regulation and ubiquitin-mediated proteolysis. A further expression pattern analysis of 15 sex-related genes using quantitative real-time PCR revealed that gibberellin-regulated protein 4-like protein and AMP-activated protein kinase are associated with stamen differentiation, whereas auxin response factor 6-like protein, AGAMOUS-like 20 protein, CLAVATA1, RING-H2 finger protein ATL3J, auxin-induced protein 22D, and r2r3-myb transcription factor contribute to embryo sac development in the instaminate flower. Cytokinin oxidase, Unigene28, auxin repressed-like protein ARP1, gibberellin receptor protein GID1 and auxin-induced protein X10A are involved in both stages mentioned above. In addition to its function in the differentiation and development of the stamens, the gibberellin signaling pathway also functions in embryo sac development for the instaminate flower. The auxin signaling pathway also participates in both stamen development

  11. Discovery of Putative Herbicide Resistance Genes and Its Regulatory Network in Chickpea Using Transcriptome Sequencing

    Directory of Open Access Journals (Sweden)

    Mir A. Iquebal

    2017-06-01

    Full Text Available Background: Chickpea (Cicer arietinum L. contributes 75% of total pulse production. Being cheaper than animal protein, makes it important in dietary requirement of developing countries. Weed not only competes with chickpea resulting into drastic yield reduction but also creates problem of harboring fungi, bacterial diseases and insect pests. Chemical approach having new herbicide discovery has constraint of limited lead molecule options, statutory regulations and environmental clearance. Through genetic approach, transgenic herbicide tolerant crop has given successful result but led to serious concern over ecological safety thus non-transgenic approach like marker assisted selection is desirable. Since large variability in tolerance limit of herbicide already exists in chickpea varieties, thus the genes offering herbicide tolerance can be introgressed in variety improvement programme. Transcriptome studies can discover such associated key genes with herbicide tolerance in chickpea.Results: This is first transcriptomic studies of chickpea or even any legume crop using two herbicide susceptible and tolerant genotypes exposed to imidazoline (Imazethapyr. Approximately 90 million paired-end reads generated from four samples were processed and assembled into 30,803 contigs using reference based assembly. We report 6,310 differentially expressed genes (DEGs, of which 3,037 were regulated by 980 miRNAs, 1,528 transcription factors associated with 897 DEGs, 47 Hub proteins, 3,540 putative Simple Sequence Repeat-Functional Domain Marker (SSR-FDM, 13,778 genic Single Nucleotide Polymorphism (SNP putative markers and 1,174 Indels. Randomly selected 20 DEGs were validated using qPCR. Pathway analysis suggested that xenobiotic degradation related gene, glutathione S-transferase (GST were only up-regulated in presence of herbicide. Down-regulation of DNA replication genes and up-regulation of abscisic acid pathway genes were observed. Study further reveals

  12. Characterization of transcriptome dynamics during watermelon fruit development: sequencing, assembly, annotation and gene expression profiles.

    Science.gov (United States)

    Guo, Shaogui; Liu, Jingan; Zheng, Yi; Huang, Mingyun; Zhang, Haiying; Gong, Guoyi; He, Hongju; Ren, Yi; Zhong, Silin; Fei, Zhangjun; Xu, Yong

    2011-09-21

    Cultivated watermelon [Citrullus lanatus (Thunb.) Matsum. & Nakai var. lanatus] is an important agriculture crop world-wide. The fruit of watermelon undergoes distinct stages of development with dramatic changes in its size, color, sweetness, texture and aroma. In order to better understand the genetic and molecular basis of these changes and significantly expand the watermelon transcript catalog, we have selected four critical stages of watermelon fruit development and used Roche/454 next-generation sequencing technology to generate a large expressed sequence tag (EST) dataset and a comprehensive transcriptome profile for watermelon fruit flesh tissues. We performed half Roche/454 GS-FLX run for each of the four watermelon fruit developmental stages (immature white, white-pink flesh, red flesh and over-ripe) and obtained 577,023 high quality ESTs with an average length of 302.8 bp. De novo assembly of these ESTs together with 11,786 watermelon ESTs collected from GenBank produced 75,068 unigenes with a total length of approximately 31.8 Mb. Overall 54.9% of the unigenes showed significant similarities to known sequences in GenBank non-redundant (nr) protein database and around two-thirds of them matched proteins of cucumber, the most closely-related species with a sequenced genome. The unigenes were further assigned with gene ontology (GO) terms and mapped to biochemical pathways. More than 5,000 SSRs were identified from the EST collection. Furthermore we carried out digital gene expression analysis of these ESTs and identified 3,023 genes that were differentially expressed during watermelon fruit development and ripening, which provided novel insights into watermelon fruit biology and a comprehensive resource of candidate genes for future functional analysis. We then generated profiles of several interesting metabolites that are important to fruit quality including pigmentation and sweetness. Integrative analysis of metabolite and digital gene expression

  13. Deep sequencing of the Mexican avocado transcriptome, an ancient angiosperm with a high content of fatty acids.

    Science.gov (United States)

    Ibarra-Laclette, Enrique; Méndez-Bravo, Alfonso; Pérez-Torres, Claudia Anahí; Albert, Victor A; Mockaitis, Keithanne; Kilaru, Aruna; López-Gómez, Rodolfo; Cervantes-Luevano, Jacob Israel; Herrera-Estrella, Luis

    2015-08-13

    Avocado (Persea americana) is an economically important tropical fruit considered to be a good source of fatty acids. Despite its importance, the molecular and cellular characterization of biochemical and developmental processes in avocado is limited due to the lack of transcriptome and genomic information. The transcriptomes of seeds, roots, stems, leaves, aerial buds and flowers were determined using different sequencing platforms. Additionally, the transcriptomes of three different stages of fruit ripening (pre-climacteric, climacteric and post-climacteric) were also analyzed. The analysis of the RNAseqatlas presented here reveals strong differences in gene expression patterns between different organs, especially between root and flower, but also reveals similarities among the gene expression patterns in other organs, such as stem, leaves and aerial buds (vegetative organs) or seed and fruit (storage organs). Important regulators, functional categories, and differentially expressed genes involved in avocado fruit ripening were identified. Additionally, to demonstrate the utility of the avocado gene expression atlas, we investigated the expression patterns of genes implicated in fatty acid metabolism and fruit ripening. A description of transcriptomic changes occurring during fruit ripening was obtained in Mexican avocado, contributing to a dynamic view of the expression patterns of genes involved in fatty acid biosynthesis and the fruit ripening process.

  14. Sequence comparison of prefrontal cortical brain transcriptome from a tame and an aggressive silver fox (Vulpes vulpes)

    Science.gov (United States)

    2011-01-01

    Background Two strains of the silver fox (Vulpes vulpes), with markedly different behavioral phenotypes, have been developed by long-term selection for behavior. Foxes from the tame strain exhibit friendly behavior towards humans, paralleling the sociability of canine puppies, whereas foxes from the aggressive strain are defensive and exhibit aggression to humans. To understand the genetic differences underlying these behavioral phenotypes fox-specific genomic resources are needed. Results cDNA from mRNA from pre-frontal cortex of a tame and an aggressive fox was sequenced using the Roche 454 FLX Titanium platform (> 2.5 million reads & 0.9 Gbase of tame fox sequence; >3.3 million reads & 1.2 Gbase of aggressive fox sequence). Over 80% of the fox reads were assembled into contigs. Mapping fox reads against the fox transcriptome assembly and the dog genome identified over 30,000 high confidence fox-specific SNPs. Fox transcripts for approximately 14,000 genes were identified using SwissProt and the dog RefSeq databases. An at least 2-fold expression difference between the two samples (p fox transcriptome. Conclusions Transcriptome sequencing significantly expanded genomic resources available for the fox, a species without a sequenced genome. In a very cost efficient manner this yielded a large number of fox-specific SNP markers for genetic studies and provided significant insights into the gene expression profile of the fox pre-frontal cortex; expression differences between the two fox samples; and a catalogue of potentially important gene-specific sequence variants. This result demonstrates the utility of this approach for developing genomic resources in species with limited genomic information. PMID:21967120

  15. De novo assembly and characterization of the transcriptome of seagrass Zostera marina using Illumina paired-end sequencing.

    Directory of Open Access Journals (Sweden)

    Fanna Kong

    Full Text Available BACKGROUND: The seagrass Zostera marina is a monocotyledonous angiosperm belonging to a polyphyletic group of plants that can live submerged in marine habitats. Zostera marina L. is one of the most common seagrasses and is considered a cornerstone of marine plant molecular ecology research and comparative studies. However, the mechanisms underlying its adaptation to the marine environment still remain poorly understood due to limited transcriptomic and genomic data. PRINCIPAL FINDINGS: Here we explored the transcriptome of Z. marina leaves under different environmental conditions using Illumina paired-end sequencing. Approximately 55 million sequencing reads were obtained, representing 58,457 transcripts that correspond to 24,216 unigenes. A total of 14,389 (59.41% unigenes were annotated by blast searches against the NCBI non-redundant protein database. 45.18% and 46.91% of the unigenes had significant similarity with proteins in the Swiss-Prot database and Pfam database, respectively. Among these, 13,897 unigenes were assigned to 57 Gene Ontology (GO terms and 4,745 unigenes were identified and mapped to 233 pathways via functional annotation against the Kyoto Encyclopedia of Genes and Genomes pathway database (KEGG. We compared the orthologous gene family of the Z. marina transcriptome to Oryza sativa and Pyropia yezoensis and 11,667 orthologous gene families are specific to Z. marina. Furthermore, we identified the photoreceptors sensing red/far-red light and blue light. Also, we identified a large number of genes that are involved in ion transporters and channels including Na+ efflux, K+ uptake, Cl- channels, and H+ pumping. CONCLUSIONS: Our study contains an extensive sequencing and gene-annotation analysis of Z. marina. This information represents a genetic resource for the discovery of genes related to light sensing and salt tolerance in this species. Our transcriptome can be further utilized in future studies on molecular adaptation to

  16. Principles of Long Noncoding RNA Evolution Derived from Direct Comparison of Transcriptomes in 17 Species

    Directory of Open Access Journals (Sweden)

    Hadas Hezroni

    2015-05-01

    Full Text Available The inability to predict long noncoding RNAs from genomic sequence has impeded the use of comparative genomics for studying their biology. Here, we develop methods that use RNA sequencing (RNA-seq data to annotate the transcriptomes of 16 vertebrates and the echinoid sea urchin, uncovering thousands of previously unannotated genes, most of which produce long intervening noncoding RNAs (lincRNAs. Although in each species, >70% of lincRNAs cannot be traced to homologs in species that diverged >50 million years ago, thousands of human lincRNAs have homologs with similar expression patterns in other species. These homologs share short, 5′-biased patches of sequence conservation nested in exonic architectures that have been extensively rewired, in part by transposable element exonization. Thus, over a thousand human lincRNAs are likely to have conserved functions in mammals, and hundreds beyond mammals, but those functions require only short patches of specific sequences and can tolerate major changes in gene architecture.

  17. Deep sequencing of the Camellia sinensis transcriptome revealed candidate genes for major metabolic pathways of tea-specific compounds

    Energy Technology Data Exchange (ETDEWEB)

    Shi, CY; Yang, H; Wei, CL; Yu, O; Zhang, ZZ; Sun, J; Wan, XC

    2011-01-01

    Tea is one of the most popular non-alcoholic beverages worldwide. However, the tea plant, Camellia sinensis, is difficult to culture in vitro, to transform, and has a large genome, rendering little genomic information available. Recent advances in large-scale RNA sequencing (RNA-seq) provide a fast, cost-effective, and reliable approach to generate large expression datasets for functional genomic analysis, which is especially suitable for non-model species with un-sequenced genomes. Using high-throughput Illumina RNA-seq, the transcriptome from poly (A){sup +} RNA of C. sinensis was analyzed at an unprecedented depth (2.59 gigabase pairs). Approximate 34.5 million reads were obtained, trimmed, and assembled into 127,094 unigenes, with an average length of 355 bp and an N50 of 506 bp, which consisted of 788 contig clusters and 126,306 singletons. This number of unigenes was 10-fold higher than existing C. sinensis sequences deposited in GenBank (as of August 2010). Sequence similarity analyses against six public databases (Uniprot, NR and COGs at NCBI, Pfam, InterPro and KEGG) found 55,088 unigenes that could be annotated with gene descriptions, conserved protein domains, or gene ontology terms. Some of the unigenes were assigned to putative metabolic pathways. Targeted searches using these annotations identified the majority of genes associated with several primary metabolic pathways and natural product pathways that are important to tea quality, such as flavonoid, theanine and caffeine biosynthesis pathways. Novel candidate genes of these secondary pathways were discovered. Comparisons with four previously prepared cDNA libraries revealed that this transcriptome dataset has both a high degree of consistency with previous EST data and an approximate 20 times increase in coverage. Thirteen unigenes related to theanine and flavonoid synthesis were validated. Their expression patterns in different organs of the tea plant were analyzed by RT-PCR and quantitative real

  18. Deep sequencing of the Camellia sinensis transcriptome revealed candidate genes for major metabolic pathways of tea-specific compounds

    Directory of Open Access Journals (Sweden)

    Chen Qi

    2011-02-01

    Full Text Available Abstract Background Tea is one of the most popular non-alcoholic beverages worldwide. However, the tea plant, Camellia sinensis, is difficult to culture in vitro, to transform, and has a large genome, rendering little genomic information available. Recent advances in large-scale RNA sequencing (RNA-seq provide a fast, cost-effective, and reliable approach to generate large expression datasets for functional genomic analysis, which is especially suitable for non-model species with un-sequenced genomes. Results Using high-throughput Illumina RNA-seq, the transcriptome from poly (A+ RNA of C. sinensis was analyzed at an unprecedented depth (2.59 gigabase pairs. Approximate 34.5 million reads were obtained, trimmed, and assembled into 127,094 unigenes, with an average length of 355 bp and an N50 of 506 bp, which consisted of 788 contig clusters and 126,306 singletons. This number of unigenes was 10-fold higher than existing C. sinensis sequences deposited in GenBank (as of August 2010. Sequence similarity analyses against six public databases (Uniprot, NR and COGs at NCBI, Pfam, InterPro and KEGG found 55,088 unigenes that could be annotated with gene descriptions, conserved protein domains, or gene ontology terms. Some of the unigenes were assigned to putative metabolic pathways. Targeted searches using these annotations identified the majority of genes associated with several primary metabolic pathways and natural product pathways that are important to tea quality, such as flavonoid, theanine and caffeine biosynthesis pathways. Novel candidate genes of these secondary pathways were discovered. Comparisons with four previously prepared cDNA libraries revealed that this transcriptome dataset has both a high degree of consistency with previous EST data and an approximate 20 times increase in coverage. Thirteen unigenes related to theanine and flavonoid synthesis were validated. Their expression patterns in different organs of the tea plant were

  19. De Novo Transcriptome Sequencing of Desert Herbaceous Achnatherum splendens (Achnatherum Seedlings and Identification of Salt Tolerance Genes

    Directory of Open Access Journals (Sweden)

    Jiangtao Liu

    2016-03-01

    Full Text Available Achnatherum splendens is an important forage herb in Northwestern China. It has a high tolerance to salinity and is, thus, considered one of the most important constructive plants in saline and alkaline areas of land in Northwest China. However, the mechanisms of salt stress tolerance in A. splendens remain unknown. Next-generation sequencing (NGS technologies can be used for global gene expression profiling. In this study, we examined sequence and transcript abundance data for the root/leaf transcriptome of A. splendens obtained using an Illumina HiSeq 2500. Over 35 million clean reads were obtained from the leaf and root libraries. All of the RNA sequencing (RNA-seq reads were assembled de novo into a total of 126,235 unigenes and 36,511 coding DNA sequences (CDS. We further identified 1663 differentially-expressed genes (DEGs between the salt stress treatment and control. Functional annotation of the DEGs by gene ontology (GO, using Arabidopsis and rice as references, revealed enrichment of salt stress-related GO categories, including “oxidation reduction”, “transcription factor activity”, and “ion channel transporter”. Thus, this global transcriptome analysis of A. splendens has provided an important genetic resource for the study of salt tolerance in this halophyte. The identified sequences and their putative functional data will facilitate future investigations of the tolerance of Achnatherum species to various types of abiotic stress.

  20. Transcriptome Sequencing, De Novo Assembly and Differential Gene Expression Analysis of the Early Development of Acipenser baeri.

    Directory of Open Access Journals (Sweden)

    Wei Song

    Full Text Available The molecular mechanisms that drive the development of the endangered fossil fish species Acipenser baeri are difficult to study due to the lack of genomic data. Recent advances in sequencing technologies and the reducing cost of sequencing offer exclusive opportunities for exploring important molecular mechanisms underlying specific biological processes. This manuscript describes the large scale sequencing and analyses of mRNA from Acipenser baeri collected at five development time points using the Illumina Hiseq2000 platform. The sequencing reads were de novo assembled and clustered into 278167 unigenes, of which 57346 (20.62% had 45837 known homologues proteins in Uniprot protein databases while 11509 proteins matched with at least one sequence of assembled unigenes. The remaining 79.38% of unigenes could stand for non-coding unigenes or unigenes specific to A. baeri. A number of 43062 unigenes were annotated into functional categories via Gene Ontology (GO annotation whereas 29526 unigenes were associated with 329 pathways by mapping to KEGG database. Subsequently, 3479 differentially expressed genes were scanned within developmental stages and clustered into 50 gene expression profiles. Genes preferentially expressed at each stage were also identified. Through GO and KEGG pathway enrichment analysis, relevant physiological variations during the early development of A. baeri could be better cognized. Accordingly, the present study gives insights into the transcriptome profile of the early development of A. baeri, and the information contained in this large scale transcriptome will provide substantial references for A. baeri developmental biology and promote its aquaculture research.

  1. RNA sequencing analysis to capture the transcriptome landscape during skin ulceration syndrome progression in sea cucumber Apostichopus japonicus.

    Science.gov (United States)

    Yang, Aifu; Zhou, Zunchun; Pan, Yongjia; Jiang, Jingwei; Dong, Ying; Guan, Xiaoyan; Sun, Hongjuan; Gao, Shan; Chen, Zhong

    2016-06-14

    Sea cucumber Apostichopus japonicus is an important economic species in China, which is affected by various diseases; skin ulceration syndrome (SUS) is the most serious. In this study, we characterized the transcriptomes in A. japonicus challenged with Vibrio splendidus to elucidate the changes in gene expression throughout the three stages of SUS progression. RNA sequencing of 21 cDNA libraries from various tissues and developmental stages of SUS-affected A. japonicus yielded 553 million raw reads, of which 542 million high-quality reads were generated by deep-sequencing using the Illumina HiSeq™ 2000 platform. The reference transcriptome comprised a combination of the Illumina reads, 454 sequencing data and Sanger sequences obtained from the public database to generate 93,163 unigenes (average length, 1,052 bp; N50 = 1,575 bp); 33,860 were annotated. Transcriptome comparisons between healthy and SUS-affected A. japonicus revealed greater differences in gene expression profiles in the body walls (BW) than in the intestines (Int), respiratory trees (RT) and coelomocytes (C). Clustering of expression models revealed stable up-regulation as the main pattern occurring in the BW throughout the three stages of SUS progression. Significantly affected pathways were associated with signal transduction, immune system, cellular processes, development and metabolism. Ninety-two differentially expressed genes (DEGs) were divided into four functional categories: attachment/pathogen recognition (17), inflammatory reactions (38), oxidative stress response (7) and apoptosis (30). Using quantitative real-time PCR, twenty representative DEGs were selected to validate the sequencing results. The Pearson's correlation coefficient (R) of the 20 DEGs ranged from 0.811 to 0.999, which confirmed the consistency and accuracy between these two approaches. Dynamic changes in global gene expression occur during SUS progression in A. japonicus. Elucidation of these changes is important

  2. Next generation sequencing based transcriptome analysis of septic-injury responsive genes in the beetle Tribolium castaneum.

    Directory of Open Access Journals (Sweden)

    Boran Altincicek

    Full Text Available Beetles (Coleoptera are the most diverse animal group on earth and interact with numerous symbiotic or pathogenic microbes in their environments. The red flour beetle Tribolium castaneum is a genetically tractable model beetle species and its whole genome sequence has recently been determined. To advance our understanding of the molecular basis of beetle immunity here we analyzed the whole transcriptome of T. castaneum by high-throughput next generation sequencing technology. Here, we demonstrate that the Illumina/Solexa sequencing approach of cDNA samples from T. castaneum including over 9.7 million reads with 72 base pairs (bp length (approximately 700 million bp sequence information with about 30× transcriptome coverage confirms the expression of most predicted genes and enabled subsequent qualitative and quantitative transcriptome analysis. This approach recapitulates our recent quantitative real-time PCR studies of immune-challenged and naïve T. castaneum beetles, validating our approach. Furthermore, this sequencing analysis resulted in the identification of 73 differentially expressed genes upon immune-challenge with statistical significance by comparing expression data to calculated values derived by fitting to generalized linear models. We identified up regulation of diverse immune-related genes (e.g. Toll receptor, serine proteinases, DOPA decarboxylase and thaumatin and of numerous genes encoding proteins with yet unknown functions. Of note, septic-injury resulted also in the elevated expression of genes encoding heat-shock proteins or cytochrome P450s supporting the view that there is crosstalk between immune and stress responses in T. castaneum. The present study provides a first comprehensive overview of septic-injury responsive genes in T. castaneum beetles. Identified genes advance our understanding of T. castaneum specific gene expression alteration upon immune-challenge in particular and may help to understand beetle immunity

  3. De novo Transcriptome Sequencing Reveals a Considerable Bias in the Incidence of Simple Sequence Repeats towards the Downstream of ‘Pre-miRNAs’ of Black Pepper

    Science.gov (United States)

    Joy, Nisha; Asha, Srinivasan; Mallika, Vijayan; Soniya, Eppurathu Vasudevan

    2013-01-01

    Next generation sequencing has an advantageon transformational development of species with limited available sequence data as it helps to decode the genome and transcriptome. We carried out the de novo sequencing using illuminaHiSeq™ 2000 to generate the first leaf transcriptome of black pepper (Piper nigrum L.), an important spice variety native to South India and also grown in other tropical regions. Despite the economic and biochemical importance of pepper, a scientifically rigorous study at the molecular level is far from complete due to lack of sufficient sequence information and cytological complexity of its genome. The 55 million raw reads obtained, when assembled using Trinity program generated 2,23,386 contigs and 1,28,157 unigenes. Reports suggest that the repeat-rich genomic regions give rise to small non-coding functional RNAs. MicroRNAs (miRNAs) are the most abundant type of non-coding regulatory RNAs. In spite of the widespread research on miRNAs, little is known about the hair-pin precursors of miRNAs bearing Simple Sequence Repeats (SSRs). We used the array of transcripts generated, for the in silico prediction and detection of ‘43 pre-miRNA candidates bearing different types of SSR motifs’. The analysis identified 3913 different types of SSR motifs with an average of one SSR per 3.04 MB of thetranscriptome. About 0.033% of the transcriptome constituted ‘pre-miRNA candidates bearing SSRs’. The abundance, type and distribution of SSR motifs studied across the hair-pin miRNA precursors, showed a significant bias in the position of SSRs towards the downstream of predicted ‘pre-miRNA candidates’. The catalogue of transcripts identified, together with the demonstration of reliable existence of SSRs in the miRNA precursors, permits future opportunities for understanding the genetic mechanism of black pepper and likely functions of ‘tandem repeats’ in miRNAs. PMID:23469176

  4. De novo Sequencing and Analysis of Lemongrass Transcriptome Provides First Insights into the Essential Oil Biosynthesis of Aromatic Grasses

    Directory of Open Access Journals (Sweden)

    Seema Meena

    2016-07-01

    Full Text Available Aromatic grasses of the genus Cymbopogon (Poaceae family represent unique group of plants that produce diverse composition of monoterpene rich essential oils, which have great value in flavour, fragrance, cosmetic and aromatherapy industries. Despite the commercial importance of these natural aromatic oils, their biosynthesis at the molecular level remains unexplored. As the first step towards understanding the essential oil biosynthesis, we performed de novo transcriptome assembly and analysis of C. flexuosus (lemongrass by employing Illumina sequencing. Mining of transcriptome data and subsequent phylogenetic analysis led to identification of terpene synthases (TPS, pyrophosphatases (PPase, alcohol dehydrogenases (ADH, aldo-keto reductases (AKR, carotenoid cleavage dioxygenases (CCD, alcohol acetyltransferases (AAT and aldehyde dehydrogenases (ALDH, which are potentially involved in essential oil biosynthesis. Comparative essential oil profiling and mRNA expression analysis in three Cymbopogon species (C. flexuosus, aldehyde type; C. martinii, alcohol type; and C. winterianus, intermediate type with varying essential oil composition indicated the involvement of identified candidate genes in the formation of alcohols, aldehydes and acetates. Molecular modeling and docking further supported the role of identified enzymes in aroma formation in Cymbopogon. Also, simple sequence repeats (SSRs were found in the transcriptome with many linked to terpene pathway genes including the genes potentially involved in aroma biosynthesis. This work provides the first insights into the essential oil biosynthesis of aromatic grasses, and the identified candidate genes and markers can be a great resource for biotechnological and molecular breeding approaches to modulate the essential oil composition.

  5. De novo Assembly and Characterization of Cajanus scarabaeoides (L. Thouars Transcriptome by Paired-End Sequencing

    Directory of Open Access Journals (Sweden)

    Deepti Nigam

    2017-07-01

    Full Text Available Pigeonpea [Cajanus cajan (L. Millsp.] is a heat and drought resilient legume crop grown mostly in Asia and Africa. Pigeonpea is affected by various biotic (diseases and insect pests and abiotic stresses (salinity and water logging which limit the yield potential of this crop. However, resistance to all these constraints is not readily available in the cultivated genotypes and some of the wild relatives have been found to withstand these resistances. Thus, the utilization of crop wild relatives (CWR in pigeonpea breeding has been effective in conferring resistance, quality and breeding efficiency traits to this crop. Bud and leaf tissue of Cajanus scarabaeoides, a wild relative of pigeon pea were used for transcriptome profiling. Approximately 30 million clean reads filtered from raw reads by removal of adaptors, ambiguous reads and low-quality reads (3.02 gigabase pairs were generated by Illumina paired-end RNA-seq technology. All of these clean reads were pooled and assembled de novo into 1,17,007 transcripts using the Trinity. Finally, a total of 98,664 unigenes were derived with mean length of 396 bp and N50 values of 1393. The assembly produced significant mapping results (73.68% in BLASTN searches of the Glycine max CDS sequence database (Ensembl. Further, uniprot database of Viridiplantae was used for unigene annotation; 81,799 of 98,664 (82.90% unigenes were finally annotated with gene descriptions or conserved protein domains. Further, a total of 23,475 SSRs were identified in 27,321 unigenes. This data will provide useful information for mining of functionally important genes and SSR markers for pigeonpea improvement.

  6. Differential genomic arrangements in Caryophyllales through deep transcriptome sequencing of A. hypochondriacus.

    Directory of Open Access Journals (Sweden)

    Meeta Sunil

    Full Text Available Genome duplication event in edible dicots under the orders Rosid and Asterid, common during the oligocene period, is missing for species under the order Caryophyllales. Despite this, grain amaranths not only survived this period but display many desirable traits missing in species under rosids and asterids. For example, grain amaranths display traits like C4 photosynthesis, high-lysine seeds, high-yield, drought resistance, tolerance to infection and resilience to stress. It is, therefore, of interest to look for minor genome rearrangements with potential functional implications that are unique to grain amaranths. Here, by deep sequencing and assembly of 16 transcriptomes (86.8 billion bases we have interrogated differential genome rearrangement unique to Amaranthus hypochondriacus with potential links to these phenotypes. We have predicted 125,581 non-redundant transcripts including 44,529 protein coding transcripts identified based on homology to known proteins and 13,529 predicted as novel/amaranth specific coding transcripts. Of the protein coding de novo assembled transcripts, we have identified 1810 chimeric transcripts. More than 30% and 19% of the gene pairs within the chimeric transcripts are found within the same loci in the genomes of A. hypochondriacus and Beta vulgaris respectively and are considered real positives. Interestingly, one of the chimeric transcripts comprises two important genes, namely DHDPS1, a key enzyme implicated in the biosynthesis of lysine, and alpha-glucosidase, an enzyme involved in sucrose catabolism, in close proximity to each other separated by a distance of 612 bases in the genome of A. hypochondriacus in a convergent configuration. We have experimentally validated that transcripts of these two genes are also overlapping in the 3' UTR with their expression negatively correlated from bud to mature seed, suggesting a potential link between the high seed lysine trait and unique genome organization.

  7. Inference of Interactions in Cyanobacterial-Heterotrophic Co-Cultures via Transcriptome Sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Beliaev, Alex S.; Romine, Margaret F.; Serres, Margaret; Bernstein, Hans C.; Linggi, Bryan E.; Markillie, Lye Meng; Isern, Nancy G.; Chrisler, William B.; Kucek, Leo A.; Hill, Eric A.; Pinchuk, Grigoriy; Bryant, Donald A.; Wiley, H. S.; Fredrickson, Jim K.; Konopka, Allan

    2014-04-29

    We employed deep sequencing technology to identify transcriptional adaptation of the euryhaline unicellular cyanobacterium Synechococcus sp. PCC 7002 and the marine facultative aerobe Shewanella putrefaciens W3-18-1 to growth in a co-culture and infer the effect of carbon flux distributions on photoautotroph-heterotroph interactions. The overall transcriptome response of both organisms to co-cultivation was shaped by their respective physiologies and growth constraints. Carbon limitation resulted in the expansion of metabolic capacities which was manifested through the transcriptional upregulation of transport and catabolic pathways. While growth coupling occurred via lactate oxidation or secretion of photosynthetically fixed carbon, there was evidence of specific metabolic interactions between the two organisms. On one hand, the production and excretion of specific amino acids (methionine and alanine) by the cyanobacterium correlated with the putative downregulation of the corresponding biosynthetic machinery of Shewanella W3-18-1. On the other hand, the broad and consistent decrease of mRNA levels for many Fe-regulated Synechococcus 7002 genes during co-cultivation suggested increased Fe availability as well as more facile and energy-efficient mechanisms for Fe acquisition by the cyanobacterium. Furthermore, evidence pointed at potentially novel interactions between oxygenic photoautotrophs and heterotrophs related to the oxidative stress response as transcriptional patterns suggested that Synechococcus 7002 rather than Shewanella W3-18-1 provided scavenging functions for reactive oxygen species under co-culture conditions. This study provides an initial insight into the complexity of photoautotrophic-heterotrophic interactions and brings new perspectives of their role in the robustness and stability of the association.

  8. Application of the whole-transcriptome shotgun sequencing approach to the study of Philadelphia-positive acute lymphoblastic leukemia

    International Nuclear Information System (INIS)

    Iacobucci, I; Ferrarini, A; Sazzini, M; Giacomelli, E; Lonetti, A; Xumerle, L; Ferrari, A; Papayannidis, C; Malerba, G; Luiselli, D; Boattini, A; Garagnani, P; Vitale, A; Soverini, S; Pane, F; Baccarani, M; Delledonne, M; Martinelli, G

    2012-01-01

    Although the pathogenesis of BCR–ABL1-positive acute lymphoblastic leukemia (ALL) is mainly related to the expression of the BCR–ABL1 fusion transcript, additional cooperating genetic lesions are supposed to be involved in its development and progression. Therefore, in an attempt to investigate the complex landscape of mutations, changes in expression profiles and alternative splicing (AS) events that can be observed in such disease, the leukemia transcriptome of a BCR–ABL1-positive ALL patient at diagnosis and at relapse was sequenced using a whole-transcriptome shotgun sequencing (RNA-Seq) approach. A total of 13.9 and 15.8 million sequence reads was generated from de novo and relapsed samples, respectively, and aligned to the human genome reference sequence. This led to the identification of five validated missense mutations in genes involved in metabolic processes (DPEP1, TMEM46), transport (MVP), cell cycle regulation (ABL1) and catalytic activity (CTSZ), two of which resulted in acquired relapse variants. In all, 6390 and 4671 putative AS events were also detected, as well as expression levels for 18 315 and 18 795 genes, 28% of which were differentially expressed in the two disease phases. These data demonstrate that RNA-Seq is a suitable approach for identifying a wide spectrum of genetic alterations potentially involved in ALL

  9. De novo assembly, gene annotation, and marker discovery in stored-product pest Liposcelis entomophila (Enderlein using transcriptome sequences.

    Directory of Open Access Journals (Sweden)

    Dan-Dan Wei

    Full Text Available BACKGROUND: As a major stored-product pest insect, Liposcelis entomophila has developed high levels of resistance to various insecticides in grain storage systems. However, the molecular mechanisms underlying resistance and environmental stress have not been characterized. To date, there is a lack of genomic information for this species. Therefore, studies aimed at profiling the L. entomophila transcriptome would provide a better understanding of the biological functions at the molecular levels. METHODOLOGY/PRINCIPAL FINDINGS: We applied Illumina sequencing technology to sequence the transcriptome of L. entomophila. A total of 54,406,328 clean reads were obtained and that de novo assembled into 54,220 unigenes, with an average length of 571 bp. Through a similarity search, 33,404 (61.61% unigenes were matched to known proteins in the NCBI non-redundant (Nr protein database. These unigenes were further functionally annotated with gene ontology (GO, cluster of orthologous groups of proteins (COG, and Kyoto Encyclopedia of Genes and Genomes (KEGG databases. A large number of genes potentially involved in insecticide resistance were manually curated, including 68 putative cytochrome P450 genes, 37 putative glutathione S-transferase (GST genes, 19 putative carboxyl/cholinesterase (CCE genes, and other 126 transcripts to contain target site sequences or encoding detoxification genes representing eight types of resistance enzymes. Furthermore, to gain insight into the molecular basis of the L. entomophila toward thermal stresses, 25 heat shock protein (Hsp genes were identified. In addition, 1,100 SSRs and 57,757 SNPs were detected and 231 pairs of SSR primes were designed for investigating the genetic diversity in future. CONCLUSIONS/SIGNIFICANCE: We developed a comprehensive transcriptomic database for L. entomophila. These sequences and putative molecular markers would further promote our understanding of the molecular mechanisms underlying

  10. Sequencing and de novo assembly of the Asian clam (Corbicula fluminea transcriptome using the Illumina GAIIx method.

    Directory of Open Access Journals (Sweden)

    Huihui Chen

    Full Text Available BACKGROUND: The Asian clam (Corbicula fluminea is currently one of the most economically important aquatic species in China and has been used as a test organism in many environmental studies. However, the lack of genomic resources, such as sequenced genome, expressed sequence tags (ESTs and transcriptome sequences has hindered the research on C. fluminea. Recent advances in large-scale RNA-Seq enable generation of genomic resources in a short time, and provide large expression datasets for functional genomic analysis. METHODOLOGY/PRINCIPAL FINDINGS: We used a next-generation high-throughput DNA sequencing technique with an Illumina GAIIx method to analyze the transcriptome from the whole bodies of C. fluminea. More than 62,250,336 high-quality reads were generated based on the raw data, and 134,684 unigenes with a mean length of 791 bp were assembled using the Velvet and Oases software. All of the assembly unigenes were annotated by running BLASTx and BLASTn similarity searches on the Nt, Nr, Swiss-Prot, COG and KEGG databases. In addition, the Clusters of Orthologous Groups (COGs, Gene Ontology (GO terms and Kyoto Encyclopedia of Gene and Genome (KEGG annotations were also assigned to each unigene transcript. To provide a preliminary verification of the assembly and annotation results, and search for potential environmental pollution biomarkers, 15 functional genes (five antioxidase genes, two cytochrome P450 genes, three GABA receptor-related genes and five heat shock protein genes were cloned and identified. Expressions of the 15 selected genes following fluoxetine exposure confirmed that the genes are indeed linked to environmental stress. CONCLUSIONS/SIGNIFICANCE: The C. fluminea transcriptome advances the underlying molecular understanding of this freshwater clam, provides a basis for further exploration of C. fluminea as an environmental test organism and promotes further studies on other bivalve organisms.

  11. Comparative analysis of transcriptomes in aerial stems and roots of Ephedra sinica based on high-throughput mRNA sequencing

    Directory of Open Access Journals (Sweden)

    Taketo Okada

    2016-12-01

    Full Text Available Ephedra plants are taxonomically classified as gymnosperms, and are medicinally important as the botanical origin of crude drugs and as bioresources that contain pharmacologically active chemicals. Here we show a comparative analysis of the transcriptomes of aerial stems and roots of Ephedra sinica based on high-throughput mRNA sequencing by RNA-Seq. De novo assembly of short cDNA sequence reads generated 23,358, 13,373, and 28,579 contigs longer than 200 bases from aerial stems, roots, or both aerial stems and roots, respectively. The presumed functions encoded by these contig sequences were annotated by BLAST (blastx. Subsequently, these contigs were classified based on gene ontology slims, Enzyme Commission numbers, and the InterPro database. Furthermore, comparative gene expression analysis was performed between aerial stems and roots. These transcriptome analyses revealed differences and similarities between the transcriptomes of aerial stems and roots in E. sinica. Deep transcriptome sequencing of Ephedra should open the door to molecular biological studies based on the entire transcriptome, tissue- or organ-specific transcriptomes, or targeted genes of interest.

  12. Comparative high-throughput transcriptome sequencing and development of SiESTa, the Silene EST annotation database

    Directory of Open Access Journals (Sweden)

    Marais Gabriel AB

    2011-07-01

    Full Text Available Abstract Background The genus Silene is widely used as a model system for addressing ecological and evolutionary questions in plants, but advances in using the genus as a model system are impeded by the lack of available resources for studying its genome. Massively parallel sequencing cDNA has recently developed into an efficient method for characterizing the transcriptomes of non-model organisms, generating massive amounts of data that enable the study of multiple species in a comparative framework. The sequences generated provide an excellent resource for identifying expressed genes, characterizing functional variation and developing molecular markers, thereby laying the foundations for future studies on gene sequence and gene expression divergence. Here, we report the results of a comparative transcriptome sequencing study of eight individuals representing four Silene and one Dianthus species as outgroup. All sequences and annotations have been deposited in a newly developed and publicly available database called SiESTa, the Silene EST annotation database. Results A total of 1,041,122 EST reads were generated in two runs on a Roche GS-FLX 454 pyrosequencing platform. EST reads were analyzed separately for all eight individuals sequenced and were assembled into contigs using TGICL. These were annotated with results from BLASTX searches and Gene Ontology (GO terms, and thousands of single-nucleotide polymorphisms (SNPs were characterized. Unassembled reads were kept as singletons and together with the contigs contributed to the unigenes characterized in each individual. The high quality of unigenes is evidenced by the proportion (49% that have significant hits in similarity searches with the A. thaliana proteome. The SiESTa database is accessible at http://www.siesta.ethz.ch. Conclusion The sequence collections established in the present study provide an important genomic resource for four Silene and one Dianthus species and will help to

  13. Comparative high-throughput transcriptome sequencing and development of SiESTa, the Silene EST annotation database

    Science.gov (United States)

    2011-01-01

    Background The genus Silene is widely used as a model system for addressing ecological and evolutionary questions in plants, but advances in using the genus as a model system are impeded by the lack of available resources for studying its genome. Massively parallel sequencing cDNA has recently developed into an efficient method for characterizing the transcriptomes of non-model organisms, generating massive amounts of data that enable the study of multiple species in a comparative framework. The sequences generated provide an excellent resource for identifying expressed genes, characterizing functional variation and developing molecular markers, thereby laying the foundations for future studies on gene sequence and gene expression divergence. Here, we report the results of a comparative transcriptome sequencing study of eight individuals representing four Silene and one Dianthus species as outgroup. All sequences and annotations have been deposited in a newly developed and publicly available database called SiESTa, the Silene EST annotation database. Results A total of 1,041,122 EST reads were generated in two runs on a Roche GS-FLX 454 pyrosequencing platform. EST reads were analyzed separately for all eight individuals sequenced and were assembled into contigs using TGICL. These were annotated with results from BLASTX searches and Gene Ontology (GO) terms, and thousands of single-nucleotide polymorphisms (SNPs) were characterized. Unassembled reads were kept as singletons and together with the contigs contributed to the unigenes characterized in each individual. The high quality of unigenes is evidenced by the proportion (49%) that have significant hits in similarity searches with the A. thaliana proteome. The SiESTa database is accessible at http://www.siesta.ethz.ch. Conclusion The sequence collections established in the present study provide an important genomic resource for four Silene and one Dianthus species and will help to further develop Silene as a

  14. Transcriptome sequencing of Crucihimalaya himalaica (Brassicaceae) reveals how Arabidopsis close relative adapt to the Qinghai-Tibet Plateau

    Science.gov (United States)

    Qiao, Qin; Wang, Qia; Han, Xi; Guan, Yanlong; Sun, Hang; Zhong, Yang; Huang, Jinling; Zhang, Ticao

    2016-02-01

    The extreme environment of the Qinghai-Tibet Plateau (QTP) provides an ideal natural laboratory for studies on adaptive evolution. Few genome/transcriptome based studies have been conducted on how plants adapt to the environments of QTP compared to numerous studies on vertebrates. Crucihimalaya himalaica is a close relative of Arabidopsis with typical QTP distribution, and is hoped to be a new model system to study speciation and ecological adaptation in extreme environment. In this study, we de novo generated a transcriptome sequence of C. himalaica, with a total of 49,438 unigenes. Compared to five relatives, 10,487 orthogroups were shared by all six species, and 4,286 orthogroups contain putative single copy gene. Further analysis identified 487 extremely significantly positively selected genes (PSGs) in C. himalaica transcriptome. Theses PSGs were enriched in functions related to specific adaptation traits, such as response to radiation, DNA repair, nitrogen metabolism, and stabilization of membrane. These functions are responsible for the adaptation of C. himalaica to the high radiation, soil depletion and low temperature environments on QTP. Our findings indicate that C. himalaica has evolved complex strategies for adapting to the extreme environments on QTP and provide novel insights into genetic mechanisms of highland adaptation in plants.

  15. Transcriptome Characterization for Non-Model Endangered Lycaenids, Protantigius superans and Spindasis takanosis, Using Illumina HiSeq 2500 Sequencing

    Directory of Open Access Journals (Sweden)

    Bharat Bhusan Patnaik

    2015-12-01

    Full Text Available The Lycaenidae butterflies, Protantigius superans and Spindasis takanosis, are endangered insects in Korea known for their symbiotic association with ants. However, necessary genomic and transcriptomics data are lacking in these species, limiting conservation efforts. In this study, the P. superans and S. takanosis transcriptomes were deciphered using Illumina HiSeq 2500 sequencing. The P. superans and S. takanosis transcriptome data included a total of 254,340,693 and 245,110,582 clean reads assembled into 159,074 and 170,449 contigs and 107,950 and 121,140 unigenes, respectively. BLASTX hits (E-value of 1.0 × 10−5 against the known protein databases annotated a total of 46,754 and 51,908 transcripts for P. superans and S. takanosis. Approximately 41.25% and 38.68% of the unigenes for P. superans and S. takanosis found homologous sequences in Protostome DB (PANM-DB. BLAST2GO analysis confirmed 18,611 unigenes representing Gene Ontology (GO terms and a total of 5259 unigenes assigned to 116 pathways for P. superans. For S. takanosis, a total of 6697 unigenes were assigned to 119 pathways using the Kyoto Encyclopedia of Genes and Genomes (KEGG pathway database. Additionally, 382,164 and 390,516 Simple Sequence Repeats (SSRs were compiled from the unigenes of P. superans and S. takanosis, respectively. This is the first report to record new genes and their utilization for conservation of lycaenid species population and as a reference information for closely related species.

  16. Transcriptome sequencing and differential gene expression analysis in Viola yedoensis Makino (Fam. Violaceae) responsive to cadmium (Cd) pollution

    Energy Technology Data Exchange (ETDEWEB)

    Gao, Jian [Key Laboratory of Biology and Genetic Improvement of Maize in Southwest Region, Ministry of Agriculture, Maize Research Institute of Sichuan Agricultural University, Wenjiang, Sichuan (China); Luo, Mao [Drug Discovery Research Center of Luzhou Medical College, Luzhou, Sichuan (China); Zhu, Ye; He, Ying; Wang, Qin [Department of Pharmacy of Luzhou Medical College, Luzhou, Sichuan (China); Zhang, Chun, E-mail: zc83good@126.com [Department of Pharmacy of Luzhou Medical College, Luzhou, Sichuan (China)

    2015-03-27

    Viola yedoensis Makino is an important Chinese traditional medicine plant adapted to cadmium (Cd) pollution regions. Illumina sequencing technology was used to sequence the transcriptome of V. yedoensis Makino. We sequenced Cd-treated (VIYCd) and untreated (VIYCK) samples of V. yedoensis, and obtained 100,410,834 and 83,587,676 high quality reads, respectively. After de novo assembly and quantitative assessment, 109,800 unigenes were finally generated with an average length of 661 bp. We then obtained functional annotations by aligning unigenes with public protein databases including NR, NT, SwissProt, KEGG and COG. In addition, 892 differentially expressed genes (DEGs) were investigated between the two libraries of untreated (VIYCK) and Cd-treated (VIYCd) plants. Moreover, 15 randomly selected DEGs were further validated with qRT-PCR and the results were highly accordant with the Solexa analysis. This study firstly generated a successful global analysis of the V. yedoensis transcriptome and it will provide for further studies on gene expression, genomics, and functional genomics in Violaceae. - Highlights: • A de novo assembly generated 109,800 unigenes and 5,4479 of them were annotated. • 31,285 could be classified into 26 COG categories. • 263 biosynthesis pathways were predicted and classified into five categories. • 892 DEGs were detected and 15 of them were validated by qRT-PCR.

  17. Whole transcriptome analysis using next-generation sequencing of model species Setaria viridis to support C4 photosynthesis research.

    Science.gov (United States)

    Xu, Jiajia; Li, Yuanyuan; Ma, Xiuling; Ding, Jianfeng; Wang, Kai; Wang, Sisi; Tian, Ye; Zhang, Hui; Zhu, Xin-Guang

    2013-09-01

    Setaria viridis is an emerging model species for genetic studies of C4 photosynthesis. Many basic molecular resources need to be developed to support for this species. In this paper, we performed a comprehensive transcriptome analysis from multiple developmental stages and tissues of S. viridis using next-generation sequencing technologies. Sequencing of the transcriptome from multiple tissues across three developmental stages (seed germination, vegetative growth, and reproduction) yielded a total of 71 million single end 100 bp long reads. Reference-based assembly using Setaria italica genome as a reference generated 42,754 transcripts. De novo assembly generated 60,751 transcripts. In addition, 9,576 and 7,056 potential simple sequence repeats (SSRs) covering S. viridis genome were identified when using the reference based assembled transcripts and the de novo assembled transcripts, respectively. This identified transcripts and SSR provided by this study can be used for both reverse and forward genetic studies based on S. viridis.

  18. Transcriptome sequencing and differential gene expression analysis in Viola yedoensis Makino (Fam. Violaceae) responsive to cadmium (Cd) pollution

    International Nuclear Information System (INIS)

    Gao, Jian; Luo, Mao; Zhu, Ye; He, Ying; Wang, Qin; Zhang, Chun

    2015-01-01

    Viola yedoensis Makino is an important Chinese traditional medicine plant adapted to cadmium (Cd) pollution regions. Illumina sequencing technology was used to sequence the transcriptome of V. yedoensis Makino. We sequenced Cd-treated (VIYCd) and untreated (VIYCK) samples of V. yedoensis, and obtained 100,410,834 and 83,587,676 high quality reads, respectively. After de novo assembly and quantitative assessment, 109,800 unigenes were finally generated with an average length of 661 bp. We then obtained functional annotations by aligning unigenes with public protein databases including NR, NT, SwissProt, KEGG and COG. In addition, 892 differentially expressed genes (DEGs) were investigated between the two libraries of untreated (VIYCK) and Cd-treated (VIYCd) plants. Moreover, 15 randomly selected DEGs were further validated with qRT-PCR and the results were highly accordant with the Solexa analysis. This study firstly generated a successful global analysis of the V. yedoensis transcriptome and it will provide for further studies on gene expression, genomics, and functional genomics in Violaceae. - Highlights: • A de novo assembly generated 109,800 unigenes and 5,4479 of them were annotated. • 31,285 could be classified into 26 COG categories. • 263 biosynthesis pathways were predicted and classified into five categories. • 892 DEGs were detected and 15 of them were validated by qRT-PCR

  19. De novo sequencing and analysis of the transcriptome during the browning of fresh-cut Luffa cylindrica 'Fusi-3' fruits

    Science.gov (United States)

    Chen, Mindong; Wang, Bin; Zhang, Qianrong; Xue, Zhuzheng

    2017-01-01

    Fresh-cut luffa (Luffa cylindrica) fruits commonly undergo browning. However, little is known about the molecular mechanisms regulating this process. We used the RNA-seq technique to analyze the transcriptomic changes occurring during the browning of fresh-cut fruits from luffa cultivar ‘Fusi-3’. Over 90 million high-quality reads were assembled into 58,073 Unigenes, and 60.86% of these were annotated based on sequences in four public databases. We detected 35,282 Unigenes with significant hits to sequences in the NCBInr database, and 24,427 Unigenes encoded proteins with sequences that were similar to those of known proteins in the Swiss-Prot database. Additionally, 20,546 and 13,021 Unigenes were similar to existing sequences in the Eukaryotic Orthologous Groups of proteins and Kyoto Encyclopedia of Genes and Genomes databases, respectively. Furthermore, 27,301 Unigenes were differentially expressed during the browning of fresh-cut luffa fruits (i.e., after 1–6 h). Moreover, 11 genes from five gene families (i.e., PPO, PAL, POD, CAT, and SOD) identified as potentially associated with enzymatic browning as well as four WRKY transcription factors were observed to be differentially regulated in fresh-cut luffa fruits. With the assistance of rapid amplification of cDNA ends technology, we obtained the full-length sequences of the 15 Unigenes. We also confirmed these Unigenes were expressed by quantitative real-time polymerase chain reaction analysis. This study provides a comprehensive transcriptome sequence resource, and may facilitate further studies aimed at identifying genes affecting luffa fruit browning for the exploitation of the underlying mechanism. PMID:29145430

  20. De novo sequencing and analysis of the transcriptome during the browning of fresh-cut Luffa cylindrica 'Fusi-3' fruits.

    Directory of Open Access Journals (Sweden)

    Haisheng Zhu

    Full Text Available Fresh-cut luffa (Luffa cylindrica fruits commonly undergo browning. However, little is known about the molecular mechanisms regulating this process. We used the RNA-seq technique to analyze the transcriptomic changes occurring during the browning of fresh-cut fruits from luffa cultivar 'Fusi-3'. Over 90 million high-quality reads were assembled into 58,073 Unigenes, and 60.86% of these were annotated based on sequences in four public databases. We detected 35,282 Unigenes with significant hits to sequences in the NCBInr database, and 24,427 Unigenes encoded proteins with sequences that were similar to those of known proteins in the Swiss-Prot database. Additionally, 20,546 and 13,021 Unigenes were similar to existing sequences in the Eukaryotic Orthologous Groups of proteins and Kyoto Encyclopedia of Genes and Genomes databases, respectively. Furthermore, 27,301 Unigenes were differentially expressed during the browning of fresh-cut luffa fruits (i.e., after 1-6 h. Moreover, 11 genes from five gene families (i.e., PPO, PAL, POD, CAT, and SOD identified as potentially associated with enzymatic browning as well as four WRKY transcription factors were observed to be differentially regulated in fresh-cut luffa fruits. With the assistance of rapid amplification of cDNA ends technology, we obtained the full-length sequences of the 15 Unigenes. We also confirmed these Unigenes were expressed by quantitative real-time polymerase chain reaction analysis. This study provides a comprehensive transcriptome sequence resource, and may facilitate further studies aimed at identifying genes affecting luffa fruit browning for the exploitation of the underlying mechanism.

  1. Sequence comparison of prefrontal cortical brain transcriptome from a tame and an aggressive silver fox (Vulpes vulpes

    Directory of Open Access Journals (Sweden)

    Sun Qi

    2011-10-01

    Full Text Available Abstract Background Two strains of the silver fox (Vulpes vulpes, with markedly different behavioral phenotypes, have been developed by long-term selection for behavior. Foxes from the tame strain exhibit friendly behavior towards humans, paralleling the sociability of canine puppies, whereas foxes from the aggressive strain are defensive and exhibit aggression to humans. To understand the genetic differences underlying these behavioral phenotypes fox-specific genomic resources are needed. Results cDNA from mRNA from pre-frontal cortex of a tame and an aggressive fox was sequenced using the Roche 454 FLX Titanium platform (> 2.5 million reads & 0.9 Gbase of tame fox sequence; >3.3 million reads & 1.2 Gbase of aggressive fox sequence. Over 80% of the fox reads were assembled into contigs. Mapping fox reads against the fox transcriptome assembly and the dog genome identified over 30,000 high confidence fox-specific SNPs. Fox transcripts for approximately 14,000 genes were identified using SwissProt and the dog RefSeq databases. An at least 2-fold expression difference between the two samples (p Conclusions Transcriptome sequencing significantly expanded genomic resources available for the fox, a species without a sequenced genome. In a very cost efficient manner this yielded a large number of fox-specific SNP markers for genetic studies and provided significant insights into the gene expression profile of the fox pre-frontal cortex; expression differences between the two fox samples; and a catalogue of potentially important gene-specific sequence variants. This result demonstrates the utility of this approach for developing genomic resources in species with limited genomic information.

  2. Identification of lignin genes and regulatory sequences involved in secondary cell wall formation in Acacia auriculiformis and Acacia mangium via de novo transcriptome sequencing

    Directory of Open Access Journals (Sweden)

    Cannon Charles H

    2011-07-01

    Full Text Available Abstract Background Acacia auriculiformis × Acacia mangium hybrids are commercially important trees for the timber and pulp industry in Southeast Asia. Increasing pulp yield while reducing pulping costs are major objectives of tree breeding programs. The general monolignol biosynthesis and secondary cell wall formation pathways are well-characterized but genes in these pathways are poorly characterized in Acacia hybrids. RNA-seq on short-read platforms is a rapid approach for obtaining comprehensive transcriptomic data and to discover informative sequence variants. Results We sequenced transcriptomes of A. auriculiformis and A. mangium from non-normalized cDNA libraries synthesized from pooled young stem and inner bark tissues using paired-end libraries and a single lane of an Illumina GAII machine. De novo assembly produced a total of 42,217 and 35,759 contigs with an average length of 496 bp and 498 bp for A. auriculiformis and A. mangium respectively. The assemblies of A. auriculiformis and A. mangium had a total length of 21,022,649 bp and 17,838,260 bp, respectively, with the largest contig 15,262 bp long. We detected all ten monolignol biosynthetic genes using Blastx and further analysis revealed 18 lignin isoforms for each species. We also identified five contigs homologous to R2R3-MYB proteins in other plant species that are involved in transcriptional regulation of secondary cell wall formation and lignin deposition. We searched the contigs against public microRNA database and predicted the stem-loop structures of six highly conserved microRNA families (miR319, miR396, miR160, miR172, miR162 and miR168 and one legume-specific family (miR2086. Three microRNA target genes were predicted to be involved in wood formation and flavonoid biosynthesis. By using the assemblies as a reference, we discovered 16,648 and 9,335 high quality putative Single Nucleotide Polymorphisms (SNPs in the transcriptomes of A. auriculiformis and A. mangium

  3. Transcriptome Sequencing and Differential Gene Expression Analysis of Delayed Gland Morphogenesis in Gossypium australe during Seed Germination

    Science.gov (United States)

    Tao, Tao; Zhao, Liang; Lv, Yuanda; Chen, Jiedan; Hu, Yan; Zhang, Tianzhen; Zhou, Baoliang

    2013-01-01

    The genus Gossypium is a globally important crop that is used to produce textiles, oil and protein. However, gossypol, which is found in cultivated cottonseed, is toxic to humans and non-ruminant animals. Efforts have been made to breed improved cultivated cotton with lower gossypol content. The delayed gland morphogenesis trait possessed by some Australian wild cotton species may enable the widespread, direct usage of cottonseed. However, the mechanisms about the delayed gland morphogenesis are still unknown. Here, we sequenced the first Australian wild cotton species ( Gossypium australe ) and a diploid cotton species ( Gossypium arboreum ) using the Illumina Hiseq 2000 RNA-seq platform to help elucidate the mechanisms underlying gossypol synthesis and gland development. Paired-end Illumina short reads were de novo assembled into 226,184, 213,257 and 275,434 transcripts, clustering into 61,048, 47,908 and 72,985 individual clusters with N50 lengths of 1,710 bp, 1544 BP and 1,743 bp, respectively. The clustered Unigenes were searched against three public protein databases (TrEMBL, SwissProt and RefSeq) and the nucleotide and protein sequences of Gossypium raimondii using BLASTx and BLASTn. A total of 21,987, 17,209 and 25,325 Unigenes were annotated. Of these, 18,766 (85.4%), 14,552 (84.6%) and 21,374 (84.4%) Unigenes could be assigned to GO-term classifications. We identified and analyzed 13,884 differentially expressed Unigenes by clustering and functional enrichment. Terpenoid-related biosynthesis pathways showed differentially regulated expression patterns between the two cotton species. Phylogenetic analysis of the terpene synthases family was also carried out to clarify the classifications of TPSs. RNA-seq data from two distinct cotton species provide comprehensive transcriptome annotation resources and global gene expression profiles during seed germination and gland and gossypol formation. These data may be used to further elucidate various mechanisms and

  4. The contribution of 700,000 ORF sequence tags to the definition of the human transcriptome

    DEFF Research Database (Denmark)

    Camargo, A A; Samaia, H P; Dias-Neto, E

    2001-01-01

    ,095 full-length mRNAs as a means of assessing the efficiency of the strategy and its potential contribution to the definition of the human transcriptome. We estimate that ORESTES sampled over 80% of all highly and moderately expressed, and between 40% and 50% of rarely expressed, human genes. In our most...

  5. Direct chloroplast sequencing: comparison of sequencing platforms and analysis tools for whole chloroplast barcoding.

    Directory of Open Access Journals (Sweden)

    Marta Brozynska

    Full Text Available Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina and Ion Torrent (Life Technology sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genome sequence of a wild rice plant relative to cultivated rice (cv. Nipponbare. Consensus chloroplast sequences were produced by mapping sequence reads to the reference rice chloroplast genome or by de novo assembly and mapping of the resulting contigs to the reference sequence. A total of 122 polymorphisms (SNPs and indels between the wild and cultivated rice chloroplasts were predicted by these different sequencing and analysis methods. Of these, a total of 102 polymorphisms including 90 SNPs were predicted by both platforms. Indels were more variable with different sequencing methods, with almost all discrepancies found in homopolymers. The Ion Torrent platform gave no apparent false SNP but was less reliable for indels. The methods should be suitable for routine barcoding using appropriate combinations of sequencing platform and data analysis.

  6. De novo assembly and characterization of the spleen transcriptome of common carp (Cyprinus carpio) using Illumina paired-end sequencing.

    Science.gov (United States)

    Li, Guoxi; Zhao, Yinli; Liu, Zhonghu; Gao, Chunsheng; Yan, Fengbin; Liu, Bianzhi; Feng, Jianxin

    2015-06-01

    Common carp (Cyprinus carpio) is one of the most important aquacultured species of the family Cyprinidae, and breeding this species for disease resistance is becoming more and more important. However, at the genome or transcriptome levels, study of the immunogenetics of disease resistance in the common carp is lacking. In this study, 60,316,906 and 75,200,328 paired-end clean reads were obtained from two cDNA libraries of the common carp spleen by Illumina paired-end sequencing technology. Totally, 130,293 unique transcript fragments (unigenes) were assembled, with an average length of 1400.57 bp. Approximately 105,612 (81.06%) unigenes could be annotated according to their homology with matches in the Nr, Nt, Swiss-Prot, COG, GO, or KEGG databases, and they were found to represent 46,747 non-redundant genes. Comparative analysis showed that 59.82% of the unigenes have significant similarity to zebrafish Refseq proteins. Gene expression comparison revealed that 10,432 and 6889 annotated unigenes were, respectively, up- and down-regulated with at least twofold changes between two developmental stages of the common carp spleen. Gene ontology and KEGG analysis were performed to classify all unigenes into functional categories for understanding gene functions and regulation pathways. In addition, 46,847 simple sequence repeats (SSRs) were detected from 35,618 unigenes, and a large number of single nucleotide polymorphism (SNP) and insertion/deletion (INDEL) sites were identified in the spleen transcriptome of common carp. This study has characterized the spleen transcriptome of the common carp for the first time, providing a valuable resource for a better understanding of the common carp immune system and defense mechanisms. This knowledge will also facilitate future functional studies on common carp immunogenetics that may eventually be applied in breeding programs. Copyright © 2015 Elsevier Ltd. All rights reserved.

  7. Comparative analyses of six solanaceous transcriptomes reveal a high degree of sequence conservation and species-specific transcripts

    Directory of Open Access Journals (Sweden)

    Ouyang Shu

    2005-09-01

    Full Text Available Abstract Background The Solanaceae is a family of closely related species with diverse phenotypes that have been exploited for agronomic purposes. Previous studies involving a small number of genes suggested sequence conservation across the Solanaceae. The availability of large collections of Expressed Sequence Tags (ESTs for the Solanaceae now provides the opportunity to assess sequence conservation and divergence on a genomic scale. Results All available ESTs and Expressed Transcripts (ETs, 449,224 sequences for six Solanaceae species (potato, tomato, pepper, petunia, tobacco and Nicotiana benthamiana, were clustered and assembled into gene indices. Examination of gene ontologies revealed that the transcripts within the gene indices encode a similar suite of biological processes. Although the ESTs and ETs were derived from a variety of tissues, 55–81% of the sequences had significant similarity at the nucleotide level with sequences among the six species. Putative orthologs could be identified for 28–58% of the sequences. This high degree of sequence conservation was supported by expression profiling using heterologous hybridizations to potato cDNA arrays that showed similar expression patterns in mature leaves for all six solanaceous species. 16–19% of the transcripts within the six Solanaceae gene indices did not have matches among Solanaceae, Arabidopsis, rice or 21 other plant gene indices. Conclusion Results from this genome scale analysis confirmed a high level of sequence conservation at the nucleotide level of the coding sequence among Solanaceae. Additionally, the results indicated that part of the Solanaceae transcriptome is likely to be unique for each species.

  8. Transcriptome characterization and sequencing-based identification of salt-responsive genes in Millettia pinnata, a semi-mangrove plant.

    Science.gov (United States)

    Huang, Jianzi; Lu, Xiang; Yan, Hao; Chen, Shouyi; Zhang, Wanke; Huang, Rongfeng; Zheng, Yizhi

    2012-04-01

    Semi-mangroves form a group of transitional species between glycophytes and halophytes, and hold unique potential for learning molecular mechanisms underlying plant salt tolerance. Millettia pinnata is a semi-mangrove plant that can survive a wide range of saline conditions in the absence of specialized morphological and physiological traits. By employing the Illumina sequencing platform, we generated ~192 million short reads from four cDNA libraries of M. pinnata and processed them into 108,598 unisequences with a high depth of coverage. The mean length and total length of these unisequences were 606 bp and 65.8 Mb, respectively. A total of 54,596 (50.3%) unisequences were assigned Nr annotations. Functional classification revealed the involvement of unisequences in various biological processes related to metabolism and environmental adaptation. We identified 23,815 candidate salt-responsive genes with significantly differential expression under seawater and freshwater treatments. Based on the reverse transcription-polymerase chain reaction (RT-PCR) and real-time PCR analyses, we verified the changes in expression levels for a number of candidate genes. The functional enrichment analyses for the candidate genes showed tissue-specific patterns of transcriptome remodelling upon salt stress in the roots and the leaves. The transcriptome of M. pinnata will provide valuable gene resources for future application in crop improvement. In addition, this study sets a good example for large-scale identification of salt-responsive genes in non-model organisms using the sequencing-based approach.

  9. De Novo Transcriptome Sequencing of Olea europaea L. to Identify Genes Involved in the Development of the Pollen Tube.

    Science.gov (United States)

    Iaria, Domenico; Chiappetta, Adriana; Muzzalupo, Innocenzo

    2016-01-01

    In olive (Olea europaea L.), the processes controlling self-incompatibility are still unclear and the molecular basis underlying this process are still not fully characterized. In order to determine compatibility relationships, using next-generation sequencing techniques and a de novo transcriptome assembly strategy, we show that pollen tubes from different olive plants, grown in vitro in a medium containing its own pistil and in combination pollen/pistil from self-sterile and self-fertile cultivars, have a distinct gene expression profile and many of the differentially expressed sequences between the samples fall within gene families involved in the development of the pollen tube, such as lipase, carboxylesterase, pectinesterase, pectin methylesterase, and callose synthase. Moreover, different genes involved in signal transduction, transcription, and growth are overrepresented. The analysis also allowed us to identify members in actin and actin depolymerization factor and fibrin gene family and member of the Ca(2+) binding gene family related to the development and polarization of pollen apical tip. The whole transcriptomic analysis, through the identification of the differentially expressed transcripts set and an extended functional annotation analysis, will lead to a better understanding of the mechanisms of pollen germination and pollen tube growth in the olive.

  10. Transcriptomic-Wide Discovery of Direct and Indirect HuR RNA Targets in Activated CD4+ T Cells.

    Directory of Open Access Journals (Sweden)

    Patsharaporn Techasintana

    Full Text Available Due to poor correlation between steady state mRNA levels and protein product, purely transcriptomic profiling methods may miss genes posttranscriptionally regulated by RNA binding proteins (RBPs and microRNAs (miRNAs. RNA immunoprecipitation (RIP methods developed to identify in vivo targets of RBPs have greatly elucidated those mRNAs which may be regulated via transcript stability and translation. The RBP HuR (ELAVL1 and family members are major stabilizers of mRNA. Many labs have identified HuR mRNA targets; however, many of these analyses have been performed in cell lines and oftentimes are not independent biological replicates. Little is known about how HuR target mRNAs behave in conditional knock-out models. In the present work, we performed HuR RIP-Seq and RNA-Seq to investigate HuR direct and indirect targets using a novel conditional knock-out model of HuR genetic ablation during CD4+ T activation and Th2 differentiation. Using independent biological replicates, we generated a high coverage RIP-Seq data set (>160 million reads that was analyzed using bioinformatics methods specifically designed to find direct mRNA targets in RIP-Seq data. Simultaneously, another set of independent biological replicates were sequenced by RNA-Seq (>425 million reads to identify indirect HuR targets. These direct and indirect targets were combined to determine canonical pathways in CD4+ T cell activation and differentiation for which HuR plays an important role. We show that HuR may regulate genes in multiple canonical pathways involved in T cell activation especially the CD28 family signaling pathway. These data provide insights into potential HuR-regulated genes during T cell activation and immune mechanisms.

  11. Prediction of the neuropeptidomes of members of the Astacidea (Crustacea, Decapoda) using publicly accessible transcriptome shotgun assembly (TSA) sequence data.

    Science.gov (United States)

    Christie, Andrew E; Chi, Megan

    2015-12-01

    The decapod infraorder Astacidea is comprised of clawed lobsters and freshwater crayfish. Due to their economic importance and their use as models for investigating neurochemical signaling, much work has focused on elucidating their neurochemistry, particularly their peptidergic systems. Interestingly, no astacidean has been the subject of large-scale peptidomic analysis via in silico transcriptome mining, this despite growing transcriptomic resources for members of this taxon. Here, the publicly accessible astacidean transcriptome shotgun assembly data were mined for putative peptide-encoding transcripts; these sequences were used to predict the structures of mature neuropeptides. One hundred seventy-six distinct peptides were predicted for Procambarus clarkii, including isoforms of adipokinetic hormone-corazonin-like peptide (ACP), allatostatin A (AST-A), allatostatin B, allatostatin C (AST-C) bursicon α, bursicon β, CCHamide, crustacean hyperglycemic hormone (CHH)/ion transport peptide (ITP), diuretic hormone 31 (DH31), eclosion hormone (EH), FMRFamide-like peptide, GSEFLamide, intocin, leucokinin, neuroparsin, neuropeptide F, pigment dispersing hormone, pyrokinin, RYamide, short neuropeptide F (sNPF), SIFamide, sulfakinin and tachykinin-related peptide (TRP). Forty-six distinct peptides, including isoforms of AST-A, AST-C, bursicon α, CCHamide, CHH/ITP, DH31, EH, intocin, myosuppressin, neuroparsin, red pigment concentrating hormone, sNPF and TRP, were predicted for Pontastacus leptodactylus, with a bursicon β and a neuroparsin predicted for Cherax quadricarinatus. The identification of ACP is the first from a decapod, while the predictions of CCHamide, EH, GSEFLamide, intocin, neuroparsin and RYamide are firsts for the Astacidea. Collectively, these data greatly expand the catalog of known astacidean neuropeptides and provide a foundation for functional studies of peptidergic signaling in members of this decapod infraorder. Copyright © 2015 Elsevier Inc

  12. De novo transcriptome sequencing of Isaria cateniannulata and comparative analysis of gene expression in response to heat and cold stresses.

    Directory of Open Access Journals (Sweden)

    Dingfeng Wang

    Full Text Available Isaria cateniannulata is a very important and virulent entomopathogenic fungus that infects many insect pest species. Although I. cateniannulata is commonly exposed to extreme environmental temperature conditions, little is known about its molecular response mechanism to temperature stress. Here, we sequenced and de novo assembled the transcriptome of I. cateniannulata in response to high and low temperature stresses using Illumina RNA-Seq technology. Our assembly encompassed 17,514 unigenes (mean length = 1,197 bp, in which 11,445 unigenes (65.34% showed significant similarities to known sequences in NCBI non-redundant protein sequences (Nr database. Using digital gene expression analysis, 4,483 differentially expressed genes (DEGs were identified after heat treatment, including 2,905 up-regulated genes and 1,578 down-regulated genes. Under cold stress, 1,927 DEGs were identified, including 1,245 up-regulated genes and 682 down-regulated genes. The expression patterns of 18 randomly selected candidate DEGs resulting from quantitative real-time PCR (qRT-PCR were consistent with their transcriptome analysis results. Although DEGs were involved in many pathways, we focused on the genes that were involved in endocytosis: In heat stress, the pathway of clathrin-dependent endocytosis (CDE was active; however at low temperature stresses, the pathway of clathrin-independent endocytosis (CIE was active. Besides, four categories of DEGs acting as temperature sensors were observed, including cell-wall-major-components-metabolism-related (CWMCMR genes, heat shock protein (Hsp genes, intracellular-compatible-solutes-metabolism-related (ICSMR genes and glutathione S-transferase (GST. These results enhance our understanding of the molecular mechanisms of I. cateniannulata in response to temperature stresses and provide a valuable resource for the future investigations.

  13. Genes involved in sex pheromone biosynthesis of Ephestia cautella, an important food storage pest, are determined by transcriptome sequencing

    KAUST Repository

    Antony, Binu

    2015-07-18

    Background Insects use pheromones, chemical signals that underlie all animal behaviors, for communication and for attracting mates. Synthetic pheromones are widely used in pest control strategies because they are environmentally safe. The production of insect pheromones in transgenic plants, which could be more economical and effective in producing isomerically pure compounds, has recently been successfully demonstrated. This research requires information regarding the pheromone biosynthetic pathways and the characterization of pheromone biosynthetic enzymes (PBEs). We used Illumina sequencing to characterize the pheromone gland (PG) transcriptome of the Pyralid moth, Ephestia cautella, a destructive storage pest, to reveal putative candidate genes involved in pheromone biosynthesis, release, transport and degradation. Results We isolated the E. cautella pheromone compound as (Z,E)-9,12-tetradecadienyl acetate, and the major pheromone precursors 16:acyl, 14:acyl, E14-16:acyl, E12-14:acyl and Z9,E12-14:acyl. Based on the abundance of precursors, two possible pheromone biosynthetic pathways are proposed. Both pathways initiate from C16:acyl-CoA, with one involving ∆14 and ∆9 desaturation to generate Z9,E12-14:acyl, and the other involving the chain shortening of C16:acyl-CoA to C14:acyl-CoA, followed by ∆12 and ∆9 desaturation to generate Z9,E12-14:acyl-CoA. Then, a final reduction and acetylation generates Z9,E12-14:OAc. Illumina sequencing yielded 83,792 transcripts, and we obtained a PG transcriptome of ~49.5 Mb. A total of 191 PBE transcripts, which included pheromone biosynthesis activating neuropeptides, fatty acid transport proteins, acetyl-CoA carboxylases, fatty acid synthases, desaturases, β-oxidation enzymes, fatty acyl-CoA reductases (FARs) and fatty acetyltransferases (FATs), were selected from the dataset. A comparison of the E. cautella transcriptome data with three other Lepidoptera PG datasets revealed that 45 % of the sequences were shared

  14. Directed PCR-free engineering of highly repetitive DNA sequences

    Directory of Open Access Journals (Sweden)

    Preissler Steffen

    2011-09-01

    Full Text Available Abstract Background Highly repetitive nucleotide sequences are commonly found in nature e.g. in telomeres, microsatellite DNA, polyadenine (poly(A tails of eukaryotic messenger RNA as well as in several inherited human disorders linked to trinucleotide repeat expansions in the genome. Therefore, studying repetitive sequences is of biological, biotechnological and medical relevance. However, cloning of such repetitive DNA sequences is challenging because specific PCR-based amplification is hampered by the lack of unique primer binding sites resulting in unspecific products. Results For the PCR-free generation of repetitive DNA sequences we used antiparallel oligonucleotides flanked by restriction sites of Type IIS endonucleases. The arrangement of recognition sites allowed for stepwise and seamless elongation of repetitive sequences. This facilitated the assembly of repetitive DNA segments and open reading frames encoding polypeptides with periodic amino acid sequences of any desired length. By this strategy we cloned a series of polyglutamine encoding sequences as well as highly repetitive polyadenine tracts. Such repetitive sequences can be used for diverse biotechnological applications. As an example, the polyglutamine sequences were expressed as His6-SUMO fusion proteins in Escherichia coli cells to study their aggregation behavior in vitro. The His6-SUMO moiety enabled affinity purification of the polyglutamine proteins, increased their solubility, and allowed controlled induction of the aggregation process. We successfully purified the fusions proteins and provide an example for their applicability in filter retardation assays. Conclusion Our seamless cloning strategy is PCR-free and allows the directed and efficient generation of highly repetitive DNA sequences of defined lengths by simple standard cloning procedures.

  15. Deep sequencing-based transcriptome analysis of chicken spleen in response to avian pathogenic Escherichia coli (APEC infection.

    Directory of Open Access Journals (Sweden)

    Qinghua Nie

    Full Text Available Avian pathogenic Escherichia coli (APEC leads to economic losses in poultry production and is also a threat to human health. The goal of this study was to characterize the chicken spleen transcriptome and to identify candidate genes for response and resistance to APEC infection using Solexa sequencing. We obtained 14422935, 14104324, and 14954692 Solexa read pairs for non-challenged (NC, challenged-mild pathology (MD, and challenged-severe pathology (SV, respectively. A total of 148197 contigs and 98461 unigenes were assembled, of which 134949 contigs and 91890 unigenes match the chicken genome. In total, 12272 annotated unigenes take part in biological processes (11664, cellular components (11927, and molecular functions (11963. Summing three specific contrasts, 13650 significantly differentially expressed unigenes were found in NC Vs. MD (6844, NC Vs. SV (7764, and MD Vs. SV (2320. Some unigenes (e.g. CD148, CD45 and LCK were involved in crucial pathways, such as the T cell receptor (TCR signaling pathway and microbial metabolism in diverse environments. This study facilitates understanding of the genetic architecture of the chicken spleen transcriptome, and has identified candidate genes for host response to APEC infection.

  16. RNA Sequencing Analysis Reveals Transcriptomic Variations in Tobacco (Nicotiana tabacum Leaves Affected by Climate, Soil, and Tillage Factors

    Directory of Open Access Journals (Sweden)

    Bo Lei

    2014-04-01

    Full Text Available The growth and development of plants are sensitive to their surroundings. Although numerous studies have analyzed plant transcriptomic variation, few have quantified the effect of combinations of factors or identified factor-specific effects. In this study, we performed RNA sequencing (RNA-seq analysis on tobacco leaves derived from 10 treatment combinations of three groups of ecological factors, i.e., climate factors (CFs, soil factors (SFs, and tillage factors (TFs. We detected 4980, 2916, and 1605 differentially expressed genes (DEGs that were affected by CFs, SFs, and TFs, which included 2703, 768, and 507 specific and 703 common DEGs (simultaneously regulated by CFs, SFs, and TFs, respectively. GO and KEGG enrichment analyses showed that genes involved in abiotic stress responses and secondary metabolic pathways were overrepresented in the common and CF-specific DEGs. In addition, we noted enrichment in CF-specific DEGs related to the circadian rhythm, SF-specific DEGs involved in mineral nutrient absorption and transport, and SF- and TF-specific DEGs associated with photosynthesis. Based on these results, we propose a model that explains how plants adapt to various ecological factors at the transcriptomic level. Additionally, the identified DEGs lay the foundation for future investigations of stress resistance, circadian rhythm and photosynthesis in tobacco.

  17. Characterization and Development of EST-SSRs by Deep Transcriptome Sequencing in Chinese Cabbage (Brassica rapa L. ssp. pekinensis

    Directory of Open Access Journals (Sweden)

    Qian Ding

    2015-01-01

    Full Text Available Simple sequence repeats (SSRs are among the most important markers for population analysis and have been widely used in plant genetic mapping and molecular breeding. Expressed sequence tag-SSR (EST-SSR markers, located in the coding regions, are potentially more efficient for QTL mapping, gene targeting, and marker-assisted breeding. In this study, we investigated 51,694 nonredundant unigenes, assembled from clean reads from deep transcriptome sequencing with a Solexa/Illumina platform, for identification and development of EST-SSRs in Chinese cabbage. In total, 10,420 EST-SSRs with over 12 bp were identified and characterized, among which 2744 EST-SSRs are new and 2317 are known ones showing polymorphism with previously reported SSRs. A total of 7877 PCR primer pairs for 1561 EST-SSR loci were designed, and primer pairs for twenty-four EST-SSRs were selected for primer evaluation. In nineteen EST-SSR loci (79.2%, amplicons were successfully generated with high quality. Seventeen (89.5% showed polymorphism in twenty-four cultivars of Chinese cabbage. The polymorphic alleles of each polymorphic locus were sequenced, and the results showed that most polymorphisms were due to variations of SSR repeat motifs. The EST-SSRs identified and characterized in this study have important implications for developing new tools for genetics and molecular breeding in Chinese cabbage.

  18. Genome and Transcriptome of Clostridium phytofermentans, Catalyst for the Direct Conversion of Plant Feedstocks to Fuels.

    Directory of Open Access Journals (Sweden)

    Elsa Petit

    Full Text Available Clostridium phytofermentans was isolated from forest soil and is distinguished by its capacity to directly ferment plant cell wall polysaccharides into ethanol as the primary product, suggesting that it possesses unusual catabolic pathways. The objective of the present study was to understand the molecular mechanisms of biomass conversion to ethanol in a single organism, Clostridium phytofermentans, by analyzing its complete genome and transcriptome during growth on plant carbohydrates. The saccharolytic versatility of C. phytofermentans is reflected in a diversity of genes encoding ATP-binding cassette sugar transporters and glycoside hydrolases, many of which may have been acquired through horizontal gene transfer. These genes are frequently organized as operons that may be controlled individually by the many transcriptional regulators identified in the genome. Preferential ethanol production may be due to high levels of expression of multiple ethanol dehydrogenases and additional pathways maximizing ethanol yield. The genome also encodes three different proteinaceous bacterial microcompartments with the capacity to compartmentalize pathways that divert fermentation intermediates to various products. These characteristics make C. phytofermentans an attractive resource for improving the efficiency and speed of biomass conversion to biofuels.

  19. Direct chromosome-length haplotyping by single-cell sequencing

    NARCIS (Netherlands)

    Porubský, David; Sanders, Ashley D; van Wietmarschen, Niek; Falconer, Ester; Hills, Mark; Spierings, Diana C J; Bevova, Marianna R; Guryev, Victor; Lansdorp, Peter Michael

    Haplotypes are fundamental to fully characterize the diploid genome of an individual, yet methods to directly chart the unique genetic makeup of each parental chromosome are lacking. Here we introduce single-cell DNA template strand sequencing (Strand-seq) as a novel approach to phasing diploid

  20. Improvement of Lactobacillus plantarum aerobic growth as directed by comprehensive transcriptome analysis

    NARCIS (Netherlands)

    Stevens, Marc J. A.; Wiersma, Anne; de Vos, Willern M.; Kuipers, Oscar P.; Smid, Eddy J.; Molenaar, Douwe; Kleerebezem, Michiel; Vos, Willem M. de

    An aerobic Lactobacillus plantarum culture displayed growth stagnation during early growth. Transcriptome analysis revealed that resumption of growth after stagnation correlated with activation of CO(2)-producing pathways, suggesting that a limiting CO(2) concentration induced the stagnation.

  1. Population genetics, phylogenomics and hybrid speciation of Juglans in China determined from whole chloroplast genomes, transcriptomes, and genotyping-by-sequencing (GBS)

    Science.gov (United States)

    Peng Zhao; Hui-Juan Zhou; Daniel Potter; Yi-Heng Hu; Xiao-Jia Feng; Meng Dang; Li Feng; Saman Zulfiqar; Wen-Zhe Liu; Gui-Fang Zhao; Keith Woeste

    2018-01-01

    Genomic data are a powerful tool for elucidating the processes involved in the evolution and divergence of species. The speciation and phylogenetic relationships among Chinese Juglans remain unclear. Here, we used results from phylogenomic and population genetic analyses, transcriptomics, Genotyping-By-Sequencing (GBS), and whole chloroplast...

  2. High-throughput sequencing and analysis of the gill tissue transcriptome from the deep-sea hydrothermal vent mussel Bathymodiolus azoricus

    Directory of Open Access Journals (Sweden)

    Gomes Paula

    2010-10-01

    PCR experiments. Conclusions We have established the first tissue transcriptional analysis of a deep-sea hydrothermal vent animal and generated a searchable catalog of genes that provides a direct method of identifying and retrieving vast numbers of novel coding sequences which can be applied in gene expression profiling experiments from a non-conventional model organism. This provides the most comprehensive sequence resource for identifying novel genes currently available for a deep-sea vent organism, in particular, genes putatively involved in immune and inflammatory reactions in vent mussels. The characterization of the B. azoricus transcriptome will facilitate research into biological processes underlying physiological adaptations to hydrothermal vent environments and will provide a basis for expanding our understanding of genes putatively involved in adaptations processes during post-capture long term acclimatization experiments, at "sea-level" conditions, using B. azoricus as a model organism.

  3. Formation of Staphylococcus aureus Biofilm in the Presence of Sublethal Concentrations of Disinfectants Studied via a Transcriptomic Analysis Using Transcriptome Sequencing (RNA-seq)

    Science.gov (United States)

    Oppelt, J.; Cincarova, L.

    2017-01-01

    ABSTRACT Staphylococcus aureus is a common biofilm-forming pathogen. Low doses of disinfectants have previously been reported to promote biofilm formation and to increase virulence. The aim of this study was to use transcriptome sequencing (RNA-seq) analysis to investigate global transcriptional changes in S. aureus in response to sublethal concentrations of the commonly used food industry disinfectants ethanol (EtOH) and chloramine T (ChT) and their combination (EtOH_ChT) in order to better understand the effects of these agents on biofilm formation. Treatment with EtOH and EtOH_ChT resulted in more significantly altered expression profiles than treatment with ChT. Our results revealed that EtOH and EtOH_ChT treatments enhanced the expression of genes responsible for regulation of gene expression (sigB), cell surface factors (clfAB), adhesins (sdrDE), and capsular polysaccharides (cap8EFGL), resulting in more intact biofilm. In addition, in this study we were able to identify the pathways involved in the adaptation of S. aureus to the stress of ChT treatment. Further, EtOH suppressed the effect of ChT on gene expression when these agents were used together at sublethal concentrations. These data show that in the presence of sublethal concentrations of tested disinfectants, S. aureus cells trigger protective mechanisms and try to cope with them. IMPORTANCE So far, the effect of disinfectants is not satisfactorily explained. The presented data will allow a better understanding of the mode of disinfectant action with regard to biofilm formation and the ability of bacteria to survive the treatment. Such an understanding could contribute to the effort to eliminate possible sources of bacteria, making disinfectant application as efficient as possible. Biofilm formation plays significant role in the spread and pathogenesis of bacterial species. PMID:29030437

  4. Design of a 9K illumina BeadChip for polar bears (Ursus maritimus) from RAD and transcriptome sequencing.

    Science.gov (United States)

    Malenfant, René M; Coltman, David W; Davis, Corey S

    2015-05-01

    Single-nucleotide polymorphisms (SNPs) offer numerous advantages over anonymous markers such as microsatellites, including improved estimation of population parameters, finer-scale resolution of population structure and more precise genomic dissection of quantitative traits. However, many SNPs are needed to equal the resolution of a single microsatellite, and reliable large-scale genotyping of SNPs remains a challenge in nonmodel species. Here, we document the creation of a 9K Illumina Infinium BeadChip for polar bears (Ursus maritimus), which will be used to investigate: (i) the fine-scale population structure among Canadian polar bears and (ii) the genomic architecture of phenotypic traits in the Western Hudson Bay subpopulation. To this end, we used restriction-site associated DNA (RAD) sequencing from 38 bears across their circumpolar range, as well as blood/fat transcriptome sequencing of 10 individuals from Western Hudson Bay. Six-thousand RAD SNPs and 3000 transcriptomic SNPs were selected for the chip, based primarily on genomic spacing and gene function respectively. Of the 9000 SNPs ordered from Illumina, 8042 were successfully printed, and - after genotyping 1450 polar bears - 5441 of these SNPs were found to be well clustered and polymorphic. Using this array, we show rapid linkage disequilibrium decay among polar bears, we demonstrate that in a subsample of 78 individuals, our SNPs detect known genetic structure more clearly than 24 microsatellites genotyped for the same individuals and that these results are not driven by the SNP ascertainment scheme. Here, we present one of the first large-scale genotyping resources designed for a threatened species. © 2014 John Wiley & Sons Ltd.

  5. De Novo Transcriptome Sequencing in Passiflora edulis Sims to Identify Genes and Signaling Pathways Involved in Cold Tolerance

    Directory of Open Access Journals (Sweden)

    Sian Liu

    2017-11-01

    Full Text Available The passion fruit (Passiflora edulis Sims, also known as the purple granadilla, is widely cultivated as the new darling of the fruit market throughout southern China. This exotic and perennial climber is adapted to warm and humid climates, and thus is generally intolerant of cold. There is limited information about gene regulation and signaling pathways related to the cold stress response in this species. In this study, two transcriptome libraries (KEDU_AP vs. GX_AP were constructed from the aerial parts of cold-tolerant and cold-susceptible varieties of P. edulis, respectively. Overall, 126,284,018 clean reads were obtained, and 86,880 unigenes with a mean size of 1449 bp were assembled. Of these, there were 64,067 (73.74% unigenes with significant similarity to publicly available plant protein sequences. Expression profiles were generated, and 3045 genes were found to be significantly differentially expressed between the KEDU_AP and GX_AP libraries, including 1075 (35.3% up-regulated and 1970 (64.7% down-regulated. These included 36 genes in enriched pathways of plant hormone signal transduction, and 56 genes encoding putative transcription factors. Six genes involved in the ICE1–CBF–COR pathway were induced in the cold-tolerant variety, and their expression levels were further verified using quantitative real-time PCR. This report is the first to identify genes and signaling pathways involved in cold tolerance using high-throughput transcriptome sequencing in P. edulis. These findings may provide useful insights into the molecular mechanisms regulating cold tolerance and genetic breeding in Passiflora spp.

  6. 454 Transcriptome sequencing suggests a role for two-component signalling in cellularization and differentiation of barley endosperm transfer cells.

    Science.gov (United States)

    Thiel, Johannes; Hollmann, Julien; Rutten, Twan; Weber, Hans; Scholz, Uwe; Weschke, Winfriede

    2012-01-01

    Cell specification and differentiation in the endosperm of cereals starts at the maternal-filial boundary and generates the endosperm transfer cells (ETCs). Besides the importance in assimilate transfer, ETCs are proposed to play an essential role in the regulation of endosperm differentiation by affecting development of proximate endosperm tissues. We attempted to identify signalling elements involved in early endosperm differentiation by using a combination of laser-assisted microdissection and 454 transcriptome sequencing. 454 sequencing of the differentiating ETC region from the syncytial state until functionality in transfer processes captured a high proportion of novel transcripts which are not available in existing barley EST databases. Intriguingly, the ETC-transcriptome showed a high abundance of elements of the two-component signalling (TCS) system suggesting an outstanding role in ETC differentiation. All components and subfamilies of the TCS, including distinct kinds of membrane-bound receptors, have been identified to be expressed in ETCs. The TCS system represents an ancient signal transduction system firstly discovered in bacteria and has previously been shown to be co-opted by eukaryotes, like fungi and plants, whereas in animals and humans this signalling route does not exist. Transcript profiling of TCS elements by qRT-PCR suggested pivotal roles for specific phosphorelays activated in a coordinated time flow during ETC cellularization and differentiation. ETC-specificity of transcriptionally activated TCS phosphorelays was assessed for early differentiation and cellularization contrasting to an extension of expression to other grain tissues at the beginning of ETC maturation. Features of candidate genes of distinct phosphorelays and transcriptional activation of genes putatively implicated in hormone signalling pathways hint at a crosstalk of hormonal influences, putatively ABA and ethylene, and TCS signalling. Our findings suggest an integral

  7. Deep developmental transcriptome sequencing uncovers numerous new genes and enhances gene annotation in the sponge Amphimedon queenslandica.

    Science.gov (United States)

    Fernandez-Valverde, Selene L; Calcino, Andrew D; Degnan, Bernard M

    2015-05-15

    The demosponge Amphimedon queenslandica is amongst the few early-branching metazoans with an assembled and annotated draft genome, making it an important species in the study of the origin and early evolution of animals. Current gene models in this species are largely based on in silico predictions and low coverage expressed sequence tag (EST) evidence. Amphimedon queenslandica protein-coding gene models are improved using deep RNA-Seq data from four developmental stages and CEL-Seq data from 82 developmental samples. Over 86% of previously predicted genes are retained in the new gene models, although 24% have additional exons; there is also a marked increase in the total number of annotated 3' and 5' untranslated regions (UTRs). Importantly, these new developmental transcriptome data reveal numerous previously unannotated protein-coding genes in the Amphimedon genome, increasing the total gene number by 25%, from 30,060 to 40,122. In general, Amphimedon genes have introns that are markedly smaller than those in other animals and most of the alternatively spliced genes in Amphimedon undergo intron-retention; exon-skipping is the least common mode of alternative splicing. Finally, in addition to canonical polyadenylation signal sequences, Amphimedon genes are enriched in a number of unique AT-rich motifs in their 3' UTRs. The inclusion of developmental transcriptome data has substantially improved the structure and composition of protein-coding gene models in Amphimedon queenslandica, providing a more accurate and comprehensive set of genes for functional and comparative studies. These improvements reveal the Amphimedon genome is comprised of a remarkably high number of tightly packed genes. These genes have small introns and there is pervasive intron retention amongst alternatively spliced transcripts. These aspects of the sponge genome are more similar unicellular opisthokont genomes than to other animal genomes.

  8. Identification and characterization of large DNA deletions affecting oil quality traits in soybean seeds through transcriptome sequencing analysis.

    Science.gov (United States)

    Goettel, Wolfgang; Ramirez, Martha; Upchurch, Robert G; An, Yong-Qiang Charles

    2016-08-01

    Identification and characterization of a 254-kb genomic deletion on a duplicated chromosome segment that resulted in a low level of palmitic acid in soybean seeds using transcriptome sequencing. A large number of soybean genotypes varying in seed oil composition and content have been identified. Understanding the molecular mechanisms underlying these variations is important for breeders to effectively utilize them as a genetic resource. Through design and application of a bioinformatics approach, we identified nine co-regulated gene clusters by comparing seed transcriptomes of nine soybean genotypes varying in oil composition and content. We demonstrated that four gene clusters in the genotypes M23, Jack and N0304-303-3 coincided with large-scale genome rearrangements. The co-regulated gene clusters in M23 and Jack mapped to a previously described 164-kb deletion and a copy number amplification of the Rhg1 locus, respectively. The coordinately down-regulated gene clusters in N0304-303-3 were caused by a 254-kb deletion containing 19 genes including a fatty acyl-ACP thioesterase B gene (FATB1a). This deletion was associated with reduced palmitic acid content in seeds and was the molecular cause of a previously reported nonfunctional FATB1a allele, fap nc . The M23 and N0304-304-3 deletions were located in duplicated genome segments retained from the Glycine-specific whole genome duplication that occurred 13 million years ago. The homoeologous genes in these duplicated regions shared a strong similarity in both their encoded protein sequences and transcript accumulation levels, suggesting that they may have conserved and important functions in seeds. The functional conservation of homoeologous genes may result in genetic redundancy and gene dosage effects for their associated seed traits, explaining why the large deletion did not cause lethal effects or completely eliminate palmitic acid in N0304-303-3.

  9. Generation and analysis of blueberry transcriptome sequences from leaves, developing fruit, and flower buds from cold acclimation through deacclimation

    Directory of Open Access Journals (Sweden)

    Rowland Lisa J

    2012-04-01

    Full Text Available Abstract Background There has been increased consumption of blueberries in recent years fueled in part because of their many recognized health benefits. Blueberry fruit is very high in anthocyanins, which have been linked to improved night vision, prevention of macular degeneration, anti-cancer activity, and reduced risk of heart disease. Very few genomic resources have been available for blueberry, however. Further development of genomic resources like expressed sequence tags (ESTs, molecular markers, and genetic linkage maps could lead to more rapid genetic improvement. Marker-assisted selection could be used to combine traits for climatic adaptation with fruit and nutritional quality traits. Results Efforts to sequence the transcriptome of the commercial highbush blueberry (Vaccinium corymbosum cultivar Bluecrop and use the sequences to identify genes associated with cold acclimation and fruit development and develop SSR markers for mapping studies are presented here. Transcriptome sequences were generated from blueberry fruit at different stages of development, flower buds at different stages of cold acclimation, and leaves by next-generation Roche 454 sequencing. Over 600,000 reads were assembled into approximately 15,000 contigs and 124,000 singletons. The assembled sequences were annotated and functionally mapped to Gene Ontology (GO terms. Frequency of the most abundant sequences in each of the libraries was compared across all libraries to identify genes that are potentially differentially expressed during cold acclimation and fruit development. Real-time PCR was performed to confirm their differential expression patterns. Overall, 14 out of 17 of the genes examined had differential expression patterns similar to what was predicted from their reads alone. The assembled sequences were also mined for SSRs. From these sequences, 15,886 blueberry EST-SSR loci were identified. Primers were designed from 7,705 of the SSR-containing sequences

  10. Generation and analysis of blueberry transcriptome sequences from leaves, developing fruit, and flower buds from cold acclimation through deacclimation

    Science.gov (United States)

    2012-01-01

    Background There has been increased consumption of blueberries in recent years fueled in part because of their many recognized health benefits. Blueberry fruit is very high in anthocyanins, which have been linked to improved night vision, prevention of macular degeneration, anti-cancer activity, and reduced risk of heart disease. Very few genomic resources have been available for blueberry, however. Further development of genomic resources like expressed sequence tags (ESTs), molecular markers, and genetic linkage maps could lead to more rapid genetic improvement. Marker-assisted selection could be used to combine traits for climatic adaptation with fruit and nutritional quality traits. Results Efforts to sequence the transcriptome of the commercial highbush blueberry (Vaccinium corymbosum) cultivar Bluecrop and use the sequences to identify genes associated with cold acclimation and fruit development and develop SSR markers for mapping studies are presented here. Transcriptome sequences were generated from blueberry fruit at different stages of development, flower buds at different stages of cold acclimation, and leaves by next-generation Roche 454 sequencing. Over 600,000 reads were assembled into approximately 15,000 contigs and 124,000 singletons. The assembled sequences were annotated and functionally mapped to Gene Ontology (GO) terms. Frequency of the most abundant sequences in each of the libraries was compared across all libraries to identify genes that are potentially differentially expressed during cold acclimation and fruit development. Real-time PCR was performed to confirm their differential expression patterns. Overall, 14 out of 17 of the genes examined had differential expression patterns similar to what was predicted from their reads alone. The assembled sequences were also mined for SSRs. From these sequences, 15,886 blueberry EST-SSR loci were identified. Primers were designed from 7,705 of the SSR-containing sequences with adequate flanking

  11. Generation and analysis of blueberry transcriptome sequences from leaves, developing fruit, and flower buds from cold acclimation through deacclimation.

    Science.gov (United States)

    Rowland, Lisa J; Alkharouf, Nadim; Darwish, Omar; Ogden, Elizabeth L; Polashock, James J; Bassil, Nahla V; Main, Dorrie

    2012-04-02

    There has been increased consumption of blueberries in recent years fueled in part because of their many recognized health benefits. Blueberry fruit is very high in anthocyanins, which have been linked to improved night vision, prevention of macular degeneration, anti-cancer activity, and reduced risk of heart disease. Very few genomic resources have been available for blueberry, however. Further development of genomic resources like expressed sequence tags (ESTs), molecular markers, and genetic linkage maps could lead to more rapid genetic improvement. Marker-assisted selection could be used to combine traits for climatic adaptation with fruit and nutritional quality traits. Efforts to sequence the transcriptome of the commercial highbush blueberry (Vaccinium corymbosum) cultivar Bluecrop and use the sequences to identify genes associated with cold acclimation and fruit development and develop SSR markers for mapping studies are presented here. Transcriptome sequences were generated from blueberry fruit at different stages of development, flower buds at different stages of cold acclimation, and leaves by next-generation Roche 454 sequencing. Over 600,000 reads were assembled into approximately 15,000 contigs and 124,000 singletons. The assembled sequences were annotated and functionally mapped to Gene Ontology (GO) terms. Frequency of the most abundant sequences in each of the libraries was compared across all libraries to identify genes that are potentially differentially expressed during cold acclimation and fruit development. Real-time PCR was performed to confirm their differential expression patterns. Overall, 14 out of 17 of the genes examined had differential expression patterns similar to what was predicted from their reads alone. The assembled sequences were also mined for SSRs. From these sequences, 15,886 blueberry EST-SSR loci were identified. Primers were designed from 7,705 of the SSR-containing sequences with adequate flanking sequence. One hundred

  12. Transcriptome sequencing of the blind subterranean mole rat, Spalax galili: Utility and potential for the discovery of novel evolutionary patterns

    KAUST Repository

    Malik, Assaf

    2011-08-12

    The blind subterranean mole rat (Spalax ehrenbergi superspecies) is a model animal for survival under extreme environments due to its ability to live in underground habitats under severe hypoxic stress and darkness. Here we report the transcriptome sequencing of Spalax galili, a chromosomal type of S. ehrenbergi. cDNA pools from muscle and brain tissues isolated from animals exposed to hypoxic and normoxic conditions were sequenced using Sanger, GS FLX, and GS FLX Titanium technologies. Assembly of the sequences yielded over 51,000 isotigs with homology to ~12,000 mouse, rat or human genes. Based on these results, it was possible to detect large numbers of splice variants, SNPs, and novel transcribed regions. In addition, multiple differential expression patterns were detected between tissues and treatments. The results presented here will serve as a valuable resource for future studies aimed at identifying genes and gene regions evolved during the adaptive radiation associated with underground life of the blind mole rat. 2011 Malik et al.

  13. Genome Sequence and Transcriptome Analyses of Chrysochromulina tobin: Metabolic Tools for Enhanced Algal Fitness in the Prominent Order Prymnesiales (Haptophyceae.

    Directory of Open Access Journals (Sweden)

    Blake T Hovde

    Full Text Available Haptophytes are recognized as seminal players in aquatic ecosystem function. These algae are important in global carbon sequestration, form destructive harmful blooms, and given their rich fatty acid content, serve as a highly nutritive food source to a broad range of eco-cohorts. Haptophyte dominance in both fresh and marine waters is supported by the mixotrophic nature of many taxa. Despite their importance the nuclear genome sequence of only one haptophyte, Emiliania huxleyi (Isochrysidales, is available. Here we report the draft genome sequence of Chrysochromulina tobin (Prymnesiales, and transcriptome data collected at seven time points over a 24-hour light/dark cycle. The nuclear genome of C. tobin is small (59 Mb, compact (∼ 40% of the genome is protein coding and encodes approximately 16,777 genes. Genes important to fatty acid synthesis, modification, and catabolism show distinct patterns of expression when monitored over the circadian photoperiod. The C. tobin genome harbors the first hybrid polyketide synthase/non-ribosomal peptide synthase gene complex reported for an algal species, and encodes potential anti-microbial peptides and proteins involved in multidrug and toxic compound extrusion. A new haptophyte xanthorhodopsin was also identified, together with two "red" RuBisCO activases that are shared across many algal lineages. The Chrysochromulina tobin genome sequence provides new information on the evolutionary history, ecology and economic importance of haptophytes.

  14. Transcriptome sequencing and metabolite analysis reveals the role of delphinidin metabolism in flower colour in grape hyacinth.

    Science.gov (United States)

    Lou, Qian; Liu, Yali; Qi, Yinyan; Jiao, Shuzhen; Tian, Feifei; Jiang, Ling; Wang, Yuejin

    2014-07-01

    Grape hyacinth (Muscari) is an important ornamental bulbous plant with an extraordinary blue colour. Muscari armeniacum, whose flowers can be naturally white, provides an opportunity to unravel the complex metabolic networks underlying certain biochemical traits, especially colour. A blue flower cDNA library of M. armeniacum and a white flower library of M. armeniacum f. album were used for transcriptome sequencing. A total of 89 926 uni-transcripts were isolated, 143 of which could be identified as putative homologues of colour-related genes in other species. Based on a comprehensive analysis relating colour compounds to gene expression profiles, the mechanism of colour biosynthesis was studied in M. armeniacum. Furthermore, a new hypothesis explaining the lack of colour phenotype of the grape hyacinth flower is proposed. Alteration of the substrate competition between flavonol synthase (FLS) and dihydroflavonol 4-reductase (DFR) may lead to elimination of blue pigmentation while the multishunt from the limited flux in the cyanidin (Cy) synthesis pathway seems to be the most likely reason for the colour change in the white flowers of M. armeniacum. Moreover, mass sequence data obtained by the deep sequencing of M. armeniacum and its white variant provided a platform for future function and molecular biological research on M. armeniacum. © The Author 2014. Published by Oxford University Press on behalf of the Society for Experimental Biology.

  15. De novo transcriptome sequencing and comparative analysis of differentially expressed genes in dryoperis fragrans under temperature stress

    International Nuclear Information System (INIS)

    Wang, W.Z.; Tong, W.S.; Gao, R.

    2016-01-01

    Dryopteris fragrans is a species of fern and contains flavonoids compounds with medicinal value. This study explain the temperature stress impact flavonoids synthesis in D. fragrans tissue culture seedlings under the low temperature at 4 degree C, high temperature at 35 degree C and moderate temperature at 25 degree C. By using Illumina HiSeq 2000 sequencing, 80.9 million raw sequence reads were de novo assembled into 66,716 non-redundant unigenes. 38,486 unigenes (57.7%) were annotated for their function. 13,973 unigenes and 29,598 unigenes were allocated to gene ontology (GO) and clusters of orthologous group (COG), respectively. 18,989 sequences mapped to 118 Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG), 204 genes were involved in flavonoid biosynthesis, regulation and transport. 25,292 and 16,817 unigenes exhibited marked differential expression in response to temperature shifts of 25 degree C to 4 degree C and 25 degree C to 35 degree C, respectively. 4CL and CHS genes involved in flavonoid biosynthesis were tested and suggested that they were responsible for biosynthesis of flavonoids. This study provides the first published data to describe the D. fragrans transcriptome and should accelerate understanding of flavonoids biosynthesis, regulation and transport mechanisms. Since most unigenes described here were successfully annotated, these results should facilitate future functional genomic understanding and research of D. fragrans. (author)

  16. Transcriptome sequencing and de novo assembly in arecanut, Areca catechu L elucidates the secondary metabolite pathway genes

    Directory of Open Access Journals (Sweden)

    Ramaswamy Manimekalai

    2018-03-01

    Full Text Available Areca catechu L. belongs to the Arecaceae family which comprises many economically important palms. The palm is a source of alkaloids and carotenoids. The lack of ample genetic information in public databases has been a constraint for the genetic improvement of arecanut. To gain molecular insight into the palm, high throughput RNA sequencing and de novo assembly of arecanut leaf transcriptome was undertaken in the present study. A total 56,321,907 paired end reads of 101 bp length consisting of 11.343 Gb nucleotides were generated. De novo assembly resulted in 48,783 good quality transcripts, of which 67% of transcripts could be annotated against NCBI non – redundant database. The Gene Ontology (GO analysis with UniProt database identified 9222 biological process, 11268 molecular function and 7574 cellular components GO terms. Large scale expression profiling through Fragments per Kilobase per Million mapped reads (FPKM showed major genes involved in different metabolic pathways of the plant. Metabolic pathway analysis of the assembled transcripts identified 124 plant related pathways. The transcripts related to carotenoid and alkaloid biosynthetic pathways had more number of reads and FPKM values suggesting higher expression of these genes. The arecanut transcript sequences generated in the study showed high similarity with coconut, oil palm and date palm sequences retrieved from public domains. We also identified 6853 genic SSR regions in the arecanut. The possible primers were designed for SSR detection and this would simplify the future efforts in genetic characterization of arecanut.

  17. All 37 Mitochondrial Genes of Aphid Aphis craccivora Obtained from Transcriptome Sequencing: Implications for the Evolution of Aphids.

    Directory of Open Access Journals (Sweden)

    Nan Song

    Full Text Available The availability of mitochondrial genome data for Aphididae, one of the economically important insect pest families, in public databases is limited. The advent of next generation sequencing technology provides the potential to generate mitochondrial genome data for many species timely and cost-effectively. In this report, we used transcriptome sequencing technology to determine all the 37 mitochondrial genes of the cowpea aphid, Aphis craccivora. This method avoids the necessity of finding suitable primers for long PCRs or primer-walking amplicons, and is proved to be effective in obtaining the whole set of mitochondrial gene data for insects with difficulty in sequencing mitochondrial genome by PCR-based strategies. Phylogenetic analyses of aphid mitochondrial genome data show clustering based on tribe level, and strongly support the monophyly of the family Aphididae. Within the monophyletic Aphidini, three samples from Aphis grouped together. In another major clade of Aphididae, Pterocomma pilosum was recovered as a potential sister-group of Cavariella salicicola, as part of Macrosiphini.

  18. Bifunctional cis-Abienol Synthase from Abies balsamea Discovered by Transcriptome Sequencing and Its Implications for Diterpenoid Fragrance Production*

    Science.gov (United States)

    Zerbe, Philipp; Chiang, Angela; Yuen, Macaire; Hamberger, Björn; Hamberger, Britta; Draper, Jason A.; Britton, Robert; Bohlmann, Jörg

    2012-01-01

    The labdanoid diterpene alcohol cis-abienol is a major component of the aromatic oleoresin of balsam fir (Abies balsamea) and serves as a valuable bioproduct material for the fragrance industry. Using high-throughput 454 transcriptome sequencing and metabolite profiling of balsam fir bark tissue, we identified candidate diterpene synthase sequences for full-length cDNA cloning and functional characterization. We discovered a bifunctional class I/II cis-abienol synthase (AbCAS), along with the paralogous levopimaradiene/abietadiene synthase and isopimaradiene synthase, all of which are members of the gymnosperm-specific TPS-d subfamily. The AbCAS-catalyzed formation of cis-abienol proceeds via cyclization and hydroxylation at carbon C-8 of a postulated carbocation intermediate in the class II active site, followed by cleavage of the diphosphate group and termination of the reaction sequence without further cyclization in the class I active site. This reaction mechanism is distinct from that of synthases of the isopimaradiene- or levopimaradiene/abietadiene synthase type, which employ deprotonation reactions in the class II active site and secondary cyclizations in the class I active site, leading to tricyclic diterpenes. Comparative homology modeling suggested the active site residues Asp-348, Leu-617, Phe-696, and Gly-723 as potentially important for the specificity of AbCAS. As a class I/II bifunctional enzyme, AbCAS is a promising target for metabolic engineering of cis-abienol production. PMID:22337889

  19. Transcriptome sequence analysis of an ornamental plant, Ananas comosus var. bracteatus, revealed the potential unigenes involved in terpenoid and phenylpropanoid biosynthesis.

    Science.gov (United States)

    Ma, Jun; Kanakala, S; He, Yehua; Zhang, Junli; Zhong, Xiaolan

    2015-01-01

    Ananas comosus var. bracteatus (Red Pineapple) is an important ornamental plant for its colorful leaves and decorative red fruits. Because of its complex genome, it is difficult to understand the molecular mechanisms involved in the growth and development. Thus high-throughput transcriptome sequencing of Ananas comosus var. bracteatus is necessary to generate large quantities of transcript sequences for the purpose of gene discovery and functional genomic studies. The Ananas comosus var. bracteatus transcriptome was sequenced by the Illumina paired-end sequencing technology. We obtained a total of 23.5 million high quality sequencing reads, 1,555,808 contigs and 41,052 unigenes. In total 41,052 unigenes of Ananas comosus var. bracteatus, 23,275 unigenes were annotated in the NCBI non-redundant protein database and 23,134 unigenes were annotated in the Swiss-Port database. Out of these, 17,748 and 8,505 unigenes were assigned to gene ontology categories and clusters of orthologous groups, respectively. Functional annotation against Kyoto Encyclopedia of Genes and Genomes Pathway database identified 5,825 unigenes which were mapped to 117 pathways. The assembly predicted many unigenes that were previously unknown. The annotated unigenes were compared against pineapple, rice, maize, Arabidopsis, and sorghum. Unigenes that did not match any of those five sequence datasets are considered to be Ananas comosus var. bracteatus unique. We predicted unigenes encoding enzymes involved in terpenoid and phenylpropanoid biosynthesis. The sequence data provide the most comprehensive transcriptomic resource currently available for Ananas comosus var. bracteatus. To our knowledge; this is the first report on the de novo transcriptome sequencing of the Ananas comosus var. bracteatus. Unigenes obtained in this study, may help improve future gene expression, genetic and genomics studies in Ananas comosus var. bracteatus.

  20. Transcriptome Sequence Analysis of an Ornamental Plant, Ananas comosus var. bracteatus, Revealed the Potential Unigenes Involved in Terpenoid and Phenylpropanoid Biosynthesis

    OpenAIRE

    Ma, Jun; Kanakala, S.; He, Yehua; Zhang, Junli; Zhong, Xiaolan

    2015-01-01

    Background Ananas comosus var. bracteatus (Red Pineapple) is an important ornamental plant for its colorful leaves and decorative red fruits. Because of its complex genome, it is difficult to understand the molecular mechanisms involved in the growth and development. Thus high-throughput transcriptome sequencing of Ananas comosus var. bracteatus is necessary to generate large quantities of transcript sequences for the purpose of gene discovery and functional genomic studies. Results The Anana...

  1. Transcriptome sequence analysis of an ornamental plant, Ananas comosus var. bracteatus, revealed the potential unigenes involved in terpenoid and phenylpropanoid biosynthesis.

    Directory of Open Access Journals (Sweden)

    Jun Ma

    Full Text Available Ananas comosus var. bracteatus (Red Pineapple is an important ornamental plant for its colorful leaves and decorative red fruits. Because of its complex genome, it is difficult to understand the molecular mechanisms involved in the growth and development. Thus high-throughput transcriptome sequencing of Ananas comosus var. bracteatus is necessary to generate large quantities of transcript sequences for the purpose of gene discovery and functional genomic studies.The Ananas comosus var. bracteatus transcriptome was sequenced by the Illumina paired-end sequencing technology. We obtained a total of 23.5 million high quality sequencing reads, 1,555,808 contigs and 41,052 unigenes. In total 41,052 unigenes of Ananas comosus var. bracteatus, 23,275 unigenes were annotated in the NCBI non-redundant protein database and 23,134 unigenes were annotated in the Swiss-Port database. Out of these, 17,748 and 8,505 unigenes were assigned to gene ontology categories and clusters of orthologous groups, respectively. Functional annotation against Kyoto Encyclopedia of Genes and Genomes Pathway database identified 5,825 unigenes which were mapped to 117 pathways. The assembly predicted many unigenes that were previously unknown. The annotated unigenes were compared against pineapple, rice, maize, Arabidopsis, and sorghum. Unigenes that did not match any of those five sequence datasets are considered to be Ananas comosus var. bracteatus unique. We predicted unigenes encoding enzymes involved in terpenoid and phenylpropanoid biosynthesis.The sequence data provide the most comprehensive transcriptomic resource currently available for Ananas comosus var. bracteatus. To our knowledge; this is the first report on the de novo transcriptome sequencing of the Ananas comosus var. bracteatus. Unigenes obtained in this study, may help improve future gene expression, genetic and genomics studies in Ananas comosus var. bracteatus.

  2. Transcriptome sequencing of Mycosphaerella fijiensis during association with Musa acuminata reveals candidate pathogenicity genes.

    Science.gov (United States)

    Noar, Roslyn D; Daub, Margaret E

    2016-08-30

    Mycosphaerella fijiensis, causative agent of the black Sigatoka disease of banana, is considered the most economically damaging banana disease. Despite its importance, the genetics of pathogenicity are poorly understood. Previous studies have characterized polyketide pathways with possible roles in pathogenicity. To identify additional candidate pathogenicity genes, we compared the transcriptome of this fungus during the necrotrophic phase of infection with that during saprophytic growth in medium. Transcriptome analysis was conducted, and the functions of differentially expressed genes were predicted by identifying conserved domains, Gene Ontology (GO) annotation and GO enrichment analysis, Carbohydrate-Active EnZymes (CAZy) annotation, and identification of genes encoding effector-like proteins. The analysis showed that genes commonly involved in secondary metabolism have higher expression in infected leaf tissue, including genes encoding cytochrome P450s, short-chain dehydrogenases, and oxidoreductases in the 2-oxoglutarate and Fe(II)-dependent oxygenase superfamily. Other pathogenicity-related genes with higher expression in infected leaf tissue include genes encoding salicylate hydroxylase-like proteins, hydrophobic surface binding proteins, CFEM domain-containing proteins, and genes encoding secreted cysteine-rich proteins characteristic of effectors. More genes encoding amino acid transporters, oligopeptide transporters, peptidases, proteases, proteinases, sugar transporters, and proteins containing Domain of Unknown Function (DUF) 3328 had higher expression in infected leaf tissue, while more genes encoding inhibitors of peptidases and proteinases had higher expression in medium. Sixteen gene clusters with higher expression in leaf tissue were identified including clusters for the synthesis of a non-ribosomal peptide. A cluster encoding a novel fusicoccane was also identified. Two putative dispensable scaffolds were identified with a large proportion of

  3. Computational sequence analysis of predicted long dsRNA transcriptomes of major crops reveals sequence complementarity with human genes.

    Science.gov (United States)

    Jensen, Peter D; Zhang, Yuanji; Wiggins, B Elizabeth; Petrick, Jay S; Zhu, Jin; Kerstetter, Randall A; Heck, Gregory R; Ivashuta, Sergey I

    2013-01-01

    Long double-stranded RNAs (long dsRNAs) are precursors for the effector molecules of sequence-specific RNA-based gene silencing in eukaryotes. Plant cells can contain numerous endogenous long dsRNAs. This study demonstrates that such endogenous long dsRNAs in plants have sequence complementarity to human genes. Many of these complementary long dsRNAs have perfect sequence complementarity of at least 21 nucleotides to human genes; enough complementarity to potentially trigger gene silencing in targeted human cells if delivered in functional form. However, the number and diversity of long dsRNA molecules in plant tissue from crops such as lettuce, tomato, corn, soy and rice with complementarity to human genes that have a long history of safe consumption supports a conclusion that long dsRNAs do not present a significant dietary risk.

  4. Constructing and sampling directed graphs with given degree sequences

    International Nuclear Information System (INIS)

    Kim, H; Del Genio, C I; Bassler, K E; Toroczkai, Z

    2012-01-01

    The interactions between the components of complex networks are often directed. Proper modeling of such systems frequently requires the construction of ensembles of digraphs with a given sequence of in- and out-degrees. As the number of simple labeled graphs with a given degree sequence is typically very large even for short sequences, sampling methods are needed for statistical studies. Currently, there are two main classes of methods that generate samples. One of the existing methods first generates a restricted class of graphs and then uses a Markov chain Monte-Carlo algorithm based on edge swaps to generate other realizations. As the mixing time of this process is still unknown, the independence of the samples is not well controlled. The other class of methods is based on the configuration model that may lead to unacceptably many sample rejections due to self-loops and multiple edges. Here we present an algorithm that can directly construct all possible realizations of a given bi-degree sequence by simple digraphs. Our method is rejection-free, guarantees the independence of the constructed samples and provides their weight. The weights can then be used to compute statistical averages of network observables as if they were obtained from uniformly distributed sampling or from any other chosen distribution. (paper)

  5. Deep sequencing whole transcriptome exploration of the σE regulon in Neisseria meningitidis.

    Directory of Open Access Journals (Sweden)

    Robert Antonius Gerhardus Huis in 't Veld

    Full Text Available Bacteria live in an ever-changing environment and must alter protein expression promptly to adapt to these changes and survive. Specific response genes that are regulated by a subset of alternative σ(70-like transcription factors have evolved in order to respond to this changing environment. Recently, we have described the existence of a σ(E regulon including the anti-σ-factor MseR in the obligate human bacterial pathogen Neisseria meningitidis. To unravel the complete σ(E regulon in N. meningitidis, we sequenced total RNA transcriptional content of wild type meningococci and compared it with that of mseR mutant cells (ΔmseR in which σ(E is highly expressed. Eleven coding genes and one non-coding gene were found to be differentially expressed between H44/76 wildtype and H44/76ΔmseR cells. Five of the 6 genes of the σ(E operon, msrA/msrB, and the gene encoding a pepSY-associated TM helix family protein showed enhanced transcription, whilst aniA encoding a nitrite reductase and nspA encoding the vaccine candidate Neisserial surface protein A showed decreased transcription. Analysis of differential expression in IGRs showed enhanced transcription of a non-coding RNA molecule, identifying a σ(E dependent small non-coding RNA. Together this constitutes the first complete exploration of an alternative σ-factor regulon in N. meningitidis. The results direct to a relatively small regulon indicative for a strictly defined response consistent with a relatively stable niche, the human throat, where N. meningitidis resides.

  6. Transcriptome sequencing in an ecologically important tree species: assembly, annotation, and marker discovery

    Directory of Open Access Journals (Sweden)

    Benkman Craig W

    2010-03-01

    Full Text Available Abstract Background Massively parallel sequencing of cDNA is now an efficient route for generating enormous sequence collections that represent expressed genes. This approach provides a valuable starting point for characterizing functional genetic variation in non-model organisms, especially where whole genome sequencing efforts are currently cost and time prohibitive. The large and complex genomes of pines (Pinus spp. have hindered the development of genomic resources, despite the ecological and economical importance of the group. While most genomic studies have focused on a single species (P. taeda, genomic level resources for other pines are insufficiently developed to facilitate ecological genomic research. Lodgepole pine (P. contorta is an ecologically important foundation species of montane forest ecosystems and exhibits substantial adaptive variation across its range in western North America. Here we describe a sequencing study of expressed genes from P. contorta, including their assembly and annotation, and their potential for molecular marker development to support population and association genetic studies. Results We obtained 586,732 sequencing reads from a 454 GS XLR70 Titanium pyrosequencer (mean length: 306 base pairs. A combination of reference-based and de novo assemblies yielded 63,657 contigs, with 239,793 reads remaining as singletons. Based on sequence similarity with known proteins, these sequences represent approximately 17,000 unique genes, many of which are well covered by contig sequences. This sequence collection also included a surprisingly large number of retrotransposon sequences, suggesting that they are highly transcriptionally active in the tissues we sampled. We located and characterized thousands of simple sequence repeats and single nucleotide polymorphisms as potential molecular markers in our assembled and annotated sequences. High quality PCR primers were designed for a substantial number of the SSR loci

  7. Genome-Wide Analysis of Gene and microRNA Expression in Diploid and Autotetraploid Paulownia fortunei (Seem Hemsl. under Drought Stress by Transcriptome, microRNA, and Degradome Sequencing

    Directory of Open Access Journals (Sweden)

    Zhenli Zhao

    2018-02-01

    Full Text Available Drought is a common and recurring climatic condition in many parts of the world, and it can have disastrous impacts on plant growth and development. Many genes involved in the drought response of plants have been identified. Transcriptome, microRNA (miRNA, and degradome analyses are rapid ways of identifying drought-responsive genes. The reference genome sequence of Paulownia fortunei (Seem Hemsl. is now available, which makes it easier to explore gene expression, transcriptional regulation, and post-transcriptional in this species. In this study, four transcriptome, small RNA, and degradome libraries were sequenced by Illumina sequencing, respectively. A total of 258 genes and 11 miRNAs were identified for drought-responsive genes and miRNAs in P. fortunei. Degradome sequencing detected 28 miRNA target genes that were cleaved by members of nine conserved miRNA families and 12 novel miRNAs. The results here will contribute toward enriching our understanding of the response of Paulownia fortunei trees to drought stress and may provide new direction for further experimental studies related the development of molecular markers, the genetic map construction, and other genomic research projects in Paulownia.

  8. RNA sequencing of the human milk fat layer transcriptome reveals distinct gene expression profiles at three stages of lactation.

    Directory of Open Access Journals (Sweden)

    Danielle G Lemay

    Full Text Available Aware of the important benefits of human milk, most U.S. women initiate breastfeeding but difficulties with milk supply lead some to quit earlier than intended. Yet, the contribution of maternal physiology to lactation difficulties remains poorly understood. Human milk fat globules, by enveloping cell contents during their secretion into milk, are a rich source of mammary cell RNA. Here, we pair this non-invasive mRNA source with RNA-sequencing to probe the milk fat layer transcriptome during three stages of lactation: colostral, transitional, and mature milk production. The resulting transcriptomes paint an exquisite portrait of human lactation. The resulting transcriptional profiles cluster not by postpartum day, but by milk Na:K ratio, indicating that women sampled during similar postpartum time frames could be at markedly different stages of gene expression. Each stage of lactation is characterized by a dynamic range (10(5-fold in transcript abundances not previously observed with microarray technology. We discovered that transcripts for isoferritins and cathepsins are strikingly abundant during colostrum production, highlighting the potential importance of these proteins for neonatal health. Two transcripts, encoding β-casein (CSN2 and α-lactalbumin (LALBA, make up 45% of the total pool of mRNA in mature lactation. Genes significantly expressed across all stages of lactation are associated with making, modifying, transporting, and packaging milk proteins. Stage-specific transcripts are associated with immune defense during the colostral stage, up-regulation of the machinery needed for milk protein synthesis during the transitional stage, and the production of lipids during mature lactation. We observed strong modulation of key genes involved in lactose synthesis and insulin signaling. In particular, protein tyrosine phosphatase, receptor type, F (PTPRF may serve as a biomarker linking insulin resistance with insufficient milk supply. This

  9. De novo transcriptome sequencing and comparative analysis to discover genes involved in ovarian maturity in Strongylocentrotus nudus.

    Science.gov (United States)

    Jia, Zhiying; Wang, Qiai; Wu, Kaikai; Wei, Zhenlin; Zhou, Zunchun; Liu, Xiaolin

    2017-09-01

    Strongylocentrotus nudus is an edible sea urchin, mainly harvested in China. Correlation studies indicated that S. nudus with larger diameter have a prolonged marketing time and better palatability owing to their precocious gonads and extended maturation process. However, the molecular mechanism underlying this phenomenon is still unknown. Here, transcriptome sequencing was applied to study the ovaries of adult S. nudus with different shell diameters to explore the possible mechanism. In this study, four independent cDNA libraries were constructed, including two from the big size urchins and two from the small ones using a HiSeq™2500 platform. A total of 88,581 unigenes were acquired with a mean length of 1354bp, of which 66,331 (74.88%) unigenes could be annotated using six major publicly available databases. Comparative analysis revealed that 353 unigenes were differentially expressed (with log2(ratio)≥1, FDR≤0.001) between the two groups. Of these, 20 differentially expressed genes (DEGs) were selected to confirm the accuracy of RNA-seq data by quantitative real-time RT-PCR. Furthermore, gene ontology and KEGG pathway enrichment analyses were performed to find the putative genes and pathways related to ovarian maturity. Eight unigenes were identified as significant DEGs involved in reproduction related pathways; these included Mos, Cdc20, Rec8, YP30, cytochrome P450 2U1, ovoperoxidase, proteoliaisin, and rendezvin. Our research fills the gap in the studies on the S. nudus ovaries using transcriptome analysis. Copyright © 2017 Elsevier Inc. All rights reserved.

  10. Development of SSR Markers Based on Transcriptome Sequencing and Association Analysis with Drought Tolerance in Perennial Grass Miscanthus from China

    Directory of Open Access Journals (Sweden)

    Gang Nie

    2017-05-01

    Full Text Available Drought has become a critical environmental stress affecting on plant in temperate area. As one of the promising bio-energy crops to sustainable biomass production, the genus Miscanthus has been widely studied around the world. However, the most widely used hybrid cultivar among this genus, Miscanthus × giganteus is proved poor drought tolerance compared to some parental species. Here we mainly focused on Miscanthus sinensis, which is one of the progenitors of M. × giganteus providing a comparable yield and well abiotic stress tolerance in some places. The main objectives were to characterize the physiological and photosynthetic respond to drought stress and to develop simple sequence repeats (SSRs markers associated with drought tolerance by transcriptome sequencing within an originally collection of 44 Miscanthus genotypes from southwest China. Significant phenotypic differences were observed among genotypes, and the average of leaf relative water content (RWC were severely affected by drought stress decreasing from 88.27 to 43.21%, which could well contribute to separating the drought resistant and drought sensitive genotype of Miscanthus. Furthermore, a total of 16,566 gene-associated SSRs markers were identified based on Illumina RNA sequencing under drought conditions, and 93 of them were randomly selected to validate. In total, 70 (75.3% SSRs were successfully amplified and the generated loci from 30 polymorphic SSRs were used to estimate the genetic differentiation and population structure. Finally, two optimum subgroups of the population were determined by structure analysis and based on association analysis, seven significant associations were identified including two markers with leaf RWC and five markers with photosynthetic traits. With the rich sequencing resources annotation, such associations would serve an efficient tool for Miscanthus drought response mechanism study and facilitate genetic improvement of drought resistant for

  11. Integrated mRNA and microRNA transcriptome sequencing characterizes sequence variants and mRNA–microRNA regulatory network in nasopharyngeal carcinoma model systems

    Directory of Open Access Journals (Sweden)

    Carol Ying-Ying Szeto

    2014-01-01

    Full Text Available Nasopharyngeal carcinoma (NPC is a prevalent malignancy in Southeast Asia among the Chinese population. Aberrant regulation of transcripts has been implicated in many types of cancers including NPC. Herein, we characterized mRNA and miRNA transcriptomes by RNA sequencing (RNASeq of NPC model systems. Matched total mRNA and small RNA of undifferentiated Epstein–Barr virus (EBV-positive NPC xenograft X666 and its derived cell line C666, well-differentiated NPC cell line HK1, and the immortalized nasopharyngeal epithelial cell line NP460 were sequenced by Solexa technology. We found 2812 genes and 149 miRNAs (human and EBV to be differentially expressed in NP460, HK1, C666 and X666 with RNASeq; 533 miRNA–mRNA target pairs were inversely regulated in the three NPC cell lines compared to NP460. Integrated mRNA/miRNA expression profiling and pathway analysis show extracellular matrix organization, Beta-1 integrin cell surface interactions, and the PI3K/AKT, EGFR, ErbB, and Wnt pathways were potentially deregulated in NPC. Real-time quantitative PCR was performed on selected mRNA/miRNAs in order to validate their expression. Transcript sequence variants such as short insertions and deletions (INDEL, single nucleotide variant (SNV, and isomiRs were characterized in the NPC model systems. A novel TP53 transcript variant was identified in NP460, HK1, and C666. Detection of three previously reported novel EBV-encoded BART miRNAs and their isomiRs were also observed. Meta-analysis of a model system to a clinical system aids the choice of different cell lines in NPC studies. This comprehensive characterization of mRNA and miRNA transcriptomes in NPC cell lines and the xenograft provides insights on miRNA regulation of mRNA and valuable resources on transcript variation and regulation in NPC, which are potentially useful for mechanistic and preclinical studies.

  12. Discovery of J chain in African lungfish (Protopterus dolloi, Sarcopterygii using high throughput transcriptome sequencing: implications in mucosal immunity.

    Directory of Open Access Journals (Sweden)

    Luca Tacchi

    Full Text Available J chain is a small polypeptide responsible for immunoglobulin (Ig polymerization and transport of Igs across mucosal surfaces in higher vertebrates. We identified a J chain in dipnoid fish, the African lungfish (Protopterus dolloi by high throughput sequencing of the transcriptome. P. dolloi J chain is 161 aa long and contains six of the eight Cys residues present in mammalian J chain. Phylogenetic studies place the lungfish J chain closer to tetrapod J chain than to the coelacanth or nurse shark sequences. J chain expression occurs in all P. dolloi immune tissues examined and it increases in the gut and kidney in response to an experimental bacterial infection. Double fluorescent in-situ hybridization shows that 88.5% of IgM⁺ cells in the gut co-express J chain, a significantly higher percentage than in the pre-pyloric spleen. Importantly, J chain expression is not restricted to the B-cell compartment since gut epithelial cells also express J chain. These results improve our current view of J chain from a phylogenetic perspective.

  13. Molecular adaptation in the world's deepest-living animal: Insights from transcriptome sequencing of the hadal amphipod Hirondellea gigas.

    Science.gov (United States)

    Lan, Yi; Sun, Jin; Tian, Renmao; Bartlett, Douglas H; Li, Runsheng; Wong, Yue Him; Zhang, Weipeng; Qiu, Jian-Wen; Xu, Ting; He, Li-Sheng; Tabata, Harry G; Qian, Pei-Yuan

    2017-07-01

    The Challenger Deep in the Mariana Trench is the deepest point in the oceans of our planet. Understanding how animals adapt to this harsh environment characterized by high hydrostatic pressure, food limitation, dark and cold is of great scientific interest. Of the animals dwelling in the Challenger Deep, amphipods have been captured using baited traps. In this study, we sequenced the transcriptome of the amphipod Hirondellea gigas collected at a depth of 10,929 m from the East Pond of the Challenger Deep. Assembly of these sequences resulted in 133,041 contigs and 22,046 translated proteins. Functional annotation of these contigs was made using the go and kegg databases. Comparison of these translated proteins with those of four shallow-water amphipods revealed 10,731 gene families, of which 5659 were single-copy orthologs. Base substitution analysis on these single-copy orthologs showed that 62 genes are positively selected in H. gigas, including genes related to β-alanine biosynthesis, energy metabolism and genetic information processing. For multiple-copy orthologous genes, gene family expansion analysis revealed that cold-inducible proteins (i.e., transcription factors II A and transcription elongation factor 1) as well as zinc finger domains are expanded in H. gigas. Overall, our results indicate that genetic adaptation to the hadal environment by H. gigas may be mediated by both gene family expansion and amino acid substitutions of specific proteins. © 2017 John Wiley & Sons Ltd.

  14. Transcriptome Sequencing and Development of Genic SSR Markers of an Endangered Chinese Endemic Genus Dipteronia Oliver (Aceraceae).

    Science.gov (United States)

    Zhou, Tao; Li, Zhong-Hu; Bai, Guo-Qing; Feng, Li; Chen, Chen; Wei, Yue; Chang, Yong-Xia; Zhao, Gui-Fang

    2016-02-23

    Dipteronia Oliver (Aceraceae) is an endangered Chinese endemic genus consisting of two living species, Dipteronia sinensis and Dipteronia dyeriana. However, studies on the population genetics and evolutionary analyses of Dipteronia have been hindered by limited genomic resources and genetic markers. Here, the generation, de novo assembly and annotation of transcriptome datasets, and a large set of microsatellite or simple sequence repeat (SSR) markers derived from Dipteronia have been described. After Illumina pair-end sequencing, approximately 93.2 million reads were generated and assembled to yield a total of 99,358 unigenes. A majority of these unigenes (53%, 52,789) had at least one blast hit against the public protein databases. Further, 12,377 SSR loci were detected and 4179 primer pairs were designed for experimental validation. Of these 4179 primer pairs, 435 primer pairs were randomly selected to test polymorphism. Our results show that products from 132 primer pairs were polymorphic, in which 97 polymorphic SSR markers were further selected to analyze the genetic diversity of 10 natural populations of Dipteronia. The identification of SSR markers during our research will provide the much valuable data for population genetic analyses and evolutionary studies in Dipteronia.

  15. Transcriptome analysis of carnation (Dianthus caryophyllus L.) based on next-generation sequencing technology.

    Science.gov (United States)

    Tanase, Koji; Nishitani, Chikako; Hirakawa, Hideki; Isobe, Sachiko; Tabata, Satoshi; Ohmiya, Akemi; Onozaki, Takashi

    2012-07-02

    Carnation (Dianthus caryophyllus L.), in the family Caryophyllaceae, can be found in a wide range of colors and is a model system for studies of flower senescence. In addition, it is one of the most important flowers in the global floriculture industry. However, few genomics resources, such as sequences and markers are available for carnation or other members of the Caryophyllaceae. To increase our understanding of the genetic control of important characters in carnation, we generated an expressed sequence tag (EST) database for a carnation cultivar important in horticulture by high-throughput sequencing using 454 pyrosequencing technology. We constructed a normalized cDNA library and a 3'-UTR library of carnation, obtaining a total of 1,162,126 high-quality reads. These reads were assembled into 300,740 unigenes consisting of 37,844 contigs and 262,896 singlets. The contigs were searched against an Arabidopsis sequence database, and 61.8% (23,380) of them had at least one BLASTX hit. These contigs were also annotated with Gene Ontology (GO) and were found to cover a broad range of GO categories. Furthermore, we identified 17,362 potential simple sequence repeats (SSRs) in 14,291 of the unigenes. We focused on gene discovery in the areas of flower color and ethylene biosynthesis. Transcripts were identified for almost every gene involved in flower chlorophyll and carotenoid metabolism and in anthocyanin biosynthesis. Transcripts were also identified for every step in the ethylene biosynthesis pathway. We present the first large-scale sequence data set for carnation, generated using next-generation sequencing technology. The large EST database generated from these sequences is an informative resource for identifying genes involved in various biological processes in carnation and provides an EST resource for understanding the genetic diversity of this plant.

  16. Transcriptome analysis of carnation (Dianthus caryophyllus L. based on next-generation sequencing technology

    Directory of Open Access Journals (Sweden)

    Tanase Koji

    2012-07-01

    Full Text Available Abstract Background Carnation (Dianthus caryophyllus L., in the family Caryophyllaceae, can be found in a wide range of colors and is a model system for studies of flower senescence. In addition, it is one of the most important flowers in the global floriculture industry. However, few genomics resources, such as sequences and markers are available for carnation or other members of the Caryophyllaceae. To increase our understanding of the genetic control of important characters in carnation, we generated an expressed sequence tag (EST database for a carnation cultivar important in horticulture by high-throughput sequencing using 454 pyrosequencing technology. Results We constructed a normalized cDNA library and a 3’-UTR library of carnation, obtaining a total of 1,162,126 high-quality reads. These reads were assembled into 300,740 unigenes consisting of 37,844 contigs and 262,896 singlets. The contigs were searched against an Arabidopsis sequence database, and 61.8% (23,380 of them had at least one BLASTX hit. These contigs were also annotated with Gene Ontology (GO and were found to cover a broad range of GO categories. Furthermore, we identified 17,362 potential simple sequence repeats (SSRs in 14,291 of the unigenes. We focused on gene discovery in the areas of flower color and ethylene biosynthesis. Transcripts were identified for almost every gene involved in flower chlorophyll and carotenoid metabolism and in anthocyanin biosynthesis. Transcripts were also identified for every step in the ethylene biosynthesis pathway. Conclusions We present the first large-scale sequence data set for carnation, generated using next-generation sequencing technology. The large EST database generated from these sequences is an informative resource for identifying genes involved in various biological processes in carnation and provides an EST resource for understanding the genetic diversity of this plant.

  17. Salmon louse (Lepeophtheirus salmonis transcriptomes during post molting maturation and egg production, revealed using EST-sequencing and microarray analysis

    Directory of Open Access Journals (Sweden)

    Jonassen Inge

    2008-03-01

    Full Text Available Abstract Background Lepeophtheirus salmonis is an ectoparasitic copepod feeding on skin, mucus and blood from salmonid hosts. Initial analysis of EST sequences from pre adult and adult stages of L. salmonis revealed a large proportion of novel transcripts. In order to link unknown transcripts to biological functions we have combined EST sequencing and microarray analysis to characterize female salmon louse transcriptomes during post molting maturation and egg production. Results EST sequence analysis shows that 43% of the ESTs have no significant hits in GenBank. Sequenced ESTs assembled into 556 contigs and 1614 singletons and whenever homologous genes were identified no clear correlation with homologous genes from any specific animal group was evident. Sequence comparison of 27 L. salmonis proteins with homologous proteins in humans, zebrafish, insects and crustaceans revealed an almost identical sequence identity with all species. Microarray analysis of maturing female adult salmon lice revealed two major transcription patterns; up-regulation during the final molting followed by down regulation and female specific up regulation during post molting growth and egg production. For a third minor group of ESTs transcription decreased during molting from pre-adult II to immature adults. Genes regulated during molting typically gave hits with cuticula proteins whilst transcripts up regulated during post molting growth were female specific, including two vitellogenins. Conclusion The copepod L.salmonis contains high a level of novel genes. Among analyzed L.salmonis proteins, sequence identities with homologous proteins in crustaceans are no higher than to homologous proteins in humans. Three distinct processes, molting, post molting growth and egg production correlate with transcriptional regulation of three groups of transcripts; two including genes related to growth, one including genes related to egg production. The function of the regulated

  18. Next-generation transcriptome assembly

    Energy Technology Data Exchange (ETDEWEB)

    Martin, Jeffrey A.; Wang, Zhong

    2011-09-01

    Transcriptomics studies often rely on partial reference transcriptomes that fail to capture the full catalog of transcripts and their variations. Recent advances in sequencing technologies and assembly algorithms have facilitated the reconstruction of the entire transcriptome by deep RNA sequencing (RNA-seq), even without a reference genome. However, transcriptome assembly from billions of RNA-seq reads, which are often very short, poses a significant informatics challenge. This Review summarizes the recent developments in transcriptome assembly approaches - reference-based, de novo and combined strategies-along with some perspectives on transcriptome assembly in the near future.

  19. Lactation transcriptomics in the Australian marsupial, Macropus eugenii: transcript sequencing and quantification

    Directory of Open Access Journals (Sweden)

    Whitley Jane C

    2007-11-01

    Full Text Available Abstract Background Lactation is an important aspect of mammalian biology and, amongst mammals, marsupials show one of the most complex lactation cycles. Marsupials, such as the tammar wallaby (Macropus eugenii give birth to a relatively immature newborn and progressive changes in milk composition and milk production regulate early stage development of the young. Results In order to investigate gene expression in the marsupial mammary gland during lactation, a comprehensive set of cDNA libraries was derived from lactating tissues throughout the lactation cycle of the tammar wallaby. A total of 14,837 express sequence tags were produced by cDNA sequencing. Sequence analysis and sequence assembly were used to construct a comprehensive catalogue of mammary transcripts. Sequence data from pregnant and early or late lactating specific cDNA libraries and, data from early or late lactation massively parallel sequencing strategies were combined to analyse the variation of milk protein gene expression during the lactation cycle. Conclusion Results show a steady increase in expression of genes coding for secreted protein during the lactation cycle that is associated with high proportion of transcripts coding for milk proteins. In addition, genes involved in immune function, translation and energy or anabolic metabolism are expressed across the lactation cycle. A number of potential new milk proteins or mammary gland remodelling markers, including noncoding RNAs have been identified.

  20. Transcriptome sequencing for identification of diapause-associated genes in fall webworm, Hyphantria cunea Drury.

    Science.gov (United States)

    Deng, Yu; Li, Fei; Rieske, Lynne K; Sun, Li-Li; Sun, Shou-Hui

    2018-08-20

    Fall webworm, Hyphantria cunea Drury (Lepidoptera: Arctiidae) is extremely adaptable and highly invasive in China as a defoliator of ornamental and forest trees. Both voltinism and diapause strategies of fall webworm in China are variable, and this variability contributes to it invasiveness. Little is known about molecular regulation of diapause in fall webworm. To gain insight into possible mechanisms of diapause induction, high-throughput RNA-seq data were generated from non-diapause pupae (NDP) and diapause pupae (DP). A total of 58,151 unigenes were assembled and researched against nine public databases. In total, 29,013 up-regulated and 3451 down-regulated unigenes were differentially expressed by DP when compared with those of NDP. Genes encoding proteins such as UDP-glycosyl transferase (UGT), cytochrome P450 and Hsp70 were predicted to be involved in diapause. Moreover, GO function and KEGG pathway enrichments were performed on all differentially expressed genes (DEGs) and showed that cell cycle and insulin signaling pathways may be related to the diapause of the fall webworm. This study provides valuable information about the fall webworm transcriptome for future gene function research, especially as it relates to diapause. Copyright © 2018 Elsevier B.V. All rights reserved.

  1. Genome sequence and transcriptome analyses of the thermophilic zygomycete fungus Rhizomucor miehei.

    Science.gov (United States)

    Zhou, Peng; Zhang, Guoqiang; Chen, Shangwu; Jiang, Zhengqiang; Tang, Yanbin; Henrissat, Bernard; Yan, Qiaojuan; Yang, Shaoqing; Chen, Chin-Fu; Zhang, Bing; Du, Zhenglin

    2014-04-21

    The zygomycete fungi like Rhizomucor miehei have been extensively exploited for the production of various enzymes. As a thermophilic fungus, R. miehei is capable of growing at temperatures that approach the upper limits for all eukaryotes. To date, over hundreds of fungal genomes are publicly available. However, Zygomycetes have been rarely investigated both genetically and genomically. Here, we report the genome of R. miehei CAU432 to explore the thermostable enzymatic repertoire of this fungus. The assembled genome size is 27.6-million-base (Mb) with 10,345 predicted protein-coding genes. Even being thermophilic, the G + C contents of fungal whole genome (43.8%) and coding genes (47.4%) are less than 50%. Phylogenetically, R. miehei is more closerly related to Phycomyces blakesleeanus than to Mucor circinelloides and Rhizopus oryzae. The genome of R. miehei harbors a large number of genes encoding secreted proteases, which is consistent with the characteristics of R. miehei being a rich producer of proteases. The transcriptome profile of R. miehei showed that the genes responsible for degrading starch, glucan, protein and lipid were highly expressed. The genome information of R. miehei will facilitate future studies to better understand the mechanisms of fungal thermophilic adaptation and the exploring of the potential of R. miehei in industrial-scale production of thermostable enzymes. Based on the existence of a large repertoire of amylolytic, proteolytic and lipolytic genes in the genome, R. miehei has potential in the production of a variety of such enzymes.

  2. Novel Kunitz-like Peptides Discovered in the Zoanthid Palythoa caribaeorum through Transcriptome Sequencing.

    Science.gov (United States)

    Liao, Qiwen; Li, Shengnan; Siu, Shirley Weng In; Yang, Binrui; Huang, Chen; Chan, Judy Yuet-Wa; Morlighem, Jean-Étienne R L; Wong, Clarence Tsun Ting; Rádis-Baptista, Gandhi; Lee, Simon Ming-Yuen

    2018-02-02

    Palythoa caribaeorum (class Anthozoa) is a zoanthid that together jellyfishes, hydra, and sea anemones, which are venomous and predatory, belongs to the Phyllum Cnidaria. The distinguished feature in these marine animals is the cnidocytes in the body tissues, responsible for toxin production and injection that are used majorly for prey capture and defense. With exception for other anthozoans, the toxin cocktails of zoanthids have been scarcely studied and are poorly known. Here, on the basis of the analysis of P. caribaeorum transcriptome, numerous predicted venom-featured polypeptides were identified including allergens, neurotoxins, membrane-active, and Kunitz-like peptides (PcKuz). The three predicted PcKuz isotoxins (1-3) were selected for functional studies. Through computational processing comprising structural phylogenetic analysis, molecular docking, and dynamics simulation, PcKuz3 was shown to be a potential voltage gated potassium-channel inhibitor. PcKuz3 fitted well as new functional Kunitz-type toxins with strong antilocomotor activity as in vivo assessed in zebrafish larvae, with weak inhibitory effect toward proteases, as evaluated in vitro. Notably, PcKuz3 can suppress, at low concentration, the 6-OHDA-induced neurotoxicity on the locomotive behavior of zebrafish, which indicated PcKuz3 may have a neuroprotective effect. Taken together, PcKuz3 figures as a novel neurotoxin structure, which differs from known homologous peptides expressed in sea anemone. Moreover, the novel PcKuz3 provides an insightful hint for biodrug development for prospective neurodegenerative disease treatment.

  3. Ultra-fast evaluation of protein energies directly from sequence.

    Directory of Open Access Journals (Sweden)

    Gevorg Grigoryan

    2006-06-01

    Full Text Available The structure, function, stability, and many other properties of a protein in a fixed environment are fully specified by its sequence, but in a manner that is difficult to discern. We present a general approach for rapidly mapping sequences directly to their energies on a pre-specified rigid backbone, an important sub-problem in computational protein design and in some methods for protein structure prediction. The cluster expansion (CE method that we employ can, in principle, be extended to model any computable or measurable protein property directly as a function of sequence. Here we show how CE can be applied to the problem of computational protein design, and use it to derive excellent approximations of physical potentials. The approach provides several attractive advantages. First, following a one-time derivation of a CE expansion, the amount of time necessary to evaluate the energy of a sequence adopting a specified backbone conformation is reduced by a factor of 10(7 compared to standard full-atom methods for the same task. Second, the agreement between two full-atom methods that we tested and their CE sequence-based expressions is very high (root mean square deviation 1.1-4.7 kcal/mol, R2 = 0.7-1.0. Third, the functional form of the CE energy expression is such that individual terms of the expansion have clear physical interpretations. We derived expressions for the energies of three classic protein design targets-a coiled coil, a zinc finger, and a WW domain-as functions of sequence, and examined the most significant terms. Single-residue and residue-pair interactions are sufficient to accurately capture the energetics of the dimeric coiled coil, whereas higher-order contributions are important for the two more globular folds. For the task of designing novel zinc-finger sequences, a CE-derived energy function provides significantly better solutions than a standard design protocol, in comparable computation time. Given these advantages

  4. De novo assembly and characterization of the leaf, bud, and fruit transcriptome from the vulnerable tree Juglans mandshurica for the development of 20 new microsatellite markers using Illumina sequencing

    Science.gov (United States)

    Zhuang Hu; Tian Zhang; Xiao-Xiao Gao; Yang Wang; Qiang Zhang; Hui-Juan Zhou; Gui-Fang Zhao; Ma-Li Wang; Keith E. Woeste; Peng Zhao

    2016-01-01

    Manchurian walnut (Juglans mandshurica Maxim.) is a vulnerable, temperate deciduous tree valued for its wood and nut, but transcriptomic and genomic data for the species are very limited. Next generation sequencing (NGS) has made it possible to develop molecular markers for this species rapidly and efficiently. Our goal is to use transcriptome...

  5. Genome and Transcriptome sequence of Finger millet (Eleusine coracana (L.) Gaertn.) provides insights into drought tolerance and nutraceutical properties.

    Science.gov (United States)

    Hittalmani, Shailaja; Mahesh, H B; Shirke, Meghana Deepak; Biradar, Hanamareddy; Uday, Govindareddy; Aruna, Y R; Lohithaswa, H C; Mohanrao, A

    2017-06-15

    Finger millet (Eleusine coracana (L.) Gaertn.) is an important staple food crop widely grown in Africa and South Asia. Among the millets, finger millet has high amount of calcium, methionine, tryptophan, fiber, and sulphur containing amino acids. In addition, it has C4 photosynthetic carbon assimilation mechanism, which helps to utilize water and nitrogen efficiently under hot and arid conditions without severely affecting yield. Therefore, development and utilization of genomic resources for genetic improvement of this crop is immensely useful. Experimental results from whole genome sequencing and assembling process of ML-365 finger millet cultivar yielded 1196 Mb covering approximately 82% of total estimated genome size. Genome analysis showed the presence of 85,243 genes and one half of the genome is repetitive in nature. The finger millet genome was found to have higher colinearity with foxtail millet and rice as compared to other Poaceae species. Mining of simple sequence repeats (SSRs) yielded abundance of SSRs within the finger millet genome. Functional annotation and mining of transcription factors revealed finger millet genome harbors large number of drought tolerance related genes. Transcriptome analysis of low moisture stress and non-stress samples revealed the identification of several drought-induced candidate genes, which could be used in drought tolerance breeding. This genome sequencing effort will strengthen plant breeders for allele discovery, genetic mapping, and identification of candidate genes for agronomically important traits. Availability of genomic resources of finger millet will enhance the novel breeding possibilities to address potential challenges of finger millet improvement.

  6. Sex chromosomes and germline transcriptomics explored by single-cell sequencing and RNA-tomography

    NARCIS (Netherlands)

    Vértesy, Ábel

    2018-01-01

    In our study of germ cell differentiation, we applied two recently developed technologies on the germline of various model organisms: single-cell mRNA sequencing and RNA-tomography. For the first time we could look at gene expression with such a high resolution, and this led us to discover the

  7. High Rhodotorula sequences in skin transcriptome of patients with diffuse systemic sclerosis.

    Science.gov (United States)

    Arron, Sarah T; Dimon, Michelle T; Li, Zhenghui; Johnson, Michael E; Wood, Tammara A; Feeney, Luzviminda; Angeles, Jorge G; Lafyatis, Robert; Whitfield, Michael L

    2014-08-01

    Previous studies have suggested a role for pathogens as a trigger of systemic sclerosis (SSc), although neither a pathogen nor a mechanism of pathogenesis is known. Here we show enrichment of Rhodotorula sequences in the skin of patients with early, diffuse SSc compared with that in normal controls. RNA-seq was performed on four SSc patients and four controls, to a depth of 200 million reads per patient. Data were analyzed to quantify the nonhuman sequence reads in each sample. We found little difference between bacterial microbiome and viral read counts, but found a significant difference between the read counts for a mycobiome component, R. glutinis. Normal samples contained almost no detected R. glutinis or other Rhodotorula sequence reads (mean score 0.021 for R. glutinis, 0.024 for all Rhodotorula). In contrast, SSc samples had a mean score of 5.039 for R. glutinis (5.232 for Rhodotorula). We were able to assemble the D1-D2 hypervariable region of the 28S ribosomal RNA (rRNA) of R. glutinis from each of the SSc samples. Taken together, these results suggest that R. glutinis may be present in the skin of early SSc patients at higher levels than in normal skin, raising the possibility that it may be triggering the inflammatory response found in SSc.

  8. Transcriptome Sequencing of Dianthus spiculifolius and Analysis of the Genes Involved in Responses to Combined Cold and Drought Stress.

    Science.gov (United States)

    Zhou, Aimin; Ma, Hongping; Liu, Enhui; Jiang, Tongtong; Feng, Shuang; Gong, Shufang; Wang, Jingang

    2017-04-17

    Dianthus spiculifolius , a perennial herbaceous flower and a member of the Caryophyllaceae family, has strong resistance to cold and drought stresses. To explore the transcriptional responses of D. spiculifolius to individual and combined stresses, we performed transcriptome sequencing of seedlings under normal conditions or subjected to cold treatment (CT), simulated drought treatment (DT), or their combination (CTDT). After de novo assembly of the obtained reads, 112,015 unigenes were generated. Analysis of differentially expressed genes (DEGs) showed that 2026, 940, and 2346 genes were up-regulated and 1468, 707, and 1759 were down-regulated in CT, DT, and CTDT samples, respectively. Among all the DEGs, 182 up-regulated and 116 down-regulated genes were identified in all the treatment groups. Analysis of metabolic pathways and regulatory networks associated with the DEGs revealed overlaps and cross-talk between cold and drought stress response pathways. The expression profiles of the selected DEGs in CT, DT, and CTDT samples were characterized and confirmed by quantitative RT-PCR. These DEGs and metabolic pathways may play important roles in the response of D. spiculifolius to the combined stress. Functional characterization of these genes and pathways will provide new targets for enhancement of plant stress tolerance through genetic manipulation.

  9. De novo sequencing, assembly, and analysis of Iris lactea var. chinensis roots' transcriptome in response to salt stress.

    Science.gov (United States)

    Gu, Chunsun; Xu, Sheng; Wang, Zhiquan; Liu, Liangqin; Zhang, Yongxia; Deng, Yanming; Huang, Suzhen

    2018-04-01

    As a halophyte, Iris lactea var. chinensis (I. lactea var. chinensis) is widely distributed and has good drought and heavy metal resistance. Moreover, it is an excellent ornamental plant. I. lactea var. chinensis has extensive application prospects owing to the global impacts of salinization. To better understand its molecular mechanism involved in salt resistance, the de novo sequencing, assembly, and analysis of I. lactea var. chinensis roots' transcriptome in response to salt-stress conditions was performed. On average, 74.17% of the clean reads were mapped to unigenes. A total of 121,093 unigenes were constructed and 56,398 (46.57%) were annotated. Among these, 13,522 differentially expressed genes (DEGs) were identified between salt-treated and control samples Compared to the transcriptional level of control, 7037 DEGs were up-regulated and 6539 down-regulated. In addition, 129 up-regulated and 1609 down-regulated genes were simultaneously detected in all three pairwise comparisons between control and salt-stressed libraries. At least 247 and 250 DEGs encoding transcription factors and transporter proteins were identified. Meanwhile, 130 DEGs regarding reactive oxygen species (ROS) scavenging system were also summarized. Based on real-time quantitative RT-PCR, we verified the changes in the expression patterns of 10 unigenes. Our study identified potential salt-responsive candidate genes and increased the understanding of halophyte responses to salinity stress. Copyright © 2018 Elsevier Masson SAS. All rights reserved.

  10. Transcriptome sequencing and de novo analysis of a cytoplasmic male sterile line and its near-isogenic restorer line in chili pepper (Capsicum annuum L..

    Directory of Open Access Journals (Sweden)

    Chen Liu

    Full Text Available BACKGROUND: The use of cytoplasmic male sterility (CMS in F1 hybrid seed production of chili pepper is increasingly popular. However, the molecular mechanisms of cytoplasmic male sterility and fertility restoration remain poorly understood due to limited transcriptomic and genomic data. Therefore, we analyzed the difference between a CMS line 121A and its near-isogenic restorer line 121C in transcriptome level using next generation sequencing technology (NGS, aiming to find out critical genes and pathways associated with the male sterility. RESULTS: We generated approximately 53 million sequencing reads and assembled de novo, yielding 85,144 high quality unigenes with an average length of 643 bp. Among these unigenes, 27,191 were identified as putative homologs of annotated sequences in the public protein databases, 4,326 and 7,061 unigenes were found to be highly abundant in lines 121A and 121C, respectively. Many of the differentially expressed unigenes represent a set of potential candidate genes associated with the formation or abortion of pollen. CONCLUSIONS: Our study profiled anther transcriptomes of a chili pepper CMS line and its restorer line. The results shed the lights on the occurrence and recovery of the disturbances in nuclear-mitochondrial interaction and provide clues for further investigations.

  11. Transcriptome sequencing and de novo analysis of cytoplasmic male sterility and maintenance in JA-CMS cotton.

    Science.gov (United States)

    Yang, Peng; Han, Jinfeng; Huang, Jinling

    2014-01-01

    Cytoplasmic male sterility (CMS) is the failure to produce functional pollen, which is inherited maternally. And it is known that anther development is modulated through complicated interactions between nuclear and mitochondrial genes in sporophytic and gametophytic tissues. However, an unbiased transcriptome sequencing analysis of CMS in cotton is currently lacking in the literature. This study compared differentially expressed (DE) genes of floral buds at the sporogenous cells stage (SS) and microsporocyte stage (MS) (the two most important stages for pollen abortion in JA-CMS) between JA-CMS and its fertile maintainer line JB cotton plants, using the Illumina HiSeq 2000 sequencing platform. A total of 709 (1.8%) DE genes including 293 up-regulated and 416 down-regulated genes were identified in JA-CMS line comparing with its maintainer line at the SS stage, and 644 (1.6%) DE genes with 263 up-regulated and 381 down-regulated genes were detected at the MS stage. By comparing the two stages in the same material, there were 8 up-regulated and 9 down-regulated DE genes in JA-CMS line and 29 up-regulated and 9 down-regulated DE genes in JB maintainer line at the MS stage. Quantitative RT-PCR was used to validate 7 randomly selected DE genes. Bioinformatics analysis revealed that genes involved in reduction-oxidation reactions and alpha-linolenic acid metabolism were down-regulated, while genes pertaining to photosynthesis and flavonoid biosynthesis were up-regulated in JA-CMS floral buds compared with their JB counterparts at the SS and/or MS stages. All these four biological processes play important roles in reactive oxygen species (ROS) homeostasis, which may be an important factor contributing to the sterile trait of JA-CMS. Further experiments are warranted to elucidate molecular mechanisms of these genes that lead to CMS.

  12. Deep sequencing reveals transcriptome re-programming of Taxus × media cells to the elicitation with methyl jasmonate.

    Science.gov (United States)

    Sun, Guiling; Yang, Yanfang; Xie, Fuliang; Wen, Jian-Fan; Wu, Jianqiang; Wilson, Iain W; Tang, Qi; Liu, Hongwei; Qiu, Deyou

    2013-01-01

    Plant cell culture represents an alternative source for producing high-value secondary metabolites including paclitaxel (Taxol®), which is mainly produced in Taxus and has been widely used in cancer chemotherapy. The phytohormone methyl jasmonate (MeJA) can significantly increase the production of paclitaxel, which is induced in plants as a secondary metabolite possibly in defense against herbivores and pathogens. In cell culture, MeJA also elicits the accumulation of paclitaxel; however, the mechanism is still largely unknown. To obtain insight into the global regulation mechanism of MeJA in the steady state of paclitaxel production (7 days after MeJA addition), especially on paclitaxel biosynthesis, we sequenced the transcriptomes of MeJA-treated and untreated Taxus × media cells and obtained ∼ 32.5 M high quality reads, from which 40,348 unique sequences were obtained by de novo assembly. Expression level analysis indicated that a large number of genes were associated with transcriptional regulation, DNA and histone modification, and MeJA signaling network. All the 29 known genes involved in the biosynthesis of terpenoid backbone and paclitaxel were found with 18 genes showing increased transcript abundance following elicitation of MeJA. The significantly up-regulated changes of 9 genes in paclitaxel biosynthesis were validated by qRT-PCR assays. According to the expression changes and the previously proposed enzyme functions, multiple candidates for the unknown steps in paclitaxel biosynthesis were identified. We also found some genes putatively involved in the transport and degradation of paclitaxel. Potential target prediction of miRNAs indicated that miRNAs may play an important role in the gene expression regulation following the elicitation of MeJA. Our results shed new light on the global regulation mechanism by which MeJA regulates the physiology of Taxus cells and is helpful to understand how MeJA elicits other plant species besides Taxus.

  13. Next-generation sequencing-based transcriptome analysis of Helicoverpa armigera Larvae immune-primed with Photorhabdus luminescens TT01.

    Directory of Open Access Journals (Sweden)

    Zengyang Zhao

    Full Text Available Although invertebrates are incapable of adaptive immunity, immunal reactions which are functionally similar to the adaptive immunity of vertebrates have been described in many studies of invertebrates including insects. The phenomenon was termed immune priming. In order to understand the molecular mechanism of immune priming, we employed Illumina/Solexa platform to investigate the transcriptional changes of the hemocytes and fat body of Helicoverpa armigera larvae immune-primed with the pathogenic bacteria Photorhabdus luminescens TT01. A total of 43.6 and 65.1 million clean reads with 4.4 and 6.5 gigabase sequence data were obtained from the TT01 (the immune-primed and PBS (non-primed cDNA libraries and assembled into 35,707 all-unigenes (non-redundant transcripts, which has a length varied from 201 to 16,947 bp and a N50 length of 1,997 bp. For 35,707 all-unigenes, 20,438 were functionally annotated and 2,494 were differentially expressed after immune priming. The differentially expressed genes (DEGs are mainly related to immunity, detoxification, development and metabolism of the host insect. Analysis on the annotated immune related DEGs supported a hypothesis that we proposed previously: the immune priming phenomenon observed in H. armigera larvae was achieved by regulation of key innate immune elements. The transcriptome profiling data sets (especially the sequences of 1,022 unannotated DEGs and the clues (such as those on immune-related signal and regulatory pathways obtained from this study will facilitate immune-related novel gene discovery and provide valuable information for further exploring the molecular mechanism of immune priming of invertebrates. All these will increase our understanding of invertebrate immunity which may provide new approaches to control insect pests or prevent epidemic of infectious diseases in economic invertebrates in the future.

  14. A comprehensive assessment of the transcriptome of cork oak (Quercus suber) through EST sequencing.

    Science.gov (United States)

    Pereira-Leal, José B; Abreu, Isabel A; Alabaça, Cláudia S; Almeida, Maria Helena; Almeida, Paulo; Almeida, Tânia; Amorim, Maria Isabel; Araújo, Susana; Azevedo, Herlânder; Badia, Aleix; Batista, Dora; Bohn, Andreas; Capote, Tiago; Carrasquinho, Isabel; Chaves, Inês; Coelho, Ana Cristina; Costa, Maria Manuela Ribeiro; Costa, Rita; Cravador, Alfredo; Egas, Conceição; Faro, Carlos; Fortes, Ana M; Fortunato, Ana S; Gaspar, Maria João; Gonçalves, Sónia; Graça, José; Horta, Marília; Inácio, Vera; Leitão, José M; Lino-Neto, Teresa; Marum, Liliana; Matos, José; Mendonça, Diogo; Miguel, Andreia; Miguel, Célia M; Morais-Cecílio, Leonor; Neves, Isabel; Nóbrega, Filomena; Oliveira, Maria Margarida; Oliveira, Rute; Pais, Maria Salomé; Paiva, Jorge A; Paulo, Octávio S; Pinheiro, Miguel; Raimundo, João A P; Ramalho, José C; Ribeiro, Ana I; Ribeiro, Teresa; Rocheta, Margarida; Rodrigues, Ana Isabel; Rodrigues, José C; Saibo, Nelson J M; Santo, Tatiana E; Santos, Ana Margarida; Sá-Pereira, Paula; Sebastiana, Mónica; Simões, Fernanda; Sobral, Rómulo S; Tavares, Rui; Teixeira, Rita; Varela, Carolina; Veloso, Maria Manuela; Ricardo, Cândido P P

    2014-05-15

    Cork oak (Quercus suber) is one of the rare trees with the ability to produce cork, a material widely used to make wine bottle stoppers, flooring and insulation materials, among many other uses. The molecular mechanisms of cork formation are still poorly understood, in great part due to the difficulty in studying a species with a long life-cycle and for which there is scarce molecular/genomic information. Cork oak forests are of great ecological importance and represent a major economic and social resource in Southern Europe and Northern Africa. However, global warming is threatening the cork oak forests by imposing thermal, hydric and many types of novel biotic stresses. Despite the economic and social value of the Q. suber species, few genomic resources have been developed, useful for biotechnological applications and improved forest management. We generated in excess of 7 million sequence reads, by pyrosequencing 21 normalized cDNA libraries derived from multiple Q. suber tissues and organs, developmental stages and physiological conditions. We deployed a stringent sequence processing and assembly pipeline that resulted in the identification of ~159,000 unigenes. These were annotated according to their similarity to known plant genes, to known Interpro domains, GO classes and E.C. numbers. The phylogenetic extent of this ESTs set was investigated, and we found that cork oak revealed a significant new gene space that is not covered by other model species or EST sequencing projects. The raw data, as well as the full annotated assembly, are now available to the community in a dedicated web portal at http://www.corkoakdb.org. This genomic resource represents the first trancriptome study in a cork producing species. It can be explored to develop new tools and approaches to understand stress responses and developmental processes in forest trees, as well as the molecular cascades underlying cork differentiation and disease response.

  15. Transcriptomic Modification in the Cerebral Cortex following Noninvasive Brain Stimulation: RNA-Sequencing Approach

    Science.gov (United States)

    2017-04-20

    2016; Accepted 14 November 2016 Academic Editor: Ming-Kuei Lu Copyright © 2016 Ben Holmes et al. This is an open access article distributed under the...associative learning in behav- ing rabbits,” Proceedings of the National Academy of Sciences of the United States of America, vol. 109, no. 17, pp. 6710...P. T. Pyl, and W. Huber, “HTSeq—a Python framework to work with high-throughput sequencing data,” Bioinformatics, vol. 31, no. 2, pp. 166–169, 2015

  16. Transcriptome profiling of the cancer, adjacent non-tumor and distant normal tissues from a colorectal cancer patient by deep sequencing.

    Directory of Open Access Journals (Sweden)

    Yan'an Wu

    Full Text Available Colorectal cancer (CRC is one of the most commonly diagnosed cancers in the world. A genome-wide screening of transcriptome dysregulation between cancer and normal tissue would provide insight into the molecular basis of CRC initiation and progression. Compared with microarray technology, which is commonly used to identify transcriptional changes, the recently developed RNA-seq technique has the ability to detect other abnormal regulations in the cancer transcriptome, such as alternative splicing, novel transcripts or gene fusion. In this study, we performed high-throughput transcriptome sequencing at ~50× coverage on CRC, adjacent non-tumor and distant normal tissue. The results revealed cancer-specific, differentially expressed genes and differential alternative splicing, suggesting that the extracellular matrix and metabolic pathways are activated and the genes related to cell homeostasis are suppressed in CRC. In addition, one tumor-restricted gene fusion, PRTEN-NOTCH2, was also detected and experimentally confirmed. This study reveals some common features in tumor invasion and provides a comprehensive survey of the CRC transcriptome, which provides better insight into the complexity of regulatory changes during tumorigenesis.

  17. Transcriptomic characterization of soybean (Glycine max) roots in response to rhizobium infection by RNA sequencing

    International Nuclear Information System (INIS)

    He, Q.; Li, Z.; Wang, S.; Huang, S.; Yang, H.

    2018-01-01

    Legumes interacting with rhizobium to convert N2 into ammonia for plant use has attracted worldwide interest. However, the plant basal nitrogen fixation mechanisms induced in response to Rhizobium, giving differential gene expression of plants, have not yet been fully realized. The differential expressed genes of soybean between inoculated and mock-inoculated were analyzed by a RNA-Seq. The results of the sequencing were aligned against the Williams 82 genome sequence, which contain 55787 transcripts; 280 and 316 transcripts were found to be up- and down-regulated, respectively, for inoculated and mock-inoculated soybean roots at stage V1. Gene ontology (GO) analyses detected 104, 182 and 178 genes associated with the cell component category, molecular function category and biological process category, respectively. Pathway analysis revealed that 98 differentially expressed genes (115 transcripts) were involved in 169 biological pathways. We selected 19 differentially expressed genes and analyzed their expressions in mock-inoculated, inoculated USDA110 and CCBAU45436 using qRT-PCR. The results were in accordance with those obtained from rhizobia infected RNA-Seq data. These showed that the results of RNA-Seq had reliability and universality. Additionally, this study showed some novel genes associated with the nitrogen fixation process in comparison to previously identified QTLs. (author)

  18. Comparative transcriptome analysis within the Lolium/Festuca species complex reveals high sequence conservation

    DEFF Research Database (Denmark)

    Czaban, Adrian; Sharma, Sapna; Byrne, Stephen

    2015-01-01

    species from the Lolium-Festuca complex, ranging from 52,166 to 72,133 transcripts per assembly. We have also predicted a set of proteins and validated it with a high-confidence protein database from three closely related species (H. vulgare, B. distachyon and O. sativa). We have obtained gene family...... clusters for the four species using OrthoMCL and analyzed their inferred phylogenetic relationships. Our results indicate that VRN2 is a candidate gene for differentiating vernalization and non-vernalization types in the Lolium-Festuca complex. Grouping of the gene families based on their BLAST identity...... enabled us to divide ortholog groups into those that are very conserved and those that are more evolutionarily relaxed. The ratio of the non-synonumous to synonymous substitutions enabled us to pinpoint protein sequences evolving in response to positive selection. These proteins may explain some...

  19. Genome-wide analysis of SRSF10-regulated alternative splicing by deep sequencing of chicken transcriptome

    Directory of Open Access Journals (Sweden)

    Xuexia Zhou

    2014-12-01

    Full Text Available Splicing factor SRSF10 is known to function as a sequence-specific splicing activator that is capable of regulating alternative splicing both in vitro and in vivo. We recently used an RNA-seq approach coupled with bioinformatics analysis to identify the extensive splicing network regulated by SRSF10 in chicken cells. We found that SRSF10 promoted both exon inclusion and exclusion. Functionally, many of the SRSF10-verified alternative exons are linked to pathways of response to external stimulus. Here we describe in detail the experimental design, bioinformatics analysis and GO/pathway enrichment analysis of SRSF10-regulated genes to correspond with our data in the Gene Expression Omnibus with accession number GSE53354. Our data thus provide a resource for studying regulation of alternative splicing in vivo that underlines biological functions of splicing regulatory proteins in cells.

  20. Deep sequencing-based transcriptome profiling analysis of bacteria-challenged Lateolabrax japonicus reveals insight into the immune-relevant genes in marine fish

    Directory of Open Access Journals (Sweden)

    Xiang Li-xin

    2010-08-01

    Full Text Available Abstract Background Systematic research on fish immunogenetics is indispensable in understanding the origin and evolution of immune systems. This has long been a challenging task because of the limited number of deep sequencing technologies and genome backgrounds of non-model fish available. The newly developed Solexa/Illumina RNA-seq and Digital gene expression (DGE are high-throughput sequencing approaches and are powerful tools for genomic studies at the transcriptome level. This study reports the transcriptome profiling analysis of bacteria-challenged Lateolabrax japonicus using RNA-seq and DGE in an attempt to gain insights into the immunogenetics of marine fish. Results RNA-seq analysis generated 169,950 non-redundant consensus sequences, among which 48,987 functional transcripts with complete or various length encoding regions were identified. More than 52% of these transcripts are possibly involved in approximately 219 known metabolic or signalling pathways, while 2,673 transcripts were associated with immune-relevant genes. In addition, approximately 8% of the transcripts appeared to be fish-specific genes that have never been described before. DGE analysis revealed that the host transcriptome profile of Vibrio harveyi-challenged L. japonicus is considerably altered, as indicated by the significant up- or down-regulation of 1,224 strong infection-responsive transcripts. Results indicated an overall conservation of the components and transcriptome alterations underlying innate and adaptive immunity in fish and other vertebrate models. Analysis suggested the acquisition of numerous fish-specific immune system components during early vertebrate evolution. Conclusion This study provided a global survey of host defence gene activities against bacterial challenge in a non-model marine fish. Results can contribute to the in-depth study of candidate genes in marine fish immunity, and help improve current understanding of host

  1. Genomic Aberrations in Crizotinib Resistant Lung Adenocarcinoma Samples Identified by Transcriptome Sequencing.

    Directory of Open Access Journals (Sweden)

    Ali Saber

    Full Text Available ALK-break positive non-small cell lung cancer (NSCLC patients initially respond to crizotinib, but resistance occurs inevitably. In this study we aimed to identify fusion genes in crizotinib resistant tumor samples. Re-biopsies of three patients were subjected to paired-end RNA sequencing to identify fusion genes using deFuse and EricScript. The IGV browser was used to determine presence of known resistance-associated mutations. Sanger sequencing was used to validate fusion genes and digital droplet PCR to validate mutations. ALK fusion genes were detected in all three patients with EML4 being the fusion partner. One patient had no additional fusion genes. Another patient had one additional fusion gene, but without a predicted open reading frame (ORF. The third patient had three additional fusion genes, of which two were derived from the same chromosomal region as the EML4-ALK. A predicted ORF was identified only in the CLIP4-VSNL1 fusion product. The fusion genes validated in the post-treatment sample were also present in the biopsy before crizotinib. ALK mutations (p.C1156Y and p.G1269A detected in the re-biopsies of two patients, were not detected in pre-treatment biopsies. In conclusion, fusion genes identified in our study are unlikely to be involved in crizotinib resistance based on presence in pre-treatment biopsies. The detection of ALK mutations in post-treatment tumor samples of two patients underlines their role in crizotinib resistance.

  2. Integrated analysis of 454 and Illumina transcriptomic sequencing characterizes carbon flux and energy source for fatty acid synthesis in developing Lindera glauca fruits for woody biodiesel.

    Science.gov (United States)

    Lin, Zixin; An, Jiyong; Wang, Jia; Niu, Jun; Ma, Chao; Wang, Libing; Yuan, Guanshen; Shi, Lingling; Liu, Lili; Zhang, Jinsong; Zhang, Zhixiang; Qi, Ji; Lin, Shanzhi

    2017-01-01

    Lindera glauca fruit with high quality and quantity of oil has emerged as a novel potential source of biodiesel in China, but the molecular regulatory mechanism of carbon flux and energy source for oil biosynthesis in developing fruits is still unknown. To better develop fruit oils of L. glauca as woody biodiesel, a combination of two different sequencing platforms (454 and Illumina) and qRT-PCR analysis was used to define a minimal reference transcriptome of developing L. glauca fruits, and to construct carbon and energy metabolic model for regulation of carbon partitioning and energy supply for FA biosynthesis and oil accumulation. We first analyzed the dynamic patterns of growth tendency, oil content, FA compositions, biodiesel properties, and the contents of ATP and pyridine nucleotide of L. glauca fruits from seven different developing stages. Comprehensive characterization of transcriptome of the developing L. glauca fruit was performed using a combination of two different next-generation sequencing platforms, of which three representative fruit samples (50, 125, and 150 DAF) and one mixed sample from seven developing stages were selected for Illumina and 454 sequencing, respectively. The unigenes separately obtained from long and short reads (201, and 259, respectively, in total) were reconciled using TGICL software, resulting in a total of 60,031 unigenes (mean length = 1061.95 bp) to describe a transcriptome for developing L. glauca fruits. Notably, 198 genes were annotated for photosynthesis, sucrose cleavage, carbon allocation, metabolite transport, acetyl-CoA formation, oil synthesis, and energy metabolism, among which some specific transporters, transcription factors, and enzymes were identified to be implicated in carbon partitioning and energy source for oil synthesis by an integrated analysis of transcriptomic sequencing and qRT-PCR. Importantly, the carbon and energy metabolic model was well established for oil biosynthesis of developing L

  3. Human genome sequencing with direct x-ray holographic imaging

    International Nuclear Information System (INIS)

    Rhodes, C.K.

    1993-01-01

    Direct holographic imaging of biological materials is widely applicable to the study of the structure, properties and action of genetic material. This particular application involves the sequencing of the human genome where prospective genomic imaging technology is composed of three subtechnologies, name an x-ray holographic camera, suitable chemistry and enzymology for the preparation of tagged DNA samples, and the illuminator in the form of an x-ray laser. We report appropriate x-ray camera, embodied by the instrument developed by MCR, is available and that suitable chemical and enzymatic procedures exist for the preparation of the necessary tagged DNA strands. Concerning the future development of the x-ray illuminator. We find that a practical small scale x-ray light source is indeed feasible. This outcome requires the use of unconventional physical processes in order to achieve the necessary power-compression in the amplifying medium. The understanding of these new physical mechanisms is developing rapidly. Importantly, although the x-ray source does not currently exist, the understanding of these new physical mechanisms is developing rapidly and the research has established the basic scaling laws that will determine the properties of the x-ray illuminator. When this x-ray source becomes available, an extremely rapid and cost effective instrument for 3-D imaging of biological materials can be applied to a wide range of biological structural assays, including the base-pair sequencing of the human genome and many questions regarding its higher levels of organization

  4. Discovery and validation of Barrett's esophagus microRNA transcriptome by next generation sequencing.

    Directory of Open Access Journals (Sweden)

    Ajay Bansal

    Full Text Available Barrett's esophagus (BE is transition from squamous to columnar mucosa as a result of gastroesophageal reflux disease (GERD. The role of microRNA during this transition has not been systematically studied.For initial screening, total RNA from 5 GERD and 6 BE patients was size fractionated. RNA <70 nucleotides was subjected to SOLiD 3 library preparation and next generation sequencing (NGS. Bioinformatics analysis was performed using R package "DEseq". A p value<0.05 adjusted for a false discovery rate of 5% was considered significant. NGS-identified miRNA were validated using qRT-PCR in an independent group of 40 GERD and 27 BE patients. MicroRNA expression of human BE tissues was also compared with three BE cell lines.NGS detected 19.6 million raw reads per sample. 53.1% of filtered reads mapped to miRBase version 18. NGS analysis followed by qRT-PCR validation found 10 differentially expressed miRNA; several are novel (-708-5p, -944, -224-5p and -3065-5p. Up- or down- regulation predicted by NGS was matched by qRT-PCR in every case. Human BE tissues and BE cell lines showed a high degree of concordance (70-80% in miRNA expression. Prediction analysis identified targets that mapped to developmental signaling pathways such as TGFβ and Notch and inflammatory pathways such as toll-like receptor signaling and TGFβ. Cluster analysis found similarly regulated (up or down miRNA to share common targets suggesting coordination between miRNA.Using highly sensitive next-generation sequencing, we have performed a comprehensive genome wide analysis of microRNA in BE and GERD patients. Differentially expressed miRNA between BE and GERD have been further validated. Expression of miRNA between BE human tissues and BE cell lines are highly correlated. These miRNA should be studied in biological models to further understand BE development.

  5. Global Transcriptome Sequencing Identifies Chlamydospore Specific Markers in Candida albicans and Candida dubliniensis

    LENUS (Irish Health Repository)

    Palige, Katja

    2013-04-15

    Candida albicans and Candida dubliniensis are pathogenic fungi that are highly related but differ in virulence and in some phenotypic traits. During in vitro growth on certain nutrient-poor media, C. albicans and C. dubliniensis are the only yeast species which are able to produce chlamydospores, large thick-walled cells of unknown function. Interestingly, only C. dubliniensis forms pseudohyphae with abundant chlamydospores when grown on Staib medium, while C. albicans grows exclusively as a budding yeast. In order to further our understanding of chlamydospore development and assembly, we compared the global transcriptional profile of both species during growth in liquid Staib medium by RNA sequencing. We also included a C. albicans mutant in our study which lacks the morphogenetic transcriptional repressor Nrg1. This strain, which is characterized by its constitutive pseudohyphal growth, specifically produces masses of chlamydospores in Staib medium, similar to C. dubliniensis. This comparative approach identified a set of putatively chlamydospore-related genes. Two of the homologous C. albicans and C. dubliniensis genes (CSP1 and CSP2) which were most strongly upregulated during chlamydospore development were analysed in more detail. By use of the green fluorescent protein as a reporter, the encoded putative cell wall related proteins were found to exclusively localize to C. albicans and C. dubliniensis chlamydospores. Our findings uncover the first chlamydospore specific markers in Candida species and provide novel insights in the complex morphogenetic development of these important fungal pathogens.

  6. Sequencing of the needle transcriptome from Norway spruce (Picea abies Karst L. reveals lower substitution rates, but similar selective constraints in gymnosperms and angiosperms

    Directory of Open Access Journals (Sweden)

    Chen Jun

    2012-11-01

    Full Text Available Abstract Background A detailed knowledge about spatial and temporal gene expression is important for understanding both the function of genes and their evolution. For the vast majority of species, transcriptomes are still largely uncharacterized and even in those where substantial information is available it is often in the form of partially sequenced transcriptomes. With the development of next generation sequencing, a single experiment can now simultaneously identify the transcribed part of a species genome and estimate levels of gene expression. Results mRNA from actively growing needles of Norway spruce (Picea abies was sequenced using next generation sequencing technology. In total, close to 70 million fragments with a length of 76 bp were sequenced resulting in 5 Gbp of raw data. A de novo assembly of these reads, together with publicly available expressed sequence tag (EST data from Norway spruce, was used to create a reference transcriptome. Of the 38,419 PUTs (putative unique transcripts longer than 150 bp in this reference assembly, 83.5% show similarity to ESTs from other spruce species and of the remaining PUTs, 3,704 show similarity to protein sequences from other plant species, leaving 4,167 PUTs with limited similarity to currently available plant proteins. By predicting coding frames and comparing not only the Norway spruce PUTs, but also PUTs from the close relatives Picea glauca and Picea sitchensis to both Pinus taeda and Taxus mairei, we obtained estimates of synonymous and non-synonymous divergence among conifer species. In addition, we detected close to 15,000 SNPs of high quality and estimated gene expression differences between samples collected under dark and light conditions. Conclusions Our study yielded a large number of single nucleotide polymorphisms as well as estimates of gene expression on transcriptome scale. In agreement with a recent study we find that the synonymous substitution rate per year (0.6 × 10

  7. Characterization of the Burkholderia thailandensis SOS response by using whole-transcriptome shotgun sequencing.

    Science.gov (United States)

    Ulrich, Ricky L; Deshazer, David; Kenny, Tara A; Ulrich, Melanie P; Moravusova, Anna; Opperman, Timothy; Bavari, Sina; Bowlin, Terry L; Moir, Donald T; Panchal, Rekha G

    2013-10-01

    The bacterial SOS response is a well-characterized regulatory network encoded by most prokaryotic bacterial species and is involved in DNA repair. In addition to nucleic acid repair, the SOS response is involved in pathogenicity, stress-induced mutagenesis, and the emergence and dissemination of antibiotic resistance. Using high-throughput sequencing technology (SOLiD RNA-Seq), we analyzed the Burkholderia thailandensis global SOS response to the fluoroquinolone antibiotic, ciprofloxacin (CIP), and the DNA-damaging chemical, mitomycin C (MMC). We demonstrate that a B. thailandensis recA mutant (RU0643) is ∼4-fold more sensitive to CIP in contrast to the parental strain B. thailandensis DW503. Our RNA-Seq results show that CIP and MMC treatment (P SOS response were induced and include lexA, uvrA, dnaE, dinB, recX, and recA. At the genome-wide level, we found an overall decrease in gene expression, especially for genes involved in amino acid and carbohydrate transport and metabolism, following both CIP and MMC exposure. Interestingly, we observed the upregulation of several genes involved in bacterial motility and enhanced transcription of a B. thailandensis genomic island encoding a Siphoviridae bacteriophage designated E264. Using B. thailandensis plaque assays and PCR with B. mallei ATCC 23344 as the host, we demonstrate that CIP and MMC exposure in B. thailandensis DW503 induces the transcription and translation of viable bacteriophage in a RecA-dependent manner. This is the first report of the SOS response in Burkholderia spp. to DNA-damaging agents. We have identified both common and unique adaptive responses of B. thailandensis to chemical stress and DNA damage.

  8. Exploration of the gene fusion landscape of glioblastoma using transcriptome sequencing and copy number data.

    Science.gov (United States)

    Shah, Nameeta; Lankerovich, Michael; Lee, Hwahyung; Yoon, Jae-Geun; Schroeder, Brett; Foltz, Greg

    2013-11-22

    RNA-seq has spurred important gene fusion discoveries in a number of different cancers, including lung, prostate, breast, brain, thyroid and bladder carcinomas. Gene fusion discovery can potentially lead to the development of novel treatments that target the underlying genetic abnormalities. In this study, we provide comprehensive view of gene fusion landscape in 185 glioblastoma multiforme patients from two independent cohorts. Fusions occur in approximately 30-50% of GBM patient samples. In the Ivy Center cohort of 24 patients, 33% of samples harbored fusions that were validated by qPCR and Sanger sequencing. We were able to identify high-confidence gene fusions from RNA-seq data in 53% of the samples in a TCGA cohort of 161 patients. We identified 13 cases (8%) with fusions retaining a tyrosine kinase domain in the TCGA cohort and one case in the Ivy Center cohort. Ours is the first study to describe recurrent fusions involving non-coding genes. Genomic locations 7p11 and 12q14-15 harbor majority of the fusions. Fusions on 7p11 are formed in focally amplified EGFR locus whereas 12q14-15 fusions are formed by complex genomic rearrangements. All the fusions detected in this study can be further visualized and analyzed using our website: http://ivygap.swedish.org/fusions. Our study highlights the prevalence of gene fusions as one of the major genomic abnormalities in GBM. The majority of the fusions are private fusions, and a minority of these recur with low frequency. A small subset of patients with fusions of receptor tyrosine kinases can benefit from existing FDA approved drugs and drugs available in various clinical trials. Due to the low frequency and rarity of clinically relevant fusions, RNA-seq of GBM patient samples will be a vital tool for the identification of patient-specific fusions that can drive personalized therapy.

  9. Genome Sequencing and Comparative Transcriptomics of the Model Entomopathogenic Fungi Metarhizium anisopliae and M. acridum

    Science.gov (United States)

    Shang, Yanfang; Duan, Zhibing; Hu, Xiao; Xie, Xue-Qin; Zhou, Gang; Peng, Guoxiong; Luo, Zhibing; Huang, Wei; Wang, Bing; Fang, Weiguo; Wang, Sibao; Zhong, Yi; Ma, Li-Jun; St. Leger, Raymond J.; Zhao, Guo-Ping; Pei, Yan; Feng, Ming-Guang; Xia, Yuxian; Wang, Chengshu

    2011-01-01

    Metarhizium spp. are being used as environmentally friendly alternatives to chemical insecticides, as model systems for studying insect-fungus interactions, and as a resource of genes for biotechnology. We present a comparative analysis of the genome sequences of the broad-spectrum insect pathogen Metarhizium anisopliae and the acridid-specific M. acridum. Whole-genome analyses indicate that the genome structures of these two species are highly syntenic and suggest that the genus Metarhizium evolved from plant endophytes or pathogens. Both M. anisopliae and M. acridum have a strikingly larger proportion of genes encoding secreted proteins than other fungi, while ∼30% of these have no functionally characterized homologs, suggesting hitherto unsuspected interactions between fungal pathogens and insects. The analysis of transposase genes provided evidence of repeat-induced point mutations occurring in M. acridum but not in M. anisopliae. With the help of pathogen-host interaction gene database, ∼16% of Metarhizium genes were identified that are similar to experimentally verified genes involved in pathogenicity in other fungi, particularly plant pathogens. However, relative to M. acridum, M. anisopliae has evolved with many expanded gene families of proteases, chitinases, cytochrome P450s, polyketide synthases, and nonribosomal peptide synthetases for cuticle-degradation, detoxification, and toxin biosynthesis that may facilitate its ability to adapt to heterogenous environments. Transcriptional analysis of both fungi during early infection processes provided further insights into the genes and pathways involved in infectivity and specificity. Of particular note, M. acridum transcribed distinct G-protein coupled receptors on cuticles from locusts (the natural hosts) and cockroaches, whereas M. anisopliae transcribed the same receptor on both hosts. This study will facilitate the identification of virulence genes and the development of improved biocontrol strains

  10. Maximum likelihood sequence estimation for optical complex direct modulation.

    Science.gov (United States)

    Che, Di; Yuan, Feng; Shieh, William

    2017-04-17

    Semiconductor lasers are versatile optical transmitters in nature. Through the direct modulation (DM), the intensity modulation is realized by the linear mapping between the injection current and the light power, while various angle modulations are enabled by the frequency chirp. Limited by the direct detection, DM lasers used to be exploited only as 1-D (intensity or angle) transmitters by suppressing or simply ignoring the other modulation. Nevertheless, through the digital coherent detection, simultaneous intensity and angle modulations (namely, 2-D complex DM, CDM) can be realized by a single laser diode. The crucial technique of CDM is the joint demodulation of intensity and differential phase with the maximum likelihood sequence estimation (MLSE), supported by a closed-form discrete signal approximation of frequency chirp to characterize the MLSE transition probability. This paper proposes a statistical method for the transition probability to significantly enhance the accuracy of the chirp model. Using the statistical estimation, we demonstrate the first single-channel 100-Gb/s PAM-4 transmission over 1600-km fiber with only 10G-class DM lasers.

  11. SNP detection from de novo transcriptome sequencing in the bivalve Macoma balthica: marker development for evolutionary studies.

    Directory of Open Access Journals (Sweden)

    Eric Pante

    Full Text Available Hybrid zones are noteworthy systems for the study of environmental adaptation to fast-changing environments, as they constitute reservoirs of polymorphism and are key to the maintenance of biodiversity. They can move in relation to climate fluctuations, as temperature can affect both selection and migration, or remain trapped by environmental and physical barriers. There is therefore a very strong incentive to study the dynamics of hybrid zones subjected to climate variations. The infaunal bivalve Macoma balthica emerges as a noteworthy model species, as divergent lineages hybridize, and its native NE Atlantic range is currently contracting to the North. To investigate the dynamics and functioning of hybrid zones in M. balthica, we developed new molecular markers by sequencing the collective transcriptome of 30 individuals. Ten individuals were pooled for each of the three populations sampled at the margins of two hybrid zones. A single 454 run generated 277 Mb from which 17K SNPs were detected. SNP density averaged 1 polymorphic site every 14 to 19 bases, for mitochondrial and nuclear loci, respectively. An [Formula: see text] scan detected high genetic divergence among several hundred SNPs, some of them involved in energetic metabolism, cellular respiration and physiological stress. The high population differentiation, recorded for nuclear-encoded ATP synthase and NADH dehydrogenase as well as most mitochondrial loci, suggests cytonuclear genetic incompatibilities. Results from this study will help pave the way to a high-resolution study of hybrid zone dynamics in M. balthica, and the relative importance of endogenous and exogenous barriers to gene flow in this system.

  12. Transcriptome analysis of salinity responsiveness in contrasting genotypes of finger millet (Eleusine coracana L.) through RNA-sequencing.

    Science.gov (United States)

    Rahman, Hifzur; Jagadeeshselvam, N; Valarmathi, R; Sachin, B; Sasikala, R; Senthil, N; Sudhakar, D; Robin, S; Muthurajan, Raveendran

    2014-07-01

    Finger millet (Eleusine coracana L.) is a hardy cereal known for its superior level of tolerance against drought, salinity, diseases and its nutritional properties. In this study, attempts were made to unravel the physiological and molecular basis of salinity tolerance in two contrasting finger millet genotypes viz., CO 12 and Trichy 1. Physiological studies revealed that the tolerant genotype Trichy 1 had lower Na(+) to K(+) ratio in leaves and shoots, higher growth rate (osmotic tolerance) and ability to accumulate higher amount of total soluble sugar in leaves under salinity stress. We sequenced the salinity responsive leaf transcriptome of contrasting finger millet genotypes using IonProton platform and generated 27.91 million reads. Mapping and annotation of finger millet transcripts against rice gene models led to the identification of salinity responsive genes and genotype specific responses. Several functional groups of genes like transporters, transcription factors, genes involved in cell signaling, osmotic homeostasis and biosynthesis of compatible solutes were found to be highly up-regulated in the tolerant Trichy 1. Salinity stress inhibited photosynthetic capacity and photosynthesis related genes in the susceptible genotype CO 12. Several genes involved in cell growth and differentiation were found to be up-regulated in both the genotypes but more specifically in tolerant genotype. Genes involved in flavonoid biosynthesis were found to be down-regulated specifically in the salinity tolerant Trichy 1. This study provides a genome-wide transcriptional analysis of two finger millet genotypes differing in their level of salinity tolerance during a gradually progressing salinity stress under greenhouse conditions.

  13. Transcriptome profiling and digital gene expression by deep sequencing in early somatic embryogenesis of endangered medicinal Eleutherococcus senticosus Maxim.

    Science.gov (United States)

    Tao, Lei; Zhao, Yue; Wu, Ying; Wang, Qiuyu; Yuan, Hongmei; Zhao, Lijuan; Guo, Wendong; You, Xiangling

    2016-03-01

    Somatic embryogenesis (SE) has been studied as a model system to understand molecular events in physiology, biochemistry, and cytology during plant embryo development. In particular, it is exceedingly difficult to access the morphological and early regulatory events in zygotic embryos. To understand the molecular mechanisms regulating early SE in Eleutherococcus senticosus Maxim., we used high-throughput RNA-Seq technology to investigate its transcriptome. We obtained 58,327,688 reads, which were assembled into 75,803 unique unigenes. To better understand their functions, the unigenes were annotated using the Clusters of Orthologous Groups, Gene Ontology, and Kyoto Encyclopedia of Genes and Genomes databases. Digital gene expression libraries revealed differences in gene expression profiles at different developmental stages (embryogenic callus, yellow embryogenic callus, global embryo). We obtained a sequencing depth of >5.6 million tags per sample and identified many differentially expressed genes at various stages of SE. The initiation of SE affected gene expression in many KEGG pathways, but predominantly that in metabolic pathways, biosynthesis of secondary metabolites, and plant hormone signal transduction. This information on the changes in the multiple pathways related to SE induction in E. senticosus Maxim. embryogenic tissue will contribute to a more comprehensive understanding of the mechanisms involved in early SE. Additionally, the differentially expressed genes may act as molecular markers and could play very important roles in the early stage of SE. The results are a comprehensive molecular biology resource for investigating SE of E. senticosus Maxim. Copyright © 2015 Elsevier B.V. All rights reserved.

  14. High-throughput sequencing of microRNA transcriptome and expression assay in the sturgeon, Acipenser schrenckii.

    Directory of Open Access Journals (Sweden)

    Lihong Yuan

    Full Text Available Sturgeons are considered as living fossils and have very high evolutionary, economical and conservation values. The multiploidy of sturgeon that has been caused by chromosome duplication may lead to the emergence of new microRNAs (miRNAs involved in the ploidy and physiological processes. In the present study, we performed the first sturgeon miRNAs analysis by RNA-seq high-throughput sequencing combined with expression assay of microarray and real-time PCR, and aimed to discover the sturgeon-specific miRNAs, confirm the expressed pattern of miRNAs and illustrate the potential role of miRNAs-targets on sturgeon biological processes. A total of 103 miRNAs were identified, including 58 miRNAs with strongly detected signals (signal >500 and P≤0.01, which were detected by microarray. Real-time PCR assay supported the expression pattern obtained by microarray. Moreover, co-expression of 21 miRNAs in all five tissues and tissue-specific expression of 16 miRNAs implied the crucial and particular function of them in sturgeon physiological processes. Target gene prediction, especially the enriched functional gene groups (369 GO terms and pathways (37 KEGG regulated by 58 miRNAs (P<0.05, illustrated the interaction of miRNAs and putative mRNAs, and also the potential mechanism involved in these biological processes. Our new findings of sturgeon miRNAs expand the public database of transcriptome information for this species, contribute to our understanding of sturgeon biology, and also provide invaluable data that may be applied in sturgeon breeding.

  15. High-throughput sequencing of microRNA transcriptome and expression assay in the sturgeon, Acipenser schrenckii.

    Science.gov (United States)

    Yuan, Lihong; Zhang, Xiujuan; Li, Linmiao; Jiang, Haiying; Chen, Jinping

    2014-01-01

    Sturgeons are considered as living fossils and have very high evolutionary, economical and conservation values. The multiploidy of sturgeon that has been caused by chromosome duplication may lead to the emergence of new microRNAs (miRNAs) involved in the ploidy and physiological processes. In the present study, we performed the first sturgeon miRNAs analysis by RNA-seq high-throughput sequencing combined with expression assay of microarray and real-time PCR, and aimed to discover the sturgeon-specific miRNAs, confirm the expressed pattern of miRNAs and illustrate the potential role of miRNAs-targets on sturgeon biological processes. A total of 103 miRNAs were identified, including 58 miRNAs with strongly detected signals (signal >500 and P≤0.01), which were detected by microarray. Real-time PCR assay supported the expression pattern obtained by microarray. Moreover, co-expression of 21 miRNAs in all five tissues and tissue-specific expression of 16 miRNAs implied the crucial and particular function of them in sturgeon physiological processes. Target gene prediction, especially the enriched functional gene groups (369 GO terms) and pathways (37 KEGG) regulated by 58 miRNAs (P<0.05), illustrated the interaction of miRNAs and putative mRNAs, and also the potential mechanism involved in these biological processes. Our new findings of sturgeon miRNAs expand the public database of transcriptome information for this species, contribute to our understanding of sturgeon biology, and also provide invaluable data that may be applied in sturgeon breeding.

  16. Transcriptome Analysis of Ceriops tagal in Saline Environments Using RNA-Sequencing.

    Directory of Open Access Journals (Sweden)

    Xiaorong Xiao

    Full Text Available Identification of genes involved in mangrove species' adaptation to salt stress can provide valuable information for developing salt-tolerant crops and understanding the molecular evolution of salt tolerance in halophiles. Ceriops tagal is a salt-tolerant mangrove tree growing in mudflats and marshes in tropical and subtropical areas, without any prior genome information. In this study, we assessed the biochemical and transcriptional responses of C. tagal to high salt treatment (500 mmol/L NaCl by hydroponic experiments and RNA-seq. In C. tagal root tissues under salt stress, proline accumulated strongly from 3 to 12 h of treatment; meanwhile, malondialdehyde content progressively increased from 0 to 9 h, then dropped to lower than control levels by 24 h. These implied that C. tagal plants could survive salt stress through biochemical modification. Using the Illumina sequencing platform, approximately 27.39 million RNA-seq reads were obtained from three salt-treated and control (untreated root samples. These reads were assembled into 47,111 transcripts with an average length of 514 bp and an N50 of 632 bp. Approximately 78% of the transcripts were annotated, and a total of 437 genes were putative transcription factors. Digital gene expression analysis was conducted by comparing transcripts from the untreated control to the three salt treated samples, and 7,330 differentially expressed transcripts were identified. Using k-means clustering, these transcripts were divided into six clusters that differed in their expression patterns across four treatment time points. The genes identified as being up- or downregulated are involved in salt stress responses, signal transduction, and DNA repair. Our study shows the main adaptive pathway of C. tagal in saline environments, under short-term and long-term treatments of salt stress. This provides vital clues as to which genes may be candidates for breeding salt-tolerant crops and clarifying molecular

  17. Transcriptome Sequencing Analysis and Functional Identification of Sex Differentiation Genes from the Mosquito Parasitic Nematode, Romanomermis wuchangensis.

    Directory of Open Access Journals (Sweden)

    Mingyue Duan

    Full Text Available Mosquito-transmitted diseases like malaria and dengue fever are global problem and an estimated 50-100 million of dengue or dengue hemorrhagic fever cases are reported worldwide every year. The mermithid nematode Romanomermis wuchangensis has been successfully used as an ecosystem-friendly biocontrol agent for mosquito prevention in laboratory studies. However, this nematode can not undergo sex differentiation in vitro culture, which has seriously affected their application of biocontrol in the field. In this study, based on transcriptome sequencing analysis of R. wuchangensis, Rwucmab-3, Rwuclaf-1 and Rwuctra-2 were cloned and used to investigate molecular regulatory function of sex differentiation. qRT-PCR results demonstrated that the expression level of Rwucmab-3 between male and female displayed obvious difference on the 3rd day of parasitic stage, which was earlier than Rwuclaf-1 and Rwuctra-2, highlighting sex differentiation process may start on the 3rd day of parasitic stage. Besides, FITC was used as a marker to test dsRNA uptake efficiency of R. wuchangensis, which fluorescence intensity increased with FITC concentration after 16 h incubation, indicating this nematode can successfully ingest soaking solution via its cuticle. RNAi results revealed the sex ratio of R. wuchangensis from RNAi treated groups soaked in dsRNA of Rwucmab-3 was significantly higher than gfp dsRNA treated groups and control groups, highlighting RNAi of Rwumab-3 may hinder the development of male nematodes. These results suggest that Rwucmab-3 mainly involves in the initiation of sex differentiation and the development of male sexual dimorphism. Rwuclaf-1 and Rwuctra-2 may play vital role in nematode reproductive and developmental system. In conclusion, transcript sequences presented in this study could provide more bioinformatics resources for future studies on gene cloning and other molecular regulatory mechanism in R. wuchangensis. Moreover, identification

  18. High-throughput sequencing of small RNA transcriptome reveals salt stress regulated microRNAs in sugarcane.

    Directory of Open Access Journals (Sweden)

    Mariana Carnavale Bottino

    Full Text Available Salt stress is a primary cause of crop losses worldwide, and it has been the subject of intense investigation to unravel the complex mechanisms responsible for salinity tolerance. MicroRNA is implicated in many developmental processes and in responses to various abiotic stresses, playing pivotal roles in plant adaptation. Deep sequencing technology was chosen to determine the small RNA transcriptome of Saccharum sp cultivars grown on saline conditions. We constructed four small RNAs libraries prepared from plants grown on hydroponic culture submitted to 170 mM NaCl and harvested after 1 h, 6 hs and 24 hs. Each library was sequenced individually and together generated more than 50 million short reads. Ninety-eight conserved miRNAs and 33 miRNAs* were identified by bioinformatics. Several of the microRNA showed considerable differences of expression in the four libraries. To confirm the results of the bioinformatics-based analysis, we studied the expression of the 10 most abundant miRNAs and 1 miRNA* in plants treated with 170 mM NaCl and in plants with a severe treatment of 340 mM NaCl. The results showed that 11 selected miRNAs had higher expression in samples treated with severe salt treatment compared to the mild one. We also investigated the regulation of the same miRNAs in shoots of four cultivars grown on soil treated with 170 mM NaCl. Cultivars could be grouped according to miRNAs expression in response to salt stress. Furthermore, the majority of the predicted target genes had an inverse regulation with their correspondent microRNAs. The targets encode a wide range of proteins, including transcription factors, metabolic enzymes and genes involved in hormone signaling, probably assisting the plants to develop tolerance to salinity. Our work provides insights into the regulatory functions of miRNAs, thereby expanding our knowledge on potential salt-stressed regulated genes.

  19. Deep mRNA sequencing of the Tritonia diomedea brain transcriptome provides access to gene homologues for neuronal excitability, synaptic transmission and peptidergic signalling.

    Directory of Open Access Journals (Sweden)

    Adriano Senatore

    Full Text Available The sea slug Tritonia diomedea (Mollusca, Gastropoda, Nudibranchia, has a simple and highly accessible nervous system, making it useful for studying neuronal and synaptic mechanisms underlying behavior. Although many important contributions have been made using Tritonia, until now, a lack of genetic information has impeded exploration at the molecular level.We performed Illumina sequencing of central nervous system mRNAs from Tritonia, generating 133.1 million 100 base pair, paired-end reads. De novo reconstruction of the RNA-Seq data yielded a total of 185,546 contigs, which partitioned into 123,154 non-redundant gene clusters (unigenes. BLAST comparison with RefSeq and Swiss-Prot protein databases, as well as mRNA data from other invertebrates (gastropod molluscs: Aplysia californica, Lymnaea stagnalis and Biomphalaria glabrata; cnidarian: Nematostella vectensis revealed that up to 76,292 unigenes in the Tritonia transcriptome have putative homologues in other databases, 18,246 of which are below a more stringent E-value cut-off of 1x10-6. In silico prediction of secreted proteins from the Tritonia transcriptome shotgun assembly (TSA produced a database of 579 unique sequences of secreted proteins, which also exhibited markedly higher expression levels compared to other genes in the TSA.Our efforts greatly expand the availability of gene sequences available for Tritonia diomedea. We were able to extract full length protein sequences for most queried genes, including those involved in electrical excitability, synaptic vesicle release and neurotransmission, thus confirming that the transcriptome will serve as a useful tool for probing the molecular correlates of behavior in this species. We also generated a neurosecretome database that will serve as a useful tool for probing peptidergic signalling systems in the Tritonia brain.

  20. Transcriptome Sequence and Plasmid Copy Number Analysis of the Brewery Isolate Pediococcus claussenii ATCC BAA-344T during Growth in Beer

    Science.gov (United States)

    Pittet, Vanessa; Phister, Trevor G.; Ziola, Barry

    2013-01-01

    Growth of specific lactic acid bacteria in beer leads to spoiled product and economic loss for the brewing industry. Microbial growth is typically inhibited by the combined stresses found in beer (e.g., ethanol, hops, low pH, minimal nutrients); however, certain bacteria have adapted to grow in this harsh environment. Considering little is known about the mechanisms used by bacteria to grow in and spoil beer, transcriptome sequencing was performed on a variant of the beer-spoilage organism Pediococcus claussenii ATCC BAA-344T (Pc344-358). Illumina sequencing was used to compare the transcript levels in Pc344-358 growing mid-exponentially in beer to those in nutrient-rich MRS broth. Various operons demonstrated high gene expression in beer, several of which are involved in nutrient acquisition and overcoming the inhibitory effects of hop compounds. As well, genes functioning in cell membrane modification and biosynthesis demonstrated significantly higher transcript levels in Pc344-358 growing in beer. Three plasmids had the majority of their genes showing increased transcript levels in beer, whereas the two cryptic plasmids showed slightly decreased gene expression. Follow-up analysis of plasmid copy number in both growth environments revealed similar trends, where more copies of the three non-cryptic plasmids were found in Pc344-358 growing in beer. Transcriptome sequencing also enabled the addition of several genes to the P . claussenii ATCC BAA-344T genome annotation, some of which are putatively transcribed as non-coding RNAs. The sequencing results not only provide the first transcriptome description of a beer-spoilage organism while growing in beer, but they also highlight several targets for future exploration, including genes that may have a role in the general stress response of lactic acid bacteria. PMID:24040005

  1. De novo transcriptome sequencing of Acer palmatum and comprehensive analysis of differentially expressed genes under salt stress in two contrasting genotypes.

    Science.gov (United States)

    Rong, Liping; Li, Qianzhong; Li, Shushun; Tang, Ling; Wen, Jing

    2016-04-01

    Maple (Acer palmatum) is an important species for landscape planting worldwide. Salt stress affects the normal growth of the Maple leaf directly, leading to loss of esthetic value. However, the limited availability of Maple genomic information has hindered research on the mechanisms underlying this tolerance. In this study, we performed comprehensive analyses of the salt tolerance in two genotypes of Maple using RNA-seq. Approximately 146.4 million paired-end reads, representing 181,769 unigenes, were obtained. The N50 length of the unigenes was 738 bp, and their total length over 102.66 Mb. 14,090 simple sequence repeats and over 500,000 single nucleotide polymorphisms were identified, which represent useful resources for marker development. Importantly, 181,769 genes were detected in at least one library, and 303 differentially expressed genes (DEGs) were identified between salt-sensitive and salt-tolerant genotypes. Among these DEGs, 125 were upregulated and 178 were downregulated genes. Two MYB-related proteins and one LEA protein were detected among the first 10 most downregulated genes. Moreover, a methyltransferase-related gene was detected among the first 10 most upregulated genes. The three most significantly enriched pathways were plant hormone signal transduction, arginine and proline metabolism, and photosynthesis. The transcriptome analysis provided a rich genetic resource for gene discovery related to salt tolerance in Maple, and in closely related species. The data will serve as an important public information platform to further our understanding of the molecular mechanisms involved in salt tolerance in Maple.

  2. Detection of nucleic acid sequences by invader-directed cleavage

    Science.gov (United States)

    Brow, Mary Ann D.; Hall, Jeff Steven Grotelueschen; Lyamichev, Victor; Olive, David Michael; Prudent, James Robert

    1999-01-01

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The 5' nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based by charge.

  3. Transcriptional Slippage and RNA Editing Increase the Diversity of Transcripts in Chloroplasts: Insight from Deep Sequencing of Vigna radiata Genome and Transcriptome.

    Directory of Open Access Journals (Sweden)

    Ching-Ping Lin

    Full Text Available We performed deep sequencing of the nuclear and organellar genomes of three mungbean genotypes: Vigna radiata ssp. sublobata TC1966, V. radiata var. radiata NM92 and the recombinant inbred line RIL59 derived from a cross between TC1966 and NM92. Moreover, we performed deep sequencing of the RIL59 transcriptome to investigate transcript variability. The mungbean chloroplast genome has a quadripartite structure including a pair of inverted repeats separated by two single copy regions. A total of 213 simple sequence repeats were identified in the chloroplast genomes of NM92 and RIL59; 78 single nucleotide variants and nine indels were discovered in comparing the chloroplast genomes of TC1966 and NM92. Analysis of the mungbean chloroplast transcriptome revealed mRNAs that were affected by transcriptional slippage and RNA editing. Transcriptional slippage frequency was positively correlated with the length of simple sequence repeats of the mungbean chloroplast genome (R2=0.9911. In total, 41 C-to-U editing sites were found in 23 chloroplast genes and in one intergenic spacer. No editing site that swapped U to C was found. A combination of bioinformatics and experimental methods revealed that the plastid-encoded RNA polymerase-transcribed genes psbF and ndhA are affected by transcriptional slippage in mungbean and in main lineages of land plants, including three dicots (Glycine max, Brassica rapa, and Nicotiana tabacum, two monocots (Oryza sativa and Zea mays, two gymnosperms (Pinus taeda and Ginkgo biloba and one moss (Physcomitrella patens. Transcript analysis of the rps2 gene showed that transcriptional slippage could affect transcripts at single sequence repeat regions with poly-A runs. It showed that transcriptional slippage together with incomplete RNA editing may cause sequence diversity of transcripts in chloroplasts of land plants.

  4. Differential transcriptome profiling of chilling stress response between shoots and rhizomes of Oryza longistaminata using RNA sequencing.

    Directory of Open Access Journals (Sweden)

    Ting Zhang

    Full Text Available Rice (Oryza sativa is very sensitive to chilling stress at seedling and reproductive stages, whereas wild rice, O. longistaminata, tolerates non-freezing cold temperatures and has overwintering ability. Elucidating the molecular mechanisms of chilling tolerance (CT in O. longistaminata should thus provide a basis for rice CT improvement through molecular breeding. In this study, high-throughput RNA sequencing was performed to profile global transcriptome alterations and crucial genes involved in response to long-term low temperature in O. longistaminata shoots and rhizomes subjected to 7 days of chilling stress. A total of 605 and 403 genes were respectively identified as up- and down-regulated in O. longistaminata under 7 days of chilling stress, with 354 and 371 differentially expressed genes (DEGs found exclusively in shoots and rhizomes, respectively. GO enrichment and KEGG pathway analyses revealed that multiple transcriptional regulatory pathways were enriched in commonly induced genes in both tissues; in contrast, only the photosynthesis pathway was prevalent in genes uniquely induced in shoots, whereas several key metabolic pathways and the programmed cell death process were enriched in genes induced only in rhizomes. Further analysis of these tissue-specific DEGs showed that the CBF/DREB1 regulon and other transcription factors (TFs, including AP2/EREBPs, MYBs, and WRKYs, were synergistically involved in transcriptional regulation of chilling stress response in shoots. Different sets of TFs, such as OsERF922, OsNAC9, OsWRKY25, and WRKY74, and eight genes encoding antioxidant enzymes were exclusively activated in rhizomes under long-term low-temperature treatment. Furthermore, several cis-regulatory elements, including the ICE1-binding site, the GATA element for phytochrome regulation, and the W-box for WRKY binding, were highly abundant in both tissues, confirming the involvement of multiple regulatory genes and complex networks in the

  5. Sequencing, De Novo Assembly, and Annotation of the Transcriptome of the Endangered Freshwater Pearl Bivalve, Cristaria plicata, Provides Novel Insights into Functional Genes and Marker Discovery.

    Directory of Open Access Journals (Sweden)

    Bharat Bhusan Patnaik

    Full Text Available The freshwater mussel Cristaria plicata (Bivalvia: Eulamellibranchia: Unionidae, is an economically important species in molluscan aquaculture due to its use in pearl farming. The species have been listed as endangered in South Korea due to the loss of natural habitats caused by anthropogenic activities. The decreasing population and a lack of genomic information on the species is concerning for environmentalists and conservationists. In this study, we conducted a de novo transcriptome sequencing and annotation analysis of C. plicata using Illumina HiSeq 2500 next-generation sequencing (NGS technology, the Trinity assembler, and bioinformatics databases to prepare a sustainable resource for the identification of candidate genes involved in immunity, defense, and reproduction.The C. plicata transcriptome analysis included a total of 286,152,584 raw reads and 281,322,837 clean reads. The de novo assembly identified a total of 453,931 contigs and 374,794 non-redundant unigenes with average lengths of 731.2 and 737.1 bp, respectively. Furthermore, 100% coverage of C. plicata mitochondrial genes within two unigenes supported the quality of the assembler. In total, 84,274 unigenes showed homology to entries in at least one database, and 23,246 unigenes were allocated to one or more Gene Ontology (GO terms. The most prominent GO biological process, cellular component, and molecular function categories (level 2 were cellular process, membrane, and binding, respectively. A total of 4,776 unigenes were mapped to 123 biological pathways in the KEGG database. Based on the GO terms and KEGG annotation, the unigenes were suggested to be involved in immunity, stress responses, sex-determination, and reproduction. A total of 17,251 cDNA simple sequence repeats (cSSRs were identified from 61,141 unigenes (size of >1 kb with the most abundant being dinucleotide repeats.This dataset represents the first transcriptome analysis of the endangered mollusc, C. plicata

  6. Optimization of de novo transcriptome assembly from high-throughput short read sequencing data improves functional annotation for non-model organisms

    Directory of Open Access Journals (Sweden)

    Haznedaroglu Berat Z

    2012-07-01

    Full Text Available Abstract Background The k-mer hash length is a key factor affecting the output of de novo transcriptome assembly packages using de Bruijn graph algorithms. Assemblies constructed with varying single k-mer choices might result in the loss of unique contiguous sequences (contigs and relevant biological information. A common solution to this problem is the clustering of single k-mer assemblies. Even though annotation is one of the primary goals of a transcriptome assembly, the success of assembly strategies does not consider the impact of k-mer selection on the annotation output. This study provides an in-depth k-mer selection analysis that is focused on the degree of functional annotation achieved for a non-model organism where no reference genome information is available. Individual k-mers and clustered assemblies (CA were considered using three representative software packages. Pair-wise comparison analyses (between individual k-mers and CAs were produced to reveal missing Kyoto Encyclopedia of Genes and Genomes (KEGG ortholog identifiers (KOIs, and to determine a strategy that maximizes the recovery of biological information in a de novo transcriptome assembly. Results Analyses of single k-mer assemblies resulted in the generation of various quantities of contigs and functional annotations within the selection window of k-mers (k-19 to k-63. For each k-mer in this window, generated assemblies contained certain unique contigs and KOIs that were not present in the other k-mer assemblies. Producing a non-redundant CA of k-mers 19 to 63 resulted in a more complete functional annotation than any single k-mer assembly. However, a fraction of unique annotations remained (~0.19 to 0.27% of total KOIs in the assemblies of individual k-mers (k-19 to k-63 that were not present in the non-redundant CA. A workflow to recover these unique annotations is presented. Conclusions This study demonstrated that different k-mer choices result in various quantities

  7. A NGS approach to the encrusting Mediterranean sponge Crella elegans (Porifera, Demospongiae, Poecilosclerida): transcriptome sequencing, characterization and overview of the gene expression along three life cycle stages.

    Science.gov (United States)

    Pérez-Porro, A R; Navarro-Gómez, D; Uriz, M J; Giribet, G

    2013-05-01

    Sponges can be dominant organisms in many marine and freshwater habitats where they play essential ecological roles. They also represent a key group to address important questions in early metazoan evolution. Recent approaches for improving knowledge on sponge biological and ecological functions as well as on animal evolution have focused on the genetic toolkits involved in ecological responses to environmental changes (biotic and abiotic), development and reproduction. These approaches are possible thanks to newly available, massive sequencing technologies-such as the Illumina platform, which facilitate genome and transcriptome sequencing in a cost-effective manner. Here we present the first NGS (next-generation sequencing) approach to understanding the life cycle of an encrusting marine sponge. For this we sequenced libraries of three different life cycle stages of the Mediterranean sponge Crella elegans and generated de novo transcriptome assemblies. Three assemblies were based on sponge tissue of a particular life cycle stage, including non-reproductive tissue, tissue with sperm cysts and tissue with larvae. The fourth assembly pooled the data from all three stages. By aggregating data from all the different life cycle stages we obtained a higher total number of contigs, contigs with blast hit and annotated contigs than from one stage-based assemblies. In that multi-stage assembly we obtained a larger number of the developmental regulatory genes known for metazoans than in any other assembly. We also advance the differential expression of selected genes in the three life cycle stages to explore the potential of RNA-seq for improving knowledge on functional processes along the sponge life cycle. © 2013 Blackwell Publishing Ltd.

  8. Transcriptomic SNP discovery for custom genotyping arrays: impacts of sequence data, SNP calling method and genotyping technology on the probability of validation success.

    Science.gov (United States)

    Humble, Emily; Thorne, Michael A S; Forcada, Jaume; Hoffman, Joseph I

    2016-08-26

    Single nucleotide polymorphism (SNP) discovery is an important goal of many studies. However, the number of 'putative' SNPs discovered from a sequence resource may not provide a reliable indication of the number that will successfully validate with a given genotyping technology. For this it may be necessary to account for factors such as the method used for SNP discovery and the type of sequence data from which it originates, suitability of the SNP flanking sequences for probe design, and genomic context. To explore the relative importance of these and other factors, we used Illumina sequencing to augment an existing Roche 454 transcriptome assembly for the Antarctic fur seal (Arctocephalus gazella). We then mapped the raw Illumina reads to the new hybrid transcriptome using BWA and BOWTIE2 before calling SNPs with GATK. The resulting markers were pooled with two existing sets of SNPs called from the original 454 assembly using NEWBLER and SWAP454. Finally, we explored the extent to which SNPs discovered using these four methods overlapped and predicted the corresponding validation outcomes for both Illumina Infinium iSelect HD and Affymetrix Axiom arrays. Collating markers across all discovery methods resulted in a global list of 34,718 SNPs. However, concordance between the methods was surprisingly poor, with only 51.0 % of SNPs being discovered by more than one method and 13.5 % being called from both the 454 and Illumina datasets. Using a predictive modeling approach, we could also show that SNPs called from the Illumina data were on average more likely to successfully validate, as were SNPs called by more than one method. Above and beyond this pattern, predicted validation outcomes were also consistently better for Affymetrix Axiom arrays. Our results suggest that focusing on SNPs called by more than one method could potentially improve validation outcomes. They also highlight possible differences between alternative genotyping technologies that could be

  9. Simultaneous sequencing of coding and noncoding RNA reveals a human transcriptome dominated by a small number of highly expressed noncoding genes.

    Science.gov (United States)

    Boivin, Vincent; Deschamps-Francoeur, Gabrielle; Couture, Sonia; Nottingham, Ryan M; Bouchard-Bourelle, Philia; Lambowitz, Alan M; Scott, Michelle S; Abou-Elela, Sherif

    2018-07-01

    Comparing the abundance of one RNA molecule to another is crucial for understanding cellular functions but most sequencing techniques can target only specific subsets of RNA. In this study, we used a new fragmented ribodepleted TGIRT sequencing method that uses a thermostable group II intron reverse transcriptase (TGIRT) to generate a portrait of the human transcriptome depicting the quantitative relationship of all classes of nonribosomal RNA longer than 60 nt. Comparison between different sequencing methods indicated that FRT is more accurate in ranking both mRNA and noncoding RNA than viral reverse transcriptase-based sequencing methods, even those that specifically target these species. Measurements of RNA abundance in different cell lines using this method correlate with biochemical estimates, confirming tRNA as the most abundant nonribosomal RNA biotype. However, the single most abundant transcript is 7SL RNA, a component of the signal recognition particle. S tructured n on c oding RNAs (sncRNAs) associated with the same biological process are expressed at similar levels, with the exception of RNAs with multiple functions like U1 snRNA. In general, sncRNAs forming RNPs are hundreds to thousands of times more abundant than their mRNA counterparts. Surprisingly, only 50 sncRNA genes produce half of the non-rRNA transcripts detected in two different cell lines. Together the results indicate that the human transcriptome is dominated by a small number of highly expressed sncRNAs specializing in functions related to translation and splicing. © 2018 Boivin et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  10. PNA Directed Sequence Addressed Self-Assembly of DNA Nanostructures

    DEFF Research Database (Denmark)

    Nielsen, Peter E.

    2008-01-01

    sequence specifically recognize another PNA oligomer. We describe how such three domain PNAs have utility for assembling dsDNA grid and clover leaf structures, and in combination with SNAP-tag technol. of protein dsDNA structures. (c) 2008 American Institute of Physics. [on SciFinder (R)] Udgivelsesdato...

  11. High-throughput sequencing and pathway analysis reveal alteration of the pituitary transcriptome by 17α-ethynylestradiol (EE2) in female coho salmon, Oncorhynchus kisutch

    Energy Technology Data Exchange (ETDEWEB)

    Harding, Louisa B. [School of Aquatic and Fishery Sciences, University of Washington, Seattle, WA 98195 (United States); Schultz, Irvin R. [Battelle, Marine Sciences Laboratory – Pacific Northwest National Laboratory, 1529 West Sequim Bay Road, Sequim, WA 98382 (United States); Goetz, Giles W. [School of Aquatic and Fishery Sciences, University of Washington, Seattle, WA 98195 (United States); Luckenbach, J. Adam [Northwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, 2725 Montlake Blvd E, Seattle, WA 98112 (United States); Center for Reproductive Biology, Washington State University, Pullman, WA 98164 (United States); Young, Graham [School of Aquatic and Fishery Sciences, University of Washington, Seattle, WA 98195 (United States); Center for Reproductive Biology, Washington State University, Pullman, WA 98164 (United States); Goetz, Frederick W. [Northwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, Manchester Research Station, P.O. Box 130, Manchester, WA 98353 (United States); Swanson, Penny, E-mail: penny.swanson@noaa.gov [Northwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, 2725 Montlake Blvd E, Seattle, WA 98112 (United States); Center for Reproductive Biology, Washington State University, Pullman, WA 98164 (United States)

    2013-10-15

    Highlights: •Studied impacts of ethynylestradiol (EE2) exposure on salmon pituitary transcriptome. •High-throughput sequencing, RNAseq, and pathway analysis were performed. •EE2 altered mRNAs for genes in circadian rhythm, GnRH, and TGFβ signaling pathways. •LH and FSH beta subunit mRNAs were most highly up- and down-regulated by EE2, respectively. •Estrogens may alter processes associated with reproductive timing in salmon. -- Abstract: Considerable research has been done on the effects of endocrine disrupting chemicals (EDCs) on reproduction and gene expression in the brain, liver and gonads of teleost fish, but information on impacts to the pituitary gland are still limited despite its central role in regulating reproduction. The aim of this study was to further our understanding of the potential effects of natural and synthetic estrogens on the brain–pituitary–gonad axis in fish by determining the effects of 17α-ethynylestradiol (EE2) on the pituitary transcriptome. We exposed sub-adult coho salmon (Oncorhynchus kisutch) to 0 or 12 ng EE2/L for up to 6 weeks and effects on the pituitary transcriptome of females were assessed using high-throughput Illumina{sup ®} sequencing, RNA-Seq and pathway analysis. After 1 or 6 weeks, 218 and 670 contiguous sequences (contigs) respectively, were differentially expressed in pituitaries of EE2-exposed fish relative to control. Two of the most highly up- and down-regulated contigs were luteinizing hormone β subunit (241-fold and 395-fold at 1 and 6 weeks, respectively) and follicle-stimulating hormone β subunit (−3.4-fold at 6 weeks). Additional contigs related to gonadotropin synthesis and release were differentially expressed in EE2-exposed fish relative to controls. These included contigs involved in gonadotropin releasing hormone (GNRH) and transforming growth factor-β signaling. There was an over-representation of significantly affected contigs in 33 and 18 canonical pathways at 1 and 6 weeks

  12. High-throughput sequencing and pathway analysis reveal alteration of the pituitary transcriptome by 17α-ethynylestradiol (EE2) in female coho salmon, Oncorhynchus kisutch

    International Nuclear Information System (INIS)

    Harding, Louisa B.; Schultz, Irvin R.; Goetz, Giles W.; Luckenbach, J. Adam; Young, Graham; Goetz, Frederick W.; Swanson, Penny

    2013-01-01

    Highlights: •Studied impacts of ethynylestradiol (EE2) exposure on salmon pituitary transcriptome. •High-throughput sequencing, RNAseq, and pathway analysis were performed. •EE2 altered mRNAs for genes in circadian rhythm, GnRH, and TGFβ signaling pathways. •LH and FSH beta subunit mRNAs were most highly up- and down-regulated by EE2, respectively. •Estrogens may alter processes associated with reproductive timing in salmon. -- Abstract: Considerable research has been done on the effects of endocrine disrupting chemicals (EDCs) on reproduction and gene expression in the brain, liver and gonads of teleost fish, but information on impacts to the pituitary gland are still limited despite its central role in regulating reproduction. The aim of this study was to further our understanding of the potential effects of natural and synthetic estrogens on the brain–pituitary–gonad axis in fish by determining the effects of 17α-ethynylestradiol (EE2) on the pituitary transcriptome. We exposed sub-adult coho salmon (Oncorhynchus kisutch) to 0 or 12 ng EE2/L for up to 6 weeks and effects on the pituitary transcriptome of females were assessed using high-throughput Illumina ® sequencing, RNA-Seq and pathway analysis. After 1 or 6 weeks, 218 and 670 contiguous sequences (contigs) respectively, were differentially expressed in pituitaries of EE2-exposed fish relative to control. Two of the most highly up- and down-regulated contigs were luteinizing hormone β subunit (241-fold and 395-fold at 1 and 6 weeks, respectively) and follicle-stimulating hormone β subunit (−3.4-fold at 6 weeks). Additional contigs related to gonadotropin synthesis and release were differentially expressed in EE2-exposed fish relative to controls. These included contigs involved in gonadotropin releasing hormone (GNRH) and transforming growth factor-β signaling. There was an over-representation of significantly affected contigs in 33 and 18 canonical pathways at 1 and 6 weeks

  13. A resource of large-scale molecular markers for monitoring Agropyron cristatum chromatin introgression in wheat background based on transcriptome sequences.

    Science.gov (United States)

    Zhang, Jinpeng; Liu, Weihua; Lu, Yuqing; Liu, Qunxing; Yang, Xinming; Li, Xiuquan; Li, Lihui

    2017-09-20

    Agropyron cristatum is a wild grass of the tribe Triticeae and serves as a gene donor for wheat improvement. However, very few markers can be used to monitor A. cristatum chromatin introgressions in wheat. Here, we reported a resource of large-scale molecular markers for tracking alien introgressions in wheat based on transcriptome sequences. By aligning A. cristatum unigenes with the Chinese Spring reference genome sequences, we designed 9602 A. cristatum expressed sequence tag-sequence-tagged site (EST-STS) markers for PCR amplification and experimental screening. As a result, 6063 polymorphic EST-STS markers were specific for the A. cristatum P genome in the single-receipt wheat background. A total of 4956 randomly selected polymorphic EST-STS markers were further tested in eight wheat variety backgrounds, and 3070 markers displaying stable and polymorphic amplification were validated. These markers covered more than 98% of the A. cristatum genome, and the marker distribution density was approximately 1.28 cM. An application case of all EST-STS markers was validated on the A. cristatum 6 P chromosome. These markers were successfully applied in the tracking of alien A. cristatum chromatin. Altogether, this study provided a universal method of large-scale molecular marker development to monitor wild relative chromatin in wheat.

  14. Development and Evaluation of a Novel Set of EST-SSR Markers Based on Transcriptome Sequences of Black Locust (Robinia pseudoacacia L.).

    Science.gov (United States)

    Guo, Qi; Wang, Jin-Xing; Su, Li-Zhuo; Lv, Wei; Sun, Yu-Han; Li, Yun

    2017-07-07

    Black locust ( Robinia pseudoacacia L. of the family Fabaceae) is an ecologically and economically important deciduous tree. However, few genomic resources are available for this forest species, and few effective expressed sequence tag-derived simple sequence repeat (EST-SSR) markers have been developed to date. In this study, paired-end sequencing was used to sequence transcriptomes of R. pseudoacacia by the Illumina HiSeq TM2000 platform, and EST-SSR loci were identified by de novo assembly. Furthermore, a total of 1697 primer pairs were successfully designed, from which 286 primers met the selection screening criteria; 94 pairs were randomly selected and tested for validation using polymerase chain reaction amplification. Forty-five primers were verified as polymorphic, with clear bands. The polymorphism information content values were 0.033-0.765, the number of alleles per locus ranged from 2 to 10, and the observed and expected heterozygosities were 0.000-0.931 and 0.035-0.810, respectively, indicating a high level of informativeness. Subsequently, 45 polymorphic EST-SSR loci were tested for amplification efficiency, using the verified primers, in an additional nine species of Leguminosae, 23 loci were amplified in more than three species, of which two loci were amplified successfully in all species. These EST-SSR markers provide a valuable tool for investigating the genetic diversity and population structure of R . pseudoacacia , constructing a DNA fingerprint database, performing quantitative trait locus mapping, and preserving genetic information.

  15. Exploiting sequence and stability information for directing nanobody stability engineering.

    Science.gov (United States)

    Kunz, Patrick; Flock, Tilman; Soler, Nicolas; Zaiss, Moritz; Vincke, Cécile; Sterckx, Yann; Kastelic, Damjana; Muyldermans, Serge; Hoheisel, Jörg D

    2017-09-01

    Variable domains of camelid heavy-chain antibodies, commonly named nanobodies, have high biotechnological potential. In view of their broad range of applications in research, diagnostics and therapy, engineering their stability is of particular interest. One important aspect is the improvement of thermostability, because it can have immediate effects on conformational stability, protease resistance and aggregation propensity of the protein. We analyzed the sequences and thermostabilities of 78 purified nanobody binders. From this data, potentially stabilizing amino acid variations were identified and studied experimentally. Some mutations improved the stability of nanobodies by up to 6.1°C, with an average of 2.3°C across eight modified nanobodies. The stabilizing mechanism involves an improvement of both conformational stability and aggregation behavior, explaining the variable degree of stabilization in individual molecules. In some instances, variations predicted to be stabilizing actually led to thermal destabilization of the proteins. The reasons for this contradiction between prediction and experiment were investigated. The results reveal a mutational strategy to improve the biophysical behavior of nanobody binders and indicate a species-specificity of nanobody architecture. This study illustrates the potential and limitations of engineering nanobody thermostability by merging sequence information with stability data, an aspect that is becoming increasingly important with the recent development of high-throughput biophysical methods. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.

  16. Transcriptome sequencing of the blind subterranean mole rat, Spalax galili: Utility and potential for the discovery of novel evolutionary patterns

    KAUST Repository

    Malik, Assaf; Korol, Abraham; Hü bner, Sariel; Hernandez, Alvaro G.; Thimmapuram, Jyothi; Ali, Shahjahan; Glaser, Fabian; Paz, Arnon; Avivi, Aaron; Band, Mark

    2011-01-01

    sequencing of Spalax galili, a chromosomal type of S. ehrenbergi. cDNA pools from muscle and brain tissues isolated from animals exposed to hypoxic and normoxic conditions were sequenced using Sanger, GS FLX, and GS FLX Titanium technologies. Assembly

  17. SNP design from 454 sequencing of Podosphaera plantaginis transcriptome reveals a genetically diverse pathogen metapopulation with high levels of mixed-genotype infection.

    Directory of Open Access Journals (Sweden)

    Charlotte Tollenaere

    Full Text Available Molecular tools may greatly improve our understanding of pathogen evolution and epidemiology but technical constraints have hindered the development of genetic resources for parasites compared to free-living organisms. This study aims at developing molecular tools for Podosphaera plantaginis, an obligate fungal pathogen of Plantago lanceolata. This interaction has been intensively studied in the Åland archipelago of Finland with epidemiological data collected from over 4,000 host populations annually since year 2001.A cDNA library of a pooled sample of fungal conidia was sequenced on the 454 GS-FLX platform. Over 549,411 reads were obtained and annotated into 45,245 contigs. Annotation data was acquired for 65.2% of the assembled sequences. The transcriptome assembly was screened for SNP loci, as well as for functionally important genes (mating-type genes and potential effector proteins. A genotyping assay of 27 SNP loci was designed and tested on 380 infected leaf samples from 80 populations within the Åland archipelago. With this panel we identified 85 multilocus genotypes (MLG with uneven frequencies across the pathogen metapopulation. Approximately half of the sampled populations contain polymorphism. Our genotyping protocol revealed mixed-genotype infection within a single host leaf to be common. Mixed infection has been proposed as one of the main drivers of pathogen evolution, and hence may be an important process in this pathosystem.The developed SNP panel offers exciting research perspectives for future studies in this well-characterized pathosystem. Also, the transcriptome provides an invaluable novel genomic resource for powdery mildews, which cause significant yield losses on commercially important crops annually. Furthermore, the features that render genetic studies in this system a challenge are shared with the majority of obligate parasitic species, and hence our results provide methodological insights from SNP calling to field

  18. Whole transcriptome analysis of Acinetobacter baumannii assessed by RNA-sequencing reveals different mRNA expression profiles in biofilm compared to planktonic cells.

    Directory of Open Access Journals (Sweden)

    Soraya Rumbo-Feal

    Full Text Available Acinetobacterbaumannii has emerged as a dangerous opportunistic pathogen, with many strains able to form biofilms and thus cause persistent infections. The aim of the present study was to use high-throughput sequencing techniques to establish complete transcriptome profiles of planktonic (free-living and sessile (biofilm forms of A. baumannii ATCC 17978 and thereby identify differences in their gene expression patterns. Collections of mRNA from planktonic (both exponential and stationary phase cultures and sessile (biofilm cells were sequenced. Six mRNA libraries were prepared following the mRNA-Seq protocols from Illumina. Reads were obtained in a HiScanSQ platform and mapped against the complete genome to describe the complete mRNA transcriptomes of planktonic and sessile cells. The results showed that the gene expression pattern of A. baumannii biofilm cells was distinct from that of planktonic cells, including 1621 genes over-expressed in biofilms relative to stationary phase cells and 55 genes expressed only in biofilms. These differences suggested important changes in amino acid and fatty acid metabolism, motility, active transport, DNA-methylation, iron acquisition, transcriptional regulation, and quorum sensing, among other processes. Disruption or deletion of five of these genes caused a significant decrease in biofilm formation ability in the corresponding mutant strains. Among the genes over-expressed in biofilm cells were those in an operon involved in quorum sensing. One of them, encoding an acyl carrier protein, was shown to be involved in biofilm formation as demonstrated by the significant decrease in biofilm formation by the corresponding knockout strain. The present work serves as a basis for future studies examining the complex network systems that regulate bacterial biofilm formation and maintenance.

  19. Transcriptome-Wide Analysis of Botrytis elliptica Responsive microRNAs and Their Targets in Lilium Regale Wilson by High-Throughput Sequencing and Degradome Analysis

    Directory of Open Access Journals (Sweden)

    Xue Gao

    2017-05-01

    Full Text Available MicroRNAs, as master regulators of gene expression, have been widely identified and play crucial roles in plant-pathogen interactions. A fatal pathogen, Botrytis elliptica, causes the serious folia disease of lily, which reduces production because of the high susceptibility of most cultivated species. However, the miRNAs related to Botrytis infection of lily, and the miRNA-mediated gene regulatory networks providing resistance to B. elliptica in lily remain largely unexplored. To systematically dissect B. elliptica-responsive miRNAs and their target genes, three small RNA libraries were constructed from the leaves of Lilium regale, a promising Chinese wild Lilium species, which had been subjected to mock B. elliptica treatment or B. elliptica infection for 6 and 24 h. By high-throughput sequencing, 71 known miRNAs belonging to 47 conserved families and 24 novel miRNA were identified, of which 18 miRNAs were downreguleted and 13 were upregulated in response to B. elliptica. Moreover, based on the lily mRNA transcriptome, 22 targets for 9 known and 1 novel miRNAs were identified by the degradome sequencing approach. Most target genes for elliptica-responsive miRNAs were involved in metabolic processes, few encoding different transcription factors, including ELONGATION FACTOR 1 ALPHA (EF1a and TEOSINTE BRANCHED1/CYCLOIDEA/PROLIFERATING CELL FACTOR 2 (TCP2. Furthermore, the expression patterns of a set of elliptica-responsive miRNAs and their targets were validated by quantitative real-time PCR. This study represents the first transcriptome-based analysis of miRNAs responsive to B. elliptica and their targets in lily. The results reveal the possible regulatory roles of miRNAs and their targets in B. elliptica interaction, which will extend our understanding of the mechanisms of this disease in lily.

  20. Gaucher disease: transcriptome analyses using microarray or mRNA sequencing in a Gba1 mutant mouse model treated with velaglucerase alfa or imiglucerase.

    Directory of Open Access Journals (Sweden)

    Nupur Dasgupta

    Full Text Available Gaucher disease type 1, an inherited lysosomal storage disorder, is caused by mutations in GBA1 leading to defective glucocerebrosidase (GCase function and consequent excess accumulation of glucosylceramide/glucosylsphingosine in visceral organs. Enzyme replacement therapy (ERT with the biosimilars, imiglucerase (imig or velaglucerase alfa (vela improves/reverses the visceral disease. Comparative transcriptomic effects (microarray and mRNA-Seq of no ERT and ERT (imig or vela were done with liver, lung, and spleen from mice having Gba1 mutant alleles, termed D409V/null. Disease-related molecular effects, dynamic ranges, and sensitivities were compared between mRNA-Seq and microarrays and their respective analytic tools, i.e. Mixed Model ANOVA (microarray, and DESeq and edgeR (mRNA-Seq. While similar gene expression patterns were observed with both platforms, mRNA-Seq identified more differentially expressed genes (DEGs (∼3-fold than the microarrays. Among the three analytic tools, DESeq identified the maximum number of DEGs for all tissues and treatments. DESeq and edgeR comparisons revealed differences in DEGs identified. In 9V/null liver, spleen and lung, post-therapy transcriptomes approximated WT, were partially reverted, and had little change, respectively, and were concordant with the corresponding histological and biochemical findings. DEG overlaps were only 8-20% between mRNA-Seq and microarray, but the biological pathways were similar. Cell growth and proliferation, cell cycle, heme metabolism, and mitochondrial dysfunction were most altered with the Gaucher disease process. Imig and vela differentially affected specific disease pathways. Differential molecular responses were observed in direct transcriptome comparisons from imig- and vela-treated tissues. These results provide cross-validation for the mRNA-Seq and microarray platforms, and show differences between the molecular effects of two highly structurally similar ERT

  1. Bioinformatic prediction of G protein-coupled receptor encoding sequences from the transcriptome of the foreleg, including the Haller's organ, of the cattle tick, Rhipicephalus australis.

    Directory of Open Access Journals (Sweden)

    Sergio Munoz

    Full Text Available The cattle tick of Australia, Rhipicephalus australis, is a vector for microbial parasites that cause serious bovine diseases. The Haller's organ, located in the tick's forelegs, is crucial for host detection and mating. To facilitate the development of new technologies for better control of this agricultural pest, we aimed to sequence and annotate the transcriptome of the R. australis forelegs and associated tissues, including the Haller's organ. As G protein-coupled receptors (GPCRs are an important family of eukaryotic proteins studied as pharmaceutical targets in humans, we prioritized the identification and classification of the GPCRs expressed in the foreleg tissues. The two forelegs from adult R. australis were excised, RNA extracted, and pyrosequenced with 454 technology. Reads were assembled into unigenes and annotated by sequence similarity. Python scripts were written to find open reading frames (ORFs from each unigene. These ORFs were analyzed by different GPCR prediction approaches based on sequence alignments, support vector machines, hidden Markov models, and principal component analysis. GPCRs consistently predicted by multiple methods were further studied by phylogenetic analysis and 3D homology modeling. From 4,782 assembled unigenes, 40,907 possible ORFs were predicted. Using Blastp, Pfam, GPCRpred, TMHMM, and PCA-GPCR, a basic set of 46 GPCR candidates were compiled and a phylogenetic tree was constructed. With further screening of tertiary structures predicted by RaptorX, 6 likely GPCRs emerged and the strongest candidate was classified by PCA-GPCR to be a GABAB receptor.

  2. Characterization of Fusobacterium varium Fv113-g1 isolated from a patient with ulcerative colitis based on complete genome sequence and transcriptome analysis.

    Directory of Open Access Journals (Sweden)

    Tsuyoshi Sekizuka

    Full Text Available Fusobacterium spp. present in the oral and gut flora is carcinogenic and is associated with the risk of pancreatic and colorectal cancers. Fusobacterium spp. is also implicated in a broad spectrum of human pathologies, including Crohn's disease and ulcerative colitis (UC. Here we report the complete genome sequence of Fusobacterium varium Fv113-g1 (genome size, 3.96 Mb isolated from a patient with UC. Comparative genome analyses totally suggested that Fv113-g1 is basically assigned as F. varium, in particular, it could be reclassified as notable F. varium subsp. similar to F. ulcerans because of partial shared orthologs. Compared with the genome sequences of F. varium ATCC 27725 (genome size, 3.30 Mb and other strains of Fusobacterium spp., Fv113-g1 possesses many accessary pan-genome sequences with noteworthy multiple virulence factors, including 44 autotransporters (type V secretion system, T5SS and 13 Fusobacterium adhesion (FadA paralogs involved in potential mucosal inflammation. Indeed, transcriptome analysis demonstrated that Fv113-g1-specific accessary genes, such as multiple T5SS and fadA paralogs, showed notably increased expression with D-MEM cultivation than with brain heart infusion broth. This implied that growth condition may enhance the expression of such potential virulence factors, leading to remarkable survival against other gut microorganisms and to the pathogenicity to human intestinal epithelium.

  3. De novo transcriptome sequence assembly and identification of AP2/ERF transcription factor related to abiotic stress in parsley (Petroselinum crispum.

    Directory of Open Access Journals (Sweden)

    Meng-Yao Li

    Full Text Available Parsley is an important biennial Apiaceae species that is widely cultivated as herb, spice, and vegetable. Previous studies on parsley principally focused on its physiological and biochemical properties, including phenolic compound and volatile oil contents. However, little is known about the molecular and genetic properties of parsley. In this study, 23,686,707 high-quality reads were obtained and assembled into 81,852 transcripts and 50,161 unigenes for the first time. Functional annotation showed that 30,516 unigenes had sequence similarity to known genes. In addition, 3,244 putative simple sequence repeats were detected in curly parsley. Finally, 1,569 of the identified unigenes belonged to 58 transcription factor families. Various abiotic stresses have a strong detrimental effect on the yield and quality of parsley. AP2/ERF transcription factors have important functions in plant development, hormonal regulation, and abiotic response. A total of 88 putative AP2/ERF factors were identified from the transcriptome sequence of parsley. Seven AP2/ERF transcription factors were selected in this study to analyze the expression profiles of parsley under different abiotic stresses. Our data provide a potentially valuable resource that can be used for intensive parsley research.

  4. De novo transcriptome sequence assembly and identification of AP2/ERF transcription factor related to abiotic stress in parsley (Petroselinum crispum).

    Science.gov (United States)

    Li, Meng-Yao; Tan, Hua-Wei; Wang, Feng; Jiang, Qian; Xu, Zhi-Sheng; Tian, Chang; Xiong, Ai-Sheng

    2014-01-01

    Parsley is an important biennial Apiaceae species that is widely cultivated as herb, spice, and vegetable. Previous studies on parsley principally focused on its physiological and biochemical properties, including phenolic compound and volatile oil contents. However, little is known about the molecular and genetic properties of parsley. In this study, 23,686,707 high-quality reads were obtained and assembled into 81,852 transcripts and 50,161 unigenes for the first time. Functional annotation showed that 30,516 unigenes had sequence similarity to known genes. In addition, 3,244 putative simple sequence repeats were detected in curly parsley. Finally, 1,569 of the identified unigenes belonged to 58 transcription factor families. Various abiotic stresses have a strong detrimental effect on the yield and quality of parsley. AP2/ERF transcription factors have important functions in plant development, hormonal regulation, and abiotic response. A total of 88 putative AP2/ERF factors were identified from the transcriptome sequence of parsley. Seven AP2/ERF transcription factors were selected in this study to analyze the expression profiles of parsley under different abiotic stresses. Our data provide a potentially valuable resource that can be used for intensive parsley research.

  5. An Extended Multilocus Sequence Typing (MLST) Scheme for Rapid Direct Typing of Leptospira from Clinical Samples

    OpenAIRE

    Weiss, Sabrina; Menezes, Angela; Woods, Kate; Chanthongthip, Anisone; Dittrich, Sabine; Opoku-Boateng, Agatha; Kimuli, Maimuna; Chalker, Victoria

    2016-01-01

    Background Rapid typing of Leptospira is currently impaired by requiring time consuming culture of leptospires. The objective of this study was to develop an assay that provides multilocus sequence typing (MLST) data direct from patient specimens while minimising costs for subsequent sequencing. Methodology and Findings An existing PCR based MLST scheme was modified by designing nested primers including anchors for facilitated subsequent sequencing. The assay was applied to various specimen t...

  6. Population genetics, phylogenomics and hybrid speciation of Juglans in China determined from whole chloroplast genomes, transcriptomes, and genotyping-by-sequencing (GBS).

    Science.gov (United States)

    Zhao, Peng; Zhou, Hui-Juan; Potter, Daniel; Hu, Yi-Heng; Feng, Xiao-Jia; Dang, Meng; Feng, Li; Zulfiqar, Saman; Liu, Wen-Zhe; Zhao, Gui-Fang; Woeste, Keith

    2018-04-18

    Genomic data are a powerful tool for elucidating the processes involved in the evolution and divergence of species. The speciation and phylogenetic relationships among Chinese Juglans remain unclear. Here, we used results from phylogenomic and population genetic analyses, transcriptomics, Genotyping-By-Sequencing (GBS), and whole chloroplast genomes (Cp genome) data to infer processes of lineage formation among the five native Chinese species of the walnut genus (Juglans, Juglandaceae), a widespread, economically important group. We found that the processes of isolation generated diversity during glaciations, but that the recent range expansion of J. regia, probably from multiple refugia, led to hybrid formation both within and between sections of the genus. In southern China, human dispersal of J. regia brought it into contact with J. sigillata, which we determined to be an ecotype of J. regia that is now maintained as a landrace. In northern China, walnut hybridized with a distinct lineage of J. mandshurica to form J. hopeiensis, a controversial taxon (considered threatened) that our data indicate is a horticultural variety. Comparisons among whole chloroplast genomes and nuclear transcriptome analyses provided conflicting evidence for the timing of the divergence of Chinese Juglans taxa. J. cathayensis and J. mandshurica are poorly differentiated based our genomic data. Reconstruction of Juglans evolutionary history indicate that episodes of climatic variation over the past 4.5 to 33.80 million years, associated with glacial advances and retreats and population isolation, have shaped Chinese walnut demography and evolution, even in the presence of gene flow and introgression. Copyright © 2018 Elsevier Inc. All rights reserved.

  7. Next generation sequencing and de novo transcriptome analysis of Costus pictus D. Don, a non-model plant with potent anti-diabetic properties

    Directory of Open Access Journals (Sweden)

    Annadurai Ramasamy S

    2012-11-01

    Full Text Available Abstract Background Phyto-remedies for diabetic control are popular among patients with Type II Diabetes mellitus (DM, in addition to other diabetic control measures. A number of plant species are known to possess diabetic control properties. Costus pictus D. Don is popularly known as “Insulin Plant” in Southern India whose leaves have been reported to increase insulin pools in blood plasma. Next Generation Sequencing is employed as a powerful tool for identifying molecular signatures in the transcriptome related to physiological functions of plant tissues. We sequenced the leaf transcriptome of C. pictus using Illumina reversible dye terminator sequencing technology and used combination of bioinformatics tools for identifying transcripts related to anti-diabetic properties of C. pictus. Results A total of 55,006 transcripts were identified, of which 69.15% transcripts could be annotated. We identified transcripts related to pathways of bixin biosynthesis and geraniol and geranial biosynthesis as major transcripts from the class of isoprenoid secondary metabolites and validated the presence of putative norbixin methyltransferase, a precursor of Bixin. The transcripts encoding these terpenoids are known to be Peroxisome Proliferator-Activated Receptor (PPAR agonists and anti-glycation agents. Sequential extraction and High Performance Liquid Chromatography (HPLC confirmed the presence of bixin in C. pictus methanolic extracts. Another significant transcript identified in relation to anti-diabetic, anti-obesity and immuno-modulation is of Abscisic Acid biosynthetic pathway. We also report many other transcripts for the biosynthesis of antitumor, anti-oxidant and antimicrobial metabolites of C. pictus leaves. Conclusion Solid molecular signatures (transcripts related to bixin, abscisic acid, and geranial and geraniol biosynthesis for the anti-diabetic properties of C. pictus leaves and vital clues related to the other phytochemical functions

  8. Retrospective Identification of Herpes Simplex 2 Virus-Associated Acute Liver Failure in an Immunocompetent Patient Detected Using Whole Transcriptome Shotgun Sequencing.

    Science.gov (United States)

    Ono, Atsushi; Hayes, C Nelson; Akamatsu, Sakura; Imamura, Michio; Aikata, Hiroshi; Chayama, Kazuaki

    2017-01-01

    Acute liver failure (ALF) is a severe condition in which liver function rapidly deteriorates in individuals without prior history of liver disease. While most cases result from acetaminophen overdose or viral hepatitis, in up to a third of patients, no clear cause can be identified. Liver transplantation has greatly reduced mortality among these patients, but 40% of patients recover without liver transplantation. Therefore, there is an urgent need for rapid determination of the etiology of acute liver failure. In this case report, we present a case of herpes simplex 2 virus- (HSV-) associated ALF in an immunocompetent patient. The patient recovered without LT, but the presence of HSV was not suspected at the time, precluding more effective treatment with acyclovir. To determine the etiology, stored blood samples were analyzed using whole transcriptome shotgun sequencing followed by mapping to a panel of viral reference sequences. The presence of HSV-DNA in blood samples at the time of admission was confirmed using real-time polymerase chain reaction, and, at the time of discharge, HSV-DNA levels had decreased by a factor of 10 6 . Conclusions. In ALF cases of undetermined etiology, uncommon causes should be considered, especially those for which an effective treatment is available.

  9. Retrospective Identification of Herpes Simplex 2 Virus-Associated Acute Liver Failure in an Immunocompetent Patient Detected Using Whole Transcriptome Shotgun Sequencing

    Directory of Open Access Journals (Sweden)

    Atsushi Ono

    2017-01-01

    Full Text Available Acute liver failure (ALF is a severe condition in which liver function rapidly deteriorates in individuals without prior history of liver disease. While most cases result from acetaminophen overdose or viral hepatitis, in up to a third of patients, no clear cause can be identified. Liver transplantation has greatly reduced mortality among these patients, but 40% of patients recover without liver transplantation. Therefore, there is an urgent need for rapid determination of the etiology of acute liver failure. In this case report, we present a case of herpes simplex 2 virus- (HSV- associated ALF in an immunocompetent patient. The patient recovered without LT, but the presence of HSV was not suspected at the time, precluding more effective treatment with acyclovir. To determine the etiology, stored blood samples were analyzed using whole transcriptome shotgun sequencing followed by mapping to a panel of viral reference sequences. The presence of HSV-DNA in blood samples at the time of admission was confirmed using real-time polymerase chain reaction, and, at the time of discharge, HSV-DNA levels had decreased by a factor of 106. Conclusions. In ALF cases of undetermined etiology, uncommon causes should be considered, especially those for which an effective treatment is available.

  10. De novo RNA sequencing transcriptome of Rhododendron obtusum identified the early heat response genes involved in the transcriptional regulation of photosynthesis.

    Directory of Open Access Journals (Sweden)

    Linchuan Fang

    Full Text Available Rhododendron spp. is an important ornamental species that is widely cultivated for landscape worldwide. Heat stress is a major obstacle for its cultivation in south China. Previous studies on rhododendron principally focused on its physiological and biochemical processes, which are involved in a series of stress tolerance. However, molecular or genetic properties of rhododendron's response to heat stress are still poorly understood. The phenotype and chlorophyll fluorescence kinetics parameters of four rhododendron cultivars were compared under normal or heat stress conditions, and a cultivar with highest heat tolerance, "Yanzhimi" (R. obtusum was selected for transcriptome sequencing. A total of 325,429,240 high quality reads were obtained and assembled into 395,561 transcripts and 92,463 unigenes. Functional annotation showed that 38,724 unigenes had sequence similarity to known genes in at least one of the proteins or nucleotide databases used in this study. These 38,724 unigenes were categorized into 51 functional groups based on Gene Ontology classification and were blasted to 24 known cluster of orthologous groups. A total of 973 identified unigenes belonged to 57 transcription factor families, including the stress-related HSF, DREB, ZNF, and NAC genes. Photosynthesis was significantly enriched in the Kyoto Encyclopedia of Genes and Genomes pathway, and the changed expression pattern was illustrated. The key pathways and signaling components that contribute to heat tolerance in rhododendron were revealed. These results provide a potentially valuable resource that can be used for heat-tolerance breeding.

  11. De novo RNA sequencing transcriptome of Rhododendron obtusum identified the early heat response genes involved in the transcriptional regulation of photosynthesis

    Science.gov (United States)

    Tong, Jun; Dong, Yanfang; Xu, Dongyun; Mao, Jing; Zhou, Yuan

    2017-01-01

    Rhododendron spp. is an important ornamental species that is widely cultivated for landscape worldwide. Heat stress is a major obstacle for its cultivation in south China. Previous studies on rhododendron principally focused on its physiological and biochemical processes, which are involved in a series of stress tolerance. However, molecular or genetic properties of rhododendron’s response to heat stress are still poorly understood. The phenotype and chlorophyll fluorescence kinetics parameters of four rhododendron cultivars were compared under normal or heat stress conditions, and a cultivar with highest heat tolerance, “Yanzhimi” (R. obtusum) was selected for transcriptome sequencing. A total of 325,429,240 high quality reads were obtained and assembled into 395,561 transcripts and 92,463 unigenes. Functional annotation showed that 38,724 unigenes had sequence similarity to known genes in at least one of the proteins or nucleotide databases used in this study. These 38,724 unigenes were categorized into 51 functional groups based on Gene Ontology classification and were blasted to 24 known cluster of orthologous groups. A total of 973 identified unigenes belonged to 57 transcription factor families, including the stress-related HSF, DREB, ZNF, and NAC genes. Photosynthesis was significantly enriched in the Kyoto Encyclopedia of Genes and Genomes pathway, and the changed expression pattern was illustrated. The key pathways and signaling components that contribute to heat tolerance in rhododendron were revealed. These results provide a potentially valuable resource that can be used for heat-tolerance breeding. PMID:29059200

  12. Impact of SO(2) on Arabidopsis thaliana transcriptome in wildtype and sulfite oxidase knockout plants analyzed by RNA deep sequencing.

    Science.gov (United States)

    Hamisch, Domenica; Randewig, Dörte; Schliesky, Simon; Bräutigam, Andrea; Weber, Andreas P M; Geffers, Robert; Herschbach, Cornelia; Rennenberg, Heinz; Mendel, Ralf R; Hänsch, Robert

    2012-12-01

    High concentrations of sulfur dioxide (SO(2) ) as an air pollutant, and its derivative sulfite, cause abiotic stress that can lead to cell death. It is currently unknown to what extent plant fumigation triggers specific transcriptional responses. To address this question, and to test the hypothesis that sulfite oxidase (SO) is acting in SO(2) detoxification, we compared Arabidopsis wildtype (WT) and SO knockout lines (SO-KO) facing the impact of 600 nl l(-1) SO(2) , using RNAseq to quantify absolute transcript abundances. These transcriptome data were correlated to sulfur metabolism-related enzyme activities and metabolites obtained from identical samples in a previous study. SO-KO plants exhibited remarkable and broad regulative responses at the mRNA level, especially in transcripts related to sulfur metabolism enzymes, but also in those related to stress response and senescence. Focusing on SO regulation, no alterations were detectable in the WT, whereas in SO-KO plants we found up-regulation of two splice variants of the SO gene, although this gene is not functional in this line. Our data provide evidence for the highly specific coregulation between SO and sulfur-related enzymes like APS reductase, and suggest two novel candidates for involvement in SO(2) detoxification: an apoplastic peroxidase, and defensins as putative cysteine mass storages. © 2012 The Authors. New Phytologist © 2012 New Phytologist Trust.

  13. Direct typing of Canine parvovirus (CPV) from infected dog faeces by rapid mini sequencing technique.

    Science.gov (United States)

    V, Pavana Jyothi; S, Akila; Selvan, Malini K; Naidu, Hariprasad; Raghunathan, Shwethaa; Kota, Sathish; Sundaram, R C Raja; Rana, Samir Kumar; Raj, G Dhinakar; Srinivasan, V A; Mohana Subramanian, B

    2016-12-01

    Canine parvovirus (CPV) is a non-enveloped single stranded DNA virus with an icosahedral capsid. Mini-sequencing based CPV typing was developed earlier to detect and differentiate all the CPV types and FPV in a single reaction. This technique was further evaluated in the present study by performing the mini-sequencing directly from fecal samples which avoided tedious virus isolation steps by cell culture system. Fecal swab samples were collected from 84 dogs with enteritis symptoms, suggestive of parvoviral infection from different locations across India. Seventy six of these samples were positive by PCR; the subsequent mini-sequencing reaction typed 74 of them as type 2a virus, and 2 samples as type 2b. Additionally, 25 of the positive samples were typed by cycle sequencing of PCR products. Direct CPV typing from fecal samples using mini-sequencing showed 100% correlation with CPV typing by cycle sequencing. Moreover, CPV typing was achieved by mini-sequencing even with faintly positive PCR amplicons which was not possible by cycle sequencing. Therefore, the mini-sequencing technique is recommended for regular epidemiological follow up of CPV types, since the technique is rapid, highly sensitive and high capacity method for CPV typing. Copyright © 2016. Published by Elsevier B.V.

  14. The genome sequence and transcriptome of Potentilla micrantha and their comparison to Fragaria vesca (the woodland strawberry).

    Science.gov (United States)

    Buti, Matteo; Moretto, Marco; Barghini, Elena; Mascagni, Flavia; Natali, Lucia; Brilli, Matteo; Lomsadze, Alexandre; Sonego, Paolo; Giongo, Lara; Alonge, Michael; Velasco, Riccardo; Varotto, Claudio; Šurbanovski, Nada; Borodovsky, Mark; Ward, Judson A; Engelen, Kristof; Cavallini, Andrea; Cestaro, Alessandro; Sargent, Daniel James

    2018-04-01

    The genus Potentilla is closely related to that of Fragaria, the economically important strawberry genus. Potentilla micrantha is a species that does not develop berries but shares numerous morphological and ecological characteristics with Fragaria vesca. These similarities make P. micrantha an attractive choice for comparative genomics studies with F. vesca. In this study, the P. micrantha genome was sequenced and annotated, and RNA-Seq data from the different developmental stages of flowering and fruiting were used to develop a set of gene predictions. A 327 Mbp sequence and annotation of the genome of P. micrantha, spanning 2674 sequence contigs, with an N50 size of 335,712, estimated to cover 80% of the total genome size of the species was developed. The genus Potentilla has a characteristically larger genome size than Fragaria, but the recovered sequence scaffolds were remarkably collinear at the micro-syntenic level with the genome of F. vesca, its closest sequenced relative. A total of 33,602 genes were predicted, and 95.1% of bench-marking universal single-copy orthologous genes were complete within the presented sequence. Thus, we argue that the majority of the gene-rich regions of the genome have been sequenced. Comparisons of RNA-Seq data from the stages of floral and fruit development revealed genes differentially expressed between P. micrantha and F. vesca.The data presented are a valuable resource for future studies of berry development in Fragaria and the Rosaceae and they also shed light on the evolution of genome size and organization in this family.

  15. Leaf Transcriptome Sequencing for Identifying Genic-SSR Markers and SNP Heterozygosity in Crossbred Mango Variety 'Amrapali' (Mangifera indica L.).

    Science.gov (United States)

    Mahato, Ajay Kumar; Sharma, Nimisha; Singh, Akshay; Srivastav, Manish; Jaiprakash; Singh, Sanjay Kumar; Singh, Anand Kumar; Sharma, Tilak Raj; Singh, Nagendra Kumar

    2016-01-01

    Mango (Mangifera indica L.) is called "king of fruits" due to its sweetness, richness of taste, diversity, large production volume and a variety of end usage. Despite its huge economic importance genomic resources in mango are scarce and genetics of useful horticultural traits are poorly understood. Here we generated deep coverage leaf RNA sequence data for mango parental varieties 'Neelam', 'Dashehari' and their hybrid 'Amrapali' using next generation sequencing technologies. De-novo sequence assembly generated 27,528, 20,771 and 35,182 transcripts for the three genotypes, respectively. The transcripts were further assembled into a non-redundant set of 70,057 unigenes that were used for SSR and SNP identification and annotation. Total 5,465 SSR loci were identified in 4,912 unigenes with 288 type I SSR (n ≥ 20 bp). One hundred type I SSR markers were randomly selected of which 43 yielded PCR amplicons of expected size in the first round of validation and were designated as validated genic-SSR markers. Further, 22,306 SNPs were identified by aligning high quality sequence reads of the three mango varieties to the reference unigene set, revealing significantly enhanced SNP heterozygosity in the hybrid Amrapali. The present study on leaf RNA sequencing of mango varieties and their hybrid provides useful genomic resource for genetic improvement of mango.

  16. Co-transcriptomic Analysis by RNA Sequencing to Simultaneously Measure Regulated Gene Expression in Host and Bacterial Pathogen

    KAUST Repository

    Ravasi, Timothy; Mavromatis, Charalampos Harris; Bokil, Nilesh J.; Schembri, Mark A.; Sweet, Matthew J.

    2016-01-01

    Intramacrophage pathogens subvert antimicrobial defence pathways using various mechanisms, including the targeting of host TLR-mediated transcriptional responses. Conversely, TLR-inducible host defence mechanisms subject intramacrophage pathogens to stress, thus altering pathogen gene expression programs. Important biological insights can thus be gained through the analysis of gene expression changes in both the host and the pathogen during an infection. Traditionally, research methods have involved the use of qPCR, microarrays and/or RNA sequencing to identify transcriptional changes in either the host or the pathogen. Here we describe the application of RNA sequencing using samples obtained from in vitro infection assays to simultaneously quantify both host and bacterial pathogen gene expression changes, as well as general approaches that can be undertaken to interpret the RNA sequencing data that is generated. These methods can be used to provide insights into host TLR-regulated transcriptional responses to microbial challenge, as well as pathogen subversion mechanisms against such responses.

  17. Co-transcriptomic Analysis by RNA Sequencing to Simultaneously Measure Regulated Gene Expression in Host and Bacterial Pathogen

    KAUST Repository

    Ravasi, Timothy

    2016-01-24

    Intramacrophage pathogens subvert antimicrobial defence pathways using various mechanisms, including the targeting of host TLR-mediated transcriptional responses. Conversely, TLR-inducible host defence mechanisms subject intramacrophage pathogens to stress, thus altering pathogen gene expression programs. Important biological insights can thus be gained through the analysis of gene expression changes in both the host and the pathogen during an infection. Traditionally, research methods have involved the use of qPCR, microarrays and/or RNA sequencing to identify transcriptional changes in either the host or the pathogen. Here we describe the application of RNA sequencing using samples obtained from in vitro infection assays to simultaneously quantify both host and bacterial pathogen gene expression changes, as well as general approaches that can be undertaken to interpret the RNA sequencing data that is generated. These methods can be used to provide insights into host TLR-regulated transcriptional responses to microbial challenge, as well as pathogen subversion mechanisms against such responses.

  18. Transcriptome sequencing of lentil based on second-generation technology permits large-scale unigene assembly and SSR marker discovery

    Directory of Open Access Journals (Sweden)

    Materne Michael

    2011-05-01

    Full Text Available Abstract Background Lentil (Lens culinaris Medik. is a cool-season grain legume which provides a rich source of protein for human consumption. In terms of genomic resources, lentil is relatively underdeveloped, in comparison to other Fabaceae species, with limited available data. There is hence a significant need to enhance such resources in order to identify novel genes and alleles for molecular breeding to increase crop productivity and quality. Results Tissue-specific cDNA samples from six distinct lentil genotypes were sequenced using Roche 454 GS-FLX Titanium technology, generating c. 1.38 × 106 expressed sequence tags (ESTs. De novo assembly generated a total of 15,354 contigs and 68,715 singletons. The complete unigene set was sequence-analysed against genome drafts of the model legume species Medicago truncatula and Arabidopsis thaliana to identify 12,639, and 7,476 unique matches, respectively. When compared to the genome of Glycine max, a total of 20,419 unique hits were observed corresponding to c. 31% of the known gene space. A total of 25,592 lentil unigenes were subsequently annoated from GenBank. Simple sequence repeat (SSR-containing ESTs were identified from consensus sequences and a total of 2,393 primer pairs were designed. A subset of 192 EST-SSR markers was screened for validation across a panel 12 cultivated lentil genotypes and one wild relative species. A total of 166 primer pairs obtained successful amplification, of which 47.5% detected genetic polymorphism. Conclusions A substantial collection of ESTs has been developed from sequence analysis of lentil genotypes using second-generation technology, permitting unigene definition across a broad range of functional categories. As well as providing resources for functional genomics studies, the unigene set has permitted significant enhancement of the number of publicly-available molecular genetic markers as tools for improvement of this species.

  19. Sequencing, de novo assembly and characterization of the spotted scat Scatophagus argus (Linnaeus 1766) transcriptome for discovery of reproduction related genes and SSRs

    Science.gov (United States)

    Yang, Wei; Chen, Huapu; Cui, Xuefan; Zhang, Kewei; Jiang, Dongneng; Deng, Siping; Zhu, Chunhua; Li, Guangli

    2017-09-01

    Spotted scat (Scatophagus argus) is an economically important farmed fish, particularly in East and Southeast Asia. Because there has been little research on reproductive development and regulation in this species, the lack of a mature artificial reproduction technology remains a barrier for the sustainable development of the aquaculture industry. More genetic and genomic background knowledge is urgently needed for an in-depth understanding of the molecular mechanism of reproductive process and identification of functional genes related to sexual differentiation, gonad maturation and gametogenesis. For these reasons, we performed transcriptomic analysis on spotted scat using a multiple tissue sample mixing strategy. The Illumina RNA sequencing generated 118 510 486 raw reads. After trimming, de novo assembly was performed and yielded 99 888 unigenes with an average length of 905.75 bp. A total of 45 015 unigenes were successfully annotated to the Nr, Swiss-Prot, KOG and KEGG databases. Additionally, 23 783 and 27 183 annotated unigenes were assigned to 56 Gene Ontology (GO) functional groups and 228 KEGG pathways, respectively. Subsequently, 2 474 transcripts associated with reproduction were selected using GO term and KEGG pathway assignments, and a number of reproduction-related genes involved in sex differentiation, gonad development and gametogenesis were identified. Furthermore, 22 279 simple sequence repeat (SSR) loci were discovered and characterized. The comprehensive transcript dataset described here greatly increases the genetic information available for spotted scat and contributes valuable sequence resources for functional gene mining and analysis. Candidate transcripts involved in reproduction would make good starting points for future studies on reproductive mechanisms, and the putative sex differentiation-related genes will be helpful for sex-determining gene identification and sex-specific marker isolation. Lastly, the SSRs can serve as marker

  20. De novo sequencing and analysis of the Ulva linza transcriptome to discover putative mechanisms associated with its successful colonization of coastal ecosystems

    Directory of Open Access Journals (Sweden)

    Zhang Xiaowen

    2012-10-01

    Full Text Available Abstract Background The green algal genus Ulva Linnaeus (Ulvaceae, Ulvales, Chlorophyta is well known for its wide distribution in marine, freshwater, and brackish environments throughout the world. The Ulva species are also highly tolerant of variations in salinity, temperature, and irradiance and are the main cause of green tides, which can have deleterious ecological effects. However, limited genomic information is currently available in this non-model and ecologically important species. Ulva linza is a species that inhabits bedrock in the mid to low intertidal zone, and it is a major contributor to biofouling. Here, we presented the global characterization of the U. linza transcriptome using the Roche GS FLX Titanium platform, with the aim of uncovering the genomic mechanisms underlying rapid and successful colonization of the coastal ecosystems. Results De novo assembly of 382,884 reads generated 13,426 contigs with an average length of 1,000 bases. Contiguous sequences were further assembled into 10,784 isotigs with an average length of 1,515 bases. A total of 304,101 reads were nominally identified by BLAST; 4,368 isotigs were functionally annotated with 13,550 GO terms, and 2,404 isotigs having enzyme commission (EC numbers were assigned to 262 KEGG pathways. When compared with four other full sequenced green algae, 3,457 unique isotigs were found in U. linza and 18 conserved in land plants. In addition, a specific photoprotective mechanism based on both LhcSR and PsbS proteins and a C4-like carbon-concentrating mechanism were found, which may help U. linza survive stress conditions. At least 19 transporters for essential inorganic nutrients (i.e., nitrogen, phosphorus, and sulphur were responsible for its ability to take up inorganic nutrients, and at least 25 eukaryotic cytochrome P450s, which is a higher number than that found in other algae, may be related to their strong allelopathy. Multi-origination of the stress related proteins

  1. Transcriptome sequencing of mung bean (Vigna radiate L.) genes and the identification of EST-SSR markers.

    Science.gov (United States)

    Chen, Honglin; Wang, Lixia; Wang, Suhua; Liu, Chunji; Blair, Matthew Wohlgemuth; Cheng, Xuzhen

    2015-01-01

    Mung bean (Vigna radiate (L.) Wilczek) is an important traditional food legume crop, with high economic and nutritional value. It is widely grown in China and other Asian countries. Despite its importance, genomic information is currently unavailable for this crop plant species or some of its close relatives in the Vigna genus. In this study, more than 103 million high quality cDNA sequence reads were obtained from mung bean using Illumina paired-end sequencing technology. The processed reads were assembled into 48,693 unigenes with an average length of 874 bp. Of these unigenes, 25,820 (53.0%) and 23,235 (47.7%) showed significant similarity to proteins in the NCBI non-redundant protein and nucleotide sequence databases, respectively. Furthermore, 19,242 (39.5%) could be classified into gene ontology categories, 18,316 (37.6%) into Swiss-Prot categories and 10,918 (22.4%) into KOG database categories (E-value SSR), and 2,303 sequences contained more than one SSR together in the same expressed sequence tag (EST). A total of 13,134 EST-SSRs were identified as potential molecular markers, with mono-nucleotide A/T repeats being the most abundant motif class and G/C repeats being rare. In this SSR analysis, we found five main repeat motifs: AG/CT (30.8%), GAA/TTC (12.6%), AAAT/ATTT (6.8%), AAAAT/ATTTT (6.2%) and AAAAAT/ATTTTT (1.9%). A total of 200 SSR loci were randomly selected for validation by PCR amplification as EST-SSR markers. Of these, 66 marker primer pairs produced reproducible amplicons that were polymorphic among 31 mung bean accessions selected from diverse geographical locations. The large number of SSR-containing sequences found in this study will be valuable for the construction of a high-resolution genetic linkage maps, association or comparative mapping and genetic analyses of various Vigna species.

  2. Uncovering the immune responses of Apis mellifera ligustica larval gut to Ascosphaera apis infection utilizing transcriptome sequencing.

    Science.gov (United States)

    Chen, Dafu; Guo, Rui; Xu, Xijian; Xiong, Cuiling; Liang, Qin; Zheng, Yanzhen; Luo, Qun; Zhang, Zhaonan; Huang, Zhijian; Kumar, Dhiraj; Xi, Weijun; Zou, Xuan; Liu, Min

    2017-07-20

    Honeybees are susceptible to a variety of diseases, including chalkbrood, which is capable of causing huge losses of both the number of bees and colony productivity. This research is designed to characterize the transcriptome profiles of Ascosphaera apis-treated and un-treated larval guts of Apis mellifera ligustica in an attempt to unravel the molecular mechanism underlying the immune responses of western honeybee larval guts to mycosis. In this study, 24, 296 and 2157 genes were observed to be differentially expressed in A. apis-treated Apis mellifera (4-, 5- and 6-day-old) compared with un-treated larval guts. Moreover, the expression patterns of differentially expressed genes (DEGs) were examined via trend analysis, and subsequently, gene ontology analysis and KEGG pathway enrichment analysis were conducted for DEGs involved in up- and down-regulated profiles. Immunity-related pathways were selected for further analysis, and our results demonstrated that a total of 13 and 50 DEGs were annotated in the humoral immune-related and cellular immune-related pathways, respectively. Additionally, we observed that many DEGs up-regulated in treated guts were part of cellular immune pathways, such as the lysosome, ubiquitin mediated proteolysis, and insect hormone biosynthesis pathways and were induced by A. apis invasion. However, more down-regulated DEGs were restrained. Surprisingly, a majority of DEGs within the Toll-like receptor signaling pathway, and the MAPK signaling pathway were up-regulated in treated guts, while all but two genes involved in the NF-κB signaling pathway were down-regulated, which suggested that most genes involved in humoral immune-related pathways were activated in response to the invasive fungal pathogen. This study's findings provide valuable information regarding the investigation of the molecular mechanism of immunity defenses of A. m. ligustica larval guts to infection with A. apis. Furthermore, these studies lay the groundwork for

  3. Transcriptome analysis of carnation (Dianthus caryophyllus L.) based on next-generation sequencing technology

    OpenAIRE

    Tanase Koji; Nishitani Chikako; Hirakawa Hideki; Isobe Sachiko; Tabata Satoshi; Ohmiya Akemi; Onozaki Takashi

    2012-01-01

    Abstract Background Carnation (Dianthus caryophyllus L.), in the family Caryophyllaceae, can be found in a wide range of colors and is a model system for studies of flower senescence. In addition, it is one of the most important flowers in the global floriculture industry. However, few genomics resources, such as sequences and markers are available for carnation or other members of the Caryophyllaceae. To increase our understanding of the genetic control of important characters in carnation, ...

  4. Transcriptome Characterization and Sequencing-Based Identification of Salt-Responsive Genes in Millettia pinnata, a Semi-Mangrove Plant

    OpenAIRE

    Huang, Jianzi; Lu, Xiang; Yan, Hao; Chen, Shouyi; Zhang, Wanke; Huang, Rongfeng; Zheng, Yizhi

    2012-01-01

    Semi-mangroves form a group of transitional species between glycophytes and halophytes, and hold unique potential for learning molecular mechanisms underlying plant salt tolerance. Millettia pinnata is a semi-mangrove plant that can survive a wide range of saline conditions in the absence of specialized morphological and physiological traits. By employing the Illumina sequencing platform, we generated ∼192 million short reads from four cDNA libraries of M. pinnata and processed them into 108 ...

  5. Identification of rat genes by TWINSCAN gene prediction, RT-PCR, and direct sequencing

    DEFF Research Database (Denmark)

    Wu, Jia Qian; Shteynberg, David; Arumugam, Manimozhiyan

    2004-01-01

    an alternative approach: reverse transcription-polymerase chain reaction (RT-PCR) and direct sequencing based on dual-genome de novo predictions from TWINSCAN. We tested 444 TWINSCAN-predicted rat genes that showed significant homology to known human genes implicated in disease but that were partially...... in the single-intron experiment. Spliced sequences were amplified in 46 cases (34%). We conclude that this procedure for elucidating gene structures with native cDNA sequences is cost-effective and will become even more so as it is further optimized.......The publication of a draft sequence of a third mammalian genome--that of the rat--suggests a need to rethink genome annotation. New mammalian sequences will not receive the kind of labor-intensive annotation efforts that are currently being devoted to human. In this paper, we demonstrate...

  6. DeepProbe: Information Directed Sequence Understanding and Chatbot Design via Recurrent Neural Networks

    OpenAIRE

    Yin, Zi; Chang, Keng-hao; Zhang, Ruofei

    2017-01-01

    Information extraction and user intention identification are central topics in modern query understanding and recommendation systems. In this paper, we propose DeepProbe, a generic information-directed interaction framework which is built around an attention-based sequence to sequence (seq2seq) recurrent neural network. DeepProbe can rephrase, evaluate, and even actively ask questions, leveraging the generative ability and likelihood estimation made possible by seq2seq models. DeepProbe makes...

  7. Profiling of secondary metabolite gene clusters regulated by LaeA in Aspergillus niger FGSC A1279 based on genome sequencing and transcriptome analysis.

    Science.gov (United States)

    Wang, Bin; Lv, Yangyong; Li, Xuejie; Lin, Yiying; Deng, Hai; Pan, Li

    The global regulator LaeA controls the production of many fungal secondary metabolites, possibly via chromatin remodeling. Here we aimed to survey the secondary metabolite profile regulated by LaeA in Aspergillus niger FGSC A1279 by genome sequencing and comparative transcriptomics between the laeA deletion (ΔlaeA) and overexpressing (OE-laeA) mutants. Genome sequencing revealed four putative polyketide synthase genes specific to FGSC A1279, suggesting that the corresponding polyketide compounds might be unique to FGSC A1279. RNA-seq data revealed 281 putative secondary metabolite genes upregulated in the OE-laeA mutants, including 22 secondary metabolite backbone genes. LC-MS chemical profiling illustrated that many secondary metabolites were produced in OE-laeA mutants compared to wild type and ΔlaeA mutants, providing potential resources for drug discovery. KEGG analysis annotated 16 secondary metabolite clusters putatively linked to metabolic pathways. Furthermore, 34 of 61 Zn 2 Cys 6 transcription factors located in secondary metabolite clusters were differentially expressed between ΔlaeA and OE-laeA mutants. Three secondary metabolite clusters (cluster 18, 30 and 33) containing Zn 2 Cys 6 transcription factors that were upregulated in OE-laeA mutants were putatively linked to KEGG pathways, suggesting that Zn 2 Cys 6 transcription factors might play an important role in synthesizing secondary metabolites regulated by LaeA. Taken together, LaeA dramatically influences the secondary metabolite profile in FGSC A1279. Copyright © 2017 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.

  8. Diversity and Genome Analysis of Australian and Global Oilseed Brassica napus L. Germplasm Using Transcriptomics and Whole Genome Re-sequencing

    Directory of Open Access Journals (Sweden)

    M. Michelle Malmberg

    2018-04-01

    Full Text Available Intensive breeding of Brassica napus has resulted in relatively low diversity, such that B. napus would benefit from germplasm improvement schemes that sustain diversity. As such, samples representative of global germplasm pools need to be assessed for existing population structure, diversity and linkage disequilibrium (LD. Complexity reduction genotyping-by-sequencing (GBS methods, including GBS-transcriptomics (GBS-t, enable cost-effective screening of a large number of samples, while whole genome re-sequencing (WGR delivers the ability to generate large numbers of unbiased genomic single nucleotide polymorphisms (SNPs, and identify structural variants (SVs. Furthermore, the development of genomic tools based on whole genomes representative of global oilseed diversity and orientated by the reference genome has substantial industry relevance and will be highly beneficial for canola breeding. As recent studies have focused on European and Chinese varieties, a global diversity panel as well as a substantial number of Australian spring types were included in this study. Focusing on industry relevance, 633 varieties were initially genotyped using GBS-t to examine population structure using 61,037 SNPs. Subsequently, 149 samples representative of global diversity were selected for WGR and both data sets used for a side-by-side evaluation of diversity and LD. The WGR data was further used to develop genomic resources consisting of a list of 4,029,750 high-confidence SNPs annotated using SnpEff, and SVs in the form of 10,976 deletions and 2,556 insertions. These resources form the basis of a reliable and repeatable system allowing greater integration between canola genomics studies, with a strong focus on breeding germplasm and industry applicability.

  9. Transcriptome Sequencing Analysis Reveals a Difference in Monoterpene Biosynthesis between Scented Lilium ‘Siberia’ and Unscented Lilium ‘Novano’

    Directory of Open Access Journals (Sweden)

    Zenghui Hu

    2017-08-01

    Full Text Available Lilium is a world famous fragrant bulb flower with high ornamental and economic values, and significant differences in fragrance are found among different Lilium genotypes. In order to explore the mechanism underlying the different fragrances, the floral scents of Lilium ‘Sibeia’, with a strong fragrance, and Lilium ‘Novano’, with a very faint fragrance, were collected in vivo using a dynamic headspace technique. These scents were identified using automated thermal desorption—gas chromatography/mass spectrometry (ATD-GC/MS at different flowering stages. We used RNA-Seq technique to determine the petal transcriptome at the full-bloom stage and analyzed differentially expressed genes (DEGs to investigate the molecular mechanism of floral scent biosynthesis. The results showed that a significantly higher amount of Lilium ‘Siberia’ floral scent was released compared with Lilium ‘Novano’. Moreover, monoterpenes played a dominant role in the floral scent of Lilium ‘Siberia’; therefore, it is believed that the different emissions of monoterpenes mainly contributed to the difference in the floral scent between the two Lilium genotypes. Transcriptome sequencing analysis indicated that ~29.24 Gb of raw data were generated and assembled into 124,233 unigenes, of which 35,749 unigenes were annotated. Through a comparison of gene expression between these two Lilium genotypes, 6,496 DEGs were identified. The genes in the terpenoid backbone biosynthesis pathway showed significantly different expression levels. The gene expressions of 1-deoxy-D-xylulose 5-phosphate synthase (DXS, 1-deoxy-D-xylulose-5-phosphate reductoisomerase (DXR, 4-hydroxy-3-methylbut-2-enyl diphosphate synthase (HDS, 4-hydroxy-3-methylbut-2-enyl diphosphate reductase (HDR, isopentenyl diphosphate isomerase (IDI, and geranyl diphosphate synthase (GPS/GGPS, were upregulated in Lilium ‘Siberia’ compared to Lilium ‘Novano’, and two monoterpene synthase genes

  10. TCW: transcriptome computational workbench.

    Science.gov (United States)

    Soderlund, Carol; Nelson, William; Willer, Mark; Gang, David R

    2013-01-01

    The analysis of transcriptome data involves many steps and various programs, along with organization of large amounts of data and results. Without a methodical approach for storage, analysis and query, the resulting ad hoc analysis can lead to human error, loss of data and results, inefficient use of time, and lack of verifiability, repeatability, and extensibility. The Transcriptome Computational Workbench (TCW) provides Java graphical interfaces for methodical analysis for both single and comparative transcriptome data without the use of a reference genome (e.g. for non-model organisms). The singleTCW interface steps the user through importing transcript sequences (e.g. Illumina) or assembling long sequences (e.g. Sanger, 454, transcripts), annotating the sequences, and performing differential expression analysis using published statistical programs in R. The data, metadata, and results are stored in a MySQL database. The multiTCW interface builds a comparison database by importing sequence and annotation from one or more single TCW databases, executes the ESTscan program to translate the sequences into proteins, and then incorporates one or more clusterings, where the clustering options are to execute the orthoMCL program, compute transitive closure, or import clusters. Both singleTCW and multiTCW allow extensive query and display of the results, where singleTCW displays the alignment of annotation hits to transcript sequences, and multiTCW displays multiple transcript alignments with MUSCLE or pairwise alignments. The query programs can be executed on the desktop for fastest analysis, or from the web for sharing the results. It is now affordable to buy a multi-processor machine, and easy to install Java and MySQL. By simply downloading the TCW, the user can interactively analyze, query and view their data. The TCW allows in-depth data mining of the results, which can lead to a better understanding of the transcriptome. TCW is freely available from www.agcol.arizona.edu/software/tcw.

  11. Laser Capture and Deep Sequencing Reveals the Transcriptomic Programmes Regulating the Onset of Pancreas and Liver Differentiation in Human Embryos

    Directory of Open Access Journals (Sweden)

    Rachel E. Jennings

    2017-11-01

    Full Text Available To interrogate the alternative fates of pancreas and liver in the earliest stages of human organogenesis, we developed laser capture, RNA amplification, and computational analysis of deep sequencing. Pancreas-enriched gene expression was less conserved between human and mouse than for liver. The dorsal pancreatic bud was enriched for components of Notch, Wnt, BMP, and FGF signaling, almost all genes known to cause pancreatic agenesis or hypoplasia, and over 30 unexplored transcription factors. SOX9 and RORA were imputed as key regulators in pancreas compared with EP300, HNF4A, and FOXA family members in liver. Analyses implied that current in vitro human stem cell differentiation follows a dorsal rather than a ventral pancreatic program and pointed to additional factors for hepatic differentiation. In summary, we provide the transcriptional codes regulating the start of human liver and pancreas development to facilitate stem cell research and clinical interpretation without inter-species extrapolation.

  12. The generation and utilization of a cancer-oriented representation of the human transcriptome by using expressed sequence tags

    DEFF Research Database (Denmark)

    Brentani, Helena; Caballero, Otávia L; Camargo, Anamaria A

    2003-01-01

    expressed sequence tags (ESTs) from human tumors and their corresponding normal tissues in the public databases. The data currently define approximately 23,500 genes, of which only approximately 1,250 are still represented only by ESTs. Examination of the EST coverage of known cancer-related (CR) genes...... reveals that ESTs, indicating that the representation of genes associated with commonly studied tumors is high. The careful recording of the origin of all ESTs we have produced has enabled detailed definition of where the genes they represent are expressed in the human body....... More than 100,000 ESTs are available for seven tissues, indicating a surprising variability of gene usage that has led to the discovery of a significant number of genes with restricted expression, and that may thus be therapeutically useful. The ESTs also reveal novel nonsynonymous germline variants...

  13. Transcriptome Sequencing of Diverse Peanut (Arachis Wild Species and the Cultivated Species Reveals a Wealth of Untapped Genetic Variability

    Directory of Open Access Journals (Sweden)

    Ratan Chopra

    2016-12-01

    Full Text Available To test the hypothesis that the cultivated peanut species possesses almost no molecular variability, we sequenced a diverse panel of 22 Arachis accessions representing Arachis hypogaea botanical classes, A-, B-, and K- genome diploids, a synthetic amphidiploid, and a tetraploid wild species. RNASeq was performed on pools of three tissues, and de novo assembly was performed. Realignment of individual accession reads to transcripts of the cultivar OLin identified 306,820 biallelic SNPs. Among 10 naturally occurring tetraploid accessions, 40,382 unique homozygous SNPs were identified in 14,719 contigs. In eight diploid accessions, 291,115 unique SNPs were identified in 26,320 contigs. The average SNP rate among the 10 cultivated tetraploids was 0.5, and among eight diploids was 9.2 per 1000 bp. Diversity analysis indicated grouping of diploids according to genome classification, and cultivated tetraploids by subspecies. Cluster analysis of variants indicated that sequences of B genome species were the most similar to the tetraploids, and the next closest diploid accession belonged to the A genome species. A subset of 66 SNPs selected from the dataset was validated; of 782 SNP calls, 636 (81.32% were confirmed using an allele-specific discrimination assay. We conclude that substantial genetic variability exists among wild species. Additionally, significant but lesser variability at the molecular level occurs among accessions of the cultivated species. This survey is the first to report significant SNP level diversity among transcripts, and may explain some of the phenotypic differences observed in germplasm surveys. Understanding SNP variants in the Arachis accessions will benefit in developing markers for selection.

  14. SNP discovery and High Resolution Melting Analysis from massive transcriptome sequencing in the California red abalone Haliotis rufescens.

    Science.gov (United States)

    Valenzuela-Muñoz, Valentina; Araya-Garay, José Miguel; Gallardo-Escárate, Cristian

    2013-06-01

    The California red abalone, Haliotis rufescens that belongs to the Haliotidae family, is the largest species of abalone in the world that has sustained the major fishery and aquaculture production in the USA and Mexico. This native mollusk has not been evaluated or assigned a conservation category even though in the last few decades it was heavily exploited until it disappeared in some areas along the California coast. In Chile, the red abalone was introduced in the 1970s from California wild abalone stocks for the purposes of aquaculture. Considering the number of years that the red abalone has been cultivated in Chile crucial genetic information is scarce and critical issues remain unresolved. This study reports and validates novel single nucleotide polymorphisms (SNP) markers for the red abalone H. rufescens using cDNA pyrosequencing. A total of 622 high quality SNPs were identified in 146 sequences with an estimated frequency of 1 SNP each 1000bp. Forty-five SNPs markers with functional information for gene ontology were selected. Of these, 8 were polymorphic among the individuals screened: Heat shock protein 70 (HSP70), vitellogenin (VTG), lysin, alginate lyase enzyme (AL), Glucose-regulated protein 94 (GRP94), fructose-bisphosphate aldolase (FBA), sulfatase 1A precursor (S1AP) and ornithine decarboxylase antizyme (ODC). Two additional sequences were also identified with polymorphisms but no similarities with known proteins were achieved. To validate the putative SNP markers, High Resolution Melting Analysis (HRMA) was conducted in a wild and hatchery-bred population. Additionally, SNP cross-amplifications were tested in two further native abalone species, Haliotis fulgens and Haliotis corrugata. This study provides novel candidate genes that could be used to evaluate loss of genetic diversity due to hatchery selection or inbreeding effects. Copyright © 2013 Elsevier B.V. All rights reserved.

  15. Chiron: translating nanopore raw signal directly into nucleotide sequence using deep learning

    KAUST Repository

    Teng, Haotian; Cao, Minh Duc; Hall, Michael B; Duarte, Tania; Wang, Sheng; Coin, Lachlan J M

    2018-01-01

    Sequencing by translocating DNA fragments through an array of nanopores is a rapidly maturing technology that offers faster and cheaper sequencing than other approaches. However, accurately deciphering the DNA sequence from the noisy and complex electrical signal is challenging. Here, we report Chiron, the first deep learning model to achieve end-to-end basecalling and directly translate the raw signal to DNA sequence without the error-prone segmentation step. Trained with only a small set of 4,000 reads, we show that our model provides state-of-the-art basecalling accuracy, even on previously unseen species. Chiron achieves basecalling speeds of more than 2,000 bases per second using desktop computer graphics processing units.

  16. Chiron: translating nanopore raw signal directly into nucleotide sequence using deep learning

    KAUST Repository

    Teng, Haotian

    2018-04-10

    Sequencing by translocating DNA fragments through an array of nanopores is a rapidly maturing technology that offers faster and cheaper sequencing than other approaches. However, accurately deciphering the DNA sequence from the noisy and complex electrical signal is challenging. Here, we report Chiron, the first deep learning model to achieve end-to-end basecalling and directly translate the raw signal to DNA sequence without the error-prone segmentation step. Trained with only a small set of 4,000 reads, we show that our model provides state-of-the-art basecalling accuracy, even on previously unseen species. Chiron achieves basecalling speeds of more than 2,000 bases per second using desktop computer graphics processing units.

  17. Direct whole-genome sequencing of Plasmodium falciparum specimens from dried erythrocyte spots

    DEFF Research Database (Denmark)

    Nag, Sidsel; Kofoed, Poul Erik; Ursing, Johan

    2018-01-01

    -infected individuals living in rural areas, away from main infrastructure and the electrical grid. The aim of this study was to describe a low-tech procedure to sample P. falciparum specimens for direct whole genome sequencing (WGS), without use of electricity and cold-chain. Methods: Venous blood samples were...

  18. Direct amplification, sequencing and profiling of Chlamydia trachomatis strains in single and mixed infection clinical samples.

    Directory of Open Access Journals (Sweden)

    Sandeep J Joseph

    Full Text Available Sequencing bacterial genomes from DNA isolated directly from clinical samples offers the promise of rapid and precise acquisition of informative genetic information. In the case of Chlamydia trachomatis, direct sequencing is particularly desirable because it obviates the requirement for culture in mammalian cells, saving time, cost and the possibility of missing low abundance strains. In this proof of concept study, we developed methodology that would allow genome-scale direct sequencing, using a multiplexed microdroplet PCR enrichment technology to amplify a 100 kb region of the C. trachomatis genome with 500 1.1-1.3 kb overlapping amplicons (5-fold amplicon redundancy. We integrated comparative genomic data into a pipeline to preferentially select conserved sites for amplicon design. The 100 kb target region could be amplified from clinical samples, including remnants from diagnostics tests, originating from the cervix, urethra and urine, For rapid analysis of these data, we developed a framework for whole-genome based genotyping called binstrain. We used binstrain to estimate the proportion of SNPs originating from 14 C. trachomatis reference serotype genomes in each sample. Direct DNA sequencing methods such as the one described here may have an important role in understanding the biology of C. trachomatis mixed infections and the natural genetic variation of the species within clinically relevant ecological niches.

  19. Rapid whole genome sequencing for the detection and characterization of microorganisms directly from clinical samples

    DEFF Research Database (Denmark)

    Hasman, Henrik; Saputra, Dhany; Sicheritz-Pontén, Thomas

    2014-01-01

    Whole genome sequencing (WGS) is becoming available as a routine tool for clinical microbiology. If applied directly on clinical samples this could further reduce diagnostic time and thereby improve control and treatment. A major bottle-neck is the availability of fast and reliable bioinformatics...

  20. Next Generation Sequencing and Transcriptome Analysis Predicts Biosynthetic Pathway of Sennosides from Senna (Cassia angustifolia Vahl., a Non-Model Plant with Potent Laxative Properties.

    Directory of Open Access Journals (Sweden)

    Nagaraja Reddy Rama Reddy

    Full Text Available Senna (Cassia angustifolia Vahl. is a world's natural laxative medicinal plant. Laxative properties are due to sennosides (anthraquinone glycosides natural products. However, little genetic information is available for this species, especially concerning the biosynthetic pathways of sennosides. We present here the transcriptome sequencing of young and mature leaf tissue of Cassia angustifolia using Illumina MiSeq platform that resulted in a total of 6.34 Gb of raw nucleotide sequence. The sequence assembly resulted in 42230 and 37174 transcripts with an average length of 1119 bp and 1467 bp for young and mature leaf, respectively. The transcripts were annotated using NCBI BLAST with 'green plant database (txid 33090', Swiss Prot, Kyoto Encylcopedia of Genes & Genomes (KEGG, Cluster of Orthologous Gene (COG and Gene Ontology (GO. Out of the total transcripts, 40138 (95.0% and 36349 (97.7% from young and mature leaf, respectively, were annotated by BLASTX against green plant database of NCBI. We used InterProscan to see protein similarity at domain level, a total of 34031 (young leaf and 32077 (mature leaf transcripts were annotated against the Pfam domains. All transcripts from young and mature leaf were assigned to 191 KEGG pathways. There were 166 and 159 CDS, respectively, from young and mature leaf involved in metabolism of terpenoids and polyketides. Many CDS encoding enzymes leading to biosynthesis of sennosides were identified. A total of 10,763 CDS differentially expressing in both young and mature leaf libraries of which 2,343 (21.7% CDS were up-regulated in young compared to mature leaf. Several differentially expressed genes found functionally associated with sennoside biosynthesis. CDS encoding for many CYPs and TF families were identified having probable roles in metabolism of primary as well as secondary metabolites. We developed SSR markers for molecular breeding of senna. We have identified a set of putative genes involved in

  1. Next Generation Sequencing and Transcriptome Analysis Predicts Biosynthetic Pathway of Sennosides from Senna (Cassia angustifolia Vahl.), a Non-Model Plant with Potent Laxative Properties.

    Science.gov (United States)

    Rama Reddy, Nagaraja Reddy; Mehta, Rucha Harishbhai; Soni, Palak Harendrabhai; Makasana, Jayanti; Gajbhiye, Narendra Athamaram; Ponnuchamy, Manivel; Kumar, Jitendra

    2015-01-01

    Senna (Cassia angustifolia Vahl.) is a world's natural laxative medicinal plant. Laxative properties are due to sennosides (anthraquinone glycosides) natural products. However, little genetic information is available for this species, especially concerning the biosynthetic pathways of sennosides. We present here the transcriptome sequencing of young and mature leaf tissue of Cassia angustifolia using Illumina MiSeq platform that resulted in a total of 6.34 Gb of raw nucleotide sequence. The sequence assembly resulted in 42230 and 37174 transcripts with an average length of 1119 bp and 1467 bp for young and mature leaf, respectively. The transcripts were annotated using NCBI BLAST with 'green plant database (txid 33090)', Swiss Prot, Kyoto Encylcopedia of Genes & Genomes (KEGG), Cluster of Orthologous Gene (COG) and Gene Ontology (GO). Out of the total transcripts, 40138 (95.0%) and 36349 (97.7%) from young and mature leaf, respectively, were annotated by BLASTX against green plant database of NCBI. We used InterProscan to see protein similarity at domain level, a total of 34031 (young leaf) and 32077 (mature leaf) transcripts were annotated against the Pfam domains. All transcripts from young and mature leaf were assigned to 191 KEGG pathways. There were 166 and 159 CDS, respectively, from young and mature leaf involved in metabolism of terpenoids and polyketides. Many CDS encoding enzymes leading to biosynthesis of sennosides were identified. A total of 10,763 CDS differentially expressing in both young and mature leaf libraries of which 2,343 (21.7%) CDS were up-regulated in young compared to mature leaf. Several differentially expressed genes found functionally associated with sennoside biosynthesis. CDS encoding for many CYPs and TF families were identified having probable roles in metabolism of primary as well as secondary metabolites. We developed SSR markers for molecular breeding of senna. We have identified a set of putative genes involved in various

  2. Directional RNA deep sequencing sheds new light on the transcriptional response of Anabaena sp. strain PCC 7120 to combined-nitrogen deprivation

    Directory of Open Access Journals (Sweden)

    Head Steven R

    2011-06-01

    Full Text Available Abstract Background Cyanobacteria are potential sources of renewable chemicals and biofuels and serve as model organisms for bacterial photosynthesis, nitrogen fixation, and responses to environmental changes. Anabaena (Nostoc sp. strain PCC 7120 (hereafter Anabaena is a multicellular filamentous cyanobacterium that can "fix" atmospheric nitrogen into ammonia when grown in the absence of a source of combined nitrogen. Because the nitrogenase enzyme is oxygen sensitive, Anabaena forms specialized cells called heterocysts that create a microoxic environment for nitrogen fixation. We have employed directional RNA-seq to map the Anabaena transcriptome during vegetative cell growth and in response to combined-nitrogen deprivation, which induces filaments to undergo heterocyst development. Our data provide an unprecedented view of transcriptional changes in Anabaena filaments during the induction of heterocyst development and transition to diazotrophic growth. Results Using the Illumina short read platform and a directional RNA-seq protocol, we obtained deep sequencing data for RNA extracted from filaments at 0, 6, 12, and 21 hours after the removal of combined nitrogen. The RNA-seq data provided information on transcript abundance and boundaries for the entire transcriptome. From these data, we detected novel antisense transcripts within the UTRs (untranslated regions and coding regions of key genes involved in heterocyst development, suggesting that antisense RNAs may be important regulators of the nitrogen response. In addition, many 5' UTRs were longer than anticipated, sometimes extending into upstream open reading frames (ORFs, and operons often showed complex structure and regulation. Finally, many genes that had not been previously identified as being involved in heterocyst development showed regulation, providing new candidates for future studies in this model organism. Conclusions Directional RNA-seq data were obtained that provide

  3. Transcriptome-wide RNA sequencing analysis of rat skeletal muscle feed arteries. II. Impact of exercise training in obesity

    Science.gov (United States)

    Jenkins, Nathan T.; Thorne, Pamela K.; Martin, Jeffrey S.; Rector, R. Scott; Davis, J. Wade; Laughlin, M. Harold

    2014-01-01

    We employed next-generation RNA sequencing (RNA-Seq) technology to determine the extent to which exercise training alters global gene expression in skeletal muscle feed arteries and aortic endothelial cells of obese Otsuka Long-Evans Tokushima Fatty (OLETF) rats. Transcriptional profiles of the soleus and gastrocnemius muscle feed arteries (SFA and GFA, respectively) and aortic endothelial cell-enriched samples from rats that underwent an endurance exercise training program (EndEx; n = 12) or a interval sprint training program (IST; n = 12) or remained sedentary (Sed; n = 12) were examined. In response to EndEx, there were 39 upregulated (e.g., MANF) and 20 downregulated (e.g., ALOX15) genes in SFA and 1 upregulated (i.e., Wisp2) and 1 downregulated (i.e., Crem) gene in GFA [false discovery rate (FDR) exercise programs. Expression of only two genes (Tubb2b and Slc9a3r2) was altered (i.e., increased) by exercise in all three arteries. The finding that both EndEx and IST produced greater transcriptional changes in the SFA compared with the GFA is intriguing when considering the fact that treadmill bouts of exercise are associated with greater relative increases in blood flow to the gastrocnemius muscle compared with the soleus muscle. PMID:24408995

  4. An RNA Sequencing Transcriptome Analysis of Grasspea (Lathyrus sativus L.) and Development of SSR and KASP Markers.

    Science.gov (United States)

    Hao, Xiaopeng; Yang, Tao; Liu, Rong; Hu, Jinguo; Yao, Yang; Burlyaeva, Marina; Wang, Yan; Ren, Guixing; Zhang, Hongyan; Wang, Dong; Chang, Jianwu; Zong, Xuxiao

    2017-01-01

    Grasspea ( Lathyrus sativus L., 2n = 14) has great agronomic potential because of its ability to survive under extreme conditions, such as drought and flood. However, this legume is less investigated because of its sparse genomic resources and very slow breeding process. In this study, 570 million quality-filtered and trimmed cDNA sequence reads with total length of over 82 billion bp were obtained using the Illumina NextSeq TM 500 platform. Approximately two million contigs and 142,053 transcripts were assembled from our RNA-Seq data, which resulted in 27,431 unigenes with an average length of 1,250 bp and maximum length of 48,515 bp. The unigenes were of high-quality. For example, the stay-green (SGR) gene of grasspea was aligned with the SGR gene of pea with high similarity. Among these unigenes, 3,204 EST-SSR primers were designed, 284 of which were randomly chosen for validation. Of these validated unigenes, 87 (30.6%) EST-SSR primers produced polymorphic amplicons among 43 grasspea accessions selected from different geographical locations. Meanwhile, 146,406 SNPs were screened and 50 SNP loci were randomly chosen for the kompetitive allele-specific PCR (KASP) validation. Over 80% (42) SNP loci were successfully transformed to KASP markers. Comparison of the dendrograms according to the SSR and KASP markers showed that the different marker systems are partially consistent with the dendrogram constructed in our study.

  5. Source coherence impairments in a direct detection direct sequence optical code-division multiple-access system.

    Science.gov (United States)

    Fsaifes, Ihsan; Lepers, Catherine; Lourdiane, Mounia; Gallion, Philippe; Beugin, Vincent; Guignard, Philippe

    2007-02-01

    We demonstrate that direct sequence optical code- division multiple-access (DS-OCDMA) encoders and decoders using sampled fiber Bragg gratings (S-FBGs) behave as multipath interferometers. In that case, chip pulses of the prime sequence codes generated by spreading in time-coherent data pulses can result from multiple reflections in the interferometers that can superimpose within a chip time duration. We show that the autocorrelation function has to be considered as the sum of complex amplitudes of the combined chip as the laser source coherence time is much greater than the integration time of the photodetector. To reduce the sensitivity of the DS-OCDMA system to the coherence time of the laser source, we analyze the use of sparse and nonperiodic quadratic congruence and extended quadratic congruence codes.

  6. Source coherence impairments in a direct detection direct sequence optical code-division multiple-access system

    Science.gov (United States)

    Fsaifes, Ihsan; Lepers, Catherine; Lourdiane, Mounia; Gallion, Philippe; Beugin, Vincent; Guignard, Philippe

    2007-02-01

    We demonstrate that direct sequence optical code- division multiple-access (DS-OCDMA) encoders and decoders using sampled fiber Bragg gratings (S-FBGs) behave as multipath interferometers. In that case, chip pulses of the prime sequence codes generated by spreading in time-coherent data pulses can result from multiple reflections in the interferometers that can superimpose within a chip time duration. We show that the autocorrelation function has to be considered as the sum of complex amplitudes of the combined chip as the laser source coherence time is much greater than the integration time of the photodetector. To reduce the sensitivity of the DS-OCDMA system to the coherence time of the laser source, we analyze the use of sparse and nonperiodic quadratic congruence and extended quadratic congruence codes.

  7. Generation of mast cells from mouse fetus: analysis of differentiation and functionality, and transcriptome profiling using next generation sequencer.

    Directory of Open Access Journals (Sweden)

    Nobuyuki Fukuishi

    Full Text Available While gene knockout technology can reveal the roles of proteins in cellular functions, including in mast cells, fetal death due to gene manipulation frequently interrupts experimental analysis. We generated mast cells from mouse fetal liver (FLMC, and compared the fundamental functions of FLMC with those of bone marrow-derived mouse mast cells (BMMC. Under electron microscopy, numerous small and electron-dense granules were observed in FLMC. In FLMC, the expression levels of a subunit of the FcεRI receptor and degranulation by IgE cross-linking were comparable with BMMC. By flow cytometry we observed surface expression of c-Kit prior to that of FcεRI on FLMC, although on BMMC the expression of c-Kit came after FcεRI. The surface expression levels of Sca-1 and c-Kit, a marker of putative mast cell precursors, were slightly different between bone marrow cells and fetal liver cells, suggesting that differentiation stage or cell type are not necessarily equivalent between both lineages. Moreover, this indicates that phenotypically similar mast cells may not have undergone an identical process of differentiation. By comprehensive analysis using the next generation sequencer, the same frequency of gene expression was observed for 98.6% of all transcripts in both cell types. These results indicate that FLMC could represent a new and useful tool for exploring mast cell differentiation, and may help to elucidate the roles of individual proteins in the function of mast cells where gene manipulation can induce embryonic lethality in the mid to late stages of pregnancy.

  8. Generation of mast cells from mouse fetus: analysis of differentiation and functionality, and transcriptome profiling using next generation sequencer.

    Science.gov (United States)

    Fukuishi, Nobuyuki; Igawa, Yuusuke; Kunimi, Tomoyo; Hamano, Hirofumi; Toyota, Masao; Takahashi, Hironobu; Kenmoku, Hiromichi; Yagi, Yasuyuki; Matsui, Nobuaki; Akagi, Masaaki

    2013-01-01

    While gene knockout technology can reveal the roles of proteins in cellular functions, including in mast cells, fetal death due to gene manipulation frequently interrupts experimental analysis. We generated mast cells from mouse fetal liver (FLMC), and compared the fundamental functions of FLMC with those of bone marrow-derived mouse mast cells (BMMC). Under electron microscopy, numerous small and electron-dense granules were observed in FLMC. In FLMC, the expression levels of a subunit of the FcεRI receptor and degranulation by IgE cross-linking were comparable with BMMC. By flow cytometry we observed surface expression of c-Kit prior to that of FcεRI on FLMC, although on BMMC the expression of c-Kit came after FcεRI. The surface expression levels of Sca-1 and c-Kit, a marker of putative mast cell precursors, were slightly different between bone marrow cells and fetal liver cells, suggesting that differentiation stage or cell type are not necessarily equivalent between both lineages. Moreover, this indicates that phenotypically similar mast cells may not have undergone an identical process of differentiation. By comprehensive analysis using the next generation sequencer, the same frequency of gene expression was observed for 98.6% of all transcripts in both cell types. These results indicate that FLMC could represent a new and useful tool for exploring mast cell differentiation, and may help to elucidate the roles of individual proteins in the function of mast cells where gene manipulation can induce embryonic lethality in the mid to late stages of pregnancy.

  9. The neuropeptides and protein hormones of the agricultural pest fruit fly Bactrocera dorsalis: What do we learn from the genome sequencing and tissue-specific transcriptomes?

    Science.gov (United States)

    Gui, Shun-Hua; Jiang, Hong-Bo; Smagghe, Guy; Wang, Jin-Jun

    2017-12-01

    Neuropeptides and protein hormones are very important signaling molecules, and are involved in the regulation and coordination of various physiological processes in invertebrates and vertebrates. Using a bioinformatics approach, we screened the recently sequenced genome and six tissue-specific transcriptome databases (central nervous system, fat body, ovary, testes, male accessory glands, antennae) of the oriental fruit fly (Bactrocera dorsalis) that is economically one of the most important pest insects of tropical and subtropical fruit. Thirty-nine candidate genes were found to encode neuropeptides or protein hormones. These include most of the known insect neuropeptides and protein hormones, with the exception of adipokinetic hormone-corazonin-related peptide, allatropin, diuretic hormone 34, diuretic hormone 45, IMFamide, inotocin, and sex peptide. Our results showed the neuropeptides and protein hormones of Diptera insects appear to have a reduced repertoire compared to some other insects. Moreover, there are also differences between B. dorsalis and the super-model of Drosophila melanogaster. Interesting features of the oriental fruit fly are the absence of genes coding for sex peptide and the presence of neuroparsin and two genes coding neuropeptide F. The majority of the identified neuropeptides and protein hormones is present in the central nervous system, with only a limited number of these in the other tissues. Moreover, we predicted their physiological functions via comparing with data of FlyBase and FlyAtlas. Taken together, owing to the large number of identified peptides, this study can be used as a reference about structure, tissue distribution and physiological functions for comparative studies in other model and important pest insects. Copyright © 2017 Elsevier Inc. All rights reserved.

  10. The regulatory network of cluster-root function and development in phosphate-deficient white lupin (Lupinus albus) identified by transcriptome sequencing.

    Science.gov (United States)

    Wang, Zhengrui; Straub, Daniel; Yang, Huaiyu; Kania, Angelika; Shen, Jianbo; Ludewig, Uwe; Neumann, Günter

    2014-07-01

    Lupinus albus serves as model plant for root-induced mobilization of sparingly soluble soil phosphates via the formation of cluster-roots (CRs) that mediate secretion of protons, citrate, phenolics and acid phosphatases (APases). This study employed next-generation sequencing to investigate the molecular mechanisms behind these complex adaptive responses at the transcriptome level. We compared different stages of CR development, including pre-emergent (PE), juvenile (JU) and the mature (MA) stages. The results confirmed that the primary metabolism underwent significant modifications during CR maturation, promoting the biosynthesis of organic acids, as had been deduced from physiological studies. Citrate catabolism was downregulated, associated with citrate accumulation in MA clusters. Upregulation of the phenylpropanoid pathway reflected the accumulation of phenolics. Specific transcript expression of ALMT and MATE transporter genes correlated with the exudation of citrate and flavonoids. The expression of transcripts related to nucleotide degradation and APases in MA clusters coincided with the re-mobilization and hydrolysis of organic phosphate resources. Most interestingly, hormone-related gene expression suggested a central role of ethylene during CR maturation. This was associated with the upregulation of the iron (Fe)-deficiency regulated network that mediates ethylene-induced expression of Fe-deficiency responses in other species. Finally, transcripts related to abscisic acid and jasmonic acid were upregulated in MA clusters, while auxin- and brassinosteroid-related genes and cytokinin receptors were most strongly expressed during CR initiation. Key regulations proposed by the RNA-seq data were confirmed by quantitative real-time polymerase chain reaction (RT-qPCR) and some physiological analyses. A model for the gene network regulating CR development and function is presented. © 2014 Scandinavian Plant Physiology Society.

  11. High-Throughput Sequencing of Small RNA Transcriptomes in Maize Kernel Identifies miRNAs Involved in Embryo and Endosperm Development.

    Science.gov (United States)

    Xing, Lijuan; Zhu, Ming; Zhang, Min; Li, Wenzong; Jiang, Haiyang; Zou, Junjie; Wang, Lei; Xu, Miaoyun

    2017-12-14

    Maize kernel development is a complex biological process that involves the temporal and spatial expression of many genes and fine gene regulation at a transcriptional and post-transcriptional level, and microRNAs (miRNAs) play vital roles during this process. To gain insight into miRNA-mediated regulation of maize kernel development, a deep-sequencing technique was used to investigate the dynamic expression of miRNAs in the embryo and endosperm at three developmental stages in B73. By miRNA transcriptomic analysis, we characterized 132 known miRNAs and six novel miRNAs in developing maize kernel, among which, 15 and 14 miRNAs were commonly differentially expressed between the embryo and endosperm at 9 days after pollination (DAP), 15 DAP and 20 DAP respectively. Conserved miRNA families such as miR159, miR160, miR166, miR390, miR319, miR528 and miR529 were highly expressed in developing embryos; miR164, miR171, miR393 and miR2118 were highly expressed in developing endosperm. Genes targeted by those highly expressed miRNAs were found to be largely related to a regulation category, including the transcription, macromolecule biosynthetic and metabolic process in the embryo as well as the vitamin biosynthetic and metabolic process in the endosperm. Quantitative reverse transcription-PCR (qRT-PCR) analysis showed that these miRNAs displayed a negative correlation with the levels of their corresponding target genes. Importantly, our findings revealed that members of the miR169 family were highly and dynamically expressed in the developing kernel, which will help to exploit new players functioning in maize kernel development.

  12. Genome, transcriptome and methylome sequencing of a primitively eusocial wasp reveal a greatly reduced DNA methylation system in a social insect.

    Science.gov (United States)

    Standage, Daniel S; Berens, Ali J; Glastad, Karl M; Severin, Andrew J; Brendel, Volker P; Toth, Amy L

    2016-04-01

    Comparative genomics of social insects has been intensely pursued in recent years with the goal of providing insights into the evolution of social behaviour and its underlying genomic and epigenomic basis. However, the comparative approach has been hampered by a paucity of data on some of the most informative social forms (e.g. incipiently and primitively social) and taxa (especially members of the wasp family Vespidae) for studying social evolution. Here, we provide a draft genome of the primitively eusocial model insect Polistes dominula, accompanied by analysis of caste-related transcriptome and methylome sequence data for adult queens and workers. Polistes dominula possesses a fairly typical hymenopteran genome, but shows very low genomewide GC content and some evidence of reduced genome size. We found numerous caste-related differences in gene expression, with evidence that both conserved and novel genes are related to caste differences. Most strikingly, these -omics data reveal a major reduction in one of the major epigenetic mechanisms that has been previously suggested to be important for caste differences in social insects: DNA methylation. Along with a conspicuous loss of a key gene associated with environmentally responsive DNA methylation (the de novo DNA methyltransferase Dnmt3), these wasps have greatly reduced genomewide methylation to almost zero. In addition to providing a valuable resource for comparative analysis of social insect evolution, our integrative -omics data for this important behavioural and evolutionary model system call into question the general importance of DNA methylation in caste differences and evolution in social insects. © 2016 The Authors. Molecular Ecology Published by John Wiley & Sons Ltd.

  13. Infants learn better from left to right: a directional bias in infants' sequence learning.

    Science.gov (United States)

    Bulf, Hermann; de Hevia, Maria Dolores; Gariboldi, Valeria; Macchi Cassia, Viola

    2017-05-26

    A wealth of studies show that human adults map ordered information onto a directional spatial continuum. We asked whether mapping ordinal information into a directional space constitutes an early predisposition, already functional prior to the acquisition of symbolic knowledge and language. While it is known that preverbal infants represent numerical order along a left-to-right spatial continuum, no studies have investigated yet whether infants, like adults, organize any kind of ordinal information onto a directional space. We investigated whether 7-month-olds' ability to learn high-order rule-like patterns from visual sequences of geometric shapes was affected by the spatial orientation of the sequences (left-to-right vs. right-to-left). Results showed that infants readily learn rule-like patterns when visual sequences were presented from left to right, but not when presented from right to left. This result provides evidence that spatial orientation critically determines preverbal infants' ability to perceive and learn ordered information in visual sequences, opening to the idea that a left-to-right spatially organized mental representation of ordered dimensions might be rooted in biologically-determined constraints on human brain development.

  14. PMS2 gene mutational analysis: direct cDNA sequencing to circumvent pseudogene interference.

    Science.gov (United States)

    Wimmer, Katharina; Wernstedt, Annekatrin

    2014-01-01

    The presence of highly homologous pseudocopies can compromise the mutation analysis of a gene of interest. In particular, when using PCR-based strategies, pseudogene co-amplification has to be effectively prevented. This is often achieved by using primers designed to be parental gene specific according to the reference sequence and by applying stringent PCR conditions. However, there are cases in which this approach is of limited utility. For example, it has been shown that the PMS2 gene exchanges sequences with one of its pseudogenes, named PMS2CL. This results in functional PMS2 alleles containing pseudogene-derived sequences at their 3'-end and in nonfunctional PMS2CL pseudogene alleles that contain gene-derived sequences. Hence, the paralogues cannot be distinguished according to the reference sequence. This shortcoming can be effectively circumvented by using direct cDNA sequencing. This approach is based on the selective amplification of PMS2 transcripts in two overlapping 1.6-kb RT-PCR products. In addition to avoiding pseudogene co-amplification and allele dropout, this method has also the advantage that it allows to effectively identify deletions, splice mutations, and de novo retrotransposon insertions that escape the detection of most DNA-based mutation analysis protocols.

  15. Evaluating next-generation sequencing for direct clinical diagnostics in diarrhoeal disease

    DEFF Research Database (Denmark)

    Joensen, Katrine Grimstrup; Engsbro, A L Ø; Lukjancenko, Oksana

    2017-01-01

    The accurate microbiological diagnosis of diarrhoea involves numerous laboratory tests and, often, the pathogen is not identified in time to guide clinical management. With next-generation sequencing (NGS) becoming cheaper, it has huge potential in routine diagnostics. The aim of this study...... was to evaluate the potential of NGS-based diagnostics through direct sequencing of faecal samples. Fifty-eight clinical faecal samples were obtained from patients with diarrhoea as part of the routine diagnostics at Hvidovre University Hospital, Denmark. Ten samples from healthy individuals were also included...

  16. Engineering of a DNA Polymerase for Direct m6 A Sequencing.

    Science.gov (United States)

    Aschenbrenner, Joos; Werner, Stephan; Marchand, Virginie; Adam, Martina; Motorin, Yuri; Helm, Mark; Marx, Andreas

    2018-01-08

    Methods for the detection of RNA modifications are of fundamental importance for advancing epitranscriptomics. N 6 -methyladenosine (m 6 A) is the most abundant RNA modification in mammalian mRNA and is involved in the regulation of gene expression. Current detection techniques are laborious and rely on antibody-based enrichment of m 6 A-containing RNA prior to sequencing, since m 6 A modifications are generally "erased" during reverse transcription (RT). To overcome the drawbacks associated with indirect detection, we aimed to generate novel DNA polymerase variants for direct m 6 A sequencing. Therefore, we developed a screen to evolve an RT-active KlenTaq DNA polymerase variant that sets a mark for N 6 -methylation. We identified a mutant that exhibits increased misincorporation opposite m 6 A compared to unmodified A. Application of the generated DNA polymerase in next-generation sequencing allowed the identification of m 6 A sites directly from the sequencing data of untreated RNA samples. © 2017 The Authors. Published by Wiley-VCH Verlag GmbH & Co. KGaA.

  17. Verona Coding Definitions of Emotional Sequences (VR-CoDES): Conceptual framework and future directions.

    Science.gov (United States)

    Piccolo, Lidia Del; Finset, Arnstein; Mellblom, Anneli V; Figueiredo-Braga, Margarida; Korsvold, Live; Zhou, Yuefang; Zimmermann, Christa; Humphris, Gerald

    2017-12-01

    To discuss the theoretical and empirical framework of VR-CoDES and potential future direction in research based on the coding system. The paper is based on selective review of papers relevant to the construction and application of VR-CoDES. VR-CoDES system is rooted in patient-centered and biopsychosocial model of healthcare consultations and on a functional approach to emotion theory. According to the VR-CoDES, emotional interaction is studied in terms of sequences consisting of an eliciting event, an emotional expression by the patient and the immediate response by the clinician. The rationale for the emphasis on sequences, on detailed classification of cues and concerns, and on the choices of explicit vs. non-explicit responses and providing vs. reducing room for further disclosure, as basic categories of the clinician responses, is described. Results from research on VR-CoDES may help raise awareness of emotional sequences. Future directions in applying VR-CoDES in research may include studies on predicting patient and clinician behavior within the consultation, qualitative analyses of longer sequences including several VR-CoDES triads, and studies of effects of emotional communication on health outcomes. VR-CoDES may be applied to develop interventions to promote good handling of patients' emotions in healthcare encounters. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. Gradient-recalled echo sequences in direct shoulder MR arthrography for evaluating the labrum

    International Nuclear Information System (INIS)

    Lee, Marc J.; Motamedi, Kambiz; Chow, Kira; Seeger, Leanne L.

    2008-01-01

    The purpose of this study was to determine the utility of fat-suppressed gradient-recalled echo (GRE) compared with conventional spin echo T1-weighted (T1W) sequences in direct shoulder MR arthrography for evaluating labral tears. Three musculoskeletal radiologists retrospectively reviewed MR arthrograms performed over a 12-month period for which surgical correlation was available. Of 180 serial arthrograms, 31 patients had surgery with a mean of 48 days following imaging. Paired coronal oblique and axial T1W or GRE sequences were analyzed by consensus for labral tear (coronal oblique two-dimensional multi-echo data image combination, 2D MEDIC; and axial three-dimensional double-echo steady-state, 3D DESS; Siemens MAGNETOM Sonata 1.5-T MR system). Interpretations were correlated with operative reports. Of 31 shoulders, 25 had labral tears at surgery. The GRE sequences depicted labral tears in 22, while T1W images depicted tears in 16 (sensitivity 88% versus 64%; p 0.7). Specificities were somewhat lower for GRE. Thin section GRE sequences are more sensitive than T1W for the detection of anterior and posterior labral tears. As the specificity of GRE was lower, it should be considered as an adjunctive imaging sequence that may improve depiction of labral tears, particularly smaller tears, in routine MR arthrography protocols. (orig.)

  19. Synthetic biology for the directed evolution of protein biocatalysts: navigating sequence space intelligently

    Science.gov (United States)

    Currin, Andrew; Swainston, Neil; Day, Philip J.

    2015-01-01

    The amino acid sequence of a protein affects both its structure and its function. Thus, the ability to modify the sequence, and hence the structure and activity, of individual proteins in a systematic way, opens up many opportunities, both scientifically and (as we focus on here) for exploitation in biocatalysis. Modern methods of synthetic biology, whereby increasingly large sequences of DNA can be synthesised de novo, allow an unprecedented ability to engineer proteins with novel functions. However, the number of possible proteins is far too large to test individually, so we need means for navigating the ‘search space’ of possible protein sequences efficiently and reliably in order to find desirable activities and other properties. Enzymologists distinguish binding (K d) and catalytic (k cat) steps. In a similar way, judicious strategies have blended design (for binding, specificity and active site modelling) with the more empirical methods of classical directed evolution (DE) for improving k cat (where natural evolution rarely seeks the highest values), especially with regard to residues distant from the active site and where the functional linkages underpinning enzyme dynamics are both unknown and hard to predict. Epistasis (where the ‘best’ amino acid at one site depends on that or those at others) is a notable feature of directed evolution. The aim of this review is to highlight some of the approaches that are being developed to allow us to use directed evolution to improve enzyme properties, often dramatically. We note that directed evolution differs in a number of ways from natural evolution, including in particular the available mechanisms and the likely selection pressures. Thus, we stress the opportunities afforded by techniques that enable one to map sequence to (structure and) activity in silico, as an effective means of modelling and exploring protein landscapes. Because known landscapes may be assessed and reasoned about as a whole

  20. Characterization of the Pathogenicity of Streptococcus intermedius TYG1620 Isolated from a Human Brain Abscess Based on the Complete Genome Sequence with Transcriptome Analysis and Transposon Mutagenesis in a Murine Subcutaneous Abscess Model.

    Science.gov (United States)

    Hasegawa, Noriko; Sekizuka, Tsuyoshi; Sugi, Yutaka; Kawakami, Nobuhiro; Ogasawara, Yumiko; Kato, Kengo; Yamashita, Akifumi; Takeuchi, Fumihiko; Kuroda, Makoto

    2017-02-01

    Streptococcus intermedius is known to cause periodontitis and pyogenic infections in the brain and liver. Here we report the complete genome sequence of strain TYG1620 (genome size, 2,006,877 bp; GC content, 37.6%; 2,020 predicted open reading frames [ORFs]) isolated from a brain abscess in an infant. Comparative analysis of S. intermedius genome sequences suggested that TYG1620 carries a notable type VII secretion system (T7SS), two long repeat regions, and 19 ORFs for cell wall-anchored proteins (CWAPs). To elucidate the genes responsible for the pathogenicity of TYG1620, transcriptome analysis was performed in a murine subcutaneous abscess model. The results suggest that the levels of expression of small hypothetical proteins similar to phenol-soluble modulin β1 (PSMβ1), a staphylococcal virulence factor, significantly increased in the abscess model. In addition, an experiment in a murine subcutaneous abscess model with random transposon (Tn) mutant attenuation suggested that Tn mutants with mutations in 212 ORFs in the Tn mutant library were attenuated in the murine abscess model (629 ORFs were disrupted in total); the 212 ORFs are putatively essential for abscess formation. Transcriptome analysis identified 37 ORFs, including paralogs of the T7SS and a putative glucan-binding CWAP in long repeat regions, to be upregulated and attenuated in vivo This study provides a comprehensive characterization of S. intermedius pathogenicity based on the complete genome sequence and a murine subcutaneous abscess model with transcriptome and Tn mutagenesis, leading to the identification of pivotal targets for vaccines or antimicrobial agents for the control of S. intermedius infections. Copyright © 2017 American Society for Microbiology.

  1. Changes in the transcriptome of the human endometrial Ishikawa cancer cell line induced by estrogen, progesterone, tamoxifen, and mifepristone (RU486 as detected by RNA-sequencing.

    Directory of Open Access Journals (Sweden)

    Karin Tamm-Rosenstein

    Full Text Available BACKGROUND: Estrogen (E2 and progesterone (P4 are key players in the maturation of the human endometrium. The corresponding steroid hormone modulators, tamoxifen (TAM and mifepristone (RU486 are widely used in breast cancer therapy and for contraception purposes, respectively. METHODOLOGY/PRINCIPAL FINDINGS: Gene expression profiling of the human endometrial Ishikawa cancer cell line treated with E2 and P4 for 3 h and 12 h, and TAM and RU486 for 12 h, was performed using RNA-sequencing. High levels of mRNA were detected for genes, including PSAP, ATP5G2, ATP5H, and GNB2L1 following E2 or P4 treatment. A total of 82 biomarkers for endometrial biology were identified among E2 induced genes, and 93 among P4 responsive genes. Identified biomarkers included: EZH2, MDK, MUC1, SLIT2, and IL6ST, which are genes previously associated with endometrial receptivity. Moreover, 98.8% and 98.6% of E2 and P4 responsive genes in Ishikawa cells, respectively, were also detected in two human mid-secretory endometrial biopsy samples. TAM treatment exhibited both antagonistic and agonistic effects of E2, and also regulated a subset of genes independently. The cell cycle regulator cyclin D1 (CCND1 showed significant up-regulation following treatment with TAM. RU486 did not appear to act as a pure antagonist of P4 and a functional analysis of RU486 response identified genes related to adhesion and apoptosis, including down-regulated genes associated with cell-cell contacts and adhesion as CTNND1, JUP, CDH2, IQGAP1, and COL2A1. CONCLUSIONS: Significant changes in gene expression by the Ishikawa cell line were detected after treatments with E2, P4, TAM, and RU486. These transcriptome data provide valuable insight into potential biomarkers related to endometrial receptivity, and also facilitate an understanding of the molecular changes that take place in the endometrium in the early stages of breast cancer treatment and contraception usage.

  2. Transcriptome sequencing of Atlantic salmon (Salmo salar L.) notochord prior to development of the vertebrae provides clues to regulation of positional fate, chordoblast lineage and mineralisation.

    Science.gov (United States)

    Wang, Shou; Furmanek, Tomasz; Kryvi, Harald; Krossøy, Christel; Totland, Geir K; Grotmol, Sindre; Wargelius, Anna

    2014-02-19

    In teleosts such as Atlantic salmon (Salmo salar L.), segmentation and subsequent mineralisation of the notochord during embryonic stages are essential for normal vertebrae formation. However, the molecular mechanisms leading to segmentation and mineralisation of the notochord are poorly understood. The aim of this study was to identify genes/pathways acting in gradients over time and along the anterior-posterior axis during notochord segmentation and immediately prior to mineralisation of the vertebral bodies in Atlantic salmon. Notochord samples were collected from unsegmented, pre-segmented and segmented developmental stages. In each stage, the cellular core of the notochord was cut into three pieces along the longitudinal axis (anterior, mid, posterior). RNA was sequenced (22 million pair-end 100 bp/ library) and mapped to the salmon genome. 66569 transcripts were predicted and 55775 were annotated. In order to identify possible gradients leading to segmentation of the notochord, all 71 notochord-expressed hox genes were investigated, most of them displaying a typical anterior-posterior expression pattern along the notochord axis. The clustering of hox genes revealed a pattern that could be related to notochord segmentation. We further investigated how mineralisation is initiated in the notochord, and several factors related to chondrogenic lineage were identified (sox9, sox5, sox6, tgfb3, ihhb and col2a1), suggesting a cartilage-like character of the notochord. KEGG analysis of differentially expressed genes between stages revealed down-regulation of pathways associated with ECM, cell division, metabolism and development at onset of notochord segmentation. This implies that inhibitory signals produce segmentation of the notochord. One such potential inhibitory signal was identified, col11a2, which was detected in segments of non-mineralising notochord. An incomplete salmon genome was successfully used to analyse RNA-seq data from the cellular core of the

  3. Developmental Transcriptome for a Facultatively Eusocial Bee, Megalopta genalis

    OpenAIRE

    Jones, Beryl M.; Wcislo, William T.; Robinson, Gene E.

    2015-01-01

    Transcriptomes provide excellent foundational resources for mechanistic and evolutionary analyses of complex traits. We present a developmental transcriptome for the facultatively eusocial bee Megalopta genalis, which represents a potential transition point in the evolution of eusociality. A de novo transcriptome assembly of Megalopta genalis was generated using paired-end Illumina sequencing and the Trinity assembler. Males and females of all life stages were aligned to this transcriptome fo...

  4. A Chaos-Based Secure Direct-Sequence/Spread-Spectrum Communication System

    Directory of Open Access Journals (Sweden)

    Nguyen Xuan Quyen

    2013-01-01

    Full Text Available This paper proposes a chaos-based secure direct-sequence/spread-spectrum (DS/SS communication system which is based on a novel combination of the conventional DS/SS and chaos techniques. In the proposed system, bit duration is varied according to a chaotic behavior but is always equal to a multiple of the fixed chip duration in the communication process. Data bits with variable duration are spectrum-spread by multiplying directly with a pseudonoise (PN sequence and then modulated onto a sinusoidal carrier by means of binary phase-shift keying (BPSK. To recover exactly the data bits, the receiver needs an identical regeneration of not only the PN sequence but also the chaotic behavior, and hence data security is improved significantly. Structure and operation of the proposed system are analyzed in detail. Theoretical evaluation of bit-error rate (BER performance in presence of additive white Gaussian noise (AWGN is provided. Parameter choice for different cases of simulation is also considered. Simulation and theoretical results are shown to verify the reliability and feasibility of the proposed system. Security of the proposed system is also discussed.

  5. The transcriptome of Utricularia vulgaris, a rootless plant with minimalist genome, reveals extreme alternative splicing and only moderate sequence similarity with Utricularia gibba

    Czech Academy of Sciences Publication Activity Database

    Bárta, J.; Stone, James D.; Pech, J.; Sirová, D.; Adamec, Lubomír; Campbell, M. A.; Štorchová, H.

    2015-01-01

    Roč. 15, MAR 7 (2015), s. 1-14, no. 78 ISSN 1471-2229 R&D Projects: GA ČR(CZ) GAP504/11/0783 Institutional support: RVO:67985939 Keywords : transcriptome * root-associated genes * alternative splicing Subject RIV: EF - Botanics Impact factor: 3.631, year: 2015

  6. The transcriptome of Utricularia vulgaris, a rootless plant with minimalist genome, reveals extreme alternative splicing and only moderate sequence similarity with Utricularia gibba

    Czech Academy of Sciences Publication Activity Database

    Bárta, J.; Stone, James D.; Pech, J.; Sirová, D.; Adamec, L.; Campbell, M. A.; Štorchová, Helena

    2015-01-01

    Roč. 15, MAR 7 2015 (2015) ISSN 1471-2229 R&D Projects: GA ČR(CZ) GAP504/11/0783 Institutional support: RVO:61389030 Keywords : Transcriptome * Root-associated genes * Alternative splicing Subject RIV: EF - Botanics Impact factor: 3.631, year: 2015

  7. Microbiome and ecotypic adaption of Holcus lanatus (L.) to extremes of its soil pH range, investigated through transcriptome sequencing.

    Science.gov (United States)

    Young, Ellen; Carey, Manus; Meharg, Andrew A; Meharg, Caroline

    2018-03-20

    Plants can adapt to edaphic stress, such as nutrient deficiency, toxicity and biotic challenges, by controlled transcriptomic responses, including microbiome interactions. Traditionally studied in model plant species with controlled microbiota inoculation treatments, molecular plant-microbiome interactions can be functionally investigated via RNA-Seq. Complex, natural plant-microbiome studies are limited, typically focusing on microbial rRNA and omitting functional microbiome investigations, presenting a fundamental knowledge gap. Here, root and shoot meta-transcriptome analyses, in tandem with shoot elemental content and root staining, were employed to investigate transcriptome responses in the wild grass Holcus lanatus and its associated natural multi-species eukaryotic microbiome. A full factorial reciprocal soil transplant experiment was employed, using plant ecotypes from two widely contrasting natural habitats, acid bog and limestone quarry soil, to investigate naturally occurring, and ecologically meaningful, edaphically driven molecular plant-microbiome interactions. Arbuscular mycorrhizal (AM) and non-AM fungal colonization was detected in roots in both soils. Staining showed greater levels of non-AM fungi, and transcriptomics indicated a predominance of Ascomycota-annotated genes. Roots in acid bog soil were dominated by Phialocephala-annotated transcripts, a putative growth-promoting endophyte, potentially involved in N nutrition and ion homeostasis. Limestone roots in acid bog soil had greater expression of other Ascomycete genera and Oomycetes and lower expression of Phialocephala-annotated transcripts compared to acid ecotype roots, which corresponded with reduced induction of pathogen defense processes, particularly lignin biosynthesis in limestone ecotypes. Ascomycota dominated in shoots and limestone soil roots, but Phialocephala-annotated transcripts were insignificant, and no single Ascomycete genus dominated. Fusarium-annotated transcripts were

  8. Gradient-recalled echo sequences in direct shoulder MR arthrography for evaluating the labrum

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Marc J.; Motamedi, Kambiz; Chow, Kira; Seeger, Leanne L. [David Geffen School of Medicine at UCLA, Department of Radiology, 200 UCLA Medical Plaza, Suite 165-59, Box 956952, Los Angeles, CA (United States)

    2008-01-15

    The purpose of this study was to determine the utility of fat-suppressed gradient-recalled echo (GRE) compared with conventional spin echo T1-weighted (T1W) sequences in direct shoulder MR arthrography for evaluating labral tears. Three musculoskeletal radiologists retrospectively reviewed MR arthrograms performed over a 12-month period for which surgical correlation was available. Of 180 serial arthrograms, 31 patients had surgery with a mean of 48 days following imaging. Paired coronal oblique and axial T1W or GRE sequences were analyzed by consensus for labral tear (coronal oblique two-dimensional multi-echo data image combination, 2D MEDIC; and axial three-dimensional double-echo steady-state, 3D DESS; Siemens MAGNETOM Sonata 1.5-T MR system). Interpretations were correlated with operative reports. Of 31 shoulders, 25 had labral tears at surgery. The GRE sequences depicted labral tears in 22, while T1W images depicted tears in 16 (sensitivity 88% versus 64%; p < 0.05). Subdividing the labrum, GRE was significantly more sensitive for the posterior labrum (75% versus 25%; p < 0.05) with a trend toward greater sensitivity at the anterior labrum (78% versus 56%; p = 0.157) but not significantly different for the superior labrum (50% versus 57%; p > 0.7). Specificities were somewhat lower for GRE. Thin section GRE sequences are more sensitive than T1W for the detection of anterior and posterior labral tears. As the specificity of GRE was lower, it should be considered as an adjunctive imaging sequence that may improve depiction of labral tears, particularly smaller tears, in routine MR arthrography protocols. (orig.)

  9. The floral transcriptome of Eucalyptus grandis

    CSIR Research Space (South Africa)

    Vining, KJ

    2015-10-01

    Full Text Available As a step toward functional annotation of genes required for floral initiation and development within the Eucalyptus genome, we used short read sequencing to analyze transcriptomes of floral buds from early and late developmental stages...

  10. TCW: transcriptome computational workbench.

    Directory of Open Access Journals (Sweden)

    Carol Soderlund

    Full Text Available BACKGROUND: The analysis of transcriptome data involves many steps and various programs, along with organization of large amounts of data and results. Without a methodical approach for storage, analysis and query, the resulting ad hoc analysis can lead to human error, loss of data and results, inefficient use of time, and lack of verifiability, repeatability, and extensibility. METHODOLOGY: The Transcriptome Computational Workbench (TCW provides Java graphical interfaces for methodical analysis for both single and comparative transcriptome data without the use of a reference genome (e.g. for non-model organisms. The singleTCW interface steps the user through importing transcript sequences (e.g. Illumina or assembling long sequences (e.g. Sanger, 454, transcripts, annotating the sequences, and performing differential expression analysis using published statistical programs in R. The data, metadata, and results are stored in a MySQL database. The multiTCW interface builds a comparison database by importing sequence and annotation from one or more single TCW databases, executes the ESTscan program to translate the sequences into proteins, and then incorporates one or more clusterings, where the clustering options are to execute the orthoMCL program, compute transitive closure, or import clusters. Both singleTCW and multiTCW allow extensive query and display of the results, where singleTCW displays the alignment of annotation hits to transcript sequences, and multiTCW displays multiple transcript alignments with MUSCLE or pairwise alignments. The query programs can be executed on the desktop for fastest analysis, or from the web for sharing the results. CONCLUSION: It is now affordable to buy a multi-processor machine, and easy to install Java and MySQL. By simply downloading the TCW, the user can interactively analyze, query and view their data. The TCW allows in-depth data mining of the results, which can lead to a better understanding of the

  11. An adaptive digital suppression filter for direct-sequence spread-spectrum communications

    Science.gov (United States)

    Saulnier, G. J.; Das, P. K.; Milstein, L. B.

    1985-09-01

    This paper describes the structure of a digital implementation of the Widrow-Hoff LMS algorithm which uses a burst processing technique to obtain some hardware simplification. This adaptive system is used to suppress narrow-band interference in a direct-sequence spread-spectrum communication system. Several different narrow-band interferers are considered, and probability of error results are presented for all cases. While, in general, the results show significant improvement in performance when the LMS algorithm is used, certain disadvantages are also present and are discussed in this paper.

  12. Underground localization using dual magnetic field sequence measurement and pose graph SLAM for directional drilling

    International Nuclear Information System (INIS)

    Park, Byeolteo; Myung, Hyun

    2014-01-01

    With the development of unconventional gas, the technology of directional drilling has become more advanced. Underground localization is the key technique of directional drilling for real-time path following and system control. However, there are problems such as vibration, disconnection with external infrastructure, and magnetic field distortion. Conventional methods cannot solve these problems in real time or in various environments. In this paper, a novel underground localization algorithm using a re-measurement of the sequence of the magnetic field and pose graph SLAM (simultaneous localization and mapping) is introduced. The proposed algorithm exploits the property of the drilling system that the body passes through the previous pass. By comparing the recorded measurement from one magnetic sensor and the current re-measurement from another magnetic sensor, the proposed algorithm predicts the pose of the drilling system. The performance of the algorithm is validated through simulations and experiments. (paper)

  13. Underground localization using dual magnetic field sequence measurement and pose graph SLAM for directional drilling

    Science.gov (United States)

    Park, Byeolteo; Myung, Hyun

    2014-12-01

    With the development of unconventional gas, the technology of directional drilling has become more advanced. Underground localization is the key technique of directional drilling for real-time path following and system control. However, there are problems such as vibration, disconnection with external infrastructure, and magnetic field distortion. Conventional methods cannot solve these problems in real time or in various environments. In this paper, a novel underground localization algorithm using a re-measurement of the sequence of the magnetic field and pose graph SLAM (simultaneous localization and mapping) is introduced. The proposed algorithm exploits the property of the drilling system that the body passes through the previous pass. By comparing the recorded measurement from one magnetic sensor and the current re-measurement from another magnetic sensor, the proposed algorithm predicts the pose of the drilling system. The performance of the algorithm is validated through simulations and experiments.

  14. An Extended Multilocus Sequence Typing (MLST Scheme for Rapid Direct Typing of Leptospira from Clinical Samples.

    Directory of Open Access Journals (Sweden)

    Sabrina Weiss

    2016-09-01

    Full Text Available Rapid typing of Leptospira is currently impaired by requiring time consuming culture of leptospires. The objective of this study was to develop an assay that provides multilocus sequence typing (MLST data direct from patient specimens while minimising costs for subsequent sequencing.An existing PCR based MLST scheme was modified by designing nested primers including anchors for facilitated subsequent sequencing. The assay was applied to various specimen types from patients diagnosed with leptospirosis between 2014 and 2015 in the United Kingdom (UK and the Lao Peoples Democratic Republic (Lao PDR. Of 44 clinical samples (23 serum, 6 whole blood, 3 buffy coat, 12 urine PCR positive for pathogenic Leptospira spp. at least one allele was amplified in 22 samples (50% and used for phylogenetic inference. Full allelic profiles were obtained from ten specimens, representing all sample types (23%. No nonspecific amplicons were observed in any of the samples. Of twelve PCR positive urine specimens three gave full allelic profiles (25% and two a partial profile. Phylogenetic analysis allowed for species assignment. The predominant species detected was L. interrogans (10/14 and 7/8 from UK and Lao PDR, respectively. All other species were detected in samples from only one country (Lao PDR: L. borgpetersenii [1/8]; UK: L. kirschneri [1/14], L. santarosai [1/14], L. weilii [2/14].Typing information of pathogenic Leptospira spp. was obtained directly from a variety of clinical samples using a modified MLST assay. This assay negates the need for time-consuming culture of Leptospira prior to typing and will be of use both in surveillance, as single alleles enable species determination, and outbreaks for the rapid identification of clusters.

  15. Current Knowledge and Recent Advances in Marine Dinoflagellate Transcriptomic Research

    Directory of Open Access Journals (Sweden)

    Muhamad Afiq Akbar

    2018-02-01

    Full Text Available Dinoflagellates are essential components in marine ecosystems, and they possess two dissimilar flagella to facilitate movement. Dinoflagellates are major components of marine food webs and of extreme importance in balancing the ecosystem energy flux in oceans. They have been reported to be the primary cause of harmful algae bloom (HABs events around the world, causing seafood poisoning and therefore having a direct impact on human health. Interestingly, dinoflagellates in the genus Symbiodinium are major components of coral reef foundations. Knowledge regarding their genes and genome organization is currently limited due to their large genome size and other genetic and cytological characteristics that hinder whole genome sequencing of dinoflagellates. Transcriptomic approaches and genetic analyses have been employed to unravel the physiological and metabolic characteristics of dinoflagellates and their complexity. In this review, we summarize the current knowledge and findings from transcriptomic studies to understand the cell growth, effects on environmental stress, toxin biosynthesis, dynamic of HABs, phylogeny and endosymbiosis of dinoflagellates. With the advancement of high throughput sequencing technologies and lower cost of sequencing, transcriptomic approaches will likely deepen our understanding in other aspects of dinoflagellates’ molecular biology such as gene functional analysis, systems biology and development of model organisms.

  16. De novo transcriptome sequencing and comparative analysis of midgut tissues of four non-model insects pertaining to Hemiptera, Coleoptera, Diptera and Lepidoptera.

    Science.gov (United States)

    Gazara, Rajesh K; Cardoso, Christiane; Bellieny-Rabelo, Daniel; Ferreira, Clélia; Terra, Walter R; Venancio, Thiago M

    2017-09-05

    Despite the great morphological diversity of insects, there is a regularity in their digestive functions, which is apparently related to their physiology. In the present work we report the de novo midgut transcriptomes of four non-model insects from four distinct orders: Spodoptera frugiperda (Lepidoptera), Musca domestica (Diptera), Tenebrio molitor (Coleoptera) and Dysdercus peruvianus (Hemiptera). We employed a computational strategy to merge assemblies obtained with two different algorithms, which substantially increased the quality of the final transcriptomes. Unigenes were annotated and analyzed using the eggNOG database, which allowed us to assign some level of functional and evolutionary information to 79.7% to 93.1% of the transcriptomes. We found interesting transcriptional patterns, such as: i) the intense use of lysozymes in digestive functions of M. domestica larvae, which are streamlined and adapted to feed on bacteria; ii) the up-regulation of orthologous UDP-glycosyl transferase and cytochrome P450 genes in the whole midguts different species, supporting the existence of an ancient defense frontline to counter xenobiotics; iii) evidence supporting roles for juvenile hormone binding proteins in the midgut physiology, probably as a way to activate genes that help fight anti-nutritional substances (e.g. protease inhibitors). The results presented here shed light on the digestive and structural properties of the digestive systems of these distantly related species. Furthermore, the produced datasets will also be useful for scientists studying these insects. Copyright © 2017. Published by Elsevier B.V.

  17. Genome sequencing and transcriptome analysis of Trichoderma reesei QM9978 strain reveals a distal chromosome translocation to be responsible for loss of vib1 expression and loss of cellulase induction.

    Science.gov (United States)

    Ivanova, Christa; Ramoni, Jonas; Aouam, Thiziri; Frischmann, Alexa; Seiboth, Bernhard; Baker, Scott E; Le Crom, Stéphane; Lemoine, Sophie; Margeot, Antoine; Bidard, Frédérique

    2017-01-01

    The hydrolysis of biomass to simple sugars used for the production of biofuels in biorefineries requires the action of cellulolytic enzyme mixtures. During the last 50 years, the ascomycete Trichoderma reesei , the main source of industrial cellulase and hemicellulase cocktails, has been subjected to several rounds of classical mutagenesis with the aim to obtain higher production levels. During these random genetic events, strains unable to produce cellulases were generated. Here, whole genome sequencing and transcriptomic analyses of the cellulase-negative strain QM9978 were used for the identification of mutations underlying this cellulase-negative phenotype. Sequence comparison of the cellulase-negative strain QM9978 to the reference strain QM6a identified a total of 43 mutations, of which 33 were located either close to or in coding regions. From those, we identified 23 single-nucleotide variants, nine InDels, and one translocation. The translocation occurred between chromosomes V and VII, is located upstream of the putative transcription factor vib1 , and abolishes its expression in QM9978 as detected during the transcriptomic analyses. Ectopic expression of vib1 under the control of its native promoter as well as overexpression of vib1 under the control of a strong constitutive promoter restored cellulase expression in QM9978, thus confirming that the translocation event is the reason for the cellulase-negative phenotype. Gene deletion of vib1 in the moderate producer strain QM9414 and in the high producer strain Rut-C30 reduced cellulase expression in both cases. Overexpression of vib1 in QM9414 and Rut-C30 had no effect on cellulase production, most likely because vib1 is already expressed at an optimal level under normal conditions. We were able to establish a link between a chromosomal translocation in QM9978 and the cellulase-negative phenotype of the strain. We identified the transcription factor vib1 as a key regulator of cellulases in T. reesei whose

  18. Development of a Single Locus Sequence Typing (SLST) Scheme for Typing Bacterial Species Directly from Complex Communities.

    Science.gov (United States)

    Scholz, Christian F P; Jensen, Anders

    2017-01-01

    The protocol describes a computational method to develop a Single Locus Sequence Typing (SLST) scheme for typing bacterial species. The resulting scheme can be used to type bacterial isolates as well as bacterial species directly from complex communities using next-generation sequencing technologies.

  19. Complete Genome Sequence of the Goatpox Virus Strain Gorgan Obtained Directly from a Commercial Live Attenuated Vaccine

    Science.gov (United States)

    Mathijs, Elisabeth; Vandenbussche, Frank; Haegeman, Andy; Al-Majali, Ahmad; De Clercq, Kris

    2016-01-01

    This is a report of the complete genome sequence of the goatpox virus strain Gorgan, which was obtained directly from a commercial live attenuated vaccine (Caprivac, Jordan Bio-Industries Centre). PMID:27738031

  20. Collective properties of injection-induced earthquake sequences: 1. Model description and directivity bias

    Science.gov (United States)

    Dempsey, David; Suckale, Jenny

    2016-05-01

    Induced seismicity is of increasing concern for oil and gas, geothermal, and carbon sequestration operations, with several M > 5 events triggered in recent years. Modeling plays an important role in understanding the causes of this seismicity and in constraining seismic hazard. Here we study the collective properties of induced earthquake sequences and the physics underpinning them. In this first paper of a two-part series, we focus on the directivity ratio, which quantifies whether fault rupture is dominated by one (unilateral) or two (bilateral) propagating fronts. In a second paper, we focus on the spatiotemporal and magnitude-frequency distributions of induced seismicity. We develop a model that couples a fracture mechanics description of 1-D fault rupture with fractal stress heterogeneity and the evolving pore pressure distribution around an injection well that triggers earthquakes. The extent of fault rupture is calculated from the equations of motion for two tips of an expanding crack centered at the earthquake hypocenter. Under tectonic loading conditions, our model exhibits a preference for unilateral rupture and a normal distribution of hypocenter locations, two features that are consistent with seismological observations. On the other hand, catalogs of induced events when injection occurs directly onto a fault exhibit a bias toward ruptures that propagate toward the injection well. This bias is due to relatively favorable conditions for rupture that exist within the high-pressure plume. The strength of the directivity bias depends on a number of factors including the style of pressure buildup, the proximity of the fault to failure and event magnitude. For injection off a fault that triggers earthquakes, the modeled directivity bias is small and may be too weak for practical detection. For two hypothetical injection scenarios, we estimate the number of earthquake observations required to detect directivity bias.

  1. Direct sequencing of mitochondrial DNA detects highly divergent haplotypes in blue marlin (Makaira nigricans).

    Science.gov (United States)

    Finnerty, J R; Block, B A

    1992-06-01

    We were able to differentiate between species of billfish (Istiophoridae family) and to detect considerable intraspecific variation in the blue marlin (Makaira nigricans) by directly sequencing a polymerase chain reaction (PCR)-amplified, 612-bp fragment of the mitochondrial cytochrome b gene. Thirteen variable nucleotide sites separated blue marlin (n = 26) into 7 genotypes. On average, these genotypes differed by 5.7 base substitutions. A smaller sample of swordfish from an equally broad geographic distribution displayed relatively little intraspecific variation, with an average of 1.3 substitutions separating different genotypes. A cladistic analysis of blue marlin cytochrome b variants indicates two major divergent evolutionary lines within the species. The frequencies of these two major evolutionary lines differ significantly between Atlantic and Pacific ocean basins. This finding is important given that the Atlantic stocks of blue marlin are considered endangered. Migration from the Pacific can help replenish the numbers of blue marlin in the Atlantic, but the loss of certain mitochondrial DNA haplotypes in the Atlantic due to overfishing probably could not be remedied by an influx of Pacific fish because of their absence in the Pacific population. Fishery management strategies should attempt to preserve the genetic diversity within the species. The detection of DNA sequence polymorphism indicates the utility of PCR technology in pelagic fishery genetics.

  2. A Remote Direct Sequence Spread Spectrum Communications Lab Utilising the Emona DATEx

    Directory of Open Access Journals (Sweden)

    Cosmas Mwikirize

    2012-12-01

    Full Text Available Remote labs have become popular learning aids due to their versatility and considerable ease of utilisation as compared to their physical counterparts. At Makerere University, the remote labs are based on the standard Massachusetts Institute of Technology (MIT iLabs Shared Architecture (ISA - a scalable and generic platform. Presented in this paper is such a lab, addressing the key practical aspects of Direct Sequence Spread Spectrum (DSSS communication. The lab is built on the National Instruments Educational Laboratory Virtual Instrumentation Suite (NI ELVIS with the Emona Digital and Analog Telecommunications Experimenter (DATEx add-on board. It also incorporates switching hardware. The lab facilitates real-time control of the equipment, with users able to set, manipulate and observe signal parameters in both the frequency and the time domains. Simulation and data Acquisition modes of the experiment are supported to provide a richer learning experience.

  3. Electrochemical direct immobilization of DNA sequences for label-free herpes virus detection

    Science.gov (United States)

    Tam, Phuong Dinh; Trung, Tran; Tuan, Mai Anh; Chien, Nguyen Duc

    2009-09-01

    DNA sequences/bio-macromolecules of herpes virus (5'-AT CAC CGA CCC GGA GAG GGA C-3') were directly immobilized into polypyrrole matrix by using the cyclic voltammetry method, and grafted onto arrays of interdigitated platinum microelectrodes. The morphology surface of the obtained PPy/DNA of herpes virus composite films was investigated by a FESEM Hitachi-S 4800. Fourier transform infrared spectroscopy (FTIR) was used to characterize the PPy/DNA film and to study the specific interactions that may exist between DNA biomacromolecules and PPy chains. Attempts are made to use these PPy/DNA composite films for label-free herpes virus detection revealed a response time of 60 s in solutions containing as low as 2 nM DNA concentration, and self life of six months when immerged in double distilled water and kept refrigerated.

  4. Chromatic dispersion compensation and Coherent Direct-Sequence OCDMA operation on a single super structured FBG.

    Science.gov (United States)

    Baños, Rocío; Pastor, Daniel; Amaya, Waldimar; Garcia-Munoz, Victor

    2012-06-18

    We have proposed, fabricated and demonstrated experimentally a set of Coherent Direct Sequence-OCDMA en/decoders based on Super Structured Fiber Bragg Gratings (SSFBGs) which are able to compensate the fiber chromatic dispersion at the same time that they perform the en/decoding task. The proposed devices avoid the use of additional dispersion compensation stages reducing system complexity and losses. This performance was evaluated for 5.4, 11.4 and 16.8 km of SSMF. The twofold performance was verified in Low Reflectivity regime employing only one GVD compensating device at decoder or sharing out the function between encoder and decoder devices. Shared functionality requires shorter SSFBGs designs and also provides added flexibility to the optical network design. Moreover, dispersion compensated en/decoders were also designed into the High Reflectivity regime employing synthesis methods achieving more than 9 dB reduction of insertion loss for each device.

  5. Electrochemical direct immobilization of DNA sequences for label-free herpes virus detection

    International Nuclear Information System (INIS)

    Phuong Dinh Tam; Mai Anh Tuan; Tran Trung; Nguyen Duc Chien

    2009-01-01

    DNA sequences/bio-macromolecules of herpes virus (5'-AT CAC CGA CCC GGA GAG GGA C-3') were directly immobilized into polypyrrole matrix by using the cyclic voltammetry method, and grafted onto arrays of interdigitated platinum microelectrodes. The morphology surface of the obtained PPy/DNA of herpes virus composite films was investigated by a FESEM Hitachi-S 4800. Fourier transform infrared spectroscopy (FTIR) was used to characterize the PPy/DNA film and to study the specific interactions that may exist between DNA biomacromolecules and PPy chains. Attempts are made to use these PPy/DNA composite films for label-free herpes virus detection revealed a response time of 60 s in solutions containing as low as 2 nM DNA concentration, and self life of six months when emerged in double distilled water and kept refrigerated.

  6. Electrochemical direct immobilization of DNA sequences for label-free herpes virus detection

    Energy Technology Data Exchange (ETDEWEB)

    Phuong Dinh Tam; Mai Anh Tuan [International Training Institute for Materials Science (Viet Nam); Tran Trung [Department of Electrochemistry, Hung-Yen University of Technology and Education (Viet Nam); Nguyen Duc Chien [Institute of Engineering Physics, Hanoi University of Technology, 1 Dai Co Viet Road, Hanoi (Viet Nam)], E-mail: tr_trunghut@yahoo.com

    2009-09-01

    DNA sequences/bio-macromolecules of herpes virus (5'-AT CAC CGA CCC GGA GAG GGA C-3') were directly immobilized into polypyrrole matrix by using the cyclic voltammetry method, and grafted onto arrays of interdigitated platinum microelectrodes. The morphology surface of the obtained PPy/DNA of herpes virus composite films was investigated by a FESEM Hitachi-S 4800. Fourier transform infrared spectroscopy (FTIR) was used to characterize the PPy/DNA film and to study the specific interactions that may exist between DNA biomacromolecules and PPy chains. Attempts are made to use these PPy/DNA composite films for label-free herpes virus detection revealed a response time of 60 s in solutions containing as low as 2 nM DNA concentration, and self life of six months when emerged in double distilled water and kept refrigerated.

  7. Massive sequencing of Ulmus minor's transcriptome provides new molecular tools for a genus under the constant threat of Dutch elm disease

    Directory of Open Access Journals (Sweden)

    Pedro ePerdiguero

    2015-07-01

    Full Text Available Elms, especially Ulmus minor and Ulmus americana, are carrying out a hard battle against Dutch elm disease (DED. This vascular wilt disease, caused by Ophiostoma ulmi and O. novo-ulmi, appeared in the twentieth century and killed millions of elms across North America and Europe. Elm breeding and conservation programmes have identified a reduced number of DED tolerant genotypes. In this study, three U. minor genotypes with contrasted levels of tolerance to DED were exposed to several biotic and abiotic stresses in order to (i obtain a de novo assembled transcriptome of U. minor using 454 pyrosequencing, (ii perform a functional annotation of the assembled transcriptome, (iii identify genes potentially involved in the molecular response to environmental stress, and (iv develop gene-based markers to support breeding programmes. A total of 58,429 putative unigenes were identified after assembly and filtering of the transcriptome. 32,152 of these unigenes showed homology with proteins identified in the genome from the most common plant model species. Well-known family proteins and transcription factors involved in abiotic, biotic or both stresses were identified after functional annotation. A total of 30,693 polymorphisms were identified in 7,125 isotigs, a large number of them corresponding to SNPs (27,359. In a subset randomly selected for validation, 87 % of the SNPs were confirmed. The material generated may be valuable for future Ulmus gene expression, population genomics and association genetics studies, especially taking into account the scarce molecular information available for this genus and the great impact that DED has on elm populations.

  8. Genetic Barrier to Direct Acting Antivirals in HCV Sequences Deposited in the European Databank.

    Directory of Open Access Journals (Sweden)

    Dimas Alexandre Kliemann

    /N/R positions required only one transition for up to 98.8% of the sequences analyzed. A single variant in position 448 in genotype 1a is less likely to become the resistance variant 448H because it requires two transversions. Also, in the position 559D a transversion and a transition were necessary to generate the resistance mutant D559H.Results revealed that in 14 out of 16 positions, conversion to a drug-resistant variant of HCV required only one single nucleotide substitutions threatening direct acting antivirals from all three classes.

  9. De novo transcriptome sequencing and digital gene expression analysis predict biosynthetic pathway of rhynchophylline and isorhynchophylline from Uncaria rhynchophylla, a non-model plant with potent anti-alzheimer's properties.

    Science.gov (United States)

    Guo, Qianqian; Ma, Xiaojun; Wei, Shugen; Qiu, Deyou; Wilson, Iain W; Wu, Peng; Tang, Qi; Liu, Lijun; Dong, Shoukun; Zu, Wei

    2014-08-12

    The major medicinal alkaloids isolated from Uncaria rhynchophylla (gouteng in chinese) capsules are rhynchophylline (RIN) and isorhynchophylline (IRN). Extracts containing these terpene indole alkaloids (TIAs) can inhibit the formation and destabilize preformed fibrils of amyloid β protein (a pathological marker of Alzheimer's disease), and have been shown to improve the cognitive function of mice with Alzheimer-like symptoms. The biosynthetic pathways of RIN and IRN are largely unknown. In this study, RNA-sequencing of pooled Uncaria capsules RNA samples taken at three developmental stages that accumulate different amount of RIN and IRN was performed. More than 50 million high-quality reads from a cDNA library were generated and de novo assembled. Sequences for all of the known enzymes involved in TIAs synthesis were identified. Additionally, 193 cytochrome P450 (CYP450), 280 methyltransferase and 144 isomerase genes were identified, that are potential candidates for enzymes involved in RIN and IRN synthesis. Digital gene expression profile (DGE) analysis was performed on the three capsule developmental stages, and based on genes possessing expression profiles consistent with RIN and IRN levels; four CYP450s, three methyltransferases and three isomerases were identified as the candidates most likely to be involved in the later steps of RIN and IRN biosynthesis. A combination of de novo transcriptome assembly and DGE analysis was shown to be a powerful method for identifying genes encoding enzymes potentially involved in the biosynthesis of important secondary metabolites in a non-model plant. The transcriptome data from this study provides an important resource for understanding the formation of major bioactive constituents in the capsule extract from Uncaria, and provides information that may aid in metabolic engineering to increase yields of these important alkaloids.

  10. Structured Semantic Knowledge Can Emerge Automatically from Predicting Word Sequences in Child-Directed Speech

    Directory of Open Access Journals (Sweden)

    Philip A. Huebner

    2018-02-01

    Full Text Available Previous research has suggested that distributional learning mechanisms may contribute to the acquisition of semantic knowledge. However, distributional learning mechanisms, statistical learning, and contemporary “deep learning” approaches have been criticized for being incapable of learning the kind of abstract and structured knowledge that many think is required for acquisition of semantic knowledge. In this paper, we show that recurrent neural networks, trained on noisy naturalistic speech to children, do in fact learn what appears to be abstract and structured knowledge. We trained two types of recurrent neural networks (Simple Recurrent Network, and Long Short-Term Memory to predict word sequences in a 5-million-word corpus of speech directed to children ages 0–3 years old, and assessed what semantic knowledge they acquired. We found that learned internal representations are encoding various abstract grammatical and semantic features that are useful for predicting word sequences. Assessing the organization of semantic knowledge in terms of the similarity structure, we found evidence of emergent categorical and hierarchical structure in both models. We found that the Long Short-term Memory (LSTM and SRN are both learning very similar kinds of representations, but the LSTM achieved higher levels of performance on a quantitative evaluation. We also trained a non-recurrent neural network, Skip-gram, on the same input to compare our results to the state-of-the-art in machine learning. We found that Skip-gram achieves relatively similar performance to the LSTM, but is representing words more in terms of thematic compared to taxonomic relations, and we provide reasons why this might be the case. Our findings show that a learning system that derives abstract, distributed representations for the purpose of predicting sequential dependencies in naturalistic language may provide insight into emergence of many properties of the developing

  11. Structured Semantic Knowledge Can Emerge Automatically from Predicting Word Sequences in Child-Directed Speech

    Science.gov (United States)

    Huebner, Philip A.; Willits, Jon A.

    2018-01-01

    Previous research has suggested that distributional learning mechanisms may contribute to the acquisition of semantic knowledge. However, distributional learning mechanisms, statistical learning, and contemporary “deep learning” approaches have been criticized for being incapable of learning the kind of abstract and structured knowledge that many think is required for acquisition of semantic knowledge. In this paper, we show that recurrent neural networks, trained on noisy naturalistic speech to children, do in fact learn what appears to be abstract and structured knowledge. We trained two types of recurrent neural networks (Simple Recurrent Network, and Long Short-Term Memory) to predict word sequences in a 5-million-word corpus of speech directed to children ages 0–3 years old, and assessed what semantic knowledge they acquired. We found that learned internal representations are encoding various abstract grammatical and semantic features that are useful for predicting word sequences. Assessing the organization of semantic knowledge in terms of the similarity structure, we found evidence of emergent categorical and hierarchical structure in both models. We found that the Long Short-term Memory (LSTM) and SRN are both learning very similar kinds of representations, but the LSTM achieved higher levels of performance on a quantitative evaluation. We also trained a non-recurrent neural network, Skip-gram, on the same input to compare our results to the state-of-the-art in machine learning. We found that Skip-gram achieves relatively similar performance to the LSTM, but is representing words more in terms of thematic compared to taxonomic relations, and we provide reasons why this might be the case. Our findings show that a learning system that derives abstract, distributed representations for the purpose of predicting sequential dependencies in naturalistic language may provide insight into emergence of many properties of the developing semantic system. PMID

  12. Direct sequencing of FAH gene in Pakistani tyrosinemia type 1 families reveals a novel mutation.

    Science.gov (United States)

    Ijaz, Sadaqat; Zahoor, Muhammad Yasir; Imran, Muhammad; Afzal, Sibtain; Bhinder, Munir A; Ullah, Ihsan; Cheema, Huma Arshad; Ramzan, Khushnooda; Shehzad, Wasim

    2016-03-01

    Hereditary tyrosinemia type 1 (HT1) is a rare inborn error of tyrosine catabolism with a worldwide prevalence of one out of 100,000 live births. HT1 is clinically characterized by hepatic and renal dysfunction resulting from the deficiency of fumarylacetoacetate hydrolase (FAH) enzyme, caused by recessive mutations in the FAH gene. We present here the first report on identification of FAH mutations in HT1 patients from Pakistan with a novel one. Three Pakistani families, each having one child affected with HT1, were enrolled over a period of 1.5 years. Two of the affected children had died as they were presented late with acute form. All regions of the FAH gene spanning exons and splicing sites were amplified by polymerase chain reaction (PCR) and mutation analysis was carried out by direct sequencing. Results of sequencing were confirmed by restriction fragment length polymorphism (PCR-RFLP) analysis. Three different FAH mutations, one in each family, were found to co-segregate with the disease phenotype. Two of these FAH mutations have been known (c.192G>T and c.1062+5G>A [IVS12+5G>A]), while c.67T>C (p.Ser23Pro) was a novel mutation. The novel variant was not detected in any of 120 chromosomes from normal ethnically matched individuals. Most of the HT1 patients die before they present to hospitals in Pakistan, as is indicated by enrollment of only three families in 1.5 years. Most of those with late clinical presentation do not survive due to delayed diagnosis followed by untimely treatment. This tragic condition advocates the establishment of expanded newborn screening program for HT1 within Pakistan.

  13. CBrowse: a SAM/BAM-based contig browser for transcriptome assembly visualization and analysis.

    Science.gov (United States)

    Li, Pei; Ji, Guoli; Dong, Min; Schmidt, Emily; Lenox, Douglas; Chen, Liangliang; Liu, Qi; Liu, Lin; Zhang, Jie; Liang, Chun

    2012-09-15

    To address the impending need for exploring rapidly increased transcriptomics data generated for non-model organisms, we developed CBrowse, an AJAX-based web browser for visualizing and analyzing transcriptome assemblies and contigs. Designed in a standard three-tier architecture with a data pre-processing pipeline, CBrowse is essentially a Rich Internet Application that offers many seamlessly integrated web interfaces and allows users to navigate, sort, filter, search and visualize data smoothly. The pre-processing pipeline takes the contig sequence file in FASTA format and its relevant SAM/BAM file as the input; detects putative polymorphisms, simple sequence repeats and sequencing errors in contigs and generates image, JSON and database-compatible CSV text files that are directly utilized by different web interfaces. CBowse is a generic visualization and analysis tool that facilitates close examination of assembly quality, genetic polymorphisms, sequence repeats and/or sequencing errors in transcriptome sequencing projects. CBrowse is distributed under the GNU General Public License, available at http://bioinfolab.muohio.edu/CBrowse/ liangc@muohio.edu or liangc.mu@gmail.com; glji@xmu.edu.cn Supplementary data are available at Bioinformatics online.

  14. A comparison of EGFR mutation testing methods in lung carcinoma: direct sequencing, real-time PCR and immunohistochemistry.

    Directory of Open Access Journals (Sweden)

    Bárbara Angulo

    Full Text Available The objective of this study is to compare two EGFR testing methodologies (a commercial real-time PCR kit and a specific EGFR mutant immunohistochemistry, with direct sequencing and to investigate the limit of detection (LOD of both PCR-based methods. We identified EGFR mutations in 21 (16% of the 136 tumours analyzed by direct sequencing. Interestingly, the Therascreen EGFR Mutation Test kit was able to characterize as wild-type one tumour that could not be analyzed by direct sequencing of the PCR product. We then compared the LOD of the kit and that of direct sequencing using the available mutant tumours. The kit was able to detect the presence of a mutation in a 1% dilution of the total DNA in nine of the 18 tumours (50%, which tested positive with the real-time quantitative PCR method. In all cases, EGFR mutation was identified at a dilution of 5%. Where the mutant DNA represented 30% of the total DNA, sequencing was able to detect mutations in 12 out of 19 cases (63%. Additional experiments with genetically defined standards (EGFR ΔE746-A750/+ and EGFR L858R/+ yielded similar results. Immunohistochemistry (IHC staining with exon 19-specific antibody was seen in eight out of nine cases with E746-A750del detected by direct sequencing. Neither of the two tumours with complex deletions were positive. Of the five L858R-mutated tumours detected by the PCR methods, only two were positive for the exon 21-specific antibody. The specificity was 100% for both antibodies. The LOD of the real-time PCR method was lower than that of direct sequencing. The mutation specific IHC produced excellent specificity.

  15. Transcriptome sequencing revealed differences in the response of renal cancer cells to hypoxia and CoCl2 treatment [version 1; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Nadezhda Zhigalova

    2015-12-01

    Full Text Available Human cancer cells are subjected to hypoxic conditions in many tumours. Hypoxia causes alterations in the glycolytic pathway activation through stabilization of hypoxia-inducible factor 1. Currently, two approaches are commonly used to model hypoxia: an alternative to generating low-oxygen conditions in an incubator, cells can be treated with CoCl2. We performed RNA-seq experiments to study transcriptomes of human Caki-1 cells under real hypoxia and after CoCl2 treatment. Despite causing transcriptional changes of a much higher order of magnitude for the genes in the hypoxia regulation pathway, CoCl2 treatment fails to induce alterations in the glycolysis / gluconeogenesis pathway. Moreover, CoCl2 caused aberrant activation of other oxidoreductases in glycine, serine and threonine metabolism pathways.

  16. Developmental gene discovery in a hemimetabolous insect: de novo assembly and annotation of a transcriptome for the cricket Gryllus bimaculatus.

    Directory of Open Access Journals (Sweden)

    Victor Zeng

    Full Text Available Most genomic resources available for insects represent the Holometabola, which are insects that undergo complete metamorphosis like beetles and flies. In contrast, the Hemimetabola (direct developing insects, representing the basal branches of the insect tree, have very few genomic resources. We have therefore created a large and publicly available transcriptome for the hemimetabolous insect Gryllus bimaculatus (cricket, a well-developed laboratory model organism whose potential for functional genetic experiments is currently limited by the absence of genomic resources. cDNA was prepared using mRNA obtained from adult ovaries containing all stages of oogenesis, and from embryo samples on each day of embryogenesis. Using 454 Titanium pyrosequencing, we sequenced over four million raw reads, and assembled them into 21,512 isotigs (predicted transcripts and 120,805 singletons with an average coverage per base pair of 51.3. We annotated the transcriptome manually for over 400 conserved genes involved in embryonic patterning, gametogenesis, and signaling pathways. BLAST comparison of the transcriptome against the NCBI non-redundant protein database (nr identified significant similarity to nr sequences for 55.5% of transcriptome sequences, and suggested that the transcriptome may contain 19,874 unique transcripts. For predicted transcripts without significant similarity to known sequences, we assessed their similarity to other orthopteran sequences, and determined that these transcripts contain recognizable protein domains, largely of unknown function. We created a searchable, web-based database to allow public access to all raw, assembled and annotated data. This database is to our knowledge the largest de novo assembled and annotated transcriptome resource available for any hemimetabolous insect. We therefore anticipate that these data will contribute significantly to more effective and higher-throughput deployment of molecular analysis tools in

  17. Deep sequencing analysis of the transcriptomes of peanut aerial and subterranean young pods identifies candidate genes related to early embryo abortion.

    Science.gov (United States)

    Chen, Xiaoping; Zhu, Wei; Azam, Sarwar; Li, Heying; Zhu, Fanghe; Li, Haifen; Hong, Yanbin; Liu, Haiyan; Zhang, Erhua; Wu, Hong; Yu, Shanlin; Zhou, Guiyuan; Li, Shaoxiong; Zhong, Ni; Wen, Shijie; Li, Xingyu; Knapp, Steve J; Ozias-Akins, Peggy; Varshney, Rajeev K; Liang, Xuanqiang

    2013-01-01

    The failure of peg penetration into the soil leads to seed abortion in peanut. Knowledge of genes involved in these processes is comparatively deficient. Here, we used RNA-seq to gain insights into transcriptomes of aerial and subterranean pods. More than 2 million transcript reads with an average length of 396 bp were generated from one aerial (AP) and two subterranean (SP1 and SP2) pod libraries using pyrosequencing technology. After assembly, sets of 49 632, 49 952 and 50 494 from a total of 74 974 transcript assembly contigs (TACs) were identified in AP, SP1 and SP2, respectively. A clear linear relationship in the gene expression level was observed between these data sets. In brief, 2194 differentially expressed TACs with a 99.0% true-positive rate were identified, among which 859 and 1068 TACs were up-regulated in aerial and subterranean pods, respectively. Functional analysis showed that putative function based on similarity with proteins catalogued in UniProt and gene ontology term classification could be determined for 59 342 (79.2%) and 42 955 (57.3%) TACs, respectively. A total of 2968 TACs were mapped to 174 KEGG pathways, of which 168 were shared by aerial and subterranean transcriptomes. TACs involved in photosynthesis were significantly up-regulated and enriched in the aerial pod. In addition, two senescence-associated genes were identified as significantly up-regulated in the aerial pod, which potentially contribute to embryo abortion in aerial pods, and in turn, to cessation of swelling. The data set generated in this study provides evidence for some functional genes as robust candidates underlying aerial and subterranean pod development and contributes to an elucidation of the evolutionary implications resulting from fruit development under light and dark conditions. © 2012 The Authors Plant Biotechnology Journal © 2012 Society for Experimental Biology, Association of Applied Biologists and Blackwell Publishing Ltd.

  18. De novo transcriptome sequencing of black pepper (Piper nigrum L.) and an analysis of genes involved in phenylpropanoid metabolism in response to Phytophthora capsici.

    Science.gov (United States)

    Hao, Chaoyun; Xia, Zhiqiang; Fan, Rui; Tan, Lehe; Hu, Lisong; Wu, Baoduo; Wu, Huasong

    2016-10-21

    Piper nigrum L., or "black pepper", is an economically important spice crop in tropical regions. Black pepper production is markedly affected by foot rot disease caused by Phytophthora capsici, and genetic improvement of black pepper is essential for combating foot rot diseases. However, little is known about the mechanism of anti- P. capsici in black pepper. The molecular mechanisms underlying foot rot susceptibility were studied by comparing transcriptome analysis between resistant (Piper flaviflorum) and susceptible (Piper nigrum cv. Reyin-1) black pepper species. 116,432 unigenes were acquired from six libraries (three replicates of resistant and susceptible black pepper samples), which were integrated by applying BLAST similarity searches and noted by adopting Kyoto Encyclopaedia of Genes and Gene Ontology (GO) genome orthology identifiers. The reference transcriptome was mapped using two sets of digital gene expression data. Using GO enrichment analysis for the differentially expressed genes, the majority of the genes associated with the phenylpropanoid biosynthesis pathway were identified in P. flaviflorum. In addition, the expression of genes revealed that after susceptible and resistant species were inoculated with P. capsici, the majority of genes incorporated in the phenylpropanoid metabolism pathway were up-regulated in both species. Among various treatments and organs, all the genes were up-regulated to a relatively high degree in resistant species. Phenylalanine ammonia lyase and peroxidase enzyme activity increased in susceptible and resistant species after inoculation with P. capsici, and the resistant species increased faster. The resistant plants retain their vascular structure in lignin revealed by histochemical analysis. Our data provide critical information regarding target genes and a technological basis for future studies of black pepper genetic improvements, including transgenic breeding.

  19. De novo assembly and characterization of global transcriptome of coconut palm (Cocos nucifera L.) embryogenic calli using Illumina paired-end sequencing.

    Science.gov (United States)

    Rajesh, M K; Fayas, T P; Naganeeswaran, S; Rachana, K E; Bhavyashree, U; Sajini, K K; Karun, Anitha

    2016-05-01

    Production and supply of quality planting material is significant to coconut cultivation but is one of the major constraints in coconut productivity. Rapid multiplication of coconut through in vitro techniques, therefore, is of paramount importance. Although somatic embryogenesis in coconut is a promising technique that will allow for the mass production of high quality palms, coconut is highly recalcitrant to in vitro culture. In order to overcome the bottlenecks in coconut somatic embryogenesis and to develop a repeatable protocol, it is imperative to understand, identify, and characterize molecular events involved in coconut somatic embryogenesis pathway. Transcriptome analysis (RNA-Seq) of coconut embryogenic calli, derived from plumular explants of West Coast Tall cultivar, was undertaken on an Illumina HiSeq 2000 platform. After de novo transcriptome assembly and functional annotation, we have obtained 40,367 transcripts which showed significant BLASTx matches with similarity greater than 40 % and E value of ≤10(-5). Fourteen genes known to be involved in somatic embryogenesis were identified. Quantitative real-time PCR (qRT-PCR) analyses of these 14 genes were carried in six developmental stages. The result showed that CLV was upregulated in the initial stage of callogenesis. Transcripts GLP, GST, PKL, WUS, and WRKY were expressed more in somatic embryo stage. The expression of SERK, MAPK, AP2, SAUR, ECP, AGP, LEA, and ANT were higher in the embryogenic callus stage compared to initial culture and somatic embryo stages. This study provides the first insights into the gene expression patterns during somatic embryogenesis in coconut.

  20. Direct evidence for sequence-dependent attraction between double-stranded DNA controlled by methylation.

    Science.gov (United States)

    Yoo, Jejoong; Kim, Hajin; Aksimentiev, Aleksei; Ha, Taekjip

    2016-03-22

    Although proteins mediate highly ordered DNA organization in vivo, theoretical studies suggest that homologous DNA duplexes can preferentially associate with one another even in the absence of proteins. Here we combine molecular dynamics simulations with single-molecule fluorescence resonance energy transfer experiments to examine the interactions between duplex DNA in the presence of spermine, a biological polycation. We find that AT-rich DNA duplexes associate more strongly than GC-rich duplexes, regardless of the sequence homology. Methyl groups of thymine acts as a steric block, relocating spermine from major grooves to interhelical regions, thereby increasing DNA-DNA attraction. Indeed, methylation of cytosines makes attraction between GC-rich DNA as strong as that between AT-rich DNA. Recent genome-wide chromosome organization studies showed that remote contact frequencies are higher for AT-rich and methylated DNA, suggesting that direct DNA-DNA interactions that we report here may play a role in the chromosome organization and gene regulation.

  1. Radiation-induced germ-line mutations detected by a direct comparison of parents and children DNA sequences containing SNPs

    International Nuclear Information System (INIS)

    Morimyo, M.; Hongo, E.; Higashi, T.; Wu, J.; Matsumoto, I.; Okamoto, M.; Kawano, A.; Tsuji, S.

    2003-01-01

    Full text: Germ-line mutation is detected in mice but not in humans. To estimate genetic risk of humans, a new approach to extrapolate from animal data to humans or to directly detect radiation-induced mutations in man is expected. We have developed a new method to detect germ-line mutations by directly comparing DNA sequences of parents and children. The nucleotide sequences among mouse strains are almost identical except SNP markers that are detected at 1/1000 frequency. When gamma-irradiated male mice are mated with female mice, heterogeneous nucleotide sequences induced in children DNA are a candidate of mutation, whose assignment can be done by SNP analysis. This system can easily detect all types of mutations such as transition, transversion, frameshift and deletion induced by radiation and can be applied to humans having genetically heterogeneous nucleotide sequences and many SNP markers. C3H male mice of 8 weeks of gestation were irradiated with gamma rays of 3 and 1 Gy and after 3 weeks, they were mated with the same aged C57BL female mice. After 3 weeks breeding, DNA was extracted from parents and children mice. The nucleotide sequences of 150 STS markers containing 300-900 bp and SNPs of parents and children DNA were determined by a direct sequencing; amplification of STS markers by Taq DNA polymerase, purification of PCR products, and DNA sequencing with a dye-terminator method. At each radiation dose, a total amount of 5 Mb DNA sequences were examined to detect radiation-induced mutations. We could find 6 deletions in 3 Gy irradiated mice but not in 1 Gy and control mice. The mutation frequency was about 4.0 x 10 -7 /bp/ Gy or 1.6 x 10 -4 /locus/Gy, and suggested the non-linear increase of mutation rate with dose

  2. A 135-kilodalton surface antigen of Mycoplasma hominis PG21 contains multiple directly repeated sequences

    DEFF Research Database (Denmark)

    Ladefoged, Søren; Birkelund, Svend; Hauge, S

    1995-01-01

    gene was sequenced, and its gene product was characterized with the goal of elucidating the structure and function of Lmp1. A total of 7,196 bp in the lmp1 region was sequenced. An open reading frame of 4,032 bp, encoding a protein of 1,344 amino acids with a calculated molecular weight of 147...

  3. The impact of cerebellar transcranial direct current stimulation (tDCS) on learning fine-motor sequences.

    Science.gov (United States)

    Shimizu, Renee E; Wu, Allan D; Samra, Jasmine K; Knowlton, Barbara J

    2017-01-05

    The cerebellum has been shown to be important for skill learning, including the learning of motor sequences. We investigated whether cerebellar transcranial direct current stimulation (tDCS) would enhance learning of fine motor sequences. Because the ability to generalize or transfer to novel task variations or circumstances is a crucial goal of real world training, we also examined the effect of tDCS on performance of novel sequences after training. In Study 1, participants received either anodal, cathodal or sham stimulation while simultaneously practising three eight-element key press sequences in a non-repeating, interleaved order. Immediately after sequence practice with concurrent tDCS, a transfer session was given in which participants practised three interleaved novel sequences. No stimulation was given during transfer. An inhibitory effect of cathodal tDCS was found during practice, such that the rate of learning was slowed in comparison to the anodal and sham groups. In Study 2, participants received anodal or sham stimulation and a 24 h delay was added between the practice and transfer sessions to reduce mental fatigue. Although this consolidation period benefitted subsequent transfer for both tDCS groups, anodal tDCS enhanced transfer performance. Together, these studies demonstrate polarity-specific effects on fine motor sequence learning and generalization.This article is part of the themed issue 'New frontiers for statistical learning in the cognitive sciences'. © 2016 The Author(s).

  4. The testes transcriptome derived from the New World Screwworm, Cochliomyia hominivorax TSA

    Science.gov (United States)

    In a collaboration with National Center for Genome Resources researchers, we sequenced and assembled the testes transcriptome derived from the Pacora, Panama, production plant strain of the New World Screwworm, Cochliomyia hominivorax. This transcriptome contains 4,149 unigenes and the Transcriptome...

  5. Transcriptome complexity in a genome-reduced bacterium

    DEFF Research Database (Denmark)

    Güell, Marc; van Noort, Vera; Yus, Eva

    2009-01-01

    To study basic principles of transcriptome organization in bacteria, we analyzed one of the smallest self-replicating organisms, Mycoplasma pneumoniae. We combined strand-specific tiling arrays, complemented by transcriptome sequencing, with more than 252 spotted arrays. We detected 117 previousl...

  6. Leaf Transcriptome Sequencing for Identifying Genic-SSR Markers and SNP Heterozygosity in Crossbred Mango Variety ‘Amrapali’ (Mangifera indica L.)

    Science.gov (United States)

    Mahato, Ajay Kumar; Sharma, Nimisha; Singh, Akshay; Srivastav, Manish; Jaiprakash; Singh, Sanjay Kumar; Singh, Anand Kumar; Sharma, Tilak Raj; Singh, Nagendra Kumar

    2016-01-01

    Mango (Mangifera indica L.) is called “king of fruits” due to its sweetness, richness of taste, diversity, large production volume and a variety of end usage. Despite its huge economic importance genomic resources in mango are scarce and genetics of useful horticultural traits are poorly understood. Here we generated deep coverage leaf RNA sequence data for mango parental varieties ‘Neelam’, ‘Dashehari’ and their hybrid ‘Amrapali’ using next generation sequencing technologies. De-novo sequence assembly generated 27,528, 20,771 and 35,182 transcripts for the three genotypes, respectively. The transcripts were further assembled into a non-redundant set of 70,057 unigenes that were used for SSR and SNP identification and annotation. Total 5,465 SSR loci were identified in 4,912 unigenes with 288 type I SSR (n ≥ 20 bp). One hundred type I SSR markers were randomly selected of which 43 yielded PCR amplicons of expected size in the first round of validation and were designated as validated genic-SSR markers. Further, 22,306 SNPs were identified by aligning high quality sequence reads of the three mango varieties to the reference unigene set, revealing significantly enhanced SNP heterozygosity in the hybrid Amrapali. The present study on leaf RNA sequencing of mango varieties and their hybrid provides useful genomic resource for genetic improvement of mango. PMID:27736892

  7. Transcriptome-based gene profiling provides novel insights into the characteristics of radish root response to Cr stress with next-generation sequencing

    Directory of Open Access Journals (Sweden)

    Yang eXie

    2015-03-01

    Full Text Available Radish (Raphanus sativus L. is an important worldwide root vegetable crop with high nutrient values and is adversely affected by non-essential heavy metals including chromium (Cr. Little is known about the molecular mechanism underlying Cr stress response in radish. In this study, RNA-Seq technique was employed to identify differentially expressed genes (DEGs under Cr stress. Based on de novo transcriptome assembly, there were 30,676 unigenes representing 60,881 transcripts isolated from radish root under Cr stress. Differential gene analysis revealed that 2,985 uingenes were significantly differentially expressed between Cr-free (CK and Cr-treated (Cr600 libraries, among which 1,424 were up-regulated and 1,561 down-regulated. Gene ontology (GO analysis revealed that these DEGs were mainly involved in primary metabolic process, response to abiotic stimulus, cellular metabolic process and small molecule metabolic process. Kyoto encyclopedia of genes and genomes (KEGG enrichment analysis showed that the DEGs were mainly involved in protein processing in endoplasmic reticulum, starch and sucrose metabolism, amino acid metabolism, glutathione metabolism, drug and xenobiotics by cytochrome P450 metabolism. RT-qPCR analysis showed that the expression patterns of 12 randomly selected DEGs were highly accordant with the results from RNA-seq. Furthermore, many candidate genes including signaling protein kinases, transcription factors and metal transporters, chelate compound biosynthesis and antioxidant system, were involved in defense and detoxification mechanisms of Cr stress response regulatory networks. These results would provide novel insight into molecular mechanism underlying plant responsiveness to Cr stress and facilitate further genetic manipulation on Cr uptake and accumulation in radish.

  8. Direct comparisons of Illumina vs. Roche 454 sequencing technologies on the same microbial community DNA sample.

    Directory of Open Access Journals (Sweden)

    Chengwei Luo

    Full Text Available Next-generation sequencing (NGS is commonly used in metagenomic studies of complex microbial communities but whether or not different NGS platforms recover the same diversity from a sample and their assembled sequences are of comparable quality remain unclear. We compared the two most frequently used platforms, the Roche 454 FLX Titanium and the Illumina Genome Analyzer (GA II, on the same DNA sample obtained from a complex freshwater planktonic community. Despite the substantial differences in read length and sequencing protocols, the platforms provided a comparable view of the community sampled. For instance, derived assemblies overlapped in ~90% of their total sequences and in situ abundances of genes and genotypes (estimated based on sequence coverage correlated highly between the two platforms (R(2>0.9. Evaluation of base-call error, frameshift frequency, and contig length suggested that Illumina offered equivalent, if not better, assemblies than Roche 454. The results from metagenomic samples were further validated against DNA samples of eighteen isolate genomes, which showed a range of genome sizes and G+C% content. We also provide quantitative estimates of the errors in gene and contig sequences assembled from datasets characterized by different levels of complexity and G+C% content. For instance, we noted that homopolymer-associated, single-base errors affected ~1% of the protein sequences recovered in Illumina contigs of 10× coverage and 50% G+C; this frequency increased to ~3% when non-homopolymer errors were also considered. Collectively, our results should serve as a useful practical guide for choosing proper sampling strategies and data possessing protocols for future metagenomic studies.

  9. Direct comparisons of Illumina vs. Roche 454 sequencing technologies on the same microbial community DNA sample.

    Science.gov (United States)

    Luo, Chengwei; Tsementzi, Despina; Kyrpides, Nikos; Read, Timothy; Konstantinidis, Konstantinos T

    2012-01-01

    Next-generation sequencing (NGS) is commonly used in metagenomic studies of complex microbial communities but whether or not different NGS platforms recover the same diversity from a sample and their assembled sequences are of comparable quality remain unclear. We compared the two most frequently used platforms, the Roche 454 FLX Titanium and the Illumina Genome Analyzer (GA) II, on the same DNA sample obtained from a complex freshwater planktonic community. Despite the substantial differences in read length and sequencing protocols, the platforms provided a comparable view of the community sampled. For instance, derived assemblies overlapped in ~90% of their total sequences and in situ abundances of genes and genotypes (estimated based on sequence coverage) correlated highly between the two platforms (R(2)>0.9). Evaluation of base-call error, frameshift frequency, and contig length suggested that Illumina offered equivalent, if not better, assemblies than Roche 454. The results from metagenomic samples were further validated against DNA samples of eighteen isolate genomes, which showed a range of genome sizes and G+C% content. We also provide quantitative estimates of the errors in gene and contig sequences assembled from datasets characterized by different levels of complexity and G+C% content. For instance, we noted that homopolymer-associated, single-base errors affected ~1% of the protein sequences recovered in Illumina contigs of 10× coverage and 50% G+C; this frequency increased to ~3% when non-homopolymer errors were also considered. Collectively, our results should serve as a useful practical guide for choosing proper sampling strategies and data possessing protocols for future metagenomic studies.

  10. Characterization of the Kenaf (Hibiscus cannabinus) Global Transcriptome Using Illumina Paired-End Sequencing and Development of EST-SSR Markers

    Science.gov (United States)

    Li, Hui; Li, Defang; Chen, Anguo; Tang, Huijuan; Li, Jianjun; Huang, Siqi

    2016-01-01

    Kenaf (Hibiscus cannabinus L.) is an economically important natural fiber crop grown worldwide. However, only 20 expressed tag sequences (ESTs) for kenaf are available in public databases. The aim of this study was to develop large-scale simple sequence repeat (SSR) markers to lay a solid foundation for the construction of genetic linkage maps and marker-assisted breeding in kenaf. We used Illumina paired-end sequencing technology to generate new EST-simple sequences and MISA software to mine SSR markers. We identified 71,318 unigenes with an average length of 1143 nt and annotated these unigenes using four different protein databases. Overall, 9324 complementary pairs were designated as EST-SSR markers, and their quality was validated using 100 randomly selected SSR markers. In total, 72 primer pairs reproducibly amplified target amplicons, and 61 of these primer pairs detected significant polymorphism among 28 kenaf accessions. Thus, in this study, we have developed large-scale SSR markers for kenaf, and this new resource will facilitate construction of genetic linkage maps, investigation of fiber growth and development in kenaf, and also be of value to novel gene discovery and functional genomic studies. PMID:26960153

  11. Characterization of the Kenaf (Hibiscus cannabinus) Global Transcriptome Using Illumina Paired-End Sequencing and Development of EST-SSR Markers.

    Science.gov (United States)

    Li, Hui; Li, Defang; Chen, Anguo; Tang, Huijuan; Li, Jianjun; Huang, Siqi

    2016-01-01

    Kenaf (Hibiscus cannabinus L.) is an economically important natural fiber crop grown worldwide. However, only 20 expressed tag sequences (ESTs) for kenaf are available in public databases. The aim of this study was to develop large-scale simple sequence repeat (SSR) markers to lay a solid foundation for the construction of genetic linkage maps and marker-assisted breeding in kenaf. We used Illumina paired-end sequencing technology to generate new EST-simple sequences and MISA software to mine SSR markers. We identified 71,318 unigenes with an average length of 1143 nt and annotated these unigenes using four different protein databases. Overall, 9324 complementary pairs were designated as EST-SSR markers, and their quality was validated using 100 randomly selected SSR markers. In total, 72 primer pairs reproducibly amplified target amplicons, and 61 of these primer pairs detected significant polymorphism among 28 kenaf accessions. Thus, in this study, we have developed large-scale SSR markers for kenaf, and this new resource will facilitate construction of genetic linkage maps, investigation of fiber growth and development in kenaf, and also be of value to novel gene discovery and functional genomic studies.

  12. The Transcriptomics of Secondary Growth and Wood Formation in Conifers

    Science.gov (United States)

    Carvalho, Ana; Paiva, Jorge; Louzada, José; Lima-Brito, José

    2013-01-01

    In the last years, forestry scientists have adapted genomics and next-generation sequencing (NGS) technologies to the search for candidate genes related to the transcriptomics of secondary growth and wood formation in several tree species. Gymnosperms, in particular, the conifers, are ecologically and economically important, namely, for the production of wood and other forestry end products. Until very recently, no whole genome sequencing of a conifer genome was available. Due to the gradual improvement of the NGS technologies and inherent bioinformatics tools, two draft assemblies of the whole genomes sequence of Picea abies and Picea glauca arose in the current year. These draft genome assemblies will bring new insights about the structure, content, and evolution of the conifer genomes. Furthermore, new directions in the forestry, breeding and research of conifers will be discussed in the following. The identification of genes associated with the xylem transcriptome and the knowledge of their regulatory mechanisms will provide less time-consuming breeding cycles and a high accuracy for the selection of traits related to wood production and quality. PMID:24288610

  13. Immunomodulatory Effects of Dietary Seaweeds in LPS Challenged Atlantic Salmon Salmo salar as Determined by Deep RNA Sequencing of the Head Kidney Transcriptome

    Science.gov (United States)

    Palstra, Arjan P.; Kals, Jeroen; Blanco Garcia, Ainhoa; Dirks, Ron P.; Poelman, Marnix

    2018-01-01

    Seaweeds may represent immuno-stimulants that could be used as health-promoting fish feed components. This study was performed to gain insights into the immunomodulatory effects of dietary seaweeds in Atlantic salmon. Specifically tested were 10% inclusion levels of Laminaria digitata (SW1) and a commercial blend of seaweeds (Oceanfeed®) (SW2) against a fishmeal based control diet (FMC). Differences between groups were assessed in growth, feed conversion ratio and blood parameters hematocrit and hemoglobin. After a LPS challenge of fish representing each of the three groups, RNAseq was performed on the head kidney as major immune organ to determine transcriptomic differences in response to the immune activation. Atlantic salmon fed with dietary seaweeds did not show major differences in performance in comparison with fishmeal fed fish. RNAseq resulted in ∼154 million reads which were mapped against a NCBI Salmo salar reference and against a de novo assembled S. salar reference for analyses of expression of immune genes and ontology of immune processes among the 87,600 cDNA contigs. The dietary seaweeds provoked a more efficient immune response which involved more efficient identification of the infection site, and processing and presentation of antigens. More specifically, chemotaxis and the chemokine-mediated signaling were improved and therewith the defense response to Gram-positive bacterium reduced. Specific Laminaria digitata effects included reduction of the interferon-gamma-mediated signaling. Highly upregulated and specific for this diet was the expression of major histocompatibility complex class I-related gene protein. The commercial blend of seaweeds caused more differential expression than Laminaria digitata and improved immune processes such as receptor-mediated endocytosis and cell adhesion, and increased the expression of genes involved in response to lipopolysaccharide and inflammatory response. Particularly, expression of many important immune

  14. A direct method for computing extreme value (Gumbel) parameters for gapped biological sequence alignments.

    Science.gov (United States)

    Quinn, Terrance; Sinkala, Zachariah

    2014-01-01

    We develop a general method for computing extreme value distribution (Gumbel, 1958) parameters for gapped alignments. Our approach uses mixture distribution theory to obtain associated BLOSUM matrices for gapped alignments, which in turn are used for determining significance of gapped alignment scores for pairs of biological sequences. We compare our results with parameters already obtained in the literature.

  15. DNA-directed alkylating ligands as potential antitumor agents: sequence specificity of alkylation by intercalating aniline mustards.

    Science.gov (United States)

    Prakash, A S; Denny, W A; Gourdie, T A; Valu, K K; Woodgate, P D; Wakelin, L P

    1990-10-23

    The sequence preferences for alkylation of a series of novel parasubstituted aniline mustards linked to the DNA-intercalating chromophore 9-aminoacridine by an alkyl chain of variable length were studied by using procedures analogous to Maxam-Gilbert reactions. The compounds alkylate DNA at both guanine and adenine sites. For mustards linked to the acridine by a short alkyl chain through a para O- or S-link group, 5'-GT sequences are the most preferred sites at which N7-guanine alkylation occurs. For analogues with longer chain lengths, the preference of 5'-GT sequences diminishes in favor of N7-adenine alkylation at the complementary 5'-AC sequence. Magnesium ions are shown to selectively inhibit alkylation at the N7 of adenine (in the major groove) by these compounds but not the alkylation at the N3 of adenine (in the minor groove) by the antitumor antibiotic CC-1065. Effects of chromophore variation were also studied by using aniline mustards linked to quinazoline and sterically hindered tert-butyl-9-aminoacridine chromophores. The results demonstrate that in this series of DNA-directed mustards the noncovalent interactions of the carrier chromophores with DNA significantly modify the sequence selectivity of alkylation by the mustard. Relationships between the DNA alkylation patterns of these compounds and their biological activities are discussed.

  16. Characterization of Human Cytomegalovirus Genome Diversity in Immunocompromised Hosts by Whole-Genome Sequencing Directly From Clinical Specimens.

    Science.gov (United States)

    Hage, Elias; Wilkie, Gavin S; Linnenweber-Held, Silvia; Dhingra, Akshay; Suárez, Nicolás M; Schmidt, Julius J; Kay-Fedorov, Penelope C; Mischak-Weissinger, Eva; Heim, Albert; Schwarz, Anke; Schulz, Thomas F; Davison, Andrew J; Ganzenmueller, Tina

    2017-06-01

    Advances in next-generation sequencing (NGS) technologies allow comprehensive studies of genetic diversity over the entire genome of human cytomegalovirus (HCMV), a significant pathogen for immunocompromised individuals. Next-generation sequencing was performed on target enriched sequence libraries prepared directly from a variety of clinical specimens (blood, urine, breast milk, respiratory samples, biopsies, and vitreous humor) obtained longitudinally or from different anatomical compartments from 20 HCMV-infected patients (renal transplant recipients, stem cell transplant recipients, and congenitally infected children). De novo-assembled HCMV genome sequences were obtained for 57 of 68 sequenced samples. Analysis of longitudinal or compartmental HCMV diversity revealed various patterns: no major differences were detected among longitudinal, intraindividual blood samples from 9 of 15 patients and in most of the patients with compartmental samples, whereas a switch of the major HCMV population was observed in 6 individuals with sequential blood samples and upon compartmental analysis of 1 patient with HCMV retinitis. Variant analysis revealed additional aspects of minor virus population dynamics and antiviral-resistance mutations. In immunosuppressed patients, HCMV can remain relatively stable or undergo drastic genomic changes that are suggestive of the emergence of minor resident strains or de novo infection. © The Author 2017. Published by Oxford University Press for the Infectious Diseases Society of America. All rights reserved. For permissions, e-mail journals.permissions@oup.com.

  17. The miRNAs and their regulatory networks responsible for pollen abortion in Ogura-CMS Chinese cabbage revealed by high-throughput sequencing of miRNAs, degradomes, and transcriptomes.

    Science.gov (United States)

    Wei, Xiaochun; Zhang, Xiaohui; Yao, Qiuju; Yuan, Yuxiang; Li, Xixiang; Wei, Fang; Zhao, Yanyan; Zhang, Qiang; Wang, Zhiyong; Jiang, Wusheng; Zhang, Xiaowei

    2015-01-01

    Chinese cabbage (Brassica rapa ssp. pekinensis) is one of the most important vegetables in Asia and is cultivated across the world. Ogura-type cytoplasmic male sterility (Ogura-CMS) has been widely used in the hybrid breeding industry for Chinese cabbage and many other cruciferous vegetables. Although, the cause of Ogura-CMS has been localized to the orf138 locus in the mitochondrial genome, however, the mechanism by which nuclear genes respond to the mutation of the mitochondrial orf138 locus is unclear. In this study, a series of whole genome small RNA, degradome and transcriptome analyses were performed on both Ogura-CMS and its maintainer Chinese cabbage buds using deep sequencing technology. A total of 289 known miRNAs derived from 69 families (including 23 new families first reported in B. rapa) and 426 novel miRNAs were identified. Among these novel miRNAs, both 3-p and 5-p miRNAs were detected on the hairpin arms of 138 precursors. Ten known and 49 novel miRNAs were down-regulated, while one known and 27 novel miRNAs were up-regulated in Ogura-CMS buds compared to the fertile plants. Using degradome analysis, a total of 376 mRNAs were identified as targets of 30 known miRNA families and 100 novel miRNAs. A large fraction of the targets were annotated as reproductive development related. Our transcriptome profiling revealed that the expression of the targets was finely tuned by the miRNAs. Two novel miRNAs were identified that were specifically highly expressed in Ogura-CMS buds and sufficiently suppressed two pollen development essential genes: sucrose transporter SUC1 and H (+) -ATPase 6. These findings provide clues for the contribution of a potential miRNA regulatory network to bud development and pollen engenderation. This study contributes new insights to the communication between the mitochondria and chromosome and takes one step toward filling the gap in the regulatory network from the orf138 locus to pollen abortion in Ogura-CMS plants from a mi

  18. The miRNAs and their regulatory networks responsible for pollen abortion in Ogura-CMS Chinese cabbage revealed by high-throughput sequencing of miRNAs, degradomes and transcriptomes

    Directory of Open Access Journals (Sweden)

    Xiaochun eWei

    2015-10-01

    Full Text Available Chinese cabbage (Brassica rapa ssp. pekinensis is one of the most important vegetables in Asia and is cultivated across the world. Ogura-type cytoplasmic male sterility (Ogura-CMS has been widely used in the hybrid breeding industry for Chinese cabbage and many other cruciferous vegetables. Although, the cause of Ogura-CMS has been localized to the orf138 locus in the mitochondrial genome, however, the mechanism by which nuclear genes respond to the mutation of the mitochondrial orf138 locus is unclear. In this study, a series of whole genome small RNA, degradome and transcriptome analyses were performed on both Ogura-CMS and its maintainer Chinese cabbage buds using deep sequencing technology. A total of 289 known miRNAs derived from 69 families (including 23 new families first reported in B. rapa and 426 novel miRNAs were identified. Among these novel miRNAs, both 3-p and 5-p miRNAs were detected on the hairpin arms of 138 precursors. Ten known and 49 novel miRNAs were down-regulated, while one known and 27 novel miRNAs were up-regulated in Ogura-CMS buds compared to the fertile plants. Using degradome analysis, a total of 376 mRNAs were identified as targets of 30 known miRNA families and 100 novel miRNAs. A large fraction of the targets were annotated as reproductive development related. Our transcriptome profiling revealed that the expression of the targets was finely tuned by the miRNAs. Two novel miRNAs were identified that were specifically highly expressed in Ogura-CMS buds and sufficiently suppressed two pollen development essential genes: sucrose transporter SUC1 and H+-ATPase 6. These findings provide clues for the contribution of a potential miRNA regulatory network to bud development and pollen engenderation. This study contributes new insights to the communication between the mitochondria and chromosome and takes one step toward filling the gap in the regulatory network from the orf138 locus to pollen abortion in Ogura-CMS plants

  19. RNA sequencing based analysis of the spleen transcriptome following the infectious bronchitis virus infection of chickens selected for different mannose-binding lectin serum concentrations

    DEFF Research Database (Denmark)

    Hamzic, Edin; Kjærup, Rikke Brødsgaard; Mach, Núria

    2016-01-01

    in strategies to control IB. To this end, two chicken lines, selected for high and low serum concentration of mannose-binding lectin (MBL), a soluble pattern recognition receptor, were studied. In total, 32 animals from each line (designated L10H for high and L10L for low MBL serum concentration) were used....... Sixteen birds from each line were infected with IBV on day 1 and birds were euthanized at 1 week and 3 weeks post infection, 8 uninfected controls and 8 infected birds from each line at each occasion. RNA sequencing was performed on spleen samples from all 64 birds used in the experiment. Differential...

  20. Multiple aspects of ATP-dependent nucleosome translocation by RSC and Mi-2 are directed by the underlying DNA sequence.

    Directory of Open Access Journals (Sweden)

    Joke J F A van Vugt

    Full Text Available BACKGROUND: Chromosome structure, DNA metabolic processes and cell type identity can all be affected by changing the positions of nucleosomes along chromosomal DNA, a reaction that is catalysed by SNF2-type ATP-driven chromatin remodelers. Recently it was suggested that in vivo, more than 50% of the nucleosome positions can be predicted simply by DNA sequence, especially within promoter regions. This seemingly contrasts with remodeler induced nucleosome mobility. The ability of remodeling enzymes to mobilise nucleosomes over short DNA distances is well documented. However, the nucleosome translocation processivity along DNA remains elusive. Furthermore, it is unknown what determines the initial direction of movement and how new nucleosome positions are adopted. METHODOLOGY/PRINCIPAL FINDINGS: We have used AFM imaging and high resolution PAGE of mononucleosomes on 600 and 2500 bp DNA molecules to analyze ATP-dependent nucleosome repositioning by native and recombinant SNF2-type enzymes. We report that the underlying DNA sequence can control the initial direction of translocation, translocation distance, as well as the new positions adopted by nucleosomes upon enzymatic mobilization. Within a strong nucleosomal positioning sequence both recombinant Drosophila Mi-2 (CHD-type and native RSC from yeast (SWI/SNF-type repositioned the nucleosome at 10 bp intervals, which are intrinsic to the positioning sequence. Furthermore, RSC-catalyzed nucleosome translocation was noticeably more efficient when beyond the influence of this sequence. Interestingly, under limiting ATP conditions RSC preferred to position the nucleosome with 20 bp intervals within the positioning sequence, suggesting that native RSC preferentially translocates nucleosomes with 15 to 25 bp DNA steps. CONCLUSIONS/SIGNIFICANCE: Nucleosome repositioning thus appears to be influenced by both remodeler intrinsic and DNA sequence specific properties that interplay to define ATPase

  1. Sequence Directionality Dramatically Affects LCST Behavior of Elastin-Like Polypeptides.

    Science.gov (United States)

    Li, Nan K; Roberts, Stefan; Quiroz, Felipe Garcia; Chilkoti, Ashutosh; Yingling, Yaroslava G

    2018-04-30

    Elastin-like polypeptides (ELP) exhibit an inverse temperature transition or lower critical solution temperature (LCST) transition phase behavior in aqueous solutions. In this paper, the thermal responsive properties of the canonical ELP, poly(VPGVG), and its reverse sequence poly(VGPVG) were investigated by turbidity measurements of the cloud point behavior, circular dichroism (CD) measurements, and all-atom molecular dynamics (MD) simulations to gain a molecular understanding of mechanism that controls hysteretic phase behavior. It was shown experimentally that both poly(VPGVG) and poly(VGPVG) undergo a transition from soluble to insoluble in aqueous solution upon heating above the transition temperature ( T t ). However, poly(VPGVG) resolubilizes upon cooling below its T t , whereas the reverse sequence, poly(VGPVG), remains aggregated despite significant undercooling below the T t . The results from MD simulations indicated that a change in sequence order results in significant differences in the dynamics of the specific residues, especially valines, which lead to extensive changes in the conformations of VPGVG and VGPVG pentamers and, consequently, dissimilar propensities for secondary structure formation and overall structure of polypeptides. These changes affected the relative hydrophilicities of polypeptides above T t , where poly(VGPVG) is more hydrophilic than poly(VPGVG) with more extended conformation and larger surface area, which led to formation of strong interchain hydrogen bonds responsible for stabilization of the aggregated phase and the observed thermal hysteresis for poly(VGPVG).

  2. Genome-wide transcriptome analysis between small-tail Han sheep and the Surabaya fur sheep using high-throughput RNA sequencing.

    Science.gov (United States)

    Miao, Xiangyang; Luo, Qingmiao

    2013-06-01

    The small-tail Han sheep and the Surabaya fur sheep are two local breeds in north China, which are characterized by high-fecundity and low-prolificacy breed respectively. Significant genetic differences between these two breeds have provided increasing interests in the identification and utilization of major prolificacy genes in these sheep. High prolificacy is a complex trait, and it is difficult to comprehensively identify the candidate genes related to this trait using the single molecular biology technique. To understand the molecular mechanisms of fecundity and provide more information about high prolificacy candidate genes in high- and low-fecundity sheep, we explored the utility of next-generation sequencing technology in this work. A total of 1.8 Gb sequencing reads were obtained and resulted in more than 20 000 contigs that averaged ∼300 bp in length. Ten differentially expressed genes were further verified by quantitative real-time RT-PCR to confirm the reliability of RNA-seq results. Our work will provide a basis for the future research of the sheep reproduction.

  3. Scrimer: designing primers from transcriptome data

    Czech Academy of Sciences Publication Activity Database

    Mořkovský, Libor; Pačes, Jan; Rídl, Jakub; Reifová, R.

    2015-01-01

    Roč. 15, č. 6 (2015), s. 1415-1420 ISSN 1755-098X R&D Projects: GA MŠk EE2.3.20.0303 Institutional support: RVO:68081766 ; RVO:68378050 Keywords : next-generation sequencing * primer design * SNaPshot * SNP genotyping * transcriptome Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 5.298, year: 2015