WorldWideScience

Sample records for genome scale transcriptome

  1. Genome Scale Transcriptomics of Baculovirus-Insect Interactions

    Directory of Open Access Journals (Sweden)

    Steven Reid

    2013-11-01

    Full Text Available Baculovirus-insect cell technologies are applied in the production of complex proteins, veterinary and human vaccines, gene delivery vectors‚ and biopesticides. Better understanding of how baculoviruses and insect cells interact would facilitate baculovirus-based production. While complete genomic sequences are available for over 58 baculovirus species, little insect genomic information is known. The release of the Bombyx mori and Plutella xylostella genomes, the accumulation of EST sequences for several Lepidopteran species, and especially the availability of two genome-scale analysis tools, namely oligonucleotide microarrays and next generation sequencing (NGS, have facilitated expression studies to generate a rich picture of insect gene responses to baculovirus infections. This review presents current knowledge on the interaction dynamics of the baculovirus-insect system‚ which is relatively well studied in relation to nucleocapsid transportation, apoptosis, and heat shock responses, but is still poorly understood regarding responses involved in pro-survival pathways, DNA damage pathways, protein degradation, translation, signaling pathways, RNAi pathways, and importantly metabolic pathways for energy, nucleotide and amino acid production. We discuss how the two genome-scale transcriptomic tools can be applied for studying such pathways and suggest that proteomics and metabolomics can produce complementary findings to transcriptomic studies.

  2. Genome scale transcriptomics of baculovirus-insect interactions.

    Science.gov (United States)

    Nguyen, Quan; Nielsen, Lars K; Reid, Steven

    2013-11-12

    Baculovirus-insect cell technologies are applied in the production of complex proteins, veterinary and human vaccines, gene delivery vectors' and biopesticides. Better understanding of how baculoviruses and insect cells interact would facilitate baculovirus-based production. While complete genomic sequences are available for over 58 baculovirus species, little insect genomic information is known. The release of the Bombyx mori and Plutella xylostella genomes, the accumulation of EST sequences for several Lepidopteran species, and especially the availability of two genome-scale analysis tools, namely oligonucleotide microarrays and next generation sequencing (NGS), have facilitated expression studies to generate a rich picture of insect gene responses to baculovirus infections. This review presents current knowledge on the interaction dynamics of the baculovirus-insect system' which is relatively well studied in relation to nucleocapsid transportation, apoptosis, and heat shock responses, but is still poorly understood regarding responses involved in pro-survival pathways, DNA damage pathways, protein degradation, translation, signaling pathways, RNAi pathways, and importantly metabolic pathways for energy, nucleotide and amino acid production. We discuss how the two genome-scale transcriptomic tools can be applied for studying such pathways and suggest that proteomics and metabolomics can produce complementary findings to transcriptomic studies.

  3. Inferring the choreography of parental genomes during fertilization from ultralarge-scale whole-transcriptome analysis.

    Science.gov (United States)

    Park, Sung-Joon; Komata, Makiko; Inoue, Fukashi; Yamada, Kaori; Nakai, Kenta; Ohsugi, Miho; Shirahige, Katsuhiko

    2013-12-15

    Fertilization precisely choreographs parental genomes by using gamete-derived cellular factors and activating genome regulatory programs. However, the mechanism remains elusive owing to the technical difficulties of preparing large numbers of high-quality preimplantation cells. Here, we collected >14 × 10(4) high-quality mouse metaphase II oocytes and used these to establish detailed transcriptional profiles for four early embryo stages and parthenogenetic development. By combining these profiles with other public resources, we found evidence that gene silencing appeared to be mediated in part by noncoding RNAs and that this was a prerequisite for post-fertilization development. Notably, we identified 817 genes that were differentially expressed in embryos after fertilization compared with parthenotes. The regulation of these genes was distinctly different from those expressed in parthenotes, suggesting functional specialization of particular transcription factors prior to first cell cleavage. We identified five transcription factors that were potentially necessary for developmental progression: Foxd1, Nkx2-5, Sox18, Myod1, and Runx1. Our very large-scale whole-transcriptome profile of early mouse embryos yielded a novel and valuable resource for studies in developmental biology and stem cell research. The database is available at http://dbtmee.hgc.jp.

  4. Genome-scale transcriptome analysis in response to nitric oxide in birch cells: implications of the triterpene biosynthetic pathway.

    Directory of Open Access Journals (Sweden)

    Fansuo Zeng

    Full Text Available Evidence supporting nitric oxide (NO as a mediator of plant biochemistry continues to grow, but its functions at the molecular level remains poorly understood and, in some cases, controversial. To study the role of NO at the transcriptional level in Betula platyphylla cells, we conducted a genome-scale transcriptome analysis of these cells. The transcriptome of untreated birch cells and those treated by sodium nitroprusside (SNP were analyzed using the Solexa sequencing. Data were collected by sequencing cDNA libraries of birch cells, which had a long period to adapt to the suspension culture conditions before SNP-treated cells and untreated cells were sampled. Among the 34,100 UniGenes detected, BLASTX search revealed that 20,631 genes showed significant (E-values≤10-5 sequence similarity with proteins from the NR-database. Numerous expressed sequence tags (i.e., 1374 were identified as differentially expressed between the 12 h SNP-treated cells and control cells samples: 403 up-regulated and 971 down-regulated. From this, we specifically examined a core set of NO-related transcripts. The altered expression levels of several transcripts, as determined by transcriptome analysis, was confirmed by qRT-PCR. The results of transcriptome analysis, gene expression quantification, the content of triterpenoid and activities of defensive enzymes elucidated NO has a significant effect on many processes including triterpenoid production, carbohydrate metabolism and cell wall biosynthesis.

  5. Genome-scale transcriptomic insights into early-stage fruit development in woodland strawberry Fragaria vesca.

    Science.gov (United States)

    Kang, Chunying; Darwish, Omar; Geretz, Aviva; Shahan, Rachel; Alkharouf, Nadim; Liu, Zhongchi

    2013-06-01

    Fragaria vesca, a diploid woodland strawberry with a small and sequenced genome, is an excellent model for studying fruit development. The strawberry fruit is unique in that the edible flesh is actually enlarged receptacle tissue. The true fruit are the numerous dry achenes dotting the receptacle's surface. Auxin produced from the achene is essential for the receptacle fruit set, a paradigm for studying crosstalk between hormone signaling and development. To investigate the molecular mechanism underlying strawberry fruit set, next-generation sequencing was employed to profile early-stage fruit development with five fruit tissue types and five developmental stages from floral anthesis to enlarged fruits. This two-dimensional data set provides a systems-level view of molecular events with precise spatial and temporal resolution. The data suggest that the endosperm and seed coat may play a more prominent role than the embryo in auxin and gibberellin biosynthesis for fruit set. A model is proposed to illustrate how hormonal signals produced in the endosperm and seed coat coordinate seed, ovary wall, and receptacle fruit development. The comprehensive fruit transcriptome data set provides a wealth of genomic resources for the strawberry and Rosaceae communities as well as unprecedented molecular insight into fruit set and early stage fruit development.

  6. Genome-Scale Transcriptome Analysis of the Desert Shrub Artemisia sphaerocephala.

    Directory of Open Access Journals (Sweden)

    Lijing Zhang

    Full Text Available Artemisia sphaerocephala, a semi-shrub belonging to the Artemisia genus of the Compositae family, is an important pioneer plant that inhabits moving and semi-stable sand dunes in the deserts and steppes of northwest and north-central China. It is very resilient in extreme environments. Additionally, its seeds have excellent nutritional value, and the abundant lipids and polysaccharides in the seeds make this plant a potential valuable source of bio-energy. However, partly due to the scarcity of genetic information, the genetic mechanisms controlling the traits and environmental adaptation capacity of A. sphaerocephala are unknown.Here, we present the first in-depth transcriptomic analysis of A. sphaerocephala. To maximize the representation of conditional transcripts, mRNA was obtained from 17 samples, including living tissues of desert-growing A. sphaerocephala, seeds germinated in the laboratory, and calli subjected to no stress (control and high and low temperature, high and low osmotic, and salt stresses. De novo transcriptome assembly performed using an Illumina HiSeq 2500 platform resulted in the generation of 68,373 unigenes. We analyzed the key genes involved in the unsaturated fatty acid synthesis pathway and identified 26 A. sphaerocephala fad2 genes, which is the largest fad2 gene family reported to date. Furthermore, a set of genes responsible for resistance to extreme temperatures, salt, drought and a combination of stresses was identified.The present work provides abundant genomic information for functional dissection of the important traits of A. sphaerocephala and contributes to the current understanding of molecular adaptive mechanisms of A. sphaerocephala in the desert environment. Identification of the key genes in the unsaturated fatty acid synthesis pathway could increase understanding of the biological regulatory mechanisms of fatty acid composition traits in plants and facilitate genetic manipulation of the fatty acid

  7. Genome-Scale Transcriptome Analysis of the Desert Shrub Artemisia sphaerocephala.

    Science.gov (United States)

    Zhang, Lijing; Hu, Xiaowei; Miao, Xiumei; Chen, Xiaolong; Nan, Shuzhen; Fu, Hua

    2016-01-01

    Artemisia sphaerocephala, a semi-shrub belonging to the Artemisia genus of the Compositae family, is an important pioneer plant that inhabits moving and semi-stable sand dunes in the deserts and steppes of northwest and north-central China. It is very resilient in extreme environments. Additionally, its seeds have excellent nutritional value, and the abundant lipids and polysaccharides in the seeds make this plant a potential valuable source of bio-energy. However, partly due to the scarcity of genetic information, the genetic mechanisms controlling the traits and environmental adaptation capacity of A. sphaerocephala are unknown. Here, we present the first in-depth transcriptomic analysis of A. sphaerocephala. To maximize the representation of conditional transcripts, mRNA was obtained from 17 samples, including living tissues of desert-growing A. sphaerocephala, seeds germinated in the laboratory, and calli subjected to no stress (control) and high and low temperature, high and low osmotic, and salt stresses. De novo transcriptome assembly performed using an Illumina HiSeq 2500 platform resulted in the generation of 68,373 unigenes. We analyzed the key genes involved in the unsaturated fatty acid synthesis pathway and identified 26 A. sphaerocephala fad2 genes, which is the largest fad2 gene family reported to date. Furthermore, a set of genes responsible for resistance to extreme temperatures, salt, drought and a combination of stresses was identified. The present work provides abundant genomic information for functional dissection of the important traits of A. sphaerocephala and contributes to the current understanding of molecular adaptive mechanisms of A. sphaerocephala in the desert environment. Identification of the key genes in the unsaturated fatty acid synthesis pathway could increase understanding of the biological regulatory mechanisms of fatty acid composition traits in plants and facilitate genetic manipulation of the fatty acid composition of oil

  8. Chromosome Scale Genome Assembly andTranscriptome Profiling of Nannochloropsisgaditana in Nitrogen Depletion

    Institute of Scientific and Technical Information of China (English)

    2014-01-01

    Nannochloropsis is rapidly emerging as a model organism for the study of biofuel production in microalgae.Here, we report a high-quality genomic assembly of Nannochloropsis gaditana, consisting of large contigs, up to 500 kbplong, and scaffolds that in most cases span the entire length of the chromosomes. We identified 10646 complete genesand characterized possible alternative transcripts. The annotation of the predicted genes and the analysis of cellular pro-cesses revealed traits relevant for the genetic improvement of this organism such as genes involved in DNA recombina-tion, RNA silencing, and cell wall synthesis. We also analyzed the modification of the transcriptional profile in nitrogendeficiencyma condition known to stimulate lipid accumulation. While the content of lipids increased, we did not detectmajor changes in expression of the genes involved in their biosynthesis. At the same time, we observed a very signifi-cant down-regulation of mitochondrial gene expression, suggesting that part of the AcetyI-CoA and NAD(P)H, normallyoxidized through the mitochondrial respiration, would be made available for fatty acids synthesis, increasing the fluxthrough the lipid biosynthetic pathway. Finally, we released an information resource of the genomic data of IV. gaditana,available online at www.nannochloropsis.org.

  9. Genome-Scale Models

    DEFF Research Database (Denmark)

    Bergdahl, Basti; Sonnenschein, Nikolaus; Machado, Daniel

    2016-01-01

    An introduction to genome-scale models, how to build and use them, will be given in this chapter. Genome-scale models have become an important part of systems biology and metabolic engineering, and are increasingly used in research, both in academica and in industry, both for modeling chemical pr...

  10. Genome-wide transcriptome analysis of 150 cell samples†

    Science.gov (United States)

    Russom, Aman; Xiao, Wenzhong; Wilhelmy, Julie; Wang, Shenglong; Heath, Joe Don; Kurn, Nurith; Tompkins, Ronald G.; Davis, Ronald W.; Toner, Mehmet

    2013-01-01

    A major challenge in molecular biology is interrogating the human transcriptome on a genome wide scale when only a limited amount of biological sample is available for analysis. Current methodologies using microarray technologies for simultaneously monitoring mRNA transcription levels require nanogram amounts of total RNA. To overcome the sample size limitation of current technologies, we have developed a method to probe the global gene expression in biological samples as small as 150 cells, or the equivalent of approximately 300 pg total RNA. The new method employs microfluidic devices for the purification of total RNA from mammalian cells and ultra-sensitive whole transcriptome amplification techniques. We verified that the RNA integrity is preserved through the isolation process, accomplished highly reproducible whole transcriptome analysis, and established high correlation between repeated isolations of 150 cells and the same cell culture sample. We validated the technology by demonstrating that the combined microfluidic and amplification protocol is capable of identifying biological pathways perturbed by stimulation, which are consistent with the information recognized in bulk-isolated samples. PMID:20023796

  11. Genome-wide transcriptome analysis of 150 cell samples.

    Science.gov (United States)

    Irimia, Daniel; Mindrinos, Michael; Russom, Aman; Xiao, Wenzhong; Wilhelmy, Julie; Wang, Shenglong; Heath, Joe Don; Kurn, Nurith; Tompkins, Ronald G; Davis, Ronald W; Toner, Mehmet

    2009-01-01

    A major challenge in molecular biology is interrogating the human transcriptome on a genome wide scale when only a limited amount of biological sample is available for analysis. Current methodologies using microarray technologies for simultaneously monitoring mRNA transcription levels require nanogram amounts of total RNA. To overcome the sample size limitation of current technologies, we have developed a method to probe the global gene expression in biological samples as small as 150 cells, or the equivalent of approximately 300 pg total RNA. The new method employs microfluidic devices for the purification of total RNA from mammalian cells and ultra-sensitive whole transcriptome amplification techniques. We verified that the RNA integrity is preserved through the isolation process, accomplished highly reproducible whole transcriptome analysis, and established high correlation between repeated isolations of 150 cells and the same cell culture sample. We validated the technology by demonstrating that the combined microfluidic and amplification protocol is capable of identifying biological pathways perturbed by stimulation, which are consistent with the information recognized in bulk-isolated samples.

  12. Genome Annotation and Transcriptomics of Oil-Producing Algae

    Science.gov (United States)

    2015-03-16

    AFRL-OSR-VA-TR-2015-0103 GENOME ANNOTATION AND TRANSCRIPTOMICS OF OIL-PRODUCING ALGAE Sabeeha Merchant UNIVERSITY OF CALIFORNIA LOS ANGELES Final...2010 To 12-31-2014 4. TITLE AND SUBTITLE GENOME ANNOTATION AND TRANSCRIPTOMICS OF OIL-PRODUCING ALGAE 5a. CONTRACT NUMBER FA9550-10-1-0095 5b...NOTES 14. ABSTRACT Most algae accumulate triacylglycerols (TAGs) when they are starved for essential nutrients like N, S, P (or Si in the case of some

  13. Genomic and Transcriptomic Analyses of Foodborne Bacterial Pathogens

    Science.gov (United States)

    Zhang, Wei; Dudley, Edward G.; Wade, Joseph T.

    DNA microarrays (often interchangeably called DNA chips or DNA arrays) are among the most popular analytical tools for high-throughput comparative genomic and transcriptomic analyses of foodborne bacterial pathogens. A typical DNA microarray contains hundreds to millions of small DNA probes that are chemically attached (or "printed") onto the surface of a microscopic glass slide. Depending on the specific "printing" and probe synthesis technologies for different microarray platforms, such DNA probes can be PCR amplicons or in situ synthesized short oligonucleotides. DNA microarray technologies have revolutionized the way that we investigate the biology of foodborne bacterial pathogens. The major advantage of these technologies is that DNA microarrays allow comparison of subtle genomic or transcriptomic variations between two bacterial samples, such as genomic variations between two different bacterial strains or transcriptomic alterations of same bacterial strain under two different treatments. Some applications of comparative genomic hybridization microarrays and global gene expression microarrays have been covered in previous chapters of this book.

  14. Characterization of mango (Mangifera indica L.) transcriptome and chloroplast genome.

    Science.gov (United States)

    Azim, M Kamran; Khan, Ishtaiq A; Zhang, Yong

    2014-05-01

    We characterized mango leaf transcriptome and chloroplast genome using next generation DNA sequencing. The RNA-seq output of mango transcriptome generated >12 million reads (total nucleotides sequenced >1 Gb). De novo transcriptome assembly generated 30,509 unigenes with lengths in the range of 300 to ≥3,000 nt and 67× depth of coverage. Blast searching against nonredundant nucleotide databases and several Viridiplantae genomic datasets annotated 24,593 mango unigenes (80% of total) and identified Citrus sinensis as closest neighbor of mango with 9,141 (37%) matched sequences. The annotation with gene ontology and Clusters of Orthologous Group terms categorized unigene sequences into 57 and 25 classes, respectively. More than 13,500 unigenes were assigned to 293 KEGG pathways. Besides major plant biology related pathways, KEGG based gene annotation pointed out active presence of an array of biochemical pathways involved in (a) biosynthesis of bioactive flavonoids, flavones and flavonols, (b) biosynthesis of terpenoids and lignins and (c) plant hormone signal transduction. The mango transcriptome sequences revealed 235 proteases belonging to five catalytic classes of proteolytic enzymes. The draft genome of mango chloroplast (cp) was obtained by a combination of Sanger and next generation sequencing. The draft mango cp genome size is 151,173 bp with a pair of inverted repeats of 27,093 bp separated by small and large single copy regions, respectively. Out of 139 genes in mango cp genome, 91 found to be protein coding. Sequence analysis revealed cp genome of C. sinensis as closest neighbor of mango. We found 51 short repeats in mango cp genome supposed to be associated with extensive rearrangements. This is the first report of transcriptome and chloroplast genome analysis of any Anacardiaceae family member.

  15. Integrated analysis of whole genome and transcriptome sequencing reveals diverse transcriptomic aberrations driven by somatic genomic changes in liver cancers.

    Directory of Open Access Journals (Sweden)

    Yuichi Shiraishi

    Full Text Available Recent studies applying high-throughput sequencing technologies have identified several recurrently mutated genes and pathways in multiple cancer genomes. However, transcriptional consequences from these genomic alterations in cancer genome remain unclear. In this study, we performed integrated and comparative analyses of whole genomes and transcriptomes of 22 hepatitis B virus (HBV-related hepatocellular carcinomas (HCCs and their matched controls. Comparison of whole genome sequence (WGS and RNA-Seq revealed much evidence that various types of genomic mutations triggered diverse transcriptional changes. Not only splice-site mutations, but also silent mutations in coding regions, deep intronic mutations and structural changes caused splicing aberrations. HBV integrations generated diverse patterns of virus-human fusion transcripts depending on affected gene, such as TERT, CDK15, FN1 and MLL4. Structural variations could drive over-expression of genes such as WNT ligands, with/without creating gene fusions. Furthermore, by taking account of genomic mutations causing transcriptional aberrations, we could improve the sensitivity of deleterious mutation detection in known cancer driver genes (TP53, AXIN1, ARID2, RPS6KA3, and identified recurrent disruptions in putative cancer driver genes such as HNF4A, CPS1, TSC1 and THRAP3 in HCCs. These findings indicate genomic alterations in cancer genome have diverse transcriptomic effects, and integrated analysis of WGS and RNA-Seq can facilitate the interpretation of a large number of genomic alterations detected in cancer genome.

  16. Genome interplay in the grain transcriptome of hexaploid bread wheat.

    Science.gov (United States)

    Pfeifer, Matthias; Kugler, Karl G; Sandve, Simen R; Zhan, Bujie; Rudi, Heidi; Hvidsten, Torgeir R; Mayer, Klaus F X; Olsen, Odd-Arne

    2014-07-18

    Allohexaploid bread wheat (Triticum aestivum L.) provides approximately 20% of calories consumed by humans. Lack of genome sequence for the three homeologous and highly similar bread wheat genomes (A, B, and D) has impeded expression analysis of the grain transcriptome. We used previously unknown genome information to analyze the cell type-specific expression of homeologous genes in the developing wheat grain and identified distinct co-expression clusters reflecting the spatiotemporal progression during endosperm development. We observed no global but cell type- and stage-dependent genome dominance, organization of the wheat genome into transcriptionally active chromosomal regions, and asymmetric expression in gene families related to baking quality. Our findings give insight into the transcriptional dynamics and genome interplay among individual grain cell types in a polyploid cereal genome. Copyright © 2014, American Association for the Advancement of Science.

  17. Pichia stipitis genomics, transcriptomics, and gene clusters

    Science.gov (United States)

    Thomas W. Jeffries; Jennifer R. Headman Van Vleet

    2009-01-01

    Genome sequencing and subsequent global gene expression studies have advanced our understanding of the lignocellulose-fermenting yeast Pichia stipitis. These studies have provided an insight into its central carbon metabolism, and analysis of its genome has revealed numerous functional gene clusters and tandem repeats. Specialized physiological traits are often the...

  18. Genomic and transcriptomic resources for assassin flies including the complete genome sequence of Proctacanthus coquilletti (Insecta: Diptera: Asilidae and 16 representative transcriptomes

    Directory of Open Access Journals (Sweden)

    Rebecca B. Dikow

    2017-01-01

    Full Text Available A high-quality draft genome for Proctacanthus coquilletti (Insecta: Diptera: Asilidae is presented along with transcriptomes for 16 Diptera species from five families: Asilidae, Apioceridae, Bombyliidae, Mydidae, and Tabanidae. Genome sequencing reveals that P. coquilletti has a genome size of approximately 210 Mbp and remarkably low heterozygosity (0.47% and few repeats (15%. These characteristics helped produce a highly contiguous (N50 = 862 kbp assembly, particularly given that only a single 2 × 250 bp PCR-free Illumina library was sequenced. A phylogenomic hypothesis is presented based on thousands of putative orthologs across the 16 transcriptomes. Phylogenetic relationships support the sister group relationship of Apioceridae + Mydidae to Asilidae. A time-calibrated phylogeny is also presented, with seven fossil calibration points, which suggests an older age of the split among Apioceridae, Asilidae, and Mydidae (158 mya and Apioceridae and Mydidae (135 mya than proposed in the AToL FlyTree project. Future studies will be able to take advantage of the resources presented here in order to produce large scale phylogenomic and evolutionary studies of assassin fly phylogeny, life histories, or venom. The bioinformatics tools and workflow presented here will be useful to others wishing to generate de novo genomic resources in species-rich taxa without a closely-related reference genome.

  19. Genomic and transcriptomic resources for assassin flies including the complete genome sequence of Proctacanthus coquilletti (Insecta: Diptera: Asilidae) and 16 representative transcriptomes

    Science.gov (United States)

    Frandsen, Paul B.; Turcatel, Mauren

    2017-01-01

    A high-quality draft genome for Proctacanthus coquilletti (Insecta: Diptera: Asilidae) is presented along with transcriptomes for 16 Diptera species from five families: Asilidae, Apioceridae, Bombyliidae, Mydidae, and Tabanidae. Genome sequencing reveals that P. coquilletti has a genome size of approximately 210 Mbp and remarkably low heterozygosity (0.47%) and few repeats (15%). These characteristics helped produce a highly contiguous (N50 = 862 kbp) assembly, particularly given that only a single 2 × 250 bp PCR-free Illumina library was sequenced. A phylogenomic hypothesis is presented based on thousands of putative orthologs across the 16 transcriptomes. Phylogenetic relationships support the sister group relationship of Apioceridae + Mydidae to Asilidae. A time-calibrated phylogeny is also presented, with seven fossil calibration points, which suggests an older age of the split among Apioceridae, Asilidae, and Mydidae (158 mya) and Apioceridae and Mydidae (135 mya) than proposed in the AToL FlyTree project. Future studies will be able to take advantage of the resources presented here in order to produce large scale phylogenomic and evolutionary studies of assassin fly phylogeny, life histories, or venom. The bioinformatics tools and workflow presented here will be useful to others wishing to generate de novo genomic resources in species-rich taxa without a closely-related reference genome.

  20. Bacillus anthracis genome organization in light of whole transcriptome sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Martin, Jeffrey; Zhu, Wenhan; Passalacqua, Karla D.; Bergman, Nicholas; Borodovsky, Mark

    2010-03-22

    Emerging knowledge of whole prokaryotic transcriptomes could validate a number of theoretical concepts introduced in the early days of genomics. What are the rules connecting gene expression levels with sequence determinants such as quantitative scores of promoters and terminators? Are translation efficiency measures, e.g. codon adaptation index and RBS score related to gene expression? We used the whole transcriptome shotgun sequencing of a bacterial pathogen Bacillus anthracis to assess correlation of gene expression level with promoter, terminator and RBS scores, codon adaptation index, as well as with a new measure of gene translational efficiency, average translation speed. We compared computational predictions of operon topologies with the transcript borders inferred from RNA-Seq reads. Transcriptome mapping may also improve existing gene annotation. Upon assessment of accuracy of current annotation of protein-coding genes in the B. anthracis genome we have shown that the transcriptome data indicate existence of more than a hundred genes missing in the annotation though predicted by an ab initio gene finder. Interestingly, we observed that many pseudogenes possess not only a sequence with detectable coding potential but also promoters that maintain transcriptional activity.

  1. DNA Microarrays in Comparative Genomics and Transcriptomics

    DEFF Research Database (Denmark)

    Willenbrock, Hanni

    2007-01-01

    of each method’s ability to analyze DNA copy number data. Moreover, our study shows that analysis methods developed for cancer research may also successfully be applied to DNA copy number profiles from bacterial genomes. However, here the purpose is to characterize variations in the gene content...... to verify predictions of highly expressed genes. Moreover, the codon bias of microbial genomes was found to constitute an environmental signature. For example, soil bacteria have very similar codon bias....

  2. Comprehensive analyses of genomes, transcriptomes and metabolites of neem tree

    Directory of Open Access Journals (Sweden)

    Nagesh A. Kuravadi

    2015-08-01

    Full Text Available Neem (Azadirachta indica A. Juss is one of the most versatile tropical evergreen tree species known in India since the Vedic period (1500 BC–600 BC. Neem tree is a rich source of limonoids, having a wide spectrum of activity against insect pests and microbial pathogens. Complex tetranortriterpenoids such as azadirachtin, salanin and nimbin are the major active principles isolated from neem seed. Absolutely nothing is known about the biochemical pathways of these metabolites in neem tree. To identify genes and pathways in neem, we sequenced neem genomes and transcriptomes using next generation sequencing technologies. Assembly of Illumina and 454 sequencing reads resulted in 267 Mb, which accounts for 70% of estimated size of neem genome. We predicted 44,495 genes in the neem genome, of which 32,278 genes were expressed in neem tissues. Neem genome consists about 32.5% (87 Mb of repetitive DNA elements. Neem tree is phylogenetically related to citrus, Citrus sinensis. Comparative analysis anchored 62% (161 Mb of assembled neem genomic contigs onto citrus chromomes. Ultrahigh performance liquid chromatography-mass spectrometry-selected reaction monitoring (UHPLC-MS/SRM method was used to quantify azadirachtin, nimbin, and salanin from neem tissues. Weighted Correlation Network Analysis (WCGNA of expressed genes and metabolites resulted in identification of possible candidate genes involved in azadirachtin biosynthesis pathway. This study provides genomic, transcriptomic and quantity of top three neem metabolites resource, which will accelerate basic research in neem to understand biochemical pathways.

  3. Comprehensive analyses of genomes, transcriptomes and metabolites of neem tree.

    Science.gov (United States)

    Kuravadi, Nagesh A; Yenagi, Vijay; Rangiah, Kannan; Mahesh, H B; Rajamani, Anantharamanan; Shirke, Meghana D; Russiachand, Heikham; Loganathan, Ramya Malarini; Shankara Lingu, Chandana; Siddappa, Shilpa; Ramamurthy, Aishwarya; Sathyanarayana, B N; Gowda, Malali

    2015-01-01

    Neem (Azadirachta indica A. Juss) is one of the most versatile tropical evergreen tree species known in India since the Vedic period (1500 BC-600 BC). Neem tree is a rich source of limonoids, having a wide spectrum of activity against insect pests and microbial pathogens. Complex tetranortriterpenoids such as azadirachtin, salanin and nimbin are the major active principles isolated from neem seed. Absolutely nothing is known about the biochemical pathways of these metabolites in neem tree. To identify genes and pathways in neem, we sequenced neem genomes and transcriptomes using next generation sequencing technologies. Assembly of Illumina and 454 sequencing reads resulted in 267 Mb, which accounts for 70% of estimated size of neem genome. We predicted 44,495 genes in the neem genome, of which 32,278 genes were expressed in neem tissues. Neem genome consists about 32.5% (87 Mb) of repetitive DNA elements. Neem tree is phylogenetically related to citrus, Citrus sinensis. Comparative analysis anchored 62% (161 Mb) of assembled neem genomic contigs onto citrus chromomes. Ultrahigh performance liquid chromatography-mass spectrometry-selected reaction monitoring (UHPLC-MS/SRM) method was used to quantify azadirachtin, nimbin, and salanin from neem tissues. Weighted Correlation Network Analysis (WCGNA) of expressed genes and metabolites resulted in identification of possible candidate genes involved in azadirachtin biosynthesis pathway. This study provides genomic, transcriptomic and quantity of top three neem metabolites resource, which will accelerate basic research in neem to understand biochemical pathways.

  4. RNAseq versus genome-predicted transcriptomes: a large population of novel transcripts identified in an Illumina-454 Hydra transcriptome.

    Science.gov (United States)

    Wenger, Yvan; Galliot, Brigitte

    2013-03-25

    Evolutionary studies benefit from deep sequencing technologies that generate genomic and transcriptomic sequences from a variety of organisms. Genome sequencing and RNAseq have complementary strengths. In this study, we present the assembly of the most complete Hydra transcriptome to date along with a comparative analysis of the specific features of RNAseq and genome-predicted transcriptomes currently available in the freshwater hydrozoan Hydra vulgaris. To produce an accurate and extensive Hydra transcriptome, we combined Illumina and 454 Titanium reads, giving the primacy to Illumina over 454 reads to correct homopolymer errors. This strategy yielded an RNAseq transcriptome that contains 48'909 unique sequences including splice variants, representing approximately 24'450 distinct genes. Comparative analysis to the available genome-predicted transcriptomes identified 10'597 novel Hydra transcripts that encode 529 evolutionarily-conserved proteins. The annotation of 170 human orthologs points to critical functions in protein biosynthesis, FGF and TOR signaling, vesicle transport, immunity, cell cycle regulation, cell death, mitochondrial metabolism, transcription and chromatin regulation. However, a majority of these novel transcripts encodes short ORFs, at least 767 of them corresponding to pseudogenes. This RNAseq transcriptome also lacks 11'270 predicted transcripts that correspond either to silent genes or to genes expressed below the detection level of this study. We established a simple and powerful strategy to combine Illumina and 454 reads and we produced, with genome assistance, an extensive and accurate Hydra transcriptome. The comparative analysis of the RNAseq transcriptome with genome-predicted transcriptomes lead to the identification of large populations of novel as well as missing transcripts that might reflect Hydra-specific evolutionary events.

  5. genomic and transcriptomic approaches towards the genetic ...

    African Journals Online (AJOL)

    USER

    to the complex nature of these stresses, and the genotype x environment interaction (GxE). .... collection (Azam-Ali et al., 2001); (vi) biological .... Integrative platform to study gene function and gene evolution in legumes ..... a powerful dissection of the genetic control of ... complemented by a new approach called genomic.

  6. Single Cell Genomics and Transcriptomics for Unicellular Eukaryotes

    Energy Technology Data Exchange (ETDEWEB)

    Ciobanu, Doina; Clum, Alicia; Singh, Vasanth; Salamov, Asaf; Han, James; Copeland, Alex; Grigoriev, Igor; James, Timothy; Singer, Steven; Woyke, Tanja; Malmstrom, Rex; Cheng, Jan-Fang

    2014-03-14

    Despite their small size, unicellular eukaryotes have complex genomes with a high degree of plasticity that allow them to adapt quickly to environmental changes. Unicellular eukaryotes live with prokaryotes and higher eukaryotes, frequently in symbiotic or parasitic niches. To this day their contribution to the dynamics of the environmental communities remains to be understood. Unfortunately, the vast majority of eukaryotic microorganisms are either uncultured or unculturable, making genome sequencing impossible using traditional approaches. We have developed an approach to isolate unicellular eukaryotes of interest from environmental samples, and to sequence and analyze their genomes and transcriptomes. We have tested our methods with six species: an uncharacterized protist from cellulose-enriched compost identified as Platyophrya, a close relative of P. vorax; the fungus Metschnikowia bicuspidate, a parasite of water flea Daphnia; the mycoparasitic fungi Piptocephalis cylindrospora, a parasite of Cokeromyces and Mucor; Caulochytrium protosteloides, a parasite of Sordaria; Rozella allomycis, a parasite of the water mold Allomyces; and the microalgae Chlamydomonas reinhardtii. Here, we present the four components of our approach: pre-sequencing methods, sequence analysis for single cell genome assembly, sequence analysis of single cell transcriptomes, and genome annotation. This technology has the potential to uncover the complexity of single cell eukaryotes and their role in the environmental samples.

  7. Introduction to Nematode Genome and Transcriptome Announcements in the Journal of Nematology.

    Science.gov (United States)

    Denver, Dee R; Ragsdale, Erik J; Thomas, W Kelley; Zasada, Inga A

    2017-06-01

    The Journal of Nematology now offers publication of Nematode Genome Announcements (NGA) and Nematode Transcriptome Announcements (NTA). These brief reports announce the sequencing and assembly of a nematode genome or transcriptome resource, along with basic technical information on DNA sequencing and bioinformatic methods used. This publishing initiative offers a new avenue to openly and concisely communicate the availability and relevance of genome and transcriptome sequence resources to the broader scientific community.

  8. Applied bioinformatics: Genome annotation and transcriptome analysis

    DEFF Research Database (Denmark)

    Gupta, Vikas

    and dhurrin, which have not previously been characterized in blueberries. There are more than 44,500 spider species with distinct habitats and unique characteristics. Spiders are masters of producing silk webs to catch prey and using venom to neutralize. The exploration of the genetics behind these properties...... japonicus (Lotus), Vaccinium corymbosum (blueberry), Stegodyphus mimosarum (spider) and Trifolium occidentale (clover). From a bioinformatics data analysis perspective, my work can be divided into three parts; genome annotation, small RNA, and gene expression analysis. Lotus is a legume of significant...... has just started. We have assembled and annotated the first two spider genomes to facilitate our understanding of spiders at the molecular level. The need for analyzing the large and increasing amount of sequencing data has increased the demand for efficient, user friendly, and broadly applicable...

  9. Whole genome and transcriptome sequencing of a B3 thymoma.

    Directory of Open Access Journals (Sweden)

    Iacopo Petrini

    Full Text Available Molecular pathology of thymomas is poorly understood. Genomic aberrations are frequently identified in tumors but no extensive sequencing has been reported in thymomas. Here we present the first comprehensive view of a B3 thymoma at whole genome and transcriptome levels. A 55-year-old Caucasian female underwent complete resection of a stage IVA B3 thymoma. RNA and DNA were extracted from a snap frozen tumor sample with a fraction of cancer cells over 80%. We performed array comparative genomic hybridization using Agilent platform, transcriptome sequencing using HiSeq 2000 (Illumina and whole genome sequencing using Complete Genomics Inc platform. Whole genome sequencing determined, in tumor and normal, the sequence of both alleles in more than 95% of the reference genome (NCBI Build 37. Copy number (CN aberrations were comparable with those previously described for B3 thymomas, with CN gain of chromosome 1q, 5, 7 and X and CN loss of 3p, 6, 11q42.2-qter and q13. One translocation t(11;X was identified by whole genome sequencing and confirmed by PCR and Sanger sequencing. Ten single nucleotide variations (SNVs and 2 insertion/deletions (INDELs were identified; these mutations resulted in non-synonymous amino acid changes or affected splicing sites. The lack of common cancer-associated mutations in this patient suggests that thymomas may evolve through mechanisms distinctive from other tumor types, and supports the rationale for additional high-throughput sequencing screens to better understand the somatic genetic architecture of thymoma.

  10. Bamboo Flowering from the Perspective of Comparative Genomics and Transcriptomics.

    Science.gov (United States)

    Biswas, Prasun; Chakraborty, Sukanya; Dutta, Smritikana; Pal, Amita; Das, Malay

    2016-01-01

    Bamboos are an important member of the subfamily Bambusoideae, family Poaceae. The plant group exhibits wide variation with respect to the timing (1-120 years) and nature (sporadic vs. gregarious) of flowering among species. Usually flowering in woody bamboos is synchronous across culms growing over a large area, known as gregarious flowering. In many monocarpic bamboos this is followed by mass death and seed setting. While in sporadic flowering an isolated wild clump may flower, set little or no seed and remain alive. Such wide variation in flowering time and extent means that the plant group serves as repositories for genes and expression patterns that are unique to bamboo. Due to the dearth of available genomic and transcriptomic resources, limited studies have been undertaken to identify the potential molecular players in bamboo flowering. The public release of the first bamboo genome sequence Phyllostachys heterocycla, availability of related genomes Brachypodium distachyon and Oryza sativa provide us the opportunity to study this long-standing biological problem in a comparative and functional genomics framework. We identified bamboo genes homologous to those of Oryza and Brachypodium that are involved in established pathways such as vernalization, photoperiod, autonomous, and hormonal regulation of flowering. Additionally, we investigated triggers like stress (drought), physiological maturity and micro RNAs that may play crucial roles in flowering. We also analyzed available transcriptome datasets of different bamboo species to identify genes and their involvement in bamboo flowering. Finally, we summarize potential research hurdles that need to be addressed in future research.

  11. Transcriptome and genome size analysis of the Venus flytrap.

    Science.gov (United States)

    Jensen, Michael Krogh; Vogt, Josef Korbinian; Bressendorff, Simon; Seguin-Orlando, Andaine; Petersen, Morten; Sicheritz-Pontén, Thomas; Mundy, John

    2015-01-01

    The insectivorous Venus flytrap (Dionaea muscipula) is renowned from Darwin's studies of plant carnivory and the origins of species. To provide tools to analyze the evolution and functional genomics of D. muscipula, we sequenced a normalized cDNA library synthesized from mRNA isolated from D. muscipula flowers and traps. Using the Oases transcriptome assembler 79,165,657 quality trimmed reads were assembled into 80,806 cDNA contigs, with an average length of 679 bp and an N50 length of 1,051 bp. A total of 17,047 unique proteins were identified, and assigned to Gene Ontology (GO) and classified into functional categories. A total of 15,547 full-length cDNA sequences were identified, from which open reading frames were detected in 10,941. Comparative GO analyses revealed that D. muscipula is highly represented in molecular functions related to catalytic, antioxidant, and electron carrier activities. Also, using a single copy sequence PCR-based method, we estimated that the genome size of D. muscipula is approx. 3 Gb. Our genome size estimate and transcriptome analyses will contribute to future research on this fascinating, monotypic species and its heterotrophic adaptations.

  12. Transcriptome and genome size analysis of the Venus flytrap.

    Directory of Open Access Journals (Sweden)

    Michael Krogh Jensen

    Full Text Available The insectivorous Venus flytrap (Dionaea muscipula is renowned from Darwin's studies of plant carnivory and the origins of species. To provide tools to analyze the evolution and functional genomics of D. muscipula, we sequenced a normalized cDNA library synthesized from mRNA isolated from D. muscipula flowers and traps. Using the Oases transcriptome assembler 79,165,657 quality trimmed reads were assembled into 80,806 cDNA contigs, with an average length of 679 bp and an N50 length of 1,051 bp. A total of 17,047 unique proteins were identified, and assigned to Gene Ontology (GO and classified into functional categories. A total of 15,547 full-length cDNA sequences were identified, from which open reading frames were detected in 10,941. Comparative GO analyses revealed that D. muscipula is highly represented in molecular functions related to catalytic, antioxidant, and electron carrier activities. Also, using a single copy sequence PCR-based method, we estimated that the genome size of D. muscipula is approx. 3 Gb. Our genome size estimate and transcriptome analyses will contribute to future research on this fascinating, monotypic species and its heterotrophic adaptations.

  13. The draft genome and transcriptome of Cannabis sativa.

    Science.gov (United States)

    van Bakel, Harm; Stout, Jake M; Cote, Atina G; Tallon, Carling M; Sharpe, Andrew G; Hughes, Timothy R; Page, Jonathan E

    2011-10-20

    Cannabis sativa has been cultivated throughout human history as a source of fiber, oil and food, and for its medicinal and intoxicating properties. Selective breeding has produced cannabis plants for specific uses, including high-potency marijuana strains and hemp cultivars for fiber and seed production. The molecular biology underlying cannabinoid biosynthesis and other traits of interest is largely unexplored. We sequenced genomic DNA and RNA from the marijuana strain Purple Kush using shortread approaches. We report a draft haploid genome sequence of 534 Mb and a transcriptome of 30,000 genes. Comparison of the transcriptome of Purple Kush with that of the hemp cultivar 'Finola' revealed that many genes encoding proteins involved in cannabinoid and precursor pathways are more highly expressed in Purple Kush than in 'Finola'. The exclusive occurrence of Δ9-tetrahydrocannabinolic acid synthase in the Purple Kush transcriptome, and its replacement by cannabidiolic acid synthase in 'Finola', may explain why the psychoactive cannabinoid Δ9-tetrahydrocannabinol (THC) is produced in marijuana but not in hemp. Resequencing the hemp cultivars 'Finola' and 'USO-31' showed little difference in gene copy numbers of cannabinoid pathway enzymes. However, single nucleotide variant analysis uncovered a relatively high level of variation among four cannabis types, and supported a separation of marijuana and hemp. The availability of the Cannabis sativa genome enables the study of a multifunctional plant that occupies a unique role in human culture. Its availability will aid the development of therapeutic marijuana strains with tailored cannabinoid profiles and provide a basis for the breeding of hemp with improved agronomic characteristics.

  14. Genomics and transcriptomics across the diversity of the Nematoda.

    Science.gov (United States)

    Blaxter, M; Kumar, S; Kaur, G; Koutsovoulos, G; Elsworth, B

    2012-01-01

    The diversity of biology in nematodes is reflected in the diversity of their genomes. Parasitic species in particular have evolved mechanisms to invade and outwit their hosts, and these offer opportunities for the development of control measures. Genomic analyses can reveal the molecular underpinnings of phenotypes such as parasitism and thus, initiate and support research programmes that explore the manipulation of host and parasite physiologies to achieve favourable outcomes. Wide sampling across nematode diversity allows phylogenetically informed formulation of research hypotheses, identification of core features shared by all species or important evolutionary novelties present in isolated clades. Many nematode species have been investigated through the use of the expressed sequence tag approach, which samples from the transcribed genome. Gene catalogues generated in this way can be explored to reveal the patterns of expression associated with parasitism and candidates for testing as drug targets or vaccine components. Analysis environments, such as NEMBASE facilitate exploitation of these data. The development of new high-throughput DNA-sequencing technologies has facilitated transcriptomic and genomic approaches to parasite biology. Whole genome sequencing offers more complete catalogues of genes and assists a systems approach to phenotype dissection. These efforts are being coordinated through the 959 Nematode Genomes initiative.

  15. Cajal body function in genome organization and transcriptome diversity.

    Science.gov (United States)

    Sawyer, Iain A; Sturgill, David; Sung, Myong-Hee; Hager, Gordon L; Dundr, Miroslav

    2016-12-01

    Nuclear bodies contribute to non-random organization of the human genome and nuclear function. Using a major prototypical nuclear body, the Cajal body, as an example, we suggest that these structures assemble at specific gene loci located across the genome as a result of high transcriptional activity. Subsequently, target genes are physically clustered in close proximity in Cajal body-containing cells. However, Cajal bodies are observed in only a limited number of human cell types, including neuronal and cancer cells. Ultimately, Cajal body depletion perturbs splicing kinetics by reducing target small nuclear RNA (snRNA) transcription and limiting the levels of spliceosomal snRNPs, including their modification and turnover following each round of RNA splicing. As such, Cajal bodies are capable of shaping the chromatin interaction landscape and the transcriptome by influencing spliceosome kinetics. Future studies should concentrate on characterizing the direct influence of Cajal bodies upon snRNA gene transcriptional dynamics. Also see the video abstract here.

  16. The capsicum transcriptome DB: a "hot" tool for genomic research.

    Science.gov (United States)

    Góngora-Castillo, Elsa; Fajardo-Jaime, Rubén; Fernández-Cortes, Araceli; Jofre-Garfias, Alba E; Lozoya-Gloria, Edmundo; Martínez, Octavio; Ochoa-Alejo, Neftalí; Rivera-Bustamante, Rafael

    2012-01-01

    Chili pepper (Capsicum annuum) is an economically important crop with no available public genome sequence. We describe a genomic resource to facilitate Capsicum annuum research. A collection of Expressed Sequence Tags (ESTs) derived from five C. annuum organs (root, stem, leaf, flower and fruit) were sequenced using the Sanger method and multiple leaf transcriptomes were deeply sampled using with GS-pyrosequencing. A hybrid assembly of 1,324,516 raw reads yielded 32,314 high quality contigs as validated by coverage and identity analysis with existing pepper sequences. Overall, 75.5% of the contigs had significant sequence similarity to entries in nucleic acid and protein databases; 23% of the sequences have not been previously reported for C. annuum and expand sequence resources for this species. A MySQL database and a user-friendly Web interface were constructed with search-tools that permit queries of the ESTs including sequence, functional annotation, Gene Ontology classification, metabolic pathways, and assembly information. The Capsicum Transcriptome DB is free available from http://www.bioingenios.ira.cinvestav.mx:81/Joomla/

  17. LegumeIP: an integrative database for comparative genomics and transcriptomics of model legumes.

    Science.gov (United States)

    Li, Jun; Dai, Xinbin; Liu, Tingsong; Zhao, Patrick Xuechun

    2012-01-01

    Legumes play a vital role in maintaining the nitrogen cycle of the biosphere. They conduct symbiotic nitrogen fixation through endosymbiotic relationships with bacteria in root nodules. However, this and other characteristics of legumes, including mycorrhization, compound leaf development and profuse secondary metabolism, are absent in the typical model plant Arabidopsis thaliana. We present LegumeIP (http://plantgrn.noble.org/LegumeIP/), an integrative database for comparative genomics and transcriptomics of model legumes, for studying gene function and genome evolution in legumes. LegumeIP compiles gene and gene family information, syntenic and phylogenetic context and tissue-specific transcriptomic profiles. The database holds the genomic sequences of three model legumes, Medicago truncatula, Glycine max and Lotus japonicus plus two reference plant species, A. thaliana and Populus trichocarpa, with annotations based on UniProt, InterProScan, Gene Ontology and the Kyoto Encyclopedia of Genes and Genomes databases. LegumeIP also contains large-scale microarray and RNA-Seq-based gene expression data. Our new database is capable of systematic synteny analysis across M. truncatula, G. max, L. japonicas and A. thaliana, as well as construction and phylogenetic analysis of gene families across the five hosted species. Finally, LegumeIP provides comprehensive search and visualization tools that enable flexible queries based on gene annotation, gene family, synteny and relative gene expression.

  18. Parallel WGA and WTA for Comparative Genome and Transcriptome NGS Analysis Using Tiny Cell Numbers.

    Science.gov (United States)

    Korfhage, Christian; Fricke, Evelyn; Meier, Andreas

    2015-07-01

    Genomic DNA determines how and when the transcriptome is changed by a trigger or environmental change and how cellular metabolism is influenced. Comparative genome and transcriptome analysis of the same cell sample links a defined genome with all changes in the bases, structure, or numbers of the transcriptome. However, comparative genome and transcriptome analysis using next-generation sequencing (NGS) or real-time PCR is often limited by the small amount of sample available. In mammals, the amount of DNA and RNA in a single cell is ∼10 picograms, but deep analysis of the genome and transcriptome currently requires several hundred nanograms of nucleic acids for library preparation for NGS sequencing. Consequently, accurate whole-genome amplification (WGA) and whole-transcriptome amplification (WTA) is required for such quantitative analysis. This unit describes how the genome and the transcriptome of a tiny number of cells can be amplified in a highly parallel and comparable process. Protocols for quality control of amplified DNA and application of amplified DNA for NGS are included.

  19. TraV: a genome context sensitive transcriptome browser.

    Science.gov (United States)

    Dietrich, Sascha; Wiegand, Sandra; Liesegang, Heiko

    2014-01-01

    Next-generation sequencing (NGS) technologies like Illumina and ABI Solid enable the investigation of transcriptional activities of genomes. While read mapping tools have been continually improved to enable the processing of the increasing number of reads generated by NGS technologies, analysis and visualization tools are struggling with the amount of data they are presented with. Current tools are capable of handling at most two to three datasets simultaneously before they are limited by available memory or due to processing overhead. In order to process fifteen transcriptome sequencing experiments of Bacillus licheniformis DSM13 obtained in a previous study, we developed TraV, a RNA-Seq analysis and visualization tool. The analytical methods are designed for prokaryotic RNA-seq experiments. TraV calculates single nucleotide activities from the mapping information to visualize and analyze multiple transcriptome sequencing experiments. The use of nucleotide activities instead of single read mapping information is highly memory efficient without incurring a processing overhead. TraV is available at http://appmibio.uni-goettingen.de/index.php?sec=serv.

  20. A Universal Genome Array and Transcriptome Atlas for Brachypodium Distachyon

    Energy Technology Data Exchange (ETDEWEB)

    Mockler, Todd [Oregon State Univ., Corvallis, OR (United States)

    2017-04-17

    Brachypodium distachyon is the premier experimental model grass platform and is related to candidate feedstock crops for bioethanol production. Based on the DOE-JGI Brachypodium Bd21 genome sequence and annotation we designed a whole genome DNA microarray platform. The quality of this array platform is unprecedented due to the exceptional quality of the Brachypodium genome assembly and annotation and the stringent probe selection criteria employed in the design. We worked with members of the international community and the bioinformatics/design team at Affymetrix at all stages in the development of the array. We used the Brachypodium arrays to interrogate the transcriptomes of plants grown in a variety of environmental conditions including diurnal and circadian light/temperature conditions and under a variety of environmental conditions. We examined the transciptional responses of Brachypodium seedlings subjected to various abiotic stresses including heat, cold, salt, and high intensity light. We generated a gene expression atlas representing various organs and developmental stages. The results of these efforts including all microarray datasets are published and available at online public databases.

  1. Multiple reference genomes and transcriptomes for Arabidopsis thaliana

    KAUST Repository

    Gan, Xiangchao

    2011-08-28

    Genetic differences between Arabidopsis thaliana accessions underlie the plants extensive phenotypic variation, and until now these have been interpreted largely in the context of the annotated reference accession Col-0. Here we report the sequencing, assembly and annotation of the genomes of 18 natural A. thaliana accessions, and their transcriptomes. When assessed on the basis of the reference annotation, one-third of protein-coding genes are predicted to be disrupted in at least one accession. However, re-annotation of each genome revealed that alternative gene models often restore coding potential. Gene expression in seedlings differed for nearly half of expressed genes and was frequently associated with cis variants within 5 kilobases, as were intron retention alternative splicing events. Sequence and expression variation is most pronounced in genes that respond to the biotic environment. Our data further promote evolutionary and functional studies in A. thaliana, especially the MAGIC genetic reference population descended from these accessions. ©2011 Macmillan Publishers Limited. All rights reserved.

  2. Consequences of normalizing transcriptomic and genomic libraries of plant genomes using a duplex-specific nuclease and tetramethylammonium chloride.

    Directory of Open Access Journals (Sweden)

    Marta Matvienko

    Full Text Available Several applications of high throughput genome and transcriptome sequencing would benefit from a reduction of the high-copy-number sequences in the libraries being sequenced and analyzed, particularly when applied to species with large genomes. We adapted and analyzed the consequences of a method that utilizes a thermostable duplex-specific nuclease for reducing the high-copy components in transcriptomic and genomic libraries prior to sequencing. This reduces the time, cost, and computational effort of obtaining informative transcriptomic and genomic sequence data for both fully sequenced and non-sequenced genomes. It also reduces contamination from organellar DNA in preparations of nuclear DNA. Hybridization in the presence of 3 M tetramethylammonium chloride (TMAC, which equalizes the rates of hybridization of GC and AT nucleotide pairs, reduced the bias against sequences with high GC content. Consequences of this method on the reduction of high-copy and enrichment of low-copy sequences are reported for Arabidopsis and lettuce.

  3. Consequences of normalizing transcriptomic and genomic libraries of plant genomes using a duplex-specific nuclease and tetramethylammonium chloride.

    Science.gov (United States)

    Matvienko, Marta; Kozik, Alexander; Froenicke, Lutz; Lavelle, Dean; Martineau, Belinda; Perroud, Bertrand; Michelmore, Richard

    2013-01-01

    Several applications of high throughput genome and transcriptome sequencing would benefit from a reduction of the high-copy-number sequences in the libraries being sequenced and analyzed, particularly when applied to species with large genomes. We adapted and analyzed the consequences of a method that utilizes a thermostable duplex-specific nuclease for reducing the high-copy components in transcriptomic and genomic libraries prior to sequencing. This reduces the time, cost, and computational effort of obtaining informative transcriptomic and genomic sequence data for both fully sequenced and non-sequenced genomes. It also reduces contamination from organellar DNA in preparations of nuclear DNA. Hybridization in the presence of 3 M tetramethylammonium chloride (TMAC), which equalizes the rates of hybridization of GC and AT nucleotide pairs, reduced the bias against sequences with high GC content. Consequences of this method on the reduction of high-copy and enrichment of low-copy sequences are reported for Arabidopsis and lettuce.

  4. Metaplastic breast carcinomas display genomic and transcriptomic heterogeneity [corrected]. .

    Science.gov (United States)

    Weigelt, Britta; Ng, Charlotte K Y; Shen, Ronglai; Popova, Tatiana; Schizas, Michail; Natrajan, Rachael; Mariani, Odette; Stern, Marc-Henri; Norton, Larry; Vincent-Salomon, Anne; Reis-Filho, Jorge S

    2015-03-01

    features of metaplastic breast carcinomas is reflected at the transcriptomic level, and an association between molecular subtypes and histology was observed. BRCA1-like genomic profiles were found only in a subset (31%) of metaplastic breast cancers, and were not associated with a specific molecular or histologic subtype.

  5. Transcriptome complexity in a genome-reduced bacterium

    DEFF Research Database (Denmark)

    Güell, Marc; van Noort, Vera; Yus, Eva;

    2009-01-01

    To study basic principles of transcriptome organization in bacteria, we analyzed one of the smallest self-replicating organisms, Mycoplasma pneumoniae. We combined strand-specific tiling arrays, complemented by transcriptome sequencing, with more than 252 spotted arrays. We detected 117 previously...

  6. Analysis Of Transcriptomes In A Porcine Tissue Collection Using RNA-Seq And Genome Assembly 10

    DEFF Research Database (Denmark)

    Hornshøj, Henrik; Thomsen, Bo; Hedegaard, Jakob

    2011-01-01

    The release of Sus scrofa genome assembly 10 supports improvement of the pig genome annotation and in depth transcriptome analyses using next-generation sequencing technologies. In this study we analyze RNA-seq reads from a tissue collection, including 10 separate tissues from Duroc boars and 10...

  7. Tools for the Validation of Genomes and Transcriptomes with Proteomics data

    DEFF Research Database (Denmark)

    Pang, Chi Nam Ignatius; Aya, Carlos; Tay, Aidan;

    Aims With the large amount of genomics and proteomics data currently available, there remains a lack of tools to integrate data from these two fields. This project aims to provide a ‘nexus’ for integrating genomics and transcriptomics data generated from next-generation sequencing with proteomics...

  8. Genomic and transcriptomic alterations following hybridisation and genome doubling in trigenomic allohexaploid Brassica carinata × Brassica rapa.

    Science.gov (United States)

    Xu, Y; Zhao, Q; Mei, S; Wang, J

    2012-09-01

    Allopolyploidisation is a prominent evolutionary force that involves two major events: interspecific hybridisation and genome doubling. Both events have important functional consequences in shaping the genomic architecture of the neo-allopolyploids. The respective effects of hybridisation and genome doubling upon genomic and transcriptomic changes in Brassica allopolyploids are unresolved. In this study, amplified fragment length polymorphism (AFLP), methylation-sensitive amplification polymorphism (MSAP) and cDNA-AFLP approaches were used to track genetic, epigenetic and transcriptional changes in both allohexaploid Brassica (ArArBcBcCcCc genome) and triploid hybrids (ArBcCc genome). Results from these groups were compared with each other and also to their parents Brassica carinata (BBCC genome) and Brassica rapa (AA genome). Rapid and dramatic genetic, DNA methylation and gene expression changes were detected in the triploid hybrids. During the shift from triploidy to allohexaploidy, some of the hybridisation-induced alterations underwent reversion. Additionally, novel genetic, epigenetic and transcriptional alterations were also detected. The proportions of A-genome-specific DNA methylation and gene expression alterations were significantly greater than those of BC-genome-specific alterations in the triploid hybrids. However, the two parental genomes were equally affected during the ploidy shift. Hemi-CCG methylation changes induced by hybridisation were recovered after genome doubling. Full-CG methylation changes were a more general process initiated in the hybrid and continued after genome doubling. These results indicate that genome doubling could ameliorate genomic and transcriptomic alterations induced by hybridisation and instigate additional alterations in trigenomic Brassica allohexaploids. Moreover, genome doubling also modified hybridisation-induced progenitor genome-biased alterations and epigenetic alteration characteristics.

  9. Multilevel comparative analysis of the contributions of genome reduction and heat shock to the Escherichia coli transcriptome

    Directory of Open Access Journals (Sweden)

    Ying Bei-Wen

    2013-01-01

    Full Text Available Abstract Background Both large deletions in genome and heat shock stress would lead to alterations in the gene expression profile; however, whether there is any potential linkage between these disturbances to the transcriptome have not been discovered. Here, the relationship between the genomic and environmental contributions to the transcriptome was analyzed by comparing the transcriptomes of the bacterium Escherichia coli (strain MG1655 and its extensive genomic deletion derivative, MDS42 grown in regular and transient heat shock conditions. Results The transcriptome analysis showed the following: (i there was a reorganization of the transcriptome in accordance with preferred chromosomal periodicity upon genomic or heat shock perturbation; (ii there was a considerable overlap between the perturbed regulatory networks and the categories enriched for differentially expressed genes (DEGs following genome reduction and heat shock; (iii the genes sensitive to genome reduction tended to be located close to genomic scars, and some were also highly responsive to heat shock; and (iv the genomic and environmental contributions to the transcriptome displayed not only a positive correlation but also a negatively compensated relationship (i.e., antagonistic epistasis. Conclusion The contributions of genome reduction and heat shock to the Escherichia coli transcriptome were evaluated at multiple levels. The observations of overlapping perturbed networks, directional similarity in transcriptional changes, positive correlation and epistatic nature linked the two contributions and suggest somehow a crosstalk guiding transcriptional reorganization in response to both genetic and environmental disturbances in bacterium E. coli.

  10. Whole-genome duplication and molecular evolution in Cornus L. (Cornaceae) – Insights from transcriptome sequences

    Science.gov (United States)

    Yu, Yan; Xiang, Qiuyun; Manos, Paul S.; Soltis, Douglas E.; Soltis, Pamela S.; Song, Bao-Hua; Cheng, Shifeng; Liu, Xin; Wong, Gane

    2017-01-01

    The pattern and rate of genome evolution have profound consequences in organismal evolution. Whole-genome duplication (WGD), or polyploidy, has been recognized as an important evolutionary mechanism of plant diversification. However, in non-model plants the molecular signals of genome duplications have remained largely unexplored. High-throughput transcriptome data from next-generation sequencing have set the stage for novel investigations of genome evolution using new bioinformatic and methodological tools in a phylogenetic framework. Here we compare ten de novo-assembled transcriptomes representing the major lineages of the angiosperm genus Cornus (dogwood) and relevant outgroups using a customized pipeline for analyses. Using three distinct approaches, molecular dating of orthologous genes, analyses of the distribution of synonymous substitutions between paralogous genes, and examination of substitution rates through time, we detected a shared WGD event in the late Cretaceous across all taxa sampled. The inferred doubling event coincides temporally with the paleoclimatic changes associated with the initial divergence of the genus into three major lineages. Analyses also showed an acceleration of rates of molecular evolution after WGD. The highest rates of molecular evolution were observed in the transcriptome of the herbaceous lineage, C. canadensis, a species commonly found at higher latitudes, including the Arctic. Our study demonstrates the value of transcriptome data for understanding genome evolution in closely related species. The results suggest dramatic increase in sea surface temperature in the late Cretaceous may have contributed to the evolution and diversification of flowering plants. PMID:28225773

  11. A genome resource to address mechanisms of developmental programming: determination of the fetal sheep heart transcriptome.

    Science.gov (United States)

    Cox, Laura A; Glenn, Jeremy P; Spradling, Kimberly D; Nijland, Mark J; Garcia, Roy; Nathanielsz, Peter W; Ford, Stephen P

    2012-06-15

    The pregnant sheep has provided seminal insights into reproduction related to animal and human development (ovarian function, fertility, implantation, fetal growth, parturition and lactation). Fetal sheep physiology has been extensively studied since 1950, contributing significantly to the basis for our understanding of many aspects of fetal development and behaviour that remain in use in clinical practice today. Understanding mechanisms requires the combination of systems approaches uniquely available in fetal sheep with the power of genomic studies. Absence of the full range of sheep genomic resources has limited the full realization of the power of this model, impeding progress in emerging areas of pregnancy biology such as developmental programming. We have examined the expressed fetal sheep heart transcriptome using high-throughput sequencing technologies. In so doing we identified 36,737 novel transcripts and describe genes, gene variants and pathways relevant to fundamental developmental mechanisms. Genes with the highest expression levels and with novel exons in the fetal heart transcriptome are known to play central roles in muscle development. We show that high-throughput sequencing methods can generate extensive transcriptome information in the absence of an assembled and annotated genome for that species. The gene sequence data obtained provide a unique genomic resource for sheep specific genetic technology development and, combined with the polymorphism data, augment annotation and assembly of the sheep genome. In addition, identification and pathway analysis of novel fetal sheep heart transcriptome splice variants is a first step towards revealing mechanisms of genetic variation and gene environment interactions during fetal heart development.

  12. Primary analysis of repeat elements of the Asian seabass (Lates calcarifer transcriptome and genome

    Directory of Open Access Journals (Sweden)

    Inna eKuznetsova

    2014-07-01

    Full Text Available As part of our Asian seabass genome project, we are generating an inventory of repeat elements in the genome and transcriptome. The karyotype showed a diploid number of 2n=24 chromosomes with a variable number of B-chromosomes. The transcriptome and genome of Asian seabass were searched for repetitive elements with experimental and bioinformatics tools. Six different types of repeats constituting 8-14% of the genome were characterized. Repetitive elements were clustered in the pericentromeric heterochromatin of all chromosomes, but some of them were preferentially accumulated in pretelomeric and pericentromeric regions of several chromosomes pairs and have chromosomes specific arrangement. From the dispersed class of fish-specific non-LTR retrotransposon elements Rex1 and MAUI-like repeats were analyzed. They were wide-spread both in the genome and transcriptome, accumulated on the pericentromeric and peritelomeric areas of all chromosomes. Every analyzed repeat was represented in the Asian seabass transcriptome, some showed differential expression between the gonads. The other group of repeats analyzed belongs to the rRNA multigene family. FISH signal for 5S rDNA was located on a single pair of chromosomes, whereas that for 18S rDNA was found on two pairs. A BAC-derived contig containing rDNA was sequenced and assembled into a scaffold containing incomplete fragments of 18S rDNA. Their assembly and chromosomal position revealed that this part of Asian seabass genome is extremely rich in repeats containing evolutionally conserved and novel sequences. In summary, transcriptome assemblies and cDNA data are suitable for the identification of repetitive DNA from unknown genomes and for comparative investigation of conserved elements between teleosts and other vertebrates.

  13. Transcriptome complexity in a genome-reduced bacterium

    DEFF Research Database (Denmark)

    Güell, Marc; van Noort, Vera; Yus, Eva

    2009-01-01

    To study basic principles of transcriptome organization in bacteria, we analyzed one of the smallest self-replicating organisms, Mycoplasma pneumoniae. We combined strand-specific tiling arrays, complemented by transcriptome sequencing, with more than 252 spotted arrays. We detected 117 previously...... undescribed, mostly noncoding transcripts, 89 of them in antisense configuration to known genes. We identified 341 operons, of which 139 are polycistronic; almost half of the latter show decaying expression in a staircase-like manner. Under various conditions, operons could be divided into 447 smaller...... transcriptional units, resulting in many alternative transcripts. Frequent antisense transcripts, alternative transcripts, and multiple regulators per gene imply a highly dynamic transcriptome, more similar to that of eukaryotes than previously thought....

  14. Transcriptome complexity in a genome-reduced bacterium.

    Science.gov (United States)

    Güell, Marc; van Noort, Vera; Yus, Eva; Chen, Wei-Hua; Leigh-Bell, Justine; Michalodimitrakis, Konstantinos; Yamada, Takuji; Arumugam, Manimozhiyan; Doerks, Tobias; Kühner, Sebastian; Rode, Michaela; Suyama, Mikita; Schmidt, Sabine; Gavin, Anne-Claude; Bork, Peer; Serrano, Luis

    2009-11-27

    To study basic principles of transcriptome organization in bacteria, we analyzed one of the smallest self-replicating organisms, Mycoplasma pneumoniae. We combined strand-specific tiling arrays, complemented by transcriptome sequencing, with more than 252 spotted arrays. We detected 117 previously undescribed, mostly noncoding transcripts, 89 of them in antisense configuration to known genes. We identified 341 operons, of which 139 are polycistronic; almost half of the latter show decaying expression in a staircase-like manner. Under various conditions, operons could be divided into 447 smaller transcriptional units, resulting in many alternative transcripts. Frequent antisense transcripts, alternative transcripts, and multiple regulators per gene imply a highly dynamic transcriptome, more similar to that of eukaryotes than previously thought.

  15. Tools for the Validation of Genomes and Transcriptomes with Proteomics data

    DEFF Research Database (Denmark)

    Pang, Chi Nam Ignatius; Aya, Carlos; Tay, Aidan

    Aims With the large amount of genomics and proteomics data currently available, there remains a lack of tools to integrate data from these two fields. This project aims to provide a ‘nexus’ for integrating genomics and transcriptomics data generated from next-generation sequencing with proteomics...... data generated from protein mass spectrometry. We are developing a set of tools which allow users to: •Co-visualise genomics, transcriptomics, and proteomics data using the Integrated Genomics Viewer (IGV).1 •Validate the existence of genes and mRNAs using peptides identified from mass spectrometry...... experiments. •Validate alternatively spliced mRNA isoforms by searching for peptides that span across exon-exon junctions....

  16. [Genomics and transcriptomics of the Chinese liver fluke Clonorchis sinensis (Opisthorchiidae, Trematoda)].

    Science.gov (United States)

    Chelomina, G N

    2017-01-01

    The review summarizes the results of first genomic and transcriptomic investigations of the liver fluke Clonorchis sinensis (Opisthorchiidae, Trematoda). The studies mark the dawn of the genomic era for opisthorchiids, which cause severe hepatobiliary diseases in humans and animals. Their results aided in understanding the molecular mechanisms of adaptation to parasitism, parasite survival in mammalian biliary tracts, and genome dynamics in the individual development and the development of parasite-host relationships. Special attention is paid to the achievements in studying the codon usage bias and the roles of mobile genetic elements (MGEs) and small interfering RNAs (siRNAs). Interspecific comparisons at the genomic and transcriptomic levels revealed molecular differences, which may contribute to understanding the specialized niches and physiological needs of the respective species. The studies in C. sinensis provide a basis for further basic and applied research in liver flukes and, in particular, the development of efficient means to prevent, diagnose, and treat clonorchiasis.

  17. Transcriptome and genome size analysis of the venus flytrap

    DEFF Research Database (Denmark)

    Jensen, Michael Krogh; Vogt, Josef Korbinian; Bressendorff, Simon

    2015-01-01

    . muscipula flowers and traps. Using the Oases transcriptome assembler 79,165,657 quality trimmed reads were assembled into 80,806 cDNA contigs, with an average length of 679 bp and an N50 length of 1,051 bp. A total of 17,047 unique proteins were identified, and assigned to Gene Ontology (GO) and classified...

  18. VESPA: Software to Facilitate Genomic Annotation of Prokaryotic Organisms Through Integration of Proteomic and Transcriptomic Data

    Energy Technology Data Exchange (ETDEWEB)

    Peterson, Elena S.; McCue, Lee Ann; Rutledge, Alexandra C.; Jensen, Jeffrey L.; Walker, Julia; Kobold, Mark A.; Webb, Samantha R.; Payne, Samuel H.; Ansong, Charles; Adkins, Joshua N.; Cannon, William R.; Webb-Robertson, Bobbie-Jo M.

    2012-04-25

    Visual Exploration and Statistics to Promote Annotation (VESPA) is an interactive visual analysis software tool that facilitates the discovery of structural mis-annotations in prokaryotic genomes. VESPA integrates high-throughput peptide-centric proteomics data and oligo-centric or RNA-Seq transcriptomics data into a genomic context. The data may be interrogated via visual analysis across multiple levels of genomic resolution, linked searches, exports and interaction with BLAST to rapidly identify location of interest within the genome and evaluate potential mis-annotations.

  19. De novo Transcriptome Assemblies of Rana (Lithobates catesbeiana and Xenopus laevis Tadpole Livers for Comparative Genomics without Reference Genomes.

    Directory of Open Access Journals (Sweden)

    Inanc Birol

    Full Text Available In this work we studied the liver transcriptomes of two frog species, the American bullfrog (Rana (Lithobates catesbeiana and the African clawed frog (Xenopus laevis. We used high throughput RNA sequencing (RNA-seq data to assemble and annotate these transcriptomes, and compared how their baseline expression profiles change when tadpoles of the two species are exposed to thyroid hormone. We generated more than 1.5 billion RNA-seq reads in total for the two species under two conditions as treatment/control pairs. We de novo assembled these reads using Trans-ABySS to reconstruct reference transcriptomes, obtaining over 350,000 and 130,000 putative transcripts for R. catesbeiana and X. laevis, respectively. Using available genomics resources for X. laevis, we annotated over 97% of our X. laevis transcriptome contigs, demonstrating the utility and efficacy of our methodology. Leveraging this validated analysis pipeline, we also annotated the assembled R. catesbeiana transcriptome. We used the expression profiles of the annotated genes of the two species to examine the similarities and differences between the tadpole liver transcriptomes. We also compared the gene ontology terms of expressed genes to measure how the animals react to a challenge by thyroid hormone. Our study reports three main conclusions. First, de novo assembly of RNA-seq data is a powerful method for annotating and establishing transcriptomes of non-model organisms. Second, the liver transcriptomes of the two frog species, R. catesbeiana and X. laevis, show many common features, and the distribution of their gene ontology profiles are statistically indistinguishable. Third, although they broadly respond the same way to the presence of thyroid hormone in their environment, their receptor/signal transduction pathways display marked differences.

  20. Breast cancer genome and transcriptome integration implicates specific mutational signatures with immune cell infiltration

    NARCIS (Netherlands)

    M. Smid (Marcel); F.G. Rodriguez-Gonzalez (F. German); A.M. Sieuwerts (Anieta); R. Salgado (Roberto); W.J.C. Prager-van der Smissen (Wendy); Vlugt-Daane, M.V.D. (Michelle Van Der); A. van Galen (Anne); S. Nik-Zainal (Serena); J. Staaf (Johan); A.B. Brinkman (Arie B.); M.J. Vijver (Marc ); A.L. Richardson (Andrea); A. Fatima (Aquila); Berentsen, K. (Kim); A. Butler (Adam); S. Martin (Sandra); H. Davies (Helen); J.E.M.A. Debets (Reno); M.E.M.-V. Gelder (Marion E. Meijer-Van); C.H.M. van Deurzen (Carolien); Macgrogan, G. (Gaëtan); Van Den Eynden, G.G.G.M. (Gert G. G. M.); C.A. Purdie (Colin A.); A.M. Thompson (Alastair M.); C. Caldas (Carlos); P.N. Span (Paul); Simpson, P.T. (Peter T.); S. Lakhani (Sunil); S.J. van Laere (Steven); C. Desmedt (Christine); Ringnér, M. (Markus); Tommasi, S. (Stefania); Eyford, J. (Jorunn); A. Broeks (Annegien); A. Vincent-Salomon (Anne); Futreal, P.A. (P. Andrew); S. Knappskog (Stian); King, T. (Tari); G. Thomas (Gilles); Viari, A. (Alain); Langerød, A. (Anita); A.-L. Borresen-Dale (Anne-Lise); E. Birney (Ewan); H. Stunnenberg (Henk); M.R. Stratton (Michael); J.A. Foekens (John); J.W.M. Martens (John)

    2016-01-01

    textabstractA recent comprehensive whole genome analysis of a large breast cancer cohort was used to link known and novel drivers and substitution signatures to the transcriptome of 266 cases. Here, we validate that subtype-specific aberrations show concordant expression changes for, for example, TP

  1. Breast cancer genome and transcriptome integration implicates specific mutational signatures with immune cell infiltration

    NARCIS (Netherlands)

    Smid, M.; Rodriguez-Gonzalez, F.G.; Sieuwerts, A.M.; Salgado, R.; Smissen, W.J. Prager-Van der; Vlugt-Daane, M.V.; Galen, A. van; Nik-Zainal, S.; Staaf, J.; Brinkman, A.B.; Vijver, M.J. van de; Richardson, A.L.; Fatima, A.; Berentsen, K.; Butler, A.; Martin, S.; Davies, H.R.; Debets, R.; Gelder, M.E. Meijer-van; Deurzen, C.H. van; MacGrogan, G.; Eynden, G.G. Van den; Purdie, C.; Thompson, A.M.; Caldas, C.; Span, P.N; Simpson, P.T.; Lakhani, S.R.; Laere, S. van; Desmedt, C.; Ringner, M.; Tommasi, S.; Eyford, J.; Broeks, A.; Vincent-Salomon, A.; Futreal, P.A.; Knappskog, S.; King, T.; Thomas, G; Viari, A.; Langerod, A.; Borresen-Dale, A.L.; Birney, E.; Stunnenberg, H.G.; Stratton, M.; Foekens, J.A.; Martens, J.W.M.

    2016-01-01

    A recent comprehensive whole genome analysis of a large breast cancer cohort was used to link known and novel drivers and substitution signatures to the transcriptome of 266 cases. Here, we validate that subtype-specific aberrations show concordant expression changes for, for example, TP53, PIK3CA,

  2. Comparative transcriptome analyses and genome assembly of Fusarium oxysporum f. sp. cubense

    NARCIS (Netherlands)

    Dita, M.A.; Herai, R.; Waalwijk, C.; Yamagishi, M.; Giachetto, P.; Ferreira, G.; Souza, de M.; Kema, G.H.J.

    2013-01-01

    Fusarium oxysporum f. sp. cubense (Foc), the causal agent of Fusarium wilt of banana, is a highly destructive and genetically diverse pathogen. Despite its economic importance, genomic information about Foc is limited and no transcriptomic analyses have been reported so far. By using 454 sequencing

  3. Comparative genomics and transcriptomics of lineages I, II, and III strains of Listeria monocytogenes

    Science.gov (United States)

    2012-01-01

    Background Listeria monocytogenes is a food-borne pathogen that causes infections with a high-mortality rate and has served as an invaluable model for intracellular parasitism. Here, we report complete genome sequences for two L. monocytogenes strains belonging to serotype 4a (L99) and 4b (CLIP80459), and transcriptomes of representative strains from lineages I, II, and III, thereby permitting in-depth comparison of genome- and transcriptome -based data from three lineages of L. monocytogenes. Lineage III, represented by the 4a L99 genome is known to contain strains less virulent for humans. Results The genome analysis of the weakly pathogenic L99 serotype 4a provides extensive evidence of virulence gene decay, including loss of several important surface proteins. The 4b CLIP80459 genome, unlike the previously sequenced 4b F2365 genome harbours an intact inlB invasion gene. These lineage I strains are characterized by the lack of prophage genes, as they share only a single prophage locus with other L. monocytogenes genomes 1/2a EGD-e and 4a L99. Comparative transcriptome analysis during intracellular growth uncovered adaptive expression level differences in lineages I, II and III of Listeria, notable amongst which was a strong intracellular induction of flagellar genes in strain 4a L99 compared to the other lineages. Furthermore, extensive differences between strains are manifest at levels of metabolic flux control and phosphorylated sugar uptake. Intriguingly, prophage gene expression was found to be a hallmark of intracellular gene expression. Deletion mutants in the single shared prophage locus of lineage II strain EGD-e 1/2a, the lma operon, revealed severe attenuation of virulence in a murine infection model. Conclusion Comparative genomics and transcriptome analysis of L. monocytogenes strains from three lineages implicate prophage genes in intracellular adaptation and indicate that gene loss and decay may have led to the emergence of attenuated lineages

  4. Comparative genomics and transcriptomics of lineages I, II, and III strains of Listeria monocytogenes

    Directory of Open Access Journals (Sweden)

    Hain Torsten

    2012-04-01

    Full Text Available Abstract Background Listeria monocytogenes is a food-borne pathogen that causes infections with a high-mortality rate and has served as an invaluable model for intracellular parasitism. Here, we report complete genome sequences for two L. monocytogenes strains belonging to serotype 4a (L99 and 4b (CLIP80459, and transcriptomes of representative strains from lineages I, II, and III, thereby permitting in-depth comparison of genome- and transcriptome -based data from three lineages of L. monocytogenes. Lineage III, represented by the 4a L99 genome is known to contain strains less virulent for humans. Results The genome analysis of the weakly pathogenic L99 serotype 4a provides extensive evidence of virulence gene decay, including loss of several important surface proteins. The 4b CLIP80459 genome, unlike the previously sequenced 4b F2365 genome harbours an intact inlB invasion gene. These lineage I strains are characterized by the lack of prophage genes, as they share only a single prophage locus with other L. monocytogenes genomes 1/2a EGD-e and 4a L99. Comparative transcriptome analysis during intracellular growth uncovered adaptive expression level differences in lineages I, II and III of Listeria, notable amongst which was a strong intracellular induction of flagellar genes in strain 4a L99 compared to the other lineages. Furthermore, extensive differences between strains are manifest at levels of metabolic flux control and phosphorylated sugar uptake. Intriguingly, prophage gene expression was found to be a hallmark of intracellular gene expression. Deletion mutants in the single shared prophage locus of lineage II strain EGD-e 1/2a, the lma operon, revealed severe attenuation of virulence in a murine infection model. Conclusion Comparative genomics and transcriptome analysis of L. monocytogenes strains from three lineages implicate prophage genes in intracellular adaptation and indicate that gene loss and decay may have led to the emergence

  5. SIGMA2: A system for the integrative genomic multi-dimensional analysis of cancer genomes, epigenomes, and transcriptomes

    Directory of Open Access Journals (Sweden)

    MacAulay Calum

    2008-10-01

    Full Text Available Abstract Background High throughput microarray technologies have afforded the investigation of genomes, epigenomes, and transcriptomes at unprecedented resolution. However, software packages to handle, analyze, and visualize data from these multiple 'omics disciplines have not been adequately developed. Results Here, we present SIGMA2, a system for the integrative genomic multi-dimensional analysis of cancer genomes, epigenomes, and transcriptomes. Multi-dimensional datasets can be simultaneously visualized and analyzed with respect to each dimension, allowing combinatorial integration of the different assays belonging to the different 'omics. Conclusion The identification of genes altered at multiple levels such as copy number, loss of heterozygosity (LOH, DNA methylation and the detection of consequential changes in gene expression can be concertedly performed, establishing SIGMA2 as a novel tool to facilitate the high throughput systems biology analysis of cancer.

  6. Genomic resources for a model in adaptation and speciation research: characterization of the Poecilia mexicana transcriptome

    Directory of Open Access Journals (Sweden)

    Kelley Joanna L

    2012-11-01

    Full Text Available Abstract Background Elucidating the genomic basis of adaptation and speciation is a major challenge in natural systems with large quantities of environmental and phenotypic data, mostly because of the scarcity of genomic resources for non-model organisms. The Atlantic molly (Poecilia mexicana, Poeciliidae is a small livebearing fish that has been extensively studied for evolutionary ecology research, particularly because this species has repeatedly colonized extreme environments in the form of caves and toxic hydrogen sulfide containing springs. In such extreme environments, populations show strong patterns of adaptive trait divergence and the emergence of reproductive isolation. Here, we used RNA-sequencing to assemble and annotate the first transcriptome of P. mexicana to facilitate ecological genomics studies in the future and aid the identification of genes underlying adaptation and speciation in the system. Description We provide the first annotated reference transcriptome of P. mexicana. Our transcriptome shows high congruence with other published fish transcriptomes, including that of the guppy, medaka, zebrafish, and stickleback. Transcriptome annotation uncovered the presence of candidate genes relevant in the study of adaptation to extreme environments. We describe general and oxidative stress response genes as well as genes involved in pathways induced by hypoxia or involved in sulfide metabolism. To facilitate future comparative analyses, we also conducted quantitative comparisons between P. mexicana from different river drainages. 106,524 single nucleotide polymorphisms were detected in our dataset, including potential markers that are putatively fixed across drainages. Furthermore, specimens from different drainages exhibited some consistent differences in gene regulation. Conclusions Our study provides a valuable genomic resource to study the molecular underpinnings of adaptation to extreme environments in replicated sulfide

  7. Next generation transcriptomics and genomics elucidate biological complexity of microglia in health and disease.

    Science.gov (United States)

    Wes, Paul D; Holtman, Inge R; Boddeke, Erik W G M; Möller, Thomas; Eggen, Bart J L

    2016-02-01

    Genome-wide expression profiling technology has resulted in detailed transcriptome data for a wide range of tissues, conditions and diseases. In neuroscience, expression datasets were mostly generated using whole brain tissue samples, resulting in data from a mixture of cell types, including glial cells and neurons. Over the past few years, a rapidly increasing number of expression profiling studies using isolated microglial cell populations have been reported. In these studies, the microglia transcriptome was compared to other cell types, such as other brain cells and peripheral tissue macrophages, and related to aging and neurodegenerative conditions. A commonality found in many of these studies was that microglia possess distinct gene expression signatures. This repertoire of selectively-expressed microglial genes highlight functions beyond immune responses, such as synaptic modulation and neurotrophic support, and open up avenues to explore as-yet-unexpected roles. These data provide improved understanding of disease pathology, and complement not only the aforementioned whole brain tissue transcriptome studies, but also genome- and epigenome-wide association studies. In this review, insights obtained from isolated microglia transcriptome studies are presented, and compared to studies using other genome-wide approaches. The relation of microglia to other tissue macrophages and glial cell populations, as well as the role of microglia in the aging brain and in neurodegenerative conditions, will be discussed. Many more of these types of studies are expected in the near future, hopefully leading to the identification of novel genes and targets for neurodegenerative conditions.

  8. CoryneCenter – An online resource for the integrated analysis of corynebacterial genome and transcriptome data

    Directory of Open Access Journals (Sweden)

    Hüser Andrea T

    2007-11-01

    Full Text Available Abstract Background The introduction of high-throughput genome sequencing and post-genome analysis technologies, e.g. DNA microarray approaches, has created the potential to unravel and scrutinize complex gene-regulatory networks on a large scale. The discovery of transcriptional regulatory interactions has become a major topic in modern functional genomics. Results To facilitate the analysis of gene-regulatory networks, we have developed CoryneCenter, a web-based resource for the systematic integration and analysis of genome, transcriptome, and gene regulatory information for prokaryotes, especially corynebacteria. For this purpose, we extended and combined the following systems into a common platform: (1 GenDB, an open source genome annotation system, (2 EMMA, a MAGE compliant application for high-throughput transcriptome data storage and analysis, and (3 CoryneRegNet, an ontology-based data warehouse designed to facilitate the reconstruction and analysis of gene regulatory interactions. We demonstrate the potential of CoryneCenter by means of an application example. Using microarray hybridization data, we compare the gene expression of Corynebacterium glutamicum under acetate and glucose feeding conditions: Known regulatory networks are confirmed, but moreover CoryneCenter points out additional regulatory interactions. Conclusion CoryneCenter provides more than the sum of its parts. Its novel analysis and visualization features significantly simplify the process of obtaining new biological insights into complex regulatory systems. Although the platform currently focusses on corynebacteria, the integrated tools are by no means restricted to these species, and the presented approach offers a general strategy for the analysis and verification of gene regulatory networks. CoryneCenter provides freely accessible projects with the underlying genome annotation, gene expression, and gene regulation data. The system is publicly available at http://www.CoryneCenter.de.

  9. CaPSID: A bioinformatics platform for computational pathogen sequence identification in human genomes and transcriptomes

    Directory of Open Access Journals (Sweden)

    Borozan Ivan

    2012-08-01

    Full Text Available Abstract Background It is now well established that nearly 20% of human cancers are caused by infectious agents, and the list of human oncogenic pathogens will grow in the future for a variety of cancer types. Whole tumor transcriptome and genome sequencing by next-generation sequencing technologies presents an unparalleled opportunity for pathogen detection and discovery in human tissues but requires development of new genome-wide bioinformatics tools. Results Here we present CaPSID (Computational Pathogen Sequence IDentification, a comprehensive bioinformatics platform for identifying, querying and visualizing both exogenous and endogenous pathogen nucleotide sequences in tumor genomes and transcriptomes. CaPSID includes a scalable, high performance database for data storage and a web application that integrates the genome browser JBrowse. CaPSID also provides useful metrics for sequence analysis of pre-aligned BAM files, such as gene and genome coverage, and is optimized to run efficiently on multiprocessor computers with low memory usage. Conclusions To demonstrate the usefulness and efficiency of CaPSID, we carried out a comprehensive analysis of both a simulated dataset and transcriptome samples from ovarian cancer. CaPSID correctly identified all of the human and pathogen sequences in the simulated dataset, while in the ovarian dataset CaPSID’s predictions were successfully validated in vitro.

  10. A draft of the genome and four transcriptomes of a medicinal and pesticidal angiosperm Azadirachta indica

    Science.gov (United States)

    2012-01-01

    Background The Azadirachta indica (neem) tree is a source of a wide number of natural products, including the potent biopesticide azadirachtin. In spite of its widespread applications in agriculture and medicine, the molecular aspects of the biosynthesis of neem terpenoids remain largely unexplored. The current report describes the draft genome and four transcriptomes of A. indica and attempts to contextualise the sequence information in terms of its molecular phylogeny, transcript expression and terpenoid biosynthesis pathways. A. indica is the first member of the family Meliaceae to be sequenced using next generation sequencing approach. Results The genome and transcriptomes of A. indica were sequenced using multiple sequencing platforms and libraries. The A. indica genome is AT-rich, bears few repetitive DNA elements and comprises about 20,000 genes. The molecular phylogenetic analyses grouped A. indica together with Citrus sinensis from the Rutaceae family validating its conventional taxonomic classification. Comparative transcript expression analysis showed either exclusive or enhanced expression of known genes involved in neem terpenoid biosynthesis pathways compared to other sequenced angiosperms. Genome and transcriptome analyses in A. indica led to the identification of repeat elements, nucleotide composition and expression profiles of genes in various organs. Conclusions This study on A. indica genome and transcriptomes will provide a model for characterization of metabolic pathways involved in synthesis of bioactive compounds, comparative evolutionary studies among various Meliaceae family members and help annotate their genomes. A better understanding of molecular pathways involved in the azadirachtin synthesis in A. indica will pave ways for bulk production of environment friendly biopesticides. PMID:22958331

  11. A draft of the genome and four transcriptomes of a medicinal and pesticidal angiosperm Azadirachta indica

    Directory of Open Access Journals (Sweden)

    Krishnan Neeraja M

    2012-09-01

    Full Text Available Abstract Background The Azadirachta indica (neem tree is a source of a wide number of natural products, including the potent biopesticide azadirachtin. In spite of its widespread applications in agriculture and medicine, the molecular aspects of the biosynthesis of neem terpenoids remain largely unexplored. The current report describes the draft genome and four transcriptomes of A. indica and attempts to contextualise the sequence information in terms of its molecular phylogeny, transcript expression and terpenoid biosynthesis pathways. A. indica is the first member of the family Meliaceae to be sequenced using next generation sequencing approach. Results The genome and transcriptomes of A. indica were sequenced using multiple sequencing platforms and libraries. The A. indica genome is AT-rich, bears few repetitive DNA elements and comprises about 20,000 genes. The molecular phylogenetic analyses grouped A. indica together with Citrus sinensis from the Rutaceae family validating its conventional taxonomic classification. Comparative transcript expression analysis showed either exclusive or enhanced expression of known genes involved in neem terpenoid biosynthesis pathways compared to other sequenced angiosperms. Genome and transcriptome analyses in A. indica led to the identification of repeat elements, nucleotide composition and expression profiles of genes in various organs. Conclusions This study on A. indica genome and transcriptomes will provide a model for characterization of metabolic pathways involved in synthesis of bioactive compounds, comparative evolutionary studies among various Meliaceae family members and help annotate their genomes. A better understanding of molecular pathways involved in the azadirachtin synthesis in A. indica will pave ways for bulk production of environment friendly biopesticides.

  12. The genome and transcriptome of perennial ryegrass mitochondria

    DEFF Research Database (Denmark)

    Islam, Md. Shofiqul; Studer, Bruno; Byrne, Stephen

    2013-01-01

    and annotation of the complete mitochondrial genome from perennial ryegrass. Results: Intact mitochondria from perennial ryegrass leaves were isolated and used for mtDNA extraction. The mitochondrial genome was sequenced to a 167-fold coverage using the Roche 454 GS-FLX Titanium platform, and assembled...... of the mitochondrial genome from perennial ryegrass presented here constitutes an important tool for future attempts to compare mitochondrial genomes within and between grass species. Our results also demonstrate that mitochondria of perennial ryegrass contain genes crucial for energy production that are well...

  13. VESPA: software to facilitate genomic annotation of prokaryotic organisms through integration of proteomic and transcriptomic data

    Directory of Open Access Journals (Sweden)

    Peterson Elena S

    2012-04-01

    Full Text Available Abstract Background The procedural aspects of genome sequencing and assembly have become relatively inexpensive, yet the full, accurate structural annotation of these genomes remains a challenge. Next-generation sequencing transcriptomics (RNA-Seq, global microarrays, and tandem mass spectrometry (MS/MS-based proteomics have demonstrated immense value to genome curators as individual sources of information, however, integrating these data types to validate and improve structural annotation remains a major challenge. Current visual and statistical analytic tools are focused on a single data type, or existing software tools are retrofitted to analyze new data forms. We present Visual Exploration and Statistics to Promote Annotation (VESPA is a new interactive visual analysis software tool focused on assisting scientists with the annotation of prokaryotic genomes though the integration of proteomics and transcriptomics data with current genome location coordinates. Results VESPA is a desktop Java™ application that integrates high-throughput proteomics data (peptide-centric and transcriptomics (probe or RNA-Seq data into a genomic context, all of which can be visualized at three levels of genomic resolution. Data is interrogated via searches linked to the genome visualizations to find regions with high likelihood of mis-annotation. Search results are linked to exports for further validation outside of VESPA or potential coding-regions can be analyzed concurrently with the software through interaction with BLAST. VESPA is demonstrated on two use cases (Yersinia pestis Pestoides F and Synechococcus sp. PCC 7002 to demonstrate the rapid manner in which mis-annotations can be found and explored in VESPA using either proteomics data alone, or in combination with transcriptomic data. Conclusions VESPA is an interactive visual analytics tool that integrates high-throughput data into a genomic context to facilitate the discovery of structural mis

  14. A new RNASeq-based reference transcriptome for sugar beet and its application in transcriptome-scale analysis of vernalization and gibberellin responses

    Science.gov (United States)

    2012-01-01

    Background Sugar beet (Beta vulgaris sp. vulgaris) crops account for about 30% of world sugar. Sugar yield is compromised by reproductive growth hence crops must remain vegetative until harvest. Prolonged exposure to cold temperature (vernalization) in the range 6°C to 12°C induces reproductive growth, leading to bolting (rapid elongation of the main stem) and flowering. Spring cultivation of crops in cool temperate climates makes them vulnerable to vernalization and hence bolting, which is initiated in the apical shoot meristem in processes involving interaction between gibberellin (GA) hormones and vernalization. The underlying mechanisms are unknown and genome scale next generation sequencing approaches now offer comprehensive strategies to investigate them; enabling the identification of novel targets for bolting control in sugar beet crops. In this study, we demonstrate the application of an mRNA-Seq based strategy for this purpose. Results There is no sugar beet reference genome, or public expression array platforms. We therefore used RNA-Seq to generate the first reference transcriptome. We next performed digital gene expression profiling using shoot apex mRNA from two sugar beet cultivars with and without applied GA, and also a vernalized cultivar with and without applied GA. Subsequent bioinformatics analyses identified transcriptional changes associated with genotypic difference and experimental treatments. Analysis of expression profiles in response to vernalization and GA treatment suggested previously unsuspected roles for a RAV1-like AP2/B3 domain protein in vernalization and efflux transporters in the GA response. Conclusions Next generation RNA-Seq enabled the generation of the first reference transcriptome for sugar beet and the study of global transcriptional responses in the shoot apex to vernalization and GA treatment, without the need for a reference genome or established array platforms. Comprehensive bioinformatic analysis identified

  15. The channel catfish genome sequence provides insights into the evolution of scale formation in teleosts

    Science.gov (United States)

    Liu, Zhanjiang; Liu, Shikai; Yao, Jun; Bao, Lisui; Zhang, Jiaren; Li, Yun; Jiang, Chen; Sun, Luyang; Wang, Ruijia; Zhang, Yu; Zhou, Tao; Zeng, Qifan; Fu, Qiang; Gao, Sen; Li, Ning; Koren, Sergey; Jiang, Yanliang; Zimin, Aleksey; Xu, Peng; Phillippy, Adam M.; Geng, Xin; Song, Lin; Sun, Fanyue; Li, Chao; Wang, Xiaozhu; Chen, Ailu; Jin, Yulin; Yuan, Zihao; Yang, Yujia; Tan, Suxu; Peatman, Eric; Lu, Jianguo; Qin, Zhenkui; Dunham, Rex; Li, Zhaoxia; Sonstegard, Tad; Feng, Jianbin; Danzmann, Roy G.; Schroeder, Steven; Scheffler, Brian; Duke, Mary V.; Ballard, Linda; Kucuktas, Huseyin; Kaltenboeck, Ludmilla; Liu, Haixia; Armbruster, Jonathan; Xie, Yangjie; Kirby, Mona L.; Tian, Yi; Flanagan, Mary Elizabeth; Mu, Weijie; Waldbieser, Geoffrey C.

    2016-01-01

    Catfish represent 12% of teleost or 6.3% of all vertebrate species, and are of enormous economic value. Here we report a high-quality reference genome sequence of channel catfish (Ictalurus punctatus), the major aquaculture species in the US. The reference genome sequence was validated by genetic mapping of 54,000 SNPs, and annotated with 26,661 predicted protein-coding genes. Through comparative analysis of genomes and transcriptomes of scaled and scaleless fish and scale regeneration experiments, we address the genomic basis for the most striking physical characteristic of catfish, the evolutionary loss of scales and provide evidence that lack of secretory calcium-binding phosphoproteins accounts for the evolutionary loss of scales in catfish. The channel catfish reference genome sequence, along with two additional genome sequences and transcriptomes of scaled catfishes, provide crucial resources for evolutionary and biological studies. This work also demonstrates the power of comparative subtraction of candidate genes for traits of structural significance. PMID:27249958

  16. Genome reannotation of the lizard Anolis carolinensis based on 14 adult and embryonic deep transcriptomes

    Directory of Open Access Journals (Sweden)

    Eckalbar Walter L

    2013-01-01

    Full Text Available Abstract Background The green anole lizard, Anolis carolinensis, is a key species for both laboratory and field-based studies of evolutionary genetics, development, neurobiology, physiology, behavior, and ecology. As the first non-avian reptilian genome sequenced, A. carolinesis is also a prime reptilian model for comparison with other vertebrate genomes. The public databases of Ensembl and NCBI have provided a first generation gene annotation of the anole genome that relies primarily on sequence conservation with related species. A second generation annotation based on tissue-specific transcriptomes would provide a valuable resource for molecular studies. Results Here we provide an annotation of the A. carolinensis genome based on de novo assembly of deep transcriptomes of 14 adult and embryonic tissues. This revised annotation describes 59,373 transcripts, compared to 16,533 and 18,939 currently for Ensembl and NCBI, and 22,962 predicted protein-coding genes. A key improvement in this revised annotation is coverage of untranslated region (UTR sequences, with 79% and 59% of transcripts containing 5’ and 3’ UTRs, respectively. Gaps in genome sequence from the current A. carolinensis build (Anocar2.0 are highlighted by our identification of 16,542 unmapped transcripts, representing 6,695 orthologues, with less than 70% genomic coverage. Conclusions Incorporation of tissue-specific transcriptome sequence into the A. carolinensis genome annotation has markedly improved its utility for comparative and functional studies. Increased UTR coverage allows for more accurate predicted protein sequence and regulatory analysis. This revised annotation also provides an atlas of gene expression specific to adult and embryonic tissues.

  17. Large-scale transcriptome analysis of retroelements in the migratory locust, Locusta migratoria.

    Directory of Open Access Journals (Sweden)

    Feng Jiang

    Full Text Available BACKGROUND: Retroelements can successfully colonize eukaryotic genome through RNA-mediated transposition, and are considered to be some of the major mediators of genome size. The migratory locust Locusta migratoria is an insect with a large genome size, and its genome is probably subject to the proliferation of retroelements. An analysis of deep-sequencing transcriptome data will elucidate the structure, diversity and expression characteristics of retroelements. RESULTS: We performed a de novo assembly from deep sequencing RNA-seq data and identified 105 retroelements in the locust transcriptome. Phylogenetic analysis of reverse transcriptase sequences revealed 1 copia, 1 BEL, 8 gypsy and 23 non-long terminal repeat (LTR retroelements in the locust transcriptome. A novel approach was developed to identify full-length LTR retroelements. A total of 5 full-length LTR retroelements and 2 full-length non-LTR retroelements that contained complete structures for retrotransposition were identified. Structural analysis indicated that all these retroelements may have been activated or deprived of retrotransposition activities very recently. Expression profiling analysis revealed that the retroelements exhibited a unique expression pattern at the egg stage and showed differential expression profiles between the solitarious and gregarious phases at the fifth instar and adult stage. CONCLUSION: We hereby present the first de novo transcriptome analysis of retroelements in a species whose genome is not available. This work contributes to a comprehensive understanding of the landscape of retroelements in the locust transcriptome. More importantly, the results reveal that non-LTR retroelements are abundant and diverse in the locust transcriptome.

  18. Genome scale engineering techniques for metabolic engineering.

    Science.gov (United States)

    Liu, Rongming; Bassalo, Marcelo C; Zeitoun, Ramsey I; Gill, Ryan T

    2015-11-01

    Metabolic engineering has expanded from a focus on designs requiring a small number of genetic modifications to increasingly complex designs driven by advances in genome-scale engineering technologies. Metabolic engineering has been generally defined by the use of iterative cycles of rational genome modifications, strain analysis and characterization, and a synthesis step that fuels additional hypothesis generation. This cycle mirrors the Design-Build-Test-Learn cycle followed throughout various engineering fields that has recently become a defining aspect of synthetic biology. This review will attempt to summarize recent genome-scale design, build, test, and learn technologies and relate their use to a range of metabolic engineering applications.

  19. CTDB: An Integrated Chickpea Transcriptome Database for Functional and Applied Genomics.

    Directory of Open Access Journals (Sweden)

    Mohit Verma

    Full Text Available Chickpea is an important grain legume used as a rich source of protein in human diet. The narrow genetic diversity and limited availability of genomic resources are the major constraints in implementing breeding strategies and biotechnological interventions for genetic enhancement of chickpea. We developed an integrated Chickpea Transcriptome Database (CTDB, which provides the comprehensive web interface for visualization and easy retrieval of transcriptome data in chickpea. The database features many tools for similarity search, functional annotation (putative function, PFAM domain and gene ontology search and comparative gene expression analysis. The current release of CTDB (v2.0 hosts transcriptome datasets with high quality functional annotation from cultivated (desi and kabuli types and wild chickpea. A catalog of transcription factor families and their expression profiles in chickpea are available in the database. The gene expression data have been integrated to study the expression profiles of chickpea transcripts in major tissues/organs and various stages of flower development. The utilities, such as similarity search, ortholog identification and comparative gene expression have also been implemented in the database to facilitate comparative genomic studies among different legumes and Arabidopsis. Furthermore, the CTDB represents a resource for the discovery of functional molecular markers (microsatellites and single nucleotide polymorphisms between different chickpea types. We anticipate that integrated information content of this database will accelerate the functional and applied genomic research for improvement of chickpea. The CTDB web service is freely available at http://nipgr.res.in/ctdb.html.

  20. The Draft Genome and Transcriptome of the Atlantic Horseshoe Crab, Limulus polyphemus

    Directory of Open Access Journals (Sweden)

    Stephen D. Simpson

    2017-01-01

    Full Text Available The horseshoe crab, Limulus polyphemus, exhibits robust circadian and circatidal rhythms, but little is known about the molecular mechanisms underlying those rhythms. In this study, horseshoe crabs were collected during the day and night as well as high and low tides, and their muscle and central nervous system tissues were processed for genome and transcriptome sequencing, respectively. The genome assembly resulted in 7.4×105 contigs with N50 of 4,736, while the transcriptome assembly resulted in 9.3×104 contigs and N50 of 3,497. Analysis of functional completeness by the identification of putative universal orthologs suggests that the transcriptome has three times more total expected orthologs than the genome. Interestingly, RNA-Seq analysis indicated no statistically significant changes in expression level for any circadian core or accessory gene, but there was significant cycling of several noncircadian transcripts. Overall, these assemblies provide a resource to investigate the Limulus clock systems and provide a large dataset for further exploration into the taxonomy and biology of the Atlantic horseshoe crab.

  1. The Draft Genome and Transcriptome of the Atlantic Horseshoe Crab, Limulus polyphemus

    Science.gov (United States)

    Ramsdell, Jordan S.; Watson III, Winsor H.; Chabot, Christopher C.

    2017-01-01

    The horseshoe crab, Limulus polyphemus, exhibits robust circadian and circatidal rhythms, but little is known about the molecular mechanisms underlying those rhythms. In this study, horseshoe crabs were collected during the day and night as well as high and low tides, and their muscle and central nervous system tissues were processed for genome and transcriptome sequencing, respectively. The genome assembly resulted in 7.4 × 105 contigs with N50 of 4,736, while the transcriptome assembly resulted in 9.3 × 104 contigs and N50 of 3,497. Analysis of functional completeness by the identification of putative universal orthologs suggests that the transcriptome has three times more total expected orthologs than the genome. Interestingly, RNA-Seq analysis indicated no statistically significant changes in expression level for any circadian core or accessory gene, but there was significant cycling of several noncircadian transcripts. Overall, these assemblies provide a resource to investigate the Limulus clock systems and provide a large dataset for further exploration into the taxonomy and biology of the Atlantic horseshoe crab. PMID:28265565

  2. Comparative transcriptome and chloroplast genome analyses of two related Dipteronia species

    Directory of Open Access Journals (Sweden)

    Tao Zhou

    2016-10-01

    Full Text Available Dipteronia (order Sapindales is an endangered genus endemic to China and has two living species, D. sinensis and D. dyeriana. The plants are closely related to the genus Acer, which is also classified in the order Sapindales. Evolutionary studies on Dipteronia have been hindered by the paucity of information on their genomes and plastids. Here, we used next generation sequencing to characterize the transcriptomes and complete chloroplast genomes of both Dipteronia species. A comparison of the transcriptomes of both species identified a total of 7,814 orthologs. Estimation of selection pressures using Ka/Ks ratios showed that only 30 of 5,435 orthologous pairs had a ratio significantly greater than 1, i.e., showing positive selection. However, 4,041 orthologs had a Ka/Ks < 0.5 (p < 0.05, suggesting that most genes had likely undergone purifying selection. Based on orthologous unigenes, 314 single copy nuclear genes were identified. Through a combination of de novo and reference guided assembly, plastid genomes were obtained; that of D. sinensis was 157,080 bp and that of D. dyeriana was 157,071 bp. Both plastid genomes encoded 87 protein coding genes, 40 tRNAs, and 8 rRNAs; no significant differences were detected in the size, gene content, and organization of the two plastomes. We used the whole chloroplast genomes to determine the phylogeny of D. sinensis and D. dyeriana and confirmed that the two species were highly divergent. Overall, our study provides comprehensive transcriptomic and chloroplast genomic resources, which will be valuable for future evolutionary studies of Dipteronia.

  3. Chromosome-level genome assembly and transcriptome of the green alga Chromochloris zofingiensis illuminates astaxanthin production

    Science.gov (United States)

    Roth, Melissa S.; Cokus, Shawn J.; Gallaher, Sean D.; Walter, Andreas; Lopez, David; Erickson, Erika; Endelman, Benjamin; Westcott, Daniel; Larabell, Carolyn A.; Merchant, Sabeeha S.; Pellegrini, Matteo

    2017-01-01

    Microalgae have potential to help meet energy and food demands without exacerbating environmental problems. There is interest in the unicellular green alga Chromochloris zofingiensis, because it produces lipids for biofuels and a highly valuable carotenoid nutraceutical, astaxanthin. To advance understanding of its biology and facilitate commercial development, we present a C. zofingiensis chromosome-level nuclear genome, organelle genomes, and transcriptome from diverse growth conditions. The assembly, derived from a combination of short- and long-read sequencing in conjunction with optical mapping, revealed a compact genome of ∼58 Mbp distributed over 19 chromosomes containing 15,274 predicted protein-coding genes. The genome has uniform gene density over chromosomes, low repetitive sequence content (∼6%), and a high fraction of protein-coding sequence (∼39%) with relatively long coding exons and few coding introns. Functional annotation of gene models identified orthologous families for the majority (∼73%) of genes. Synteny analysis uncovered localized but scrambled blocks of genes in putative orthologous relationships with other green algae. Two genes encoding beta-ketolase (BKT), the key enzyme synthesizing astaxanthin, were found in the genome, and both were up-regulated by high light. Isolation and molecular analysis of astaxanthin-deficient mutants showed that BKT1 is required for the production of astaxanthin. Moreover, the transcriptome under high light exposure revealed candidate genes that could be involved in critical yet missing steps of astaxanthin biosynthesis, including ABC transporters, cytochrome P450 enzymes, and an acyltransferase. The high-quality genome and transcriptome provide insight into the green algal lineage and carotenoid production. PMID:28484037

  4. Dynamic probe selection for studying microbial transcriptome with high-density genomic tiling microarrays

    Directory of Open Access Journals (Sweden)

    Chen Tsute

    2010-02-01

    Full Text Available Abstract Background Current commercial high-density oligonucleotide microarrays can hold millions of probe spots on a single microscopic glass slide and are ideal for studying the transcriptome of microbial genomes using a tiling probe design. This paper describes a comprehensive computational pipeline implemented specifically for designing tiling probe sets to study microbial transcriptome profiles. Results The pipeline identifies every possible probe sequence from both forward and reverse-complement strands of all DNA sequences in the target genome including circular or linear chromosomes and plasmids. Final probe sequence lengths are adjusted based on the maximal oligonucleotide synthesis cycles and best isothermality allowed. Optimal probes are then selected in two stages - sequential and gap-filling. In the sequential stage, probes are selected from sequence windows tiled alongside the genome. In the gap-filling stage, additional probes are selected from the largest gaps between adjacent probes that have already been selected, until a predefined number of probes is reached. Selection of the highest quality probe within each window and gap is based on five criteria: sequence uniqueness, probe self-annealing, melting temperature, oligonucleotide length, and probe position. Conclusions The probe selection pipeline evaluates global and local probe sequence properties and selects a set of probes dynamically and evenly distributed along the target genome. Unique to other similar methods, an exact number of non-redundant probes can be designed to utilize all the available probe spots on any chosen microarray platform. The pipeline can be applied to microbial genomes when designing high-density tiling arrays for comparative genomics, ChIP chip, gene expression and comprehensive transcriptome studies.

  5. Discovery of germline-related genes in Cephalochordate amphioxus: A genome wide survey using genome annotation and transcriptome data.

    Science.gov (United States)

    Yue, Jia-Xing; Li, Kun-Lung; Yu, Jr-Kai

    2015-12-01

    The generation of germline cells is a critical process in the reproduction of multicellular organisms. Studies in animal models have identified a common repertoire of genes that play essential roles in primordial germ cell (PGC) formation. However, comparative studies also indicate that the timing and regulation of this core genetic program vary considerably in different animals, raising the intriguing questions regarding the evolution of PGC developmental mechanisms in metazoans. Cephalochordates (commonly called amphioxus or lancelets) represent one of the invertebrate chordate groups and can provide important information about the evolution of developmental mechanisms in the chordate lineage. In this study, we used genome and transcriptome data to identify germline-related genes in two distantly related cephalochordate species, Branchiostoma floridae and Asymmetron lucayanum. Branchiostoma and Asymmetron diverged more than 120 MYA, and the most conspicuous difference between them is their gonadal morphology. We used important germline developmental genes in several model animals to search the amphioxus genome and transcriptome dataset for conserved homologs. We also annotated the assembled transcriptome data using Gene Ontology (GO) terms to facilitate the discovery of putative genes associated with germ cell development and reproductive functions in amphioxus. We further confirmed the expression of 14 genes in developing oocytes or mature eggs using whole mount in situ hybridization, suggesting their potential functions in amphioxus germ cell development. The results of this global survey provide a useful resource for testing potential functions of candidate germline-related genes in cephalochordates and for investigating differences in gonad developmental mechanisms between Branchiostoma and Asymmetron species.

  6. Transcriptome and genome size analysis of the venus flytrap

    DEFF Research Database (Denmark)

    Jensen, Michael Krogh; Vogt, Josef Korbinian; Bressendorff, Simon

    2015-01-01

    The insectivorous Venus flytrap (Dionaea muscipula) is renowned from Darwin's studies of plant carnivory and the origins of species. To provide tools to analyze the evolution and functional genomics of D. muscipula, we sequenced a normalized cDNA library synthesized from mRNA isolated from D...

  7. Large-scale Gene Ontology analysis of plant transcriptome-derived sequences retrieved by AFLP technology

    NARCIS (Netherlands)

    Botton, A.; Galla, G.; Conesa, A.; Bachem, C.W.B.; Ramina, A.; Barcaccia, G.

    2008-01-01

    Background: After 10-year-use of AFLP (Amplified Fragment Length Polymorphism) technology for DNA fingerprinting and mRNA profiling, large repertories of genome- and transcriptome-derived sequences are available in public databases for model, crop and tree species. AFLP marker systems have been and

  8. Large-scale Gene Ontology analysis of plant transcriptome-derived sequences retrieved by AFLP technology

    NARCIS (Netherlands)

    Botton, A.; Galla, G.; Conesa, A.; Bachem, C.W.B.; Ramina, A.; Barcaccia, G.

    2008-01-01

    Background: After 10-year-use of AFLP (Amplified Fragment Length Polymorphism) technology for DNA fingerprinting and mRNA profiling, large repertories of genome- and transcriptome-derived sequences are available in public databases for model, crop and tree species. AFLP marker systems have been and

  9. Large-scale transcriptome data reveals transcriptional activity of fission yeast LTR retrotransposons

    DEFF Research Database (Denmark)

    Mourier, Tobias; Willerslev, Eske

    2010-01-01

    BACKGROUND: Retrotransposons are transposable elements that proliferate within eukaryotic genomes through a process involving reverse transcription. The numbers of retrotransposons within genomes and differences between closely related species may yield insight into the evolutionary history...... of transcriptional activity are observed from both strands of solitary LTR sequences. Transcriptome data collected during meiosis suggests that transcription of solitary LTRs is correlated with the transcription of nearby protein-coding genes. CONCLUSIONS: Presumably, the host organism negatively regulates...

  10. Genome-wide transcriptome and expression profile analysis of Phalaenopsis during explant browning.

    Directory of Open Access Journals (Sweden)

    Chuanjun Xu

    Full Text Available Explant browning presents a major problem for in vitro culture, and can lead to the death of the explant and failure of regeneration. Considerable work has examined the physiological mechanisms underlying Phalaenopsis leaf explant browning, but the molecular mechanisms of browning remain elusive. In this study, we used whole genome RNA sequencing to examine Phalaenopsis leaf explant browning at genome-wide level.We first used Illumina high-throughput technology to sequence the transcriptome of Phalaenopsis and then performed de novo transcriptome assembly. We assembled 79,434,350 clean reads into 31,708 isogenes and generated 26,565 annotated unigenes. We assigned Gene Ontology (GO terms, Kyoto Encyclopedia of Genes and Genomes (KEGG annotations, and potential Pfam domains to each transcript. Using the transcriptome data as a reference, we next analyzed the differential gene expression of explants cultured for 0, 3, and 6 d, respectively. We then identified differentially expressed genes (DEGs before and after Phalaenopsis explant browning. We also performed GO, KEGG functional enrichment and Pfam analysis of all DEGs. Finally, we selected 11 genes for quantitative real-time PCR (qPCR analysis to confirm the expression profile analysis.Here, we report the first comprehensive analysis of transcriptome and expression profiles during Phalaenopsis explant browning. Our results suggest that Phalaenopsis explant browning may be due in part to gene expression changes that affect the secondary metabolism, such as: phenylpropanoid pathway and flavonoid biosynthesis. Genes involved in photosynthesis and ATPase activity have been found to be changed at transcription level; these changes may perturb energy metabolism and thus lead to the decay of plant cells and tissues. This study provides comprehensive gene expression data for Phalaenopsis browning. Our data constitute an important resource for further functional studies to prevent explant browning.

  11. Genome-wide transcriptome and expression profile analysis of Phalaenopsis during explant browning.

    Science.gov (United States)

    Xu, Chuanjun; Zeng, Biyu; Huang, Junmei; Huang, Wen; Liu, Yumei

    2015-01-01

    Explant browning presents a major problem for in vitro culture, and can lead to the death of the explant and failure of regeneration. Considerable work has examined the physiological mechanisms underlying Phalaenopsis leaf explant browning, but the molecular mechanisms of browning remain elusive. In this study, we used whole genome RNA sequencing to examine Phalaenopsis leaf explant browning at genome-wide level. We first used Illumina high-throughput technology to sequence the transcriptome of Phalaenopsis and then performed de novo transcriptome assembly. We assembled 79,434,350 clean reads into 31,708 isogenes and generated 26,565 annotated unigenes. We assigned Gene Ontology (GO) terms, Kyoto Encyclopedia of Genes and Genomes (KEGG) annotations, and potential Pfam domains to each transcript. Using the transcriptome data as a reference, we next analyzed the differential gene expression of explants cultured for 0, 3, and 6 d, respectively. We then identified differentially expressed genes (DEGs) before and after Phalaenopsis explant browning. We also performed GO, KEGG functional enrichment and Pfam analysis of all DEGs. Finally, we selected 11 genes for quantitative real-time PCR (qPCR) analysis to confirm the expression profile analysis. Here, we report the first comprehensive analysis of transcriptome and expression profiles during Phalaenopsis explant browning. Our results suggest that Phalaenopsis explant browning may be due in part to gene expression changes that affect the secondary metabolism, such as: phenylpropanoid pathway and flavonoid biosynthesis. Genes involved in photosynthesis and ATPase activity have been found to be changed at transcription level; these changes may perturb energy metabolism and thus lead to the decay of plant cells and tissues. This study provides comprehensive gene expression data for Phalaenopsis browning. Our data constitute an important resource for further functional studies to prevent explant browning.

  12. The draft genome and transcriptome of Cannabis sativa

    OpenAIRE

    van Bakel, Harm; Stout, Jake M.; Cote, Atina G; Tallon, Carling M; Sharpe, Andrew G; Hughes, Timothy R.; Page, Jonathan E.

    2011-01-01

    Background Cannabis sativa has been cultivated throughout human history as a source of fiber, oil and food, and for its medicinal and intoxicating properties. Selective breeding has produced cannabis plants for specific uses, including high-potency marijuana strains and hemp cultivars for fiber and seed production. The molecular biology underlying cannabinoid biosynthesis and other traits of interest is largely unexplored. Results We sequenced genomic DNA and RNA from the marijuana strain Pur...

  13. Phenotypic, genomic, transcriptomic and proteomic changes in Bacillus cereus after a short-term space flight

    Science.gov (United States)

    Su, Longxiang; Zhou, Lisha; Liu, Jinwen; Cen, Zhong; Wu, Chunyan; Wang, Tong; Zhou, Tao; Chang, De; Guo, Yinghua; Fang, Xiangqun; Wang, Junfeng; Li, Tianzhi; Yin, Sanjun; Dai, Wenkui; Zhou, Yuping; Zhao, Jiao; Fang, Chengxiang; Yang, Ruifu; Liu, Changting

    2014-01-01

    The environment in space could affect microorganisms by changing a variety of features, including proliferation rate, cell physiology, cell metabolism, biofilm production, virulence, and drug resistance. However, the relevant mechanisms remain unclear. To explore the effect of a space environment on Bacillus cereus, a strain of B. cereus was sent to space for 398 h by ShenZhou VIII from November 1, 2011 to November 17, 2011. A ground simulation with similar temperature conditions was simultaneously performed as a control. After the flight, the flight and control strains were further analyzed using phenotypic, genomic, transcriptomic and proteomic techniques to explore the divergence of B. cereus in a space environment. The flight strains exhibited a significantly slower growth rate, a significantly higher amikacin resistance level, and changes in metabolism relative to the ground control strain. After the space flight, three polymorphic loci were found in the flight strains LCT-BC25 and LCT-BC235. A combined transcriptome and proteome analysis was performed, and this analysis revealed that the flight strains had changes in genes/proteins relevant to metabolism. In addition, certain genes/proteins that are relevant to structural function, gene expression modification and translation, and virulence were also altered. Our study represents the first documented analysis of the phenotypic, genomic, transcriptomic, and proteomic changes that occur in B. cereus during space flight, and our results could be beneficial to the field of space microbiology.

  14. Integrated genome and transcriptome sequencing identifies a novel form of hybrid and aggressive prostate cancer.

    Science.gov (United States)

    Wu, Chunxiao; Wyatt, Alexander W; Lapuk, Anna V; McPherson, Andrew; McConeghy, Brian J; Bell, Robert H; Anderson, Shawn; Haegert, Anne; Brahmbhatt, Sonal; Shukin, Robert; Mo, Fan; Li, Estelle; Fazli, Ladan; Hurtado-Coll, Antonio; Jones, Edward C; Butterfield, Yaron S; Hach, Faraz; Hormozdiari, Fereydoun; Hajirasouliha, Iman; Boutros, Paul C; Bristow, Robert G; Jones, Steven Jm; Hirst, Martin; Marra, Marco A; Maher, Christopher A; Chinnaiyan, Arul M; Sahinalp, S Cenk; Gleave, Martin E; Volik, Stanislav V; Collins, Colin C

    2012-05-01

    Next-generation sequencing is making sequence-based molecular pathology and personalized oncology viable. We selected an individual initially diagnosed with conventional but aggressive prostate adenocarcinoma and sequenced the genome and transcriptome from primary and metastatic tissues collected prior to hormone therapy. The histology-pathology and copy number profiles were remarkably homogeneous, yet it was possible to propose the quadrant of the prostate tumour that likely seeded the metastatic diaspora. Despite a homogeneous cell type, our transcriptome analysis revealed signatures of both luminal and neuroendocrine cell types. Remarkably, the repertoire of expressed but apparently private gene fusions, including C15orf21:MYC, recapitulated this biology. We hypothesize that the amplification and over-expression of the stem cell gene MSI2 may have contributed to the stable hybrid cellular identity. This hybrid luminal-neuroendocrine tumour appears to represent a novel and highly aggressive case of prostate cancer with unique biological features and, conceivably, a propensity for rapid progression to castrate-resistance. Overall, this work highlights the importance of integrated analyses of genome, exome and transcriptome sequences for basic tumour biology, sequence-based molecular pathology and personalized oncology.

  15. Transcriptome, methylome and genomic variations analysis of ectopic thyroid glands.

    Directory of Open Access Journals (Sweden)

    Rasha Abu-Khudir

    Full Text Available BACKGROUND: Congenital hypothyroidism from thyroid dysgenesis (CHTD is predominantly a sporadic disease characterized by defects in the differentiation, migration or growth of thyroid tissue. Of these defects, incomplete migration resulting in ectopic thyroid tissue is the most common (up to 80%. Germinal mutations in the thyroid-related transcription factors NKX2.1, FOXE1, PAX-8, and NKX2.5 have been identified in only 3% of patients with sporadic CHTD. Moreover, a survey of monozygotic twins yielded a discordance rate of 92%, suggesting that somatic events, genetic or epigenetic, probably play an important role in the etiology of CHTD. METHODOLOGY/PRINCIPAL FINDINGS: To assess the role of somatic genetic or epigenetic processes in CHTD, we analyzed gene expression, genome-wide methylation, and structural genome variations in normal versus ectopic thyroid tissue. In total, 1011 genes were more than two-fold induced or repressed. Expression array was validated by quantitative real-time RT-PCR for 100 genes. After correction for differences in thyroid activation state, 19 genes were exclusively associated with thyroid ectopy, among which genes involved in embryonic development (e.g. TXNIP and in the Wnt pathway (e.g. SFRP2 and FRZB were observed. None of the thyroid related transcription factors (FOXE1, HHEX, NKX2.1, NKX2.5 showed decreased expression, whereas PAX8 expression was associated with thyroid activation state. Finally, the expression profile was independent of promoter and CpG island methylation and of structural genome variations. CONCLUSIONS/SIGNIFICANCE: This is the first integrative molecular analysis of ectopic thyroid tissue. Ectopic thyroids show a differential gene expression compared to that of normal thyroids, although molecular basis could not be defined. Replication of this pilot study on a larger cohort could lead to unraveling the elusive cause of defective thyroid migration during embryogenesis.

  16. Spider Transcriptomes Identify Ancient Large-Scale Gene Duplication Event Potentially Important in Silk Gland Evolution.

    Science.gov (United States)

    Clarke, Thomas H; Garb, Jessica E; Hayashi, Cheryl Y; Arensburger, Peter; Ayoub, Nadia A

    2015-06-08

    The evolution of specialized tissues with novel functions, such as the silk synthesizing glands in spiders, is likely an influential driver of adaptive success. Large-scale gene duplication events and subsequent paralog divergence are thought to be required for generating evolutionary novelty. Such an event has been proposed for spiders, but not tested. We de novo assembled transcriptomes from three cobweb weaving spider species. Based on phylogenetic analyses of gene families with representatives from each of the three species, we found numerous duplication events indicative of a whole genome or segmental duplication. We estimated the age of the gene duplications relative to several speciation events within spiders and arachnids and found that the duplications likely occurred after the divergence of scorpions (order Scorpionida) and spiders (order Araneae), but before the divergence of the spider suborders Mygalomorphae and Araneomorphae, near the evolutionary origin of spider silk glands. Transcripts that are expressed exclusively or primarily within black widow silk glands are more likely to have a paralog descended from the ancient duplication event and have elevated amino acid replacement rates compared with other transcripts. Thus, an ancient large-scale gene duplication event within the spider lineage was likely an important source of molecular novelty during the evolution of silk gland-specific expression. This duplication event may have provided genetic material for subsequent silk gland diversification in the true spiders (Araneomorphae). © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  17. Comparative Genomics and Transcriptomic Analysis of Mycobacterium Kansasii

    KAUST Repository

    Alzahid, Yara

    2014-04-01

    The group of Mycobacteria is one of the most intensively studied bacterial taxa, as they cause the two historical and worldwide known diseases: leprosy and tuberculosis. Mycobacteria not identified as tuberculosis or leprosy complex, have been referred to by ‘environmental mycobacteria’ or ‘Nontuberculous mycobacteria (NTM). Mycobacterium kansasii (M. kansasii) is one of the most frequent NTM pathogens, as it causes pulmonary disease in immuno-competent patients and pulmonary, and disseminated disease in patients with various immuno-deficiencies. There have been five documented subtypes of this bacterium, by different molecular typing methods, showing that type I causes tuberculosis-like disease in healthy individuals, and type II in immune-compromised individuals. The remaining types are said to be environmental, thereby, not causing any diseases. The aim of this project was to conduct a comparative genomic study of M. kansasii types I-V and investigating the gene expression level of those types. From various comparative genomics analysis, provided genomics evidence on why M. kansasii type I is considered pathogenic, by focusing on three key elements that are involved in virulence of Mycobacteria: ESX secretion system, Phospholipase c (plcb) and Mammalian cell entry (Mce) operons. The results showed the lack of the espA operon in types II-V, which renders the ESX- 1 operon dysfunctional, as espA is one of the key factors that control this secretion system. However, gene expression analysis showed this operon to be deleted in types II, III and IV. Furthermore, plcB was found to be truncated in types III and IV. Analysis of Mce operons (1-4) show that mce-1 operon is duplicated, mce-2 is absent and mce-3 and mce-4 is present in one copy in M. kansasii types I-V. Gene expression profiles of type I-IV, showed that the secreted proteins of ESX-1 were slightly upregulated in types II-IV when compared to type I and the secreted forms of ESX-5 were highly down

  18. RSIADB, a collective resource for genome and transcriptome analyses in Rhizoctonia solani AG1 IA.

    Science.gov (United States)

    Chen, Lei; Ai, Peng; Zhang, Jinfeng; Deng, Qiming; Wang, Shiquan; Li, Shuangcheng; Zhu, Jun; Li, Ping; Zheng, Aiping

    2016-01-01

    Rice [Oryza sativa (L.)] feeds more than half of the world's population. Rhizoctonia solaniis a major fungal pathogen of rice causing extreme crop losses in all rice-growing regions of the world. R. solani AG1 IA is a major cause of sheath blight in rice. In this study, we constructed a comprehensive and user-friendly web-based database, RSIADB, to analyse its draft genome and transcriptome. The database was built using the genome sequence (10,489 genes) and annotation information for R. solani AG1 IA. A total of six RNAseq samples of R. solani AG1 IA were also analysed, corresponding to 10, 18, 24, 32, 48 and 72 h after infection of rice leaves. The RSIADB database enables users to search, browse, and download gene sequences for R. solani AG1 IA, and mine the data using BLAST, Sequence Extractor, Browse and Construction Diagram tools that were integrated into the database. RSIADB is an important genomic resource for scientists working with R. solani AG1 IA and will assist researchers in analysing the annotated genome and transcriptome of this pathogen. This resource will facilitate studies on gene function, pathogenesis factors and secreted proteins, as well as provide an avenue for comparative analyses of genes expressed during different stages of infection. Database URL:http://genedenovoweb.ticp.net:81/rsia/index.php.

  19. Role of genomics and transcriptomics in selection of reintroduction source populations.

    Science.gov (United States)

    He, Xiaoping; Johansson, Mattias L; Heath, Daniel D

    2016-10-01

    The use and importance of reintroduction as a conservation tool to return a species to its historical range from which it has been extirpated will increase as climate change and human development accelerate habitat loss and population extinctions. Although the number of reintroduction attempts has increased rapidly over the past 2 decades, the success rate is generally low. As a result of population differences in fitness-related traits and divergent responses to environmental stresses, population performance upon reintroduction is highly variable, and it is generally agreed that selecting an appropriate source population is a critical component of a successful reintroduction. Conservation genomics is an emerging field that addresses long-standing challenges in conservation, and the potential for using novel molecular genetic approaches to inform and improve conservation efforts is high. Because the successful establishment and persistence of reintroduced populations is highly dependent on the functional genetic variation and environmental stress tolerance of the source population, we propose the application of conservation genomics and transcriptomics to guide reintroduction practices. Specifically, we propose using genome-wide functional loci to estimate genetic variation of source populations. This estimate can then be used to predict the potential for adaptation. We also propose using transcriptional profiling to measure the expression response of fitness-related genes to environmental stresses as a proxy for acclimation (tolerance) capacity. Appropriate application of conservation genomics and transcriptomics has the potential to dramatically enhance reintroduction success in a time of rapidly declining biodiversity and accelerating environmental change. © 2016 Society for Conservation Biology.

  20. The genome and development-dependent transcriptomes of Pyronema confluens: a window into fungal evolution.

    Directory of Open Access Journals (Sweden)

    Stefanie Traeger

    Full Text Available Fungi are a large group of eukaryotes found in nearly all ecosystems. More than 250 fungal genomes have already been sequenced, greatly improving our understanding of fungal evolution, physiology, and development. However, for the Pezizomycetes, an early-diverging lineage of filamentous ascomycetes, there is so far only one genome available, namely that of the black truffle, Tuber melanosporum, a mycorrhizal species with unusual subterranean fruiting bodies. To help close the sequence gap among basal filamentous ascomycetes, and to allow conclusions about the evolution of fungal development, we sequenced the genome and assayed transcriptomes during development of Pyronema confluens, a saprobic Pezizomycete with a typical apothecium as fruiting body. With a size of 50 Mb and ~13,400 protein-coding genes, the genome is more characteristic of higher filamentous ascomycetes than the large, repeat-rich truffle genome; however, some typical features are different in the P. confluens lineage, e.g. the genomic environment of the mating type genes that is conserved in higher filamentous ascomycetes, but only partly conserved in P. confluens. On the other hand, P. confluens has a full complement of fungal photoreceptors, and expression studies indicate that light perception might be similar to distantly related ascomycetes and, thus, represent a basic feature of filamentous ascomycetes. Analysis of spliced RNA-seq sequence reads allowed the detection of natural antisense transcripts for 281 genes. The P. confluens genome contains an unusually high number of predicted orphan genes, many of which are upregulated during sexual development, consistent with the idea of rapid evolution of sex-associated genes. Comparative transcriptomics identified the transcription factor gene pro44 that is upregulated during development in P. confluens and the Sordariomycete Sordaria macrospora. The P. confluens pro44 gene (PCON_06721 was used to complement the S. macrospora

  1. Deep analysis of wild Vitis flower transcriptome reveals unexplored genome regions associated with sex specification.

    Science.gov (United States)

    Ramos, Miguel Jesus Nunes; Coito, João Lucas; Fino, Joana; Cunha, Jorge; Silva, Helena; de Almeida, Patrícia Gomes; Costa, Maria Manuela Ribeiro; Amâncio, Sara; Paulo, Octávio S; Rocheta, Margarida

    2017-01-01

    RNA-seq of Vitis during early stages of bud development, in male, female and hermaphrodite flowers, identified new loci outside of annotated gene models, suggesting their involvement in sex establishment. The molecular mechanisms responsible for flower sex specification remain unclear for most plant species. In the case of V. vinifera ssp. vinifera, it is not fully understood what determines hermaphroditism in the domesticated subspecies and male or female flowers in wild dioecious relatives (Vitis vinifera ssp. sylvestris). Here, we describe a de novo assembly of the transcriptome of three flower developmental stages from the three Vitis vinifera flower types. The validation of de novo assembly showed a correlation of 0.825. The main goals of this work were the identification of V. v. sylvestris exclusive transcripts and the characterization of differential gene expression during flower development. RNA from several flower developmental stages was used previously to generate Illumina sequence reads. Through a sequential de novo assembly strategy one comprehensive transcriptome comprising 95,516 non-redundant transcripts was assembled. From this dataset 81,064 transcripts were annotated to V. v. vinifera reference transcriptome and 11,084 were annotated against V. v. vinifera reference genome. Moreover, we found 3368 transcripts that could not be mapped to Vitis reference genome. From all the non-redundant transcripts that were assembled, bioinformatics analysis identified 133 specific of V. v. sylvestris and 516 transcripts differentially expressed among the three flower types. The detection of transcription from areas of the genome not currently annotated suggests active transcription of previously unannotated genomic loci during early stages of bud development.

  2. Whole-genome resequencing and transcriptomic analysis to identify genes involved in leaf-color diversity in ornamental rice plants.

    Directory of Open Access Journals (Sweden)

    Chang-Kug Kim

    Full Text Available Rice field art is a large-scale art form in which people design rice fields using various kinds of ornamental rice plants with different leaf colors. Leaf color-related genes play an important role in the study of chlorophyll biosynthesis, chloroplast structure and function, and anthocyanin biosynthesis. Despite the role of different metabolites in the traditional relationship between leaf and color, comprehensive color-specific metabolite studies of ornamental rice have been limited. We performed whole-genome resequencing and transcriptomic analysis of regulatory patterns and genetic diversity among different rice cultivars to discover new genetic mechanisms that promote enhanced levels of various leaf colors. We resequenced the genomes of 10 rice leaf-color accessions to an average of 40× reads depth and >95% coverage and performed 30 RNA-seq experiments using the 10 rice accessions sampled at three developmental stages. The sequencing results yielded a total of 1,814 × 106 reads and identified an average of 713,114 SNPs per rice accession. Based on our analysis of the DNA variation and gene expression, we selected 47 candidate genes. We used an integrated analysis of the whole-genome resequencing data and the RNA-seq data to divide the candidate genes into two groups: genes related to macronutrient (i.e., magnesium and sulfur transport and genes related to flavonoid pathways, including anthocyanidin biosynthesis. We verified the candidate genes with quantitative real-time PCR using transgenic T-DNA insertion mutants. Our study demonstrates the potential of integrated screening methods combined with genetic-variation and transcriptomic data to isolate genes involved in complex biosynthetic networks and pathways.

  3. Whole-Genome Resequencing and Transcriptomic Analysis to Identify Genes Involved in Leaf-Color Diversity in Ornamental Rice Plants

    Science.gov (United States)

    Shin, Younhee; Lim, Hye-Min; Lee, Gang-Seob; Kim, A-Ram; Lee, Tae-Ho; Lee, Jae-Hee; Park, Dong-Suk; Yoo, Seungil; Kim, Yong-Hwan; Kim, Yong-Kab

    2015-01-01

    Rice field art is a large-scale art form in which people design rice fields using various kinds of ornamental rice plants with different leaf colors. Leaf color-related genes play an important role in the study of chlorophyll biosynthesis, chloroplast structure and function, and anthocyanin biosynthesis. Despite the role of different metabolites in the traditional relationship between leaf and color, comprehensive color-specific metabolite studies of ornamental rice have been limited. We performed whole-genome resequencing and transcriptomic analysis of regulatory patterns and genetic diversity among different rice cultivars to discover new genetic mechanisms that promote enhanced levels of various leaf colors. We resequenced the genomes of 10 rice leaf-color accessions to an average of 40× reads depth and >95% coverage and performed 30 RNA-seq experiments using the 10 rice accessions sampled at three developmental stages. The sequencing results yielded a total of 1,814 × 106 reads and identified an average of 713,114 SNPs per rice accession. Based on our analysis of the DNA variation and gene expression, we selected 47 candidate genes. We used an integrated analysis of the whole-genome resequencing data and the RNA-seq data to divide the candidate genes into two groups: genes related to macronutrient (i.e., magnesium and sulfur) transport and genes related to flavonoid pathways, including anthocyanidin biosynthesis. We verified the candidate genes with quantitative real-time PCR using transgenic T-DNA insertion mutants. Our study demonstrates the potential of integrated screening methods combined with genetic-variation and transcriptomic data to isolate genes involved in complex biosynthetic networks and pathways. PMID:25897514

  4. High density linkage mapping of genomic and transcriptomic SNPs for synteny analysis and anchoring the genome sequence of chickpea

    Science.gov (United States)

    Gaur, Rashmi; Jeena, Ganga; Shah, Niraj; Gupta, Shefali; Pradhan, Seema; Tyagi, Akhilesh K; Jain, Mukesh; Chattopadhyay, Debasis; Bhatia, Sabhyata

    2015-01-01

    This study presents genome-wide discovery of SNPs through next generation sequencing of the genome of Cicer reticulatum. Mapping of the C. reticulatum sequenced reads onto the draft genome assembly of C. arietinum (desi chickpea) resulted in identification of 842,104 genomic SNPs which were utilized along with an additional 36,446 genic SNPs identified from transcriptome sequences of the aforementioned varieties. Two new chickpea Oligo Pool All (OPAs) each having 3,072 SNPs were designed and utilized for SNP genotyping of 129 Recombinant Inbred Lines (RILs). Using Illumina GoldenGate Technology genotyping data of 5,041 SNPs were generated and combined with the 1,673 marker data from previously published studies, to generate a high resolution linkage map. The map comprised of 6698 markers distributed on eight linkage groups spanning 1083.93 cM with an average inter-marker distance of 0.16 cM. Utility of the present map was demonstrated for improving the anchoring of the earlier reported draft genome sequence of desi chickpea by ~30% and that of kabuli chickpea by 18%. The genetic map reported in this study represents the most dense linkage map of chickpea , with the potential to facilitate efficient anchoring of the draft genome sequences of desi as well as kabuli chickpea varieties. PMID:26303721

  5. Large-Scale Sequencing: The Future of Genomic Sciences Colloquium

    Energy Technology Data Exchange (ETDEWEB)

    Margaret Riley; Merry Buckley

    2009-01-01

    Genetic sequencing and the various molecular techniques it has enabled have revolutionized the field of microbiology. Examining and comparing the genetic sequences borne by microbes - including bacteria, archaea, viruses, and microbial eukaryotes - provides researchers insights into the processes microbes carry out, their pathogenic traits, and new ways to use microorganisms in medicine and manufacturing. Until recently, sequencing entire microbial genomes has been laborious and expensive, and the decision to sequence the genome of an organism was made on a case-by-case basis by individual researchers and funding agencies. Now, thanks to new technologies, the cost and effort of sequencing is within reach for even the smallest facilities, and the ability to sequence the genomes of a significant fraction of microbial life may be possible. The availability of numerous microbial genomes will enable unprecedented insights into microbial evolution, function, and physiology. However, the current ad hoc approach to gathering sequence data has resulted in an unbalanced and highly biased sampling of microbial diversity. A well-coordinated, large-scale effort to target the breadth and depth of microbial diversity would result in the greatest impact. The American Academy of Microbiology convened a colloquium to discuss the scientific benefits of engaging in a large-scale, taxonomically-based sequencing project. A group of individuals with expertise in microbiology, genomics, informatics, ecology, and evolution deliberated on the issues inherent in such an effort and generated a set of specific recommendations for how best to proceed. The vast majority of microbes are presently uncultured and, thus, pose significant challenges to such a taxonomically-based approach to sampling genome diversity. However, we have yet to even scratch the surface of the genomic diversity among cultured microbes. A coordinated sequencing effort of cultured organisms is an appropriate place to begin

  6. Lifestyle transitions in plant pathogenic Colletotrichum fungi deciphered by genome and transcriptome analyses.

    Science.gov (United States)

    O'Connell, Richard J; Thon, Michael R; Hacquard, Stéphane; Amyotte, Stefan G; Kleemann, Jochen; Torres, Maria F; Damm, Ulrike; Buiate, Ester A; Epstein, Lynn; Alkan, Noam; Altmüller, Janine; Alvarado-Balderrama, Lucia; Bauser, Christopher A; Becker, Christian; Birren, Bruce W; Chen, Zehua; Choi, Jaeyoung; Crouch, Jo Anne; Duvick, Jonathan P; Farman, Mark A; Gan, Pamela; Heiman, David; Henrissat, Bernard; Howard, Richard J; Kabbage, Mehdi; Koch, Christian; Kracher, Barbara; Kubo, Yasuyuki; Law, Audrey D; Lebrun, Marc-Henri; Lee, Yong-Hwan; Miyara, Itay; Moore, Neil; Neumann, Ulla; Nordström, Karl; Panaccione, Daniel G; Panstruga, Ralph; Place, Michael; Proctor, Robert H; Prusky, Dov; Rech, Gabriel; Reinhardt, Richard; Rollins, Jeffrey A; Rounsley, Steve; Schardl, Christopher L; Schwartz, David C; Shenoy, Narmada; Shirasu, Ken; Sikhakolli, Usha R; Stüber, Kurt; Sukno, Serenella A; Sweigard, James A; Takano, Yoshitaka; Takahara, Hiroyuki; Trail, Frances; van der Does, H Charlotte; Voll, Lars M; Will, Isa; Young, Sarah; Zeng, Qiandong; Zhang, Jingze; Zhou, Shiguo; Dickman, Martin B; Schulze-Lefert, Paul; Ver Loren van Themaat, Emiel; Ma, Li-Jun; Vaillancourt, Lisa J

    2012-09-01

    Colletotrichum species are fungal pathogens that devastate crop plants worldwide. Host infection involves the differentiation of specialized cell types that are associated with penetration, growth inside living host cells (biotrophy) and tissue destruction (necrotrophy). We report here genome and transcriptome analyses of Colletotrichum higginsianum infecting Arabidopsis thaliana and Colletotrichum graminicola infecting maize. Comparative genomics showed that both fungi have large sets of pathogenicity-related genes, but families of genes encoding secreted effectors, pectin-degrading enzymes, secondary metabolism enzymes, transporters and peptidases are expanded in C. higginsianum. Genome-wide expression profiling revealed that these genes are transcribed in successive waves that are linked to pathogenic transitions: effectors and secondary metabolism enzymes are induced before penetration and during biotrophy, whereas most hydrolases and transporters are upregulated later, at the switch to necrotrophy. Our findings show that preinvasion perception of plant-derived signals substantially reprograms fungal gene expression and indicate previously unknown functions for particular fungal cell types.

  7. Genome-Wide Transcriptome Analysis of Cadmium Stress in Rice

    Directory of Open Access Journals (Sweden)

    Youko Oono

    2016-01-01

    Full Text Available Rice growth is severely affected by toxic concentrations of the nonessential heavy metal cadmium (Cd. To elucidate the molecular basis of the response to Cd stress, we performed mRNA sequencing of rice following our previous study on exposure to high concentrations of Cd (Oono et al., 2014. In this study, rice plants were hydroponically treated with low concentrations of Cd and approximately 211 million sequence reads were mapped onto the IRGSP-1.0 reference rice genome sequence. Many genes, including some identified under high Cd concentration exposure in our previous study, were found to be responsive to low Cd exposure, with an average of about 11,000 transcripts from each condition. However, genes expressed constitutively across the developmental course responded only slightly to low Cd concentrations, in contrast to their clear response to high Cd concentration, which causes fatal damage to rice seedlings according to phenotypic changes. The expression of metal ion transporter genes tended to correlate with Cd concentration, suggesting the potential of the RNA-Seq strategy to reveal novel Cd-responsive transporters by analyzing gene expression under different Cd concentrations. This study could help to develop novel strategies for improving tolerance to Cd exposure in rice and other cereal crops.

  8. Integration of transcriptome and whole genomic resequencing data to identify key genes affecting swine fat deposition.

    Directory of Open Access Journals (Sweden)

    Kai Xing

    Full Text Available Fat deposition is highly correlated with the growth, meat quality, reproductive performance and immunity of pigs. Fatty acid synthesis takes place mainly in the adipose tissue of pigs; therefore, in this study, a high-throughput massively parallel sequencing approach was used to generate adipose tissue transcriptomes from two groups of Songliao black pigs that had opposite backfat thickness phenotypes. The total number of paired-end reads produced for each sample was in the range of 39.29-49.36 millions. Approximately 188 genes were differentially expressed in adipose tissue and were enriched for metabolic processes, such as fatty acid biosynthesis, lipid synthesis, metabolism of fatty acids, etinol, caffeine and arachidonic acid and immunity. Additionally, many genetic variations were detected between the two groups through pooled whole-genome resequencing. Integration of transcriptome and whole-genome resequencing data revealed important genomic variations among the differentially expressed genes for fat deposition, for example, the lipogenic genes. Further studies are required to investigate the roles of candidate genes in fat deposition to improve pig breeding programs.

  9. Integrated clinical, whole-genome, and transcriptome analysis of multisampled lethal metastatic prostate cancer

    Science.gov (United States)

    Bova, G. Steven; Kallio, Heini M.L.; Annala, Matti; Kivinummi, Kati; Högnäs, Gunilla; Häyrynen, Sergei; Rantapero, Tommi; Kivinen, Virpi; Isaacs, William B.; Tolonen, Teemu; Nykter, Matti; Visakorpi, Tapio

    2016-01-01

    We report the first combined analysis of whole-genome sequence, detailed clinical history, and transcriptome sequence of multiple prostate cancer metastases in a single patient (A21). Whole-genome and transcriptome sequence was obtained from nine anatomically separate metastases, and targeted DNA sequencing was performed in cancerous and noncancerous foci within the primary tumor specimen removed 5 yr before death. Transcriptome analysis revealed increased expression of androgen receptor (AR)-regulated genes in liver metastases that harbored an AR p.L702H mutation, suggesting a dominant effect by the mutation despite being present in only one of an estimated 16 copies per cell. The metastases harbored several alterations to the PI3K/AKT pathway, including a clonal truncal mutation in PIK3CG and present in all metastatic sites studied. The list of truncal genomic alterations shared by all metastases included homozygous deletion of TP53, hemizygous deletion of RB1 and CHD1, and amplification of FGFR1. If the patient were treated today, given this knowledge, the use of second-generation androgen-directed therapies, cessation of glucocorticoid administration, and therapeutic inhibition of the PI3K/AKT pathway or FGFR1 receptor could provide personalized benefit. Three previously unreported truncal clonal missense mutations (ABCC4 p.R891L, ALDH9A1 p.W89R, and ASNA1 p.P75R) were expressed at the RNA level and assessed as druggable. The truncal status of mutations may be critical for effective actionability and merit further study. Our findings suggest that a large set of deeply analyzed cases could serve as a powerful guide to more effective prostate cancer basic science and personalized cancer medicine clinical trials. PMID:27148588

  10. Tools to covisualize and coanalyze proteomic data with genomes and transcriptomes: validation of genes and alternative mRNA splicing

    DEFF Research Database (Denmark)

    Pang, Chi; Tay, Aidan; Aya, Carlos

    2014-01-01

    Direct links between proteomic and genomic/transcriptomic data are not frequently made, partly because of lack of appropriate bioinformatics tools. To help address this, we have developed the PG Nexus pipeline. The PG Nexus allows users to covisualize peptides in the context of genomes or genomic...... is open-source and available from https://github.com/IntersectAustralia/ap11_Samifier . It has been integrated into Galaxy and made available in the Galaxy tool shed....

  11. A complete sequence and transcriptomic analyses of date palm (Phoenix dactylifera L. mitochondrial genome.

    Directory of Open Access Journals (Sweden)

    Yongjun Fang

    Full Text Available Based on next-generation sequencing data, we assembled the mitochondrial (mt genome of date palm (Phoenix dactylifera L. into a circular molecule of 715,001 bp in length. The mt genome of P. dactylifera encodes 38 proteins, 30 tRNAs, and 3 ribosomal RNAs, which constitute a gene content of 6.5% (46,770 bp over the full length. The rest, 93.5% of the genome sequence, is comprised of cp (chloroplast-derived (10.3% with respect to the whole genome length and non-coding sequences. In the non-coding regions, there are 0.33% tandem and 2.3% long repeats. Our transcriptomic data from eight tissues (root, seed, bud, fruit, green leaf, yellow leaf, female flower, and male flower showed higher gene expression levels in male flower, root, bud, and female flower, as compared to four other tissues. We identified 120 potential SNPs among three date palm cultivars (Khalas, Fahal, and Sukry, and successfully found seven SNPs in the coding sequences. A phylogenetic analysis, based on 22 conserved genes of 15 representative plant mitochondria, showed that P. dactylifera positions at the root of all sequenced monocot mt genomes. In addition, consistent with previous discoveries, there are three co-transcribed gene clusters-18S-5S rRNA, rps3-rpl16 and nad3-rps12-in P. dactylifera, which are highly conserved among all known mitochondrial genomes of angiosperms.

  12. Breaking the 1000-gene barrier for Mimivirus using ultra-deep genome and transcriptome sequencing

    Directory of Open Access Journals (Sweden)

    Claverie Jean-Michel

    2011-03-01

    Full Text Available Abstract Background Mimivirus, a giant dsDNA virus infecting Acanthamoeba, is the prototype of the mimiviridae family, the latest addition to the family of the nucleocytoplasmic large DNA viruses (NCLDVs. Its 1.2 Mb-genome was initially predicted to encode 917 genes. A subsequent RNA-Seq analysis precisely mapped many transcript boundaries and identified 75 new genes. Findings We now report a much deeper analysis using the SOLiD™ technology combining RNA-Seq of the Mimivirus transcriptome during the infectious cycle (202.4 Million reads, and a complete genome re-sequencing (45.3 Million reads. This study corrected the genome sequence and identified several single nucleotide polymorphisms. Our results also provided clear evidence of previously overlooked transcription units, including an important RNA polymerase subunit distantly related to Euryarchea homologues. The total Mimivirus gene count is now 1018, 11% greater than the original annotation. Conclusions This study highlights the huge progress brought about by ultra-deep sequencing for the comprehensive annotation of virus genomes, opening the door to a complete one-nucleotide resolution level description of their transcriptional activity, and to the realistic modeling of the viral genome expression at the ultimate molecular level. This work also illustrates the need to go beyond bioinformatics-only approaches for the annotation of short protein and non-coding genes in viral genomes.

  13. Optimizing de novo transcriptome assembly and extending genomic resources for striped catfish (Pangasianodon hypophthalmus).

    Science.gov (United States)

    Thanh, Nguyen Minh; Jung, Hyungtaek; Lyons, Russell E; Njaci, Isaac; Yoon, Byoung-Ha; Chand, Vincent; Tuan, Nguyen Viet; Thu, Vo Thi Minh; Mather, Peter

    2015-10-01

    Striped catfish (Pangasianodon hypophthalmus) is a commercially important freshwater fish used in inland aquaculture in the Mekong Delta, Vietnam. The culture industry is facing a significant challenge however from saltwater intrusion into many low topographical coastal provinces across the Mekong Delta as a result of predicted climate change impacts. Developing genomic resources for this species can facilitate the production of improved culture lines that can withstand raised salinity conditions, and so we have applied high-throughput Ion Torrent sequencing of transcriptome libraries from six target osmoregulatory organs from striped catfish as a genomic resource for use in future selection strategies. We obtained 12,177,770 reads after trimming and processing with an average length of 97bp. De novo assemblies were generated using CLC Genomic Workbench, Trinity and Velvet/Oases with the best overall contig performance resulting from the CLC assembly. De novo assembly using CLC yielded 66,451 contigs with an average length of 478bp and N50 length of 506bp. A total of 37,969 contigs (57%) possessed significant similarity with proteins in the non-redundant database. Comparative analyses revealed that a significant number of contigs matched sequences reported in other teleost fishes, ranging in similarity from 45.2% with Atlantic cod to 52% with zebrafish. In addition, 28,879 simple sequence repeats (SSRs) and 55,721 single nucleotide polymorphisms (SNPs) were detected in the striped catfish transcriptome. The sequence collection generated in the current study represents the most comprehensive genomic resource for P. hypophthalmus available to date. Our results illustrate the utility of next-generation sequencing as an efficient tool for constructing a large genomic database for marker development in non-model species.

  14. Genome and transcriptome analysis of the food-yeast Candida utilis.

    Directory of Open Access Journals (Sweden)

    Yasuyuki Tomita

    Full Text Available The industrially important food-yeast Candida utilis is a Crabtree effect-negative yeast used to produce valuable chemicals and recombinant proteins. In the present study, we conducted whole genome sequencing and phylogenetic analysis of C. utilis, which showed that this yeast diverged long before the formation of the CUG and Saccharomyces/Kluyveromyces clades. In addition, we performed comparative genome and transcriptome analyses using next-generation sequencing, which resulted in the identification of genes important for characteristic phenotypes of C. utilis such as those involved in nitrate assimilation, in addition to the gene encoding the functional hexose transporter. We also found that an antisense transcript of the alcohol dehydrogenase gene, which in silico analysis did not predict to be a functional gene, was transcribed in the stationary-phase, suggesting a novel system of repression of ethanol production. These findings should facilitate the development of more sophisticated systems for the production of useful reagents using C. utilis.

  15. Augmenting transcriptome assembly by combining de novo and genome-guided tools.

    Science.gov (United States)

    Jain, Prachi; Krishnan, Neeraja M; Panda, Binay

    2013-01-01

    Researchers interested in studying and constructing transcriptomes, especially for non-model species, face the conundrum of choosing from a number of available de novo and genome-guided assemblers. None of the popular assembly tools in use today achieve requisite sensitivity, specificity or recovery of full-length transcripts on their own. Here, we present a comprehensive comparative study of the performance of various assemblers. Additionally, we present an approach to combinatorially augment transciptome assembly by using both de novo and genome-guided tools. In our study, we obtained the best recovery and most full-length transcripts with Trinity and TopHat1-Cufflinks, respectively. The sensitivity of the assembly and isoform recovery was superior, without compromising much on the specificity, when transcripts from Trinity were augmented with those from TopHat1-Cufflinks.

  16. De Novo Genome and Transcriptome Assembly of the Canadian Beaver (Castor canadensis)

    Science.gov (United States)

    Lok, Si; Paton, Tara A.; Wang, Zhuozhi; Kaur, Gaganjot; Walker, Susan; Yuen, Ryan K. C.; Sung, Wilson W. L.; Whitney, Joseph; Buchanan, Janet A.; Trost, Brett; Singh, Naina; Apresto, Beverly; Chen, Nan; Coole, Matthew; Dawson, Travis J.; Ho, Karen; Hu, Zhizhou; Pullenayegum, Sanjeev; Samler, Kozue; Shipstone, Arun; Tsoi, Fiona; Wang, Ting; Pereira, Sergio L.; Rostami, Pirooz; Ryan, Carol Ann; Tong, Amy Hin Yan; Ng, Karen; Sundaravadanam, Yogi; Simpson, Jared T.; Lim, Burton K.; Engstrom, Mark D.; Dutton, Christopher J.; Kerr, Kevin C. R.; Franke, Maria; Rapley, William; Wintle, Richard F.; Scherer, Stephen W.

    2017-01-01

    The Canadian beaver (Castor canadensis) is the largest indigenous rodent in North America. We report a draft annotated assembly of the beaver genome, the first for a large rodent and the first mammalian genome assembled directly from uncorrected and moderate coverage (genome size is 2.7 Gb estimated by k-mer analysis. We assembled the beaver genome using the new Canu assembler optimized for noisy reads. The resulting assembly was refined using Pilon supported by short reads (80 ×) and checked for accuracy by congruency against an independent short read assembly. We scaffolded the assembly using the exon–gene models derived from 9805 full-length open reading frames (FL-ORFs) constructed from the beaver leukocyte and muscle transcriptomes. The final assembly comprised 22,515 contigs with an N50 of 278,680 bp and an N50-scaffold of 317,558 bp. Maximum contig and scaffold lengths were 3.3 and 4.2 Mb, respectively, with a combined scaffold length representing 92% of the estimated genome size. The completeness and accuracy of the scaffold assembly was demonstrated by the precise exon placement for 91.1% of the 9805 assembled FL-ORFs and 83.1% of the BUSCO (Benchmarking Universal Single-Copy Orthologs) gene set used to assess the quality of genome assemblies. Well-represented were genes involved in dentition and enamel deposition, defining characteristics of rodents with which the beaver is well-endowed. The study provides insights for genome assembly and an important genomics resource for Castoridae and rodent evolutionary biology. PMID:28087693

  17. De Novo Genome and Transcriptome Assembly of the Canadian Beaver (Castor canadensis

    Directory of Open Access Journals (Sweden)

    Si Lok

    2017-02-01

    Full Text Available The Canadian beaver (Castor canadensis is the largest indigenous rodent in North America. We report a draft annotated assembly of the beaver genome, the first for a large rodent and the first mammalian genome assembled directly from uncorrected and moderate coverage (< 30 × long reads generated by single-molecule sequencing. The genome size is 2.7 Gb estimated by k-mer analysis. We assembled the beaver genome using the new Canu assembler optimized for noisy reads. The resulting assembly was refined using Pilon supported by short reads (80 × and checked for accuracy by congruency against an independent short read assembly. We scaffolded the assembly using the exon–gene models derived from 9805 full-length open reading frames (FL-ORFs constructed from the beaver leukocyte and muscle transcriptomes. The final assembly comprised 22,515 contigs with an N50 of 278,680 bp and an N50-scaffold of 317,558 bp. Maximum contig and scaffold lengths were 3.3 and 4.2 Mb, respectively, with a combined scaffold length representing 92% of the estimated genome size. The completeness and accuracy of the scaffold assembly was demonstrated by the precise exon placement for 91.1% of the 9805 assembled FL-ORFs and 83.1% of the BUSCO (Benchmarking Universal Single-Copy Orthologs gene set used to assess the quality of genome assemblies. Well-represented were genes involved in dentition and enamel deposition, defining characteristics of rodents with which the beaver is well-endowed. The study provides insights for genome assembly and an important genomics resource for Castoridae and rodent evolutionary biology.

  18. Dictyocaulus viviparus genome, variome and transcriptome elucidate lungworm biology and support future intervention.

    Science.gov (United States)

    McNulty, Samantha N; Strübe, Christina; Rosa, Bruce A; Martin, John C; Tyagi, Rahul; Choi, Young-Jun; Wang, Qi; Hallsworth Pepin, Kymberlie; Zhang, Xu; Ozersky, Philip; Wilson, Richard K; Sternberg, Paul W; Gasser, Robin B; Mitreva, Makedonka

    2016-02-09

    The bovine lungworm, Dictyocaulus viviparus (order Strongylida), is an important parasite of livestock that causes substantial economic and production losses worldwide. Here we report the draft genome, variome, and developmental transcriptome of D. viviparus. The genome (161 Mb) is smaller than those of related bursate nematodes and encodes fewer proteins (14,171 total). In the first genome-wide assessment of genomic variation in any parasitic nematode, we found a high degree of sequence variability in proteins predicted to be involved host-parasite interactions. Next, we used extensive RNA sequence data to track gene transcription across the life cycle of D. viviparus, and identified genes that might be important in nematode development and parasitism. Finally, we predicted genes that could be vital in host-parasite interactions, genes that could serve as drug targets, and putative RNAi effectors with a view to developing functional genomic tools. This extensive, well-curated dataset should provide a basis for developing new anthelmintics, vaccines, and improved diagnostic tests and serve as a platform for future investigations of drug resistance and epidemiology of the bovine lungworm and related nematodes.

  19. Transcriptome analysis in Concholepas concholepas (Gastropoda, Muricidae): mining and characterization of new genomic and molecular markers.

    Science.gov (United States)

    Cárdenas, Leyla; Sánchez, Roland; Gomez, Daniela; Fuenzalida, Gonzalo; Gallardo-Escárate, Cristián; Tanguy, Arnaud

    2011-09-01

    The marine gastropod Concholepas concholepas, locally known as the "loco", is the main target species of the benthonic Chilean fisheries. Genetic and genomic tools are necessary to study the genome of this species in order to understand the molecular basis of its development, growth, and other key traits to improve the management strategies and to identify local adaptation to prevent loss of biodiversity. Here, we use pyrosequencing technologies to generate the first transcriptomic database from adult specimens of the loco. After trimming, a total of 140,756 Expressed Sequence Tag sequences were achieved. Clustering and assembly analysis identified 19,219 contigs and 105,435 singleton sequences. BlastN analysis showed a significant identity with Expressed Sequence Tags of different gastropod species available in public databases. Similarly, BlastX results showed that only 895 out of the total 124,654 had significant hits and may represent novel genes for marine gastropods. From this database, simple sequence repeat motifs were also identified and a total of 38 primer pairs were designed and tested to assess their potential as informative markers and to investigate their cross-species amplification in different related gastropod species. This dataset represents the first publicly available 454 data for a marine gastropod endemic to the southeastern Pacific coast, providing a valuable transcriptomic resource for future efforts of gene discovery and development of functional markers in other marine gastropods.

  20. The capsicum transcriptome DB: a “hot” tool for genomic research

    Science.gov (United States)

    Góngora-Castillo, Elsa; Fajardo-Jaime, Rubén; Fernández-Cortes, Araceli; Jofre-Garfias, Alba E; Lozoya-Gloria, Edmundo; Martínez, Octavio; Ochoa-Alejo, Neftalí; Rivera-Bustamante, Rafael

    2012-01-01

    Chili pepper (Capsicum annuum) is an economically important crop with no available public genome sequence. We describe a genomic resource to facilitate Capsicum annuum research. A collection of Expressed Sequence Tags (ESTs) derived from five C. annuum organs (root, stem, leaf, flower and fruit) were sequenced using the Sanger method and multiple leaf transcriptomes were deeply sampled using with GS-pyrosequencing. A hybrid assembly of 1,324,516 raw reads yielded 32,314 high quality contigs as validated by coverage and identity analysis with existing pepper sequences. Overall, 75.5% of the contigs had significant sequence similarity to entries in nucleic acid and protein databases; 23% of the sequences have not been previously reported for C. annuum and expand sequence resources for this species. A MySQL database and a user-friendly Web interface were constructed with search-tools that permit queries of the ESTs including sequence, functional annotation, Gene Ontology classification, metabolic pathways, and assembly information. The Capsicum Transcriptome DB is free available from http://www.bioingenios.ira.cinvestav.mx:81/Joomla/ PMID:22359434

  1. Genome and transcriptome sequences reveal the specific parasitism of the nematophagous Purpureocillium lilacinum 36-1

    Directory of Open Access Journals (Sweden)

    Jialian Xie

    2016-07-01

    Full Text Available Purpureocillium lilacinum is a promising nematophagous ascomycete able to adapt diverse environments and it is also an opportunistic fungus that infects humans. A microbial inoculant of P. lilacinum has been registered to control plant parasitic nematodes. However, the molecular mechanism of the toxicological processes is still unclear because of the relatively few reports on the subject. In this study, using Illumina paired-end sequencing, the draft genome sequence and the transcriptome of P. lilacinum strain 36-1 infecting nematode-eggs were determined. Whole genome alignment indicated that P. lilacinum 36-1 possessed a more dynamic genome in comparison with P. lilacinum India strain. Moreover, a phylogenetic analysis showed that the P. lilacinum 36-1 had a closer relation to entomophagous fungi. The protein-coding genes in P. lilacinum 36-1 occurred much more frequently than they did in other fungi, which was a result of the depletion of repeat-induced point mutations (RIP. Comparative genome and transcriptome analyses revealed the genes that were involved in pathogenicity, particularly in the recognition, adhesion of nematode-eggs, downstream signal transduction pathways and hydrolase genes. By contrast, certain numbers of cellulose and xylan degradation genes and a lack of polysaccharide lyase genes showed the potential of P. lilacinum 36-1 as an endophyte. Notably, the expression of appressorium-formation and antioxidants-related genes exhibited similar infection patterns in P. lilacinum strain 36-1 to those of the model entomophagous fungi Metarhizium spp. These results uncovered the specific parasitism of P. lilacinum and presented the genes responsible for the infection of nematode-eggs.

  2. Large-scale Gene Ontology analysis of plant transcriptome-derived sequences retrieved by AFLP technology

    Directory of Open Access Journals (Sweden)

    Ramina Angelo

    2008-07-01

    Full Text Available Abstract Background After 10-year-use of AFLP (Amplified Fragment Length Polymorphism technology for DNA fingerprinting and mRNA profiling, large repertories of genome- and transcriptome-derived sequences are available in public databases for model, crop and tree species. AFLP marker systems have been and are being extensively exploited for genome scanning and gene mapping, as well as cDNA-AFLP for transcriptome profiling and differentially expressed gene cloning. The evaluation, annotation and classification of genomic markers and expressed transcripts would be of great utility for both functional genomics and systems biology research in plants. This may be achieved by means of the Gene Ontology (GO, consisting in three structured vocabularies (i.e. ontologies describing genes, transcripts and proteins of any organism in terms of their associated cellular component, biological process and molecular function in a species-independent manner. In this paper, the functional annotation of about 8,000 AFLP-derived ESTs retrieved in the NCBI databases was carried out by using GO terminology. Results Descriptive statistics on the type, size and nature of gene sequences obtained by means of AFLP technology were calculated. The gene products associated with mRNA transcripts were then classified according to the three main GO vocabularies. A comparison of the functional content of cDNA-AFLP records was also performed by splitting the sequence dataset into monocots and dicots and by comparing them to all annotated ESTs of Arabidopsis and rice, respectively. On the whole, the statistical parameters adopted for the in silico AFLP-derived transcriptome-anchored sequence analysis proved to be critical for obtaining reliable GO results. Such an exhaustive annotation may offer a suitable platform for functional genomics, particularly useful in non-model species. Conclusion Reliable GO annotations of AFLP-derived sequences can be gathered through the optimization

  3. Genomic and transcriptomic insights into the efficient entomopathogenicity of Bacillus thuringiensis

    Science.gov (United States)

    Zhu, Lei; Peng, Donghai; Wang, Yueying; Ye, Weixing; Zheng, Jinshui; Zhao, Changming; Han, Dongmei; Geng, Ce; Ruan, Lifang; He, Jin; Yu, Ziniu; Sun, Ming

    2015-01-01

    Bacillus thuringiensis has been globally used as a microbial pesticide for over 70 years. However, information regarding its various adaptions and virulence factors and their roles in the entomopathogenic process remains limited. In this work, we present the complete genomes of two industrially patented Bacillus thuringiensis strains (HD-1 and YBT-1520). A comparative genomic analysis showed a larger and more complicated genome constitution that included novel insecticidal toxicity-related genes (ITRGs). All of the putative ITRGs were summarized according to the steps of infection. A comparative genomic analysis showed that highly toxic strains contained significantly more ITRGs, thereby providing additional strategies for infection, immune evasion, and cadaver utilization. Furthermore, a comparative transcriptomic analysis suggested that a high expression of these ITRGs was a key factor in efficient entomopathogenicity. We identified an active extra urease synthesis system in the highly toxic strains that may aid B. thuringiensis survival in insects (similar to previous results with well-known pathogens). Taken together, these results explain the efficient entomopathogenicity of B. thuringiensis. It provides novel insights into the strategies used by B. thuringiensis to resist and overcome host immune defenses and helps identify novel toxicity factors. PMID:26411888

  4. Genomic and transcriptomic insights into the efficient entomopathogenicity of Bacillus thuringiensis.

    Science.gov (United States)

    Zhu, Lei; Peng, Donghai; Wang, Yueying; Ye, Weixing; Zheng, Jinshui; Zhao, Changming; Han, Dongmei; Geng, Ce; Ruan, Lifang; He, Jin; Yu, Ziniu; Sun, Ming

    2015-09-28

    Bacillus thuringiensis has been globally used as a microbial pesticide for over 70 years. However, information regarding its various adaptions and virulence factors and their roles in the entomopathogenic process remains limited. In this work, we present the complete genomes of two industrially patented Bacillus thuringiensis strains (HD-1 and YBT-1520). A comparative genomic analysis showed a larger and more complicated genome constitution that included novel insecticidal toxicity-related genes (ITRGs). All of the putative ITRGs were summarized according to the steps of infection. A comparative genomic analysis showed that highly toxic strains contained significantly more ITRGs, thereby providing additional strategies for infection, immune evasion, and cadaver utilization. Furthermore, a comparative transcriptomic analysis suggested that a high expression of these ITRGs was a key factor in efficient entomopathogenicity. We identified an active extra urease synthesis system in the highly toxic strains that may aid B. thuringiensis survival in insects (similar to previous results with well-known pathogens). Taken together, these results explain the efficient entomopathogenicity of B. thuringiensis. It provides novel insights into the strategies used by B. thuringiensis to resist and overcome host immune defenses and helps identify novel toxicity factors.

  5. Genome-wide binding and transcriptome analysis of human farnesoid X receptor in primary human hepatocytes.

    Directory of Open Access Journals (Sweden)

    Le Zhan

    Full Text Available Farnesoid X receptor (FXR, NR1H4 is a ligand-activated transcription factor, belonging to the nuclear receptor superfamily. FXR is highly expressed in the liver and is essential in regulating bile acid homeostasis. FXR deficiency is implicated in numerous liver diseases and mice with modulation of FXR have been used as animal models to study liver physiology and pathology. We have reported genome-wide binding of FXR in mice by chromatin immunoprecipitation - deep sequencing (ChIP-seq, with results indicating that FXR may be involved in regulating diverse pathways in liver. However, limited information exists for the functions of human FXR and the suitability of using murine models to study human FXR functions.In the current study, we performed ChIP-seq in primary human hepatocytes (PHHs treated with a synthetic FXR agonist, GW4064 or DMSO control. In parallel, RNA deep sequencing (RNA-seq and RNA microarray were performed for GW4064 or control treated PHHs and wild type mouse livers, respectively.ChIP-seq showed similar profiles of genome-wide FXR binding in humans and mice in terms of motif analysis and pathway prediction. However, RNA-seq and microarray showed more different transcriptome profiles between PHHs and mouse livers upon GW4064 treatment.In summary, we have established genome-wide human FXR binding and transcriptome profiles. These results will aid in determining the human FXR functions, as well as judging to what level the mouse models could be used to study human FXR functions.

  6. Draft assembly of elite inbred line PH207 provides insights into genomic and transcriptome diversity in maize

    Science.gov (United States)

    Intense artificial selection over the last 100 years has produced elite maize (Zea mays) inbred lines that combine to produce high-yielding hybrids. To further our understanding of how genome and transcriptome variation contribute to the production of high-yielding hybrids, we generated a draft geno...

  7. Analysis of the Legionella longbeachae genome and transcriptome uncovers unique strategies to cause Legionnaires' disease.

    Directory of Open Access Journals (Sweden)

    Christel Cazalet

    2010-02-01

    Full Text Available Legionella pneumophila and L. longbeachae are two species of a large genus of bacteria that are ubiquitous in nature. L. pneumophila is mainly found in natural and artificial water circuits while L. longbeachae is mainly present in soil. Under the appropriate conditions both species are human pathogens, capable of causing a severe form of pneumonia termed Legionnaires' disease. Here we report the sequencing and analysis of four L. longbeachae genomes, one complete genome sequence of L. longbeachae strain NSW150 serogroup (Sg 1, and three draft genome sequences another belonging to Sg1 and two to Sg2. The genome organization and gene content of the four L. longbeachae genomes are highly conserved, indicating strong pressure for niche adaptation. Analysis and comparison of L. longbeachae strain NSW150 with L. pneumophila revealed common but also unexpected features specific to this pathogen. The interaction with host cells shows distinct features from L. pneumophila, as L. longbeachae possesses a unique repertoire of putative Dot/Icm type IV secretion system substrates, eukaryotic-like and eukaryotic domain proteins, and encodes additional secretion systems. However, analysis of the ability of a dotA mutant of L. longbeachae NSW150 to replicate in the Acanthamoeba castellanii and in a mouse lung infection model showed that the Dot/Icm type IV secretion system is also essential for the virulence of L. longbeachae. In contrast to L. pneumophila, L. longbeachae does not encode flagella, thereby providing a possible explanation for differences in mouse susceptibility to infection between the two pathogens. Furthermore, transcriptome analysis revealed that L. longbeachae has a less pronounced biphasic life cycle as compared to L. pneumophila, and genome analysis and electron microscopy suggested that L. longbeachae is encapsulated. These species-specific differences may account for the different environmental niches and disease epidemiology of these

  8. Genome scale metabolic modeling of cancer

    DEFF Research Database (Denmark)

    Nilsson, Avlant; Nielsen, Jens

    2016-01-01

    been used as scaffolds for analysis of high throughput data to allow mechanistic interpretation of changes in expression. Finally, GEMs allow quantitative flux predictions using flux balance analysis (FBA). Here we critically review the requirements for successful FBA simulations of cancer cells......Cancer cells reprogram metabolism to support rapid proliferation and survival. Energy metabolism is particularly important for growth and genes encoding enzymes involved in energy metabolism are frequently altered in cancer cells. A genome scale metabolic model (GEM) is a mathematical formalization...... of metabolism which allows simulation and hypotheses testing of metabolic strategies. It has successfully been applied to many microorganisms and is now used to study cancer metabolism. Generic models of human metabolism have been reconstructed based on the existence of metabolic genes in the human genome...

  9. Talaromyces marneffei Genomic, Transcriptomic, Proteomic and Metabolomic Studies Reveal Mechanisms for Environmental Adaptations and Virulence

    Directory of Open Access Journals (Sweden)

    Susanna K. P. Lau

    2017-06-01

    Full Text Available Talaromyces marneffei is a thermally dimorphic fungus causing systemic infections in patients positive for HIV or other immunocompromised statuses. Analysis of its ~28.9 Mb draft genome and additional transcriptomic, proteomic and metabolomic studies revealed mechanisms for environmental adaptations and virulence. Meiotic genes and genes for pheromone receptors, enzymes which process pheromones, and proteins involved in pheromone response pathway are present, indicating its possibility as a heterothallic fungus. Among the 14 Mp1p homologs, only Mp1p is a virulence factor binding a variety of host proteins, fatty acids and lipids. There are 23 polyketide synthase genes, one for melanin and two for mitorubrinic acid/mitorubrinol biosynthesis, which are virulence factors. Another polyketide synthase is for biogenesis of the diffusible red pigment, which consists of amino acid conjugates of monascorubin and rubropunctatin. Novel microRNA-like RNAs (milRNAs and processing proteins are present. The dicer protein, dcl-2, is required for biogenesis of two milRNAs, PM-milR-M1 and PM-milR-M2, which are more highly expressed in hyphal cells. Comparative transcriptomics showed that tandem repeat-containing genes were overexpressed in yeast phase, generating protein polymorphism among cells, evading host’s immunity. Comparative proteomics between yeast and hyphal cells revealed that glyceraldehyde-3-phosphate dehydrogenase, up-regulated in hyphal cells, is an adhesion factor for conidial attachment.

  10. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups.

    Science.gov (United States)

    Curtis, Christina; Shah, Sohrab P; Chin, Suet-Feung; Turashvili, Gulisa; Rueda, Oscar M; Dunning, Mark J; Speed, Doug; Lynch, Andy G; Samarajiwa, Shamith; Yuan, Yinyin; Gräf, Stefan; Ha, Gavin; Haffari, Gholamreza; Bashashati, Ali; Russell, Roslin; McKinney, Steven; Langerød, Anita; Green, Andrew; Provenzano, Elena; Wishart, Gordon; Pinder, Sarah; Watson, Peter; Markowetz, Florian; Murphy, Leigh; Ellis, Ian; Purushotham, Arnie; Børresen-Dale, Anne-Lise; Brenton, James D; Tavaré, Simon; Caldas, Carlos; Aparicio, Samuel

    2012-04-18

    The elucidation of breast cancer subgroups and their molecular drivers requires integrated views of the genome and transcriptome from representative numbers of patients. We present an integrated analysis of copy number and gene expression in a discovery and validation set of 997 and 995 primary breast tumours, respectively, with long-term clinical follow-up. Inherited variants (copy number variants and single nucleotide polymorphisms) and acquired somatic copy number aberrations (CNAs) were associated with expression in ~40% of genes, with the landscape dominated by cis- and trans-acting CNAs. By delineating expression outlier genes driven in cis by CNAs, we identified putative cancer genes, including deletions in PPP2R2A, MTAP and MAP2K4. Unsupervised analysis of paired DNA–RNA profiles revealed novel subgroups with distinct clinical outcomes, which reproduced in the validation cohort. These include a high-risk, oestrogen-receptor-positive 11q13/14 cis-acting subgroup and a favourable prognosis subgroup devoid of CNAs. Trans-acting aberration hotspots were found to modulate subgroup-specific gene networks, including a TCR deletion-mediated adaptive immune response in the ‘CNA-devoid’ subgroup and a basal-specific chromosome 5 deletion-associated mitotic network. Our results provide a novel molecular stratification of the breast cancer population, derived from the impact of somatic CNAs on the transcriptome.

  11. Genome-wide transcriptome analysis revealed organelle specific responses to temperature variations in algae

    Science.gov (United States)

    Shin, HyeonSeok; Hong, Seong-Joo; Yoo, Chan; Han, Mi-Ae; Lee, Hookeun; Choi, Hyung-Kyoon; Cho, Suhyung; Lee, Choul-Gyun; Cho, Byung-Kwan

    2016-01-01

    Temperature is a critical environmental factor that affects microalgal growth. However, microalgal coping mechanisms for temperature variations are unclear. Here, we determined changes in transcriptome, total carbohydrate, total fatty acid methyl ester, and fatty acid composition of Tetraselmis sp. KCTC12432BP, a strain with a broad temperature tolerance range, to elucidate the tolerance mechanisms in response to large temperature variations. Owing to unavailability of genome sequence information, de novo transcriptome assembly coupled with BLAST analysis was performed using strand specific RNA-seq data. This resulted in 26,245 protein-coding transcripts, of which 83.7% could be annotated to putative functions. We identified more than 681 genes differentially expressed, suggesting an organelle-specific response to temperature variation. Among these, the genes related to the photosynthetic electron transfer chain, which are localized in the plastid thylakoid membrane, were upregulated at low temperature. However, the transcripts related to the electron transport chain and biosynthesis of phosphatidylethanolamine localized in mitochondria were upregulated at high temperature. These results show that the low energy uptake by repressed photosynthesis under low and high temperature conditions is compensated by different mechanisms, including photosystem I and mitochondrial oxidative phosphorylation, respectively. This study illustrates that microalgae tolerate different temperature conditions through organelle specific mechanisms. PMID:27883062

  12. Genome-based analysis of the transcriptome from mature chickpea root nodules

    Directory of Open Access Journals (Sweden)

    Fabian eAfonso-Grunz

    2014-07-01

    Full Text Available Symbiotic nitrogen fixation (SNF in root nodules of grain legumes such as chickpea is a highly complex process that drastically affects the gene expression patterns of both the prokaryotic as well as eukaryotic interacting cells. A successfully established symbiotic relationship requires mutual signaling mechanisms and a continuous adaptation of the metabolism of the involved cells to varying environmental conditions. Although some of these processes are well understood today many of the molecular mechanisms underlying SNF, especially in chickpea, remain unclear. Here, we reannotated our previously published transcriptome data generated by deepSuperSAGE (Serial Analysis of Gene Expression to the recently published draft genome of chickpea to assess the root- and nodule-specific transcriptomes of the eukaryotic host cells. The identified gene expression patterns comprise up to 71 significantly differentially expressed genes and the expression of twenty of these was validated by quantitative real-time PCR with the tissues from five independent biological replicates. Many of the differentially expressed transcripts were found to encode proteins implicated in sugar metabolism, antioxidant defense as well as biotic and abiotic stress responses of the host cells, and some of them were already known to contribute to SNF in other legumes. The differentially expressed genes identified in this study represent candidates that can be used for further characterization of the complex molecular mechanisms underlying SNF in chickpea.

  13. Integration of expression data in genome-scale metabolic network reconstructions

    Directory of Open Access Journals (Sweden)

    Anna S. Blazier

    2012-08-01

    Full Text Available With the advent of high-throughput technologies, the field of systems biology has amassed an abundance of omics data, quantifying thousands of cellular components across a variety of scales, ranging from mRNA transcript levels to metabolite quantities. Methods are needed to not only integrate this omics data but to also use this data to heighten the predictive capabilities of computational models. Several recent studies have successfully demonstrated how flux balance analysis (FBA, a constraint-based modeling approach, can be used to integrate transcriptomic data into genome-scale metabolic network reconstructions to generate predictive computational models. In this review, we summarize such FBA-based methods for integrating expression data into genome-scale metabolic network reconstructions, highlighting their advantages as well as their limitations.

  14. Comprehensive Comparative Genomic and Transcriptomic Analyses of the Legume Genes Controlling the Nodulation Process.

    Science.gov (United States)

    Qiao, Zhenzhen; Pingault, Lise; Nourbakhsh-Rey, Mehrnoush; Libault, Marc

    2016-01-01

    Nitrogen is one of the most essential plant nutrients and one of the major factors limiting crop productivity. Having the goal to perform a more sustainable agriculture, there is a need to maximize biological nitrogen fixation, a feature of legumes. To enhance our understanding of the molecular mechanisms controlling the interaction between legumes and rhizobia, the symbiotic partner fixing and assimilating the atmospheric nitrogen for the plant, researchers took advantage of genetic and genomic resources developed across different legume models (e.g., Medicago truncatula, Lotus japonicus, Glycine max, and Phaseolus vulgaris) to identify key regulatory protein coding genes of the nodulation process. In this study, we are presenting the results of a comprehensive comparative genomic analysis to highlight orthologous and paralogous relationships between the legume genes controlling nodulation. Mining large transcriptomic datasets, we also identified several orthologous and paralogous genes characterized by the induction of their expression during nodulation across legume plant species. This comprehensive study prompts new insights into the evolution of the nodulation process in legume plant and will benefit the scientific community interested in the transfer of functional genomic information between species.

  15. Genome and Transcriptome of Clostridium phytofermentans, Catalyst for the Direct Conversion of Plant Feedstocks to Fuels

    Science.gov (United States)

    Petit, Elsa; Coppi, Maddalena V.; Hayes, James C.; Tolonen, Andrew C.; Warnick, Thomas; Latouf, William G.; Amisano, Danielle; Biddle, Amy; Mukherjee, Supratim; Ivanova, Natalia; Lykidis, Athanassios; Land, Miriam; Hauser, Loren; Kyrpides, Nikos; Henrissat, Bernard; Lau, Joanne; Schnell, Danny J.; Church, George M.; Leschine, Susan B.; Blanchard, Jeffrey L.

    2015-01-01

    Clostridium phytofermentans was isolated from forest soil and is distinguished by its capacity to directly ferment plant cell wall polysaccharides into ethanol as the primary product, suggesting that it possesses unusual catabolic pathways. The objective of the present study was to understand the molecular mechanisms of biomass conversion to ethanol in a single organism, Clostridium phytofermentans, by analyzing its complete genome and transcriptome during growth on plant carbohydrates. The saccharolytic versatility of C. phytofermentans is reflected in a diversity of genes encoding ATP-binding cassette sugar transporters and glycoside hydrolases, many of which may have been acquired through horizontal gene transfer. These genes are frequently organized as operons that may be controlled individually by the many transcriptional regulators identified in the genome. Preferential ethanol production may be due to high levels of expression of multiple ethanol dehydrogenases and additional pathways maximizing ethanol yield. The genome also encodes three different proteinaceous bacterial microcompartments with the capacity to compartmentalize pathways that divert fermentation intermediates to various products. These characteristics make C. phytofermentans an attractive resource for improving the efficiency and speed of biomass conversion to biofuels. PMID:26035711

  16. Genome and Transcriptome of Clostridium phytofermentans, Catalyst for the Direct Conversion of Plant Feedstocks to Fuels.

    Directory of Open Access Journals (Sweden)

    Elsa Petit

    Full Text Available Clostridium phytofermentans was isolated from forest soil and is distinguished by its capacity to directly ferment plant cell wall polysaccharides into ethanol as the primary product, suggesting that it possesses unusual catabolic pathways. The objective of the present study was to understand the molecular mechanisms of biomass conversion to ethanol in a single organism, Clostridium phytofermentans, by analyzing its complete genome and transcriptome during growth on plant carbohydrates. The saccharolytic versatility of C. phytofermentans is reflected in a diversity of genes encoding ATP-binding cassette sugar transporters and glycoside hydrolases, many of which may have been acquired through horizontal gene transfer. These genes are frequently organized as operons that may be controlled individually by the many transcriptional regulators identified in the genome. Preferential ethanol production may be due to high levels of expression of multiple ethanol dehydrogenases and additional pathways maximizing ethanol yield. The genome also encodes three different proteinaceous bacterial microcompartments with the capacity to compartmentalize pathways that divert fermentation intermediates to various products. These characteristics make C. phytofermentans an attractive resource for improving the efficiency and speed of biomass conversion to biofuels.

  17. Genome, Transcriptome, and Functional Analyses of Penicillium expansum Provide New Insights Into Secondary Metabolism and Pathogenicity.

    Science.gov (United States)

    Ballester, Ana-Rosa; Marcet-Houben, Marina; Levin, Elena; Sela, Noa; Selma-Lázaro, Cristina; Carmona, Lourdes; Wisniewski, Michael; Droby, Samir; González-Candelas, Luis; Gabaldón, Toni

    2015-03-01

    The relationship between secondary metabolism and infection in pathogenic fungi has remained largely elusive. The genus Penicillium comprises a group of plant pathogens with varying host specificities and with the ability to produce a wide array of secondary metabolites. The genomes of three Penicillium expansum strains, the main postharvest pathogen of pome fruit, and one Pencillium italicum strain, a postharvest pathogen of citrus fruit, were sequenced and compared with 24 other fungal species. A genomic analysis of gene clusters responsible for the production of secondary metabolites was performed. Putative virulence factors in P. expansum were identified by means of a transcriptomic analysis of apple fruits during the course of infection. Despite a major genome contraction, P. expansum is the Penicillium species with the largest potential for the production of secondary metabolites. Results using knockout mutants clearly demonstrated that neither patulin nor citrinin are required by P. expansum to successfully infect apples. Li et al. ( MPMI-12-14-0398-FI ) reported similar results and conclusions in their recently accepted paper.

  18. An integrated genomic and transcriptomic survey of mucormycosis-causing fungi

    Science.gov (United States)

    Chibucos, Marcus C.; Soliman, Sameh; Gebremariam, Teclegiorgis; Lee, Hongkyu; Daugherty, Sean; Orvis, Joshua; Shetty, Amol C.; Crabtree, Jonathan; Hazen, Tracy H.; Etienne, Kizee A.; Kumari, Priti; O'Connor, Timothy D.; Rasko, David A.; Filler, Scott G.; Fraser, Claire M.; Lockhart, Shawn R.; Skory, Christopher D.; Ibrahim, Ashraf S.; Bruno, Vincent M.

    2016-01-01

    Mucormycosis is a life-threatening infection caused by Mucorales fungi. Here we sequence 30 fungal genomes, and perform transcriptomics with three representative Rhizopus and Mucor strains and with human airway epithelial cells during fungal invasion, to reveal key host and fungal determinants contributing to pathogenesis. Analysis of the host transcriptional response to Mucorales reveals platelet-derived growth factor receptor B (PDGFRB) signaling as part of a core response to divergent pathogenic fungi; inhibition of PDGFRB reduces Mucorales-induced damage to host cells. The unique presence of CotH invasins in all invasive Mucorales, and the correlation between CotH gene copy number and clinical prevalence, are consistent with an important role for these proteins in mucormycosis pathogenesis. Our work provides insight into the evolution of this medically and economically important group of fungi, and identifies several molecular pathways that might be exploited as potential therapeutic targets. PMID:27447865

  19. Comprehensive transcriptome and improved genome annotation of Bacillus licheniformis WX-02.

    Science.gov (United States)

    Guo, Jing; Cheng, Gang; Gou, Xiang-Yong; Xing, Feng; Li, Sen; Han, Yi-Chao; Wang, Long; Song, Jia-Ming; Shu, Cheng-Cheng; Chen, Shou-Wen; Chen, Ling-Ling

    2015-08-19

    The updated genome of Bacillus licheniformis WX-02 comprises a circular chromosome of 4286821 base-pairs containing 4512 protein-coding genes. We applied strand-specific RNA-sequencing to explore the transcriptome profiles of B. licheniformis WX-02 under normal and high-salt conditions (NaCl 6%). We identified 2381 co-expressed gene pairs constituting 871 operon structures. In addition, 1169 antisense transcripts and 90 small RNAs were detected. Systematic comparison of differentially expressed genes under different conditions revealed that genes involved in multiple functions were significantly repressed in long-term high salt adaptation process. Genes related to promotion of glutamic acid synthesis were activated by 6% NaCl, potentially explaining the high yield of γ-PGA under salt condition. This study will be useful for the optimization of crucial metabolic activities in this bacterium. Copyright © 2015. Published by Elsevier B.V.

  20. Genome-wide transcriptome and proteome analysis on different developmental stages of Cordyceps militaris.

    Directory of Open Access Journals (Sweden)

    Yalin Yin

    Full Text Available BACKGROUND: Cordyceps militaris, an ascomycete caterpillar fungus, has been used as a traditional Chinese medicine for many years owing to its anticancer and immunomodulatory activities. Currently, artificial culturing of this beneficial fungus has been widely used and can meet the market, but systematic molecular studies on the developmental stages of cultured C. militaris at transcriptional and translational levels have not been determined. METHODOLOGY/PRINCIPAL FINDINGS: We utilized high-throughput Illumina sequencing to obtain the transcriptomes of C. militaris mycelium and fruiting body. All clean reads were mapped to C. militaris genome and most of the reads showed perfect coverage. Alternative splicing and novel transcripts were predicted to enrich the database. Gene expression analysis revealed that 2,113 genes were up-regulated in mycelium and 599 in fruiting body. Gene Ontology (GO and Kyoto Encyclopedia of Genes and Genomes (KEGG analysis were performed to analyze the genes with expression differences. Moreover, the putative cordycepin metabolism difference between different developmental stages was studied. In addition, the proteome data of mycelium and fruiting body were obtained by one-dimensional gel electrophoresis (1-DGE coupled with nano-electrospray ionization liquid chromatography tandem mass spectrometry (nESI-LC-MS/MS. 359 and 214 proteins were detected from mycelium and fruiting body respectively. GO, KEGG and Cluster of Orthologous Groups (COG analysis were further conducted to better understand their difference. We analyzed the amounts of some noteworthy proteins in these two samples including lectin, superoxide dismutase, glycoside hydrolase and proteins involved in cordycepin metabolism, providing important information for further protein studies. CONCLUSIONS/SIGNIFICANCE: The results reveal the difference in gene expression between the mycelium and fruiting body of artificially cultivated C. militaris by transcriptome

  1. Genome-Wide Transcriptome and Proteome Analysis on Different Developmental Stages of Cordyceps militaris

    Science.gov (United States)

    Yin, Yalin; Yu, Guojun; Chen, Yijie; Jiang, Shuai; Wang, Man; Jin, Yanxia; Lan, Xianqing; Liang, Yi; Sun, Hui

    2012-01-01

    Background Cordyceps militaris, an ascomycete caterpillar fungus, has been used as a traditional Chinese medicine for many years owing to its anticancer and immunomodulatory activities. Currently, artificial culturing of this beneficial fungus has been widely used and can meet the market, but systematic molecular studies on the developmental stages of cultured C. militaris at transcriptional and translational levels have not been determined. Methodology/Principal Findings We utilized high-throughput Illumina sequencing to obtain the transcriptomes of C. militaris mycelium and fruiting body. All clean reads were mapped to C. militaris genome and most of the reads showed perfect coverage. Alternative splicing and novel transcripts were predicted to enrich the database. Gene expression analysis revealed that 2,113 genes were up-regulated in mycelium and 599 in fruiting body. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis were performed to analyze the genes with expression differences. Moreover, the putative cordycepin metabolism difference between different developmental stages was studied. In addition, the proteome data of mycelium and fruiting body were obtained by one-dimensional gel electrophoresis (1-DGE) coupled with nano-electrospray ionization liquid chromatography tandem mass spectrometry (nESI-LC-MS/MS). 359 and 214 proteins were detected from mycelium and fruiting body respectively. GO, KEGG and Cluster of Orthologous Groups (COG) analysis were further conducted to better understand their difference. We analyzed the amounts of some noteworthy proteins in these two samples including lectin, superoxide dismutase, glycoside hydrolase and proteins involved in cordycepin metabolism, providing important information for further protein studies. Conclusions/Significance The results reveal the difference in gene expression between the mycelium and fruiting body of artificially cultivated C. militaris by transcriptome and proteome

  2. Improved Evidence-Based Genome-scale Metabolic Models for Maize Leaf, Embryo, and Endosperm

    Energy Technology Data Exchange (ETDEWEB)

    Seaver, Samuel M.D.; Frelin, Oceane; Bradbury, Louis M.T.; Zarecki, Raphy; Ruppin, Eytan; Hanson, Andrew D.; Henry, Christopher S.

    2015-03-10

    There is a growing demand for genome-scale metabolic reconstructions for plants, fueled by the need to understand the metabolic basis of crop yield and by progress in genome and transcriptome sequencing. Methods are also required to enable the interpretation of plant transcriptome data to study how cellular metabolic activity varies under different growth conditions or even within different organs, tissues, and developmental stages. Such methods depend extensively on the accuracy with which genes have been mapped to the biochemical reactions in the plant metabolic pathways. Errors in these mappings lead to metabolic reconstructions with an inflated number of reactions and possible generation of unreliable metabolic phenotype predictions. Here we introduce a new evidence-based genome-scale metabolic reconstruction of maize, with significant improvements in the quality of the gene-reaction associations included within our model. We also present a new approach for applying our model to predict active metabolic genes based on transcriptome data. This method includes a minimal set of reactions associated with low expression genes to enable activity of a maximum number of reactions associated with high expression genes. We apply this method to construct an organ-specific model for the maize leaf, and tissue specific models for maize embryo and endosperm cells. We validate our models using fluxomics data for the endosperm and embryo, demonstrating an improved capacity of our models to fit the available fluxomics data. All models are publicly available via the DOE Systems Biology Knowledgebase and PlantSEED, and our new method is generally applicable for analysis transcript profiles from any plant, paving the way for further in silico studies with a wide variety of plant genomes.

  3. Improved Evidence-Based Genome-scale Metabolic Models for Maize Leaf, Embryo, and Endosperm.

    Directory of Open Access Journals (Sweden)

    Samuel eSeaver

    2015-03-01

    Full Text Available There is a growing demand for genome-scale metabolic reconstructions for plants, fueled by the need to understand the metabolic basis of crop yield and by progress in genome and transcriptome sequencing. Methods are also required to enable the interpretation of plant transcriptome data to study how cellular metabolic activity varies under different growth conditions or even within different organs, tissues, and developmental stages. Such methods depend extensively on the accuracy with which genes have been mapped to the biochemical reactions in the plant metabolic pathways. Errors in these mappings lead to metabolic reconstructions with an inflated number of reactions and possible generation of unreliable metabolic phenotype predictions. Here we introduce a new evidence-based genome-scale metabolic reconstruction of maize, with significant improvements in the quality of the gene-reaction associations included within our model. We also present a new approach for applying our model to predict active metabolic genes based on transcriptome data. This method includes a minimal set of reactions associated with low expression genes to enable activity of a maximum number of reactions associated with high expression genes. We apply this method to construct an organ-specific model for the maize leaf, and tissue specific models for maize embryo and endosperm cells. We validate our models using fluxomics data for the endosperm and embryo, demonstrating an improved capacity of our models to fit the available fluxomics data. All models are publicly available via the DOE Systems Biology Knowledgebase and PlantSEED, and our new method is generally applicable for analysis transcript profiles from any plant, paving the way for further in silico studies with a wide variety of plant genomes.

  4. Comparative study of de novo assembly and genome-guided assembly strategies for transcriptome reconstruction based on RNA-Seq.

    Science.gov (United States)

    Lu, Bingxin; Zeng, Zhenbing; Shi, Tieliu

    2013-02-01

    Transcriptome reconstruction is an important application of RNA-Seq, providing critical information for further analysis of transcriptome. Although RNA-Seq offers the potential to identify the whole picture of transcriptome, it still presents special challenges. To handle these difficulties and reconstruct transcriptome as completely as possible, current computational approaches mainly employ two strategies: de novo assembly and genome-guided assembly. In order to find the similarities and differences between them, we firstly chose five representative assemblers belonging to the two classes respectively, and then investigated and compared their algorithm features in theory and real performances in practice. We found that all the methods can be reduced to graph reduction problems, yet they have different conceptual and practical implementations, thus each assembly method has its specific advantages and disadvantages, performing worse than others in certain aspects while outperforming others in anther aspects at the same time. Finally we merged assemblies of the five assemblers and obtained a much better assembly. Additionally we evaluated an assembler using genome-guided de novo assembly approach, and achieved good performance. Based on these results, we suggest that to obtain a comprehensive set of recovered transcripts, it is better to use a combination of de novo assembly and genome-guided assembly.

  5. Large-Scale Sequencing: The Future of Genomic Sciences Colloquium

    Energy Technology Data Exchange (ETDEWEB)

    Margaret Riley; Merry Buckley

    2009-01-01

    Genetic sequencing and the various molecular techniques it has enabled have revolutionized the field of microbiology. Examining and comparing the genetic sequences borne by microbes - including bacteria, archaea, viruses, and microbial eukaryotes - provides researchers insights into the processes microbes carry out, their pathogenic traits, and new ways to use microorganisms in medicine and manufacturing. Until recently, sequencing entire microbial genomes has been laborious and expensive, and the decision to sequence the genome of an organism was made on a case-by-case basis by individual researchers and funding agencies. Now, thanks to new technologies, the cost and effort of sequencing is within reach for even the smallest facilities, and the ability to sequence the genomes of a significant fraction of microbial life may be possible. The availability of numerous microbial genomes will enable unprecedented insights into microbial evolution, function, and physiology. However, the current ad hoc approach to gathering sequence data has resulted in an unbalanced and highly biased sampling of microbial diversity. A well-coordinated, large-scale effort to target the breadth and depth of microbial diversity would result in the greatest impact. The American Academy of Microbiology convened a colloquium to discuss the scientific benefits of engaging in a large-scale, taxonomically-based sequencing project. A group of individuals with expertise in microbiology, genomics, informatics, ecology, and evolution deliberated on the issues inherent in such an effort and generated a set of specific recommendations for how best to proceed. The vast majority of microbes are presently uncultured and, thus, pose significant challenges to such a taxonomically-based approach to sampling genome diversity. However, we have yet to even scratch the surface of the genomic diversity among cultured microbes. A coordinated sequencing effort of cultured organisms is an appropriate place to begin

  6. Genome and transcriptome analysis of the basidiomycetous yeast Pseudozyma antarctica producing extracellular glycolipids, mannosylerythritol lipids.

    Directory of Open Access Journals (Sweden)

    Tomotake Morita

    Full Text Available Pseudozyma antarctica is a non-pathogenic phyllosphere yeast known as an excellent producer of mannosylerythritol lipids (MELs, multi-functional extracellular glycolipids, from vegetable oils. To clarify the genetic characteristics of P. antarctica, we analyzed the 18 Mb genome of P. antarctica T-34. On the basis of KOG analysis, the number of genes (219 genes categorized into lipid transport and metabolism classification in P. antarctica was one and a half times larger than that of yeast Saccharomyces cerevisiae (140 genes. The gene encoding an ATP/citrate lyase (ACL related to acetyl-CoA synthesis conserved in oleaginous strains was found in P. antarctica genome: the single ACL gene possesses the four domains identical to that of the human gene, whereas the other oleaginous ascomycetous species have the two genes covering the four domains. P. antarctica genome exhibited a remarkable degree of synteny to U. maydis genome, however, the comparison of the gene expression profiles under the culture on the two carbon sources, glucose and soybean oil, by the DNA microarray method revealed that transcriptomes between the two species were significantly different. In P. antarctica, expression of the gene sets relating fatty acid metabolism were markedly up-regulated under the oily conditions compared with glucose. Additionally, MEL biosynthesis cluster of P. antarctica was highly expressed regardless of the carbon source as compared to U. maydis. These results strongly indicate that P. antarctica has an oleaginous nature which is relevant to its non-pathogenic and MEL-overproducing characteristics. The analysis and dataset contribute to stimulate the development of improved strains with customized properties for high yield production of functional bio-based materials.

  7. Large-Scale Transcriptome Analysis of Cucumber and Botrytis cinerea during Infection.

    Directory of Open Access Journals (Sweden)

    Weiwen Kong

    Full Text Available Cucumber gray mold caused by Botrytis cinerea is considered one of the most serious cucumber diseases. With the advent of Hi-seq technology, it is possible to study the plant-pathogen interaction at the transcriptome level. To the best of our knowledge, this is the first application of RNA-seq to identify cucumber and B. cinerea differentially expressed genes (DEGs before and after the plant-pathogen interaction. In total, 248,908,688 raw reads were generated; after removing low-quality reads and those containing adapter and poly-N, 238,341,648 clean reads remained to map the reference genome. There were 3,512 cucumber DEGs and 1,735 B. cinerea DEGs. GO enrichment and KEGG enrichment analysis were performed on these DEGs to study the interaction between cucumber and B. cinerea. To verify the reliability and accuracy of our transcriptome data, 5 cucumber DEGs and 5 B. cinerea DEGs were chosen for RT-PCR verification. This is the first systematic transcriptome analysis of components related to the B. cinerea-cucumber interaction. Functional genes and putative pathways identified herein will increase our understanding of the mechanism of the pathogen-host interaction.

  8. Genome-wide primary transcriptome analysis of H2-producing archaeon Thermococcus onnurineus NA1

    Science.gov (United States)

    Cho, Suhyung; Kim, Min-Sik; Jeong, Yujin; Lee, Bo-Rahm; Lee, Jung-Hyun; Kang, Sung Gyun; Cho, Byung-Kwan

    2017-01-01

    In spite of their pivotal roles in transcriptional and post-transcriptional processes, the regulatory elements of archaeal genomes are not yet fully understood. Here, we determine the primary transcriptome of the H2-producing archaeon Thermococcus onnurineus NA1. We identified 1,082 purine-rich transcription initiation sites along with well-conserved TATA box, A-rich B recognition element (BRE), and promoter proximal element (PPE) motif in promoter regions, a high pyrimidine nucleotide content (T/C) at the −1 position, and Shine-Dalgarno (SD) motifs (GGDGRD) in 5′ untranslated regions (5′ UTRs). Along with differential transcript levels, 117 leaderless genes and 86 non-coding RNAs (ncRNAs) were identified, representing diverse cellular functions and potential regulatory functions under the different growth conditions. Interestingly, we observed low GC content in ncRNAs for RNA-based regulation via unstructured forms or interaction with other cellular components. Further comparative analysis of T. onnurineus upstream regulatory sequences with those of closely related archaeal genomes demonstrated that transcription of orthologous genes are initiated by highly conserved promoter sequences, however their upstream sequences for transcriptional and translational regulation are largely diverse. These results provide the genetic information of T. onnurineus for its future application in metabolic engineering. PMID:28216628

  9. Population genomic footprints of fine-scale differentiation between habitats in Mediterranean blue tits.

    Science.gov (United States)

    Szulkin, M; Gagnaire, P-A; Bierne, N; Charmantier, A

    2016-01-01

    Linking population genetic variation to the spatial heterogeneity of the environment is of fundamental interest to evolutionary biology and ecology, in particular when phenotypic differences between populations are observed at biologically small spatial scales. Here, we applied restriction-site associated DNA sequencing (RAD-Seq) to test whether phenotypically differentiated populations of wild blue tits (Cyanistes caeruleus) breeding in a highly heterogeneous environment exhibit genetic structure related to habitat type. Using 12 106 SNPs in 197 individuals from deciduous and evergreen oak woodlands, we applied complementary population genomic analyses, which revealed that genetic variation is influenced by both geographical distance and habitat type. A fine-scale genetic differentiation supported by genome- and transcriptome-wide analyses was found within Corsica, between two adjacent habitats where blue tits exhibit marked differences in breeding time while nesting < 6 km apart. Using redundancy analysis (RDA), we show that genomic variation remains associated with habitat type when controlling for spatial and temporal effects. Finally, our results suggest that the observed patterns of genomic differentiation were not driven by a small proportion of highly differentiated loci, but rather emerged through a process such as habitat choice, which reduces gene flow between habitats across the entire genome. The pattern of genomic isolation-by-environment closely matches differentiation observed at the phenotypic level, thereby offering significant potential for future inference of phenotype-genotype associations in a heterogeneous environment.

  10. Large-scale transcriptome analyses reveal new genetic marker candidates of head, neck, and thyroid cancer

    DEFF Research Database (Denmark)

    Reis, Eduardo M; Ojopi, Elida P B; Alberto, Fernando L

    2005-01-01

    A detailed genome mapping analysis of 213,636 expressed sequence tags (EST) derived from nontumor and tumor tissues of the oral cavity, larynx, pharynx, and thyroid was done. Transcripts matching known human genes were identified; potential new splice variants were flagged and subjected to manual...... amplification was selected by identifying transcripts that mapped to genomic regions previously known to be frequently amplified or deleted in head, neck, and thyroid tumors. Three of these markers were evaluated by quantitative reverse transcription-PCR in an independent set of individual samples. Along...... with detailed clinical data about tumor origin, the information reported here is now publicly available on a dedicated Web site as a resource for further biological investigation. This first in silico reconstruction of the head, neck, and thyroid transcriptomes points to a wealth of new candidate markers...

  11. Genomic resources for the brown planthopper, Nilaparvata lugens: Transcriptome pyrosequencing and microarray design

    Institute of Scientific and Technical Information of China (English)

    Chris Bass; Martin Bay Hebsgaard; Joseph Hughes

    2012-01-01

    The brown planthopper,Nilaparvata lugens is a pest of cultivated rice throughout Asia and is controlled using insecticides and/or resistant rice varieties.This species has developed resistance to many classes of insecticide and biotypes have developed that are virulent against formerly resistant rice cultivars.Insects use a suite of detoxification enzymes,including cytochrome P450s,glutathione S-transferases and carboxyl/cholinesterases to defend themselves against plant secondary metabolites and pesticides.Pyrosequencing on the Roche 454-FLX platform was used to produce a substantial expressed sequence tag (EST) dataset to complement the existing Sanger sequenced ESTs in GenBank.A total of 78 959 reads were combined with the 37 392 publically available Sanger ESTs; these assembled into 8 911 contigs and 10 620 singletons.Analysis of the distribution of tentative unique genes (TUGs) with the gene ontology for biological processes and molecular functions suggests that the 454 and Sanger EST assembly is broadly representative of the N.lugens transcriptome.The brown planthopper transcriptome was found to contain 31 TUGs encoding P450s,nine encoding glutathione S-transferases and 26 encoding carboxyl/cholinesterases and many of these are putatively involved in the detoxification of xenobiotics.The Agilent eArray platform was used to construct an oligonucleotide microarray populated with probes for ~ 19 000 unigene sequences,including all those known to encode detoxification enzymes.The genomic resources developed in this study will be useful to the community studying this crop pest and will help elucidate the molecular mechanism underlying insecticide resistance and planthopper adaptation to resistant rice cultivars.

  12. Coding SNPs as intrinsic markers for sample tracking in large-scale transcriptome studies.

    Science.gov (United States)

    Xu, Weihong; Gao, Hong; Seok, Junhee; Wilhelmy, Julie; Mindrinos, Michael N; Davis, Ronald W; Xiao, Wenzhong

    2012-06-01

    Large-scale transcriptome profiling in clinical studies often involves assaying multiple samples of a patient to monitor disease progression, treatment effect, and host response in multiple tissues. Such profiling is prone to human error, which often results in mislabeled samples. Here, we present a method to detect mislabeled sample outliers using coding single nucleotide polymorphisms (cSNPs) specifically designed on the microarray and demonstrate that the mislabeled samples can be efficiently identified by either simple clustering of allele-specific expression scores or Mahalanobis distance-based outlier detection method. Based on our results, we recommend the incorporation of cSNPs into future transcriptome array designs as intrinsic markers for sample tracking.

  13. Transcriptomics and molecular evolutionary rate analysis of the bladderwort (Utricularia, a carnivorous plant with a minimal genome

    Directory of Open Access Journals (Sweden)

    Herrera-Estrella Alfredo

    2011-06-01

    Full Text Available Abstract Background The carnivorous plant Utricularia gibba (bladderwort is remarkable in having a minute genome, which at ca. 80 megabases is approximately half that of Arabidopsis. Bladderworts show an incredible diversity of forms surrounding a defined theme: tiny, bladder-like suction traps on terrestrial, epiphytic, or aquatic plants with a diversity of unusual vegetative forms. Utricularia plants, which are rootless, are also anomalous in physiological features (respiration and carbon distribution, and highly enhanced molecular evolutionary rates in chloroplast, mitochondrial and nuclear ribosomal sequences. Despite great interest in the genus, no genomic resources exist for Utricularia, and the substitution rate increase has received limited study. Results Here we describe the sequencing and analysis of the Utricularia gibba transcriptome. Three different organs were surveyed, the traps, the vegetative shoot bodies, and the inflorescence stems. We also examined the bladderwort transcriptome under diverse stress conditions. We detail aspects of functional classification, tissue similarity, nitrogen and phosphorus metabolism, respiration, DNA repair, and detoxification of reactive oxygen species (ROS. Long contigs of plastid and mitochondrial genomes, as well as sequences for 100 individual nuclear genes, were compared with those of other plants to better establish information on molecular evolutionary rates. Conclusion The Utricularia transcriptome provides a detailed genomic window into processes occurring in a carnivorous plant. It contains a deep representation of the complex metabolic pathways that characterize a putative minimal plant genome, permitting its use as a source of genomic information to explore the structural, functional, and evolutionary diversity of the genus. Vegetative shoots and traps are the most similar organs by functional classification of their transcriptome, the traps expressing hydrolytic enzymes for prey

  14. Complete Genome Sequence of Sporisorium scitamineum and Biotrophic Interaction Transcriptome with Sugarcane

    Science.gov (United States)

    Benevenuto, Juliana; Peters, Leila P.; Carvalho, Giselle; Palhares, Alessandra; Quecine, Maria C.; Nunes, Filipe R. S.; Kmit, Maria C. P.; Wai, Alvan; Hausner, Georg; Aitken, Karen S.; Berkman, Paul J.; Fraser, James A.; Moolhuijzen, Paula M.; Coutinho, Luiz L.; Creste, Silvana; Vieira, Maria L. C.; Kitajima, João P.; Monteiro-Vitorello, Claudia B.

    2015-01-01

    Sporisorium scitamineum is a biotrophic fungus responsible for the sugarcane smut, a worldwide spread disease. This study provides the complete sequence of individual chromosomes of S. scitamineum from telomere to telomere achieved by a combination of PacBio long reads and Illumina short reads sequence data, as well as a draft sequence of a second fungal strain. Comparative analysis to previous available sequences of another strain detected few polymorphisms among the three genomes. The novel complete sequence described herein allowed us to identify and annotate extended subtelomeric regions, repetitive elements and the mitochondrial DNA sequence. The genome comprises 19,979,571 bases, 6,677 genes encoding proteins, 111 tRNAs and 3 assembled copies of rDNA, out of our estimated number of copies as 130. Chromosomal reorganizations were detected when comparing to sequences of S. reilianum, the closest smut relative, potentially influenced by repeats of transposable elements. Repetitive elements may have also directed the linkage of the two mating-type loci. The fungal transcriptome profiling from in vitro and from interaction with sugarcane at two time points (early infection and whip emergence) revealed that 13.5% of the genes were differentially expressed in planta and particular to each developmental stage. Among them are plant cell wall degrading enzymes, proteases, lipases, chitin modification and lignin degradation enzymes, sugar transporters and transcriptional factors. The fungus also modulates transcription of genes related to surviving against reactive oxygen species and other toxic metabolites produced by the plant. Previously described effectors in smut/plant interactions were detected but some new candidates are proposed. Ten genomic islands harboring some of the candidate genes unique to S. scitamineum were expressed only in planta. RNAseq data was also used to reassure gene predictions. PMID:26065709

  15. From genes to milk: genomic organization and epigenetic regulation of the mammary transcriptome.

    Science.gov (United States)

    Lemay, Danielle G; Pollard, Katherine S; Martin, William F; Freeman Zadrowski, Courtneay; Hernandez, Joseph; Korf, Ian; German, J Bruce; Rijnkels, Monique

    2013-01-01

    Even in genomes lacking operons, a gene's position in the genome influences its potential for expression. The mechanisms by which adjacent genes are co-expressed are still not completely understood. Using lactation and the mammary gland as a model system, we explore the hypothesis that chromatin state contributes to the co-regulation of gene neighborhoods. The mammary gland represents a unique evolutionary model, due to its recent appearance, in the context of vertebrate genomes. An understanding of how the mammary gland is regulated to produce milk is also of biomedical and agricultural importance for human lactation and dairying. Here, we integrate epigenomic and transcriptomic data to develop a comprehensive regulatory model. Neighborhoods of mammary-expressed genes were determined using expression data derived from pregnant and lactating mice and a neighborhood scoring tool, G-NEST. Regions of open and closed chromatin were identified by ChIP-Seq of histone modifications H3K36me3, H3K4me2, and H3K27me3 in the mouse mammary gland and liver tissue during lactation. We found that neighborhoods of genes in regions of uniquely active chromatin in the lactating mammary gland, compared with liver tissue, were extremely rare. Rather, genes in most neighborhoods were suppressed during lactation as reflected in their expression levels and their location in regions of silenced chromatin. Chromatin silencing was largely shared between the liver and mammary gland during lactation, and what distinguished the mammary gland was mainly a small tissue-specific repertoire of isolated, expressed genes. These findings suggest that an advantage of the neighborhood organization is in the collective repression of groups of genes via a shared mechanism of chromatin repression. Genes essential to the mammary gland's uniqueness are isolated from neighbors, and likely have less tolerance for variation in expression, properties they share with genes responsible for an organism's survival.

  16. Complete Genome Sequence of Sporisorium scitamineum and Biotrophic Interaction Transcriptome with Sugarcane.

    Directory of Open Access Journals (Sweden)

    Lucas M Taniguti

    Full Text Available Sporisorium scitamineum is a biotrophic fungus responsible for the sugarcane smut, a worldwide spread disease. This study provides the complete sequence of individual chromosomes of S. scitamineum from telomere to telomere achieved by a combination of PacBio long reads and Illumina short reads sequence data, as well as a draft sequence of a second fungal strain. Comparative analysis to previous available sequences of another strain detected few polymorphisms among the three genomes. The novel complete sequence described herein allowed us to identify and annotate extended subtelomeric regions, repetitive elements and the mitochondrial DNA sequence. The genome comprises 19,979,571 bases, 6,677 genes encoding proteins, 111 tRNAs and 3 assembled copies of rDNA, out of our estimated number of copies as 130. Chromosomal reorganizations were detected when comparing to sequences of S. reilianum, the closest smut relative, potentially influenced by repeats of transposable elements. Repetitive elements may have also directed the linkage of the two mating-type loci. The fungal transcriptome profiling from in vitro and from interaction with sugarcane at two time points (early infection and whip emergence revealed that 13.5% of the genes were differentially expressed in planta and particular to each developmental stage. Among them are plant cell wall degrading enzymes, proteases, lipases, chitin modification and lignin degradation enzymes, sugar transporters and transcriptional factors. The fungus also modulates transcription of genes related to surviving against reactive oxygen species and other toxic metabolites produced by the plant. Previously described effectors in smut/plant interactions were detected but some new candidates are proposed. Ten genomic islands harboring some of the candidate genes unique to S. scitamineum were expressed only in planta. RNAseq data was also used to reassure gene predictions.

  17. Genomics and transcriptomics characterization of genes expressed during postharvest at 4°C by the edible basidiomycete Pleurotus ostreatus.

    Science.gov (United States)

    Ramírez, Lucía; Oguiza, José Antonio; Pérez, Gúmer; Lavín, José Luis; Omarini, Alejandra; Santoyo, Francisco; Alfaro, Manuel; Castanera, Raúl; Parenti, Alejandra; Muguerza, Elaia; Pisabarro, Antonio G

    2011-06-01

    Pleurotus ostreatus is an industrially cultivated basidiomycete with nutritional and environmental applications. Its genome, which was sequenced by the Joint Genome Institute, has become a model for lignin degradation and for fungal genomics and transcriptomics studies. The complete P. ostreatus genome contains 35 Mbp organized in 11 chromosomes, and two different haploid genomes have been individually sequenced. In this work, genomics and transcriptomics approaches were employed in the study of P. ostreatus under different physiological conditions. Specifically, we analyzed a collection of expressed sequence tags (EST) obtained from cut fruit bodies that had been stored at 4°C for 7 days (postharvest conditions). Studies of the 253 expressed clones that had been automatically and manually annotated provided a detailed picture of the life characteristics of the self-sustained fruit bodies. The results suggested a complex metabolism in which autophagy, RNA metabolism, and protein and carbohydrate turnover are increased. Genes involved in environment sensing and morphogenesis were expressed under these conditions. The data improve our understanding of the decay process in postharvest mushrooms and highlight the use of high-throughput techniques to construct models of living organisms subjected to different environmental conditions.

  18. The Carcinogenic Liver Fluke, Clonorchis sinensis: New Assembly, Reannotation and Analysis of the Genome and Characterization of Tissue Transcriptomes

    OpenAIRE

    Yan Huang; Wenjun Chen; Xiaoyun Wang; Hailiang Liu; Yangyi Chen; Lei Guo; Fang Luo; Jiufeng Sun; Qiang Mao; Pei Liang; Zhizhi Xie; Chenhui Zhou; Yanli Tian; Xiaoli Lv; Lisi Huang

    2013-01-01

    Clonorchis sinensis (C. sinensis), an important food-borne parasite that inhabits the intrahepatic bile duct and causes clonorchiasis, is of interest to both the public health field and the scientific research community. To learn more about the migration, parasitism and pathogenesis of C. sinensis at the molecular level, the present study developed an upgraded genomic assembly and annotation by sequencing paired-end and mate-paired libraries. We also performed transcriptome sequence analyses ...

  19. Identification of Candidate Adherent-Invasive E. coli Signature Transcripts by Genomic/Transcriptomic Analysis.

    Directory of Open Access Journals (Sweden)

    Yuanhao Zhang

    transcription quantitative polymerase chain reaction assays for 6 genes were conducted on fecal and ileal RNA samples from 22 inflammatory bowel disease (IBD, and 32 patients without IBD (non-IBD. The expression of Cas loci was detected in a higher proportion of CD than non-IBD fecal and ileal RNA samples (p <0.05. These results support a comparative genomic/transcriptomic approach towards identifying candidate AIEC signature transcripts.

  20. Detection of driver protein complexes in breast cancer metastasis by large-scale transcriptome-interactome integration.

    Science.gov (United States)

    Garcia, Maxime; Finetti, Pascal; Bertucci, Francois; Birnbaum, Daniel; Bidaut, Ghislain

    2014-01-01

    With the development of high-throughput gene expression profiling technologies came the opportunity to define genomic signatures predicting clinical condition or cancer patient outcome. However, such signatures show dependency on training set, lack of generalization, and instability, partly due to microarray data topology. Additional issues for analyzing tumor gene expression are that subtle molecular perturbations in driver genes leading to cancer and metastasis (masked in typical differential expression analysis) may provoke expression changes of greater amplitude in downstream genes (easily detected). In this chapter, we are describing an interactome-based algorithm, Interactome-Transcriptome Integration (ITI) that is used to find a generalizable signature for prediction of breast cancer relapse by superimposition of a large-scale protein-protein interaction data (human interactome) over several gene expression datasets. ITI extracts regions in the interactome whose expression is discriminating for predicting relapse-free survival in cancer and allow detection of subnetworks that constitutes a generalizable and stable genomic signature. In this chapter, we describe the practical aspects of running the full ITI pipeline (subnetwork detection and classification) on six microarray datasets.

  1. Analyses of transcriptome sequences reveal multiple ancient large-scale duplication events in the ancestor of Sphagnopsida (Bryophyta).

    Science.gov (United States)

    Devos, Nicolas; Szövényi, Péter; Weston, David J; Rothfels, Carl J; Johnson, Matthew G; Shaw, A Jonathan

    2016-07-01

    The goal of this research was to investigate whether there has been a whole-genome duplication (WGD) in the ancestry of Sphagnum (peatmoss) or the class Sphagnopsida, and to determine if the timing of any such duplication(s) and patterns of paralog retention could help explain the rapid radiation and current ecological dominance of peatmosses. RNA sequencing (RNA-seq) data were generated for nine taxa in Sphagnopsida (Bryophyta). Analyses of frequency plots for synonymous substitutions per synonymous site (Ks ) between paralogous gene pairs and reconciliation of 578 gene trees were conducted to assess evidence of large-scale or genome-wide duplication events in each transcriptome. Both Ks frequency plots and gene tree-based analyses indicate multiple duplication events in the history of the Sphagnopsida. The most recent WGD event predates divergence of Sphagnum from the two other genera of Sphagnopsida. Duplicate retention is highly variable across species, which might be best explained by local adaptation. Our analyses indicate that the last WGD could have been an important factor underlying the diversification of peatmosses and facilitated their rise to ecological dominance in peatlands. The timing of the duplication events and their significance in the evolutionary history of peat mosses are discussed.

  2. Comparative description of ten transcriptomes of newly sequenced invertebrates and efficiency estimation of genomic sampling in non-model taxa

    Directory of Open Access Journals (Sweden)

    Riesgo Ana

    2012-11-01

    Full Text Available Abstract Introduction Traditionally, genomic or transcriptomic data have been restricted to a few model or emerging model organisms, and to a handful of species of medical and/or environmental importance. Next-generation sequencing techniques have the capability of yielding massive amounts of gene sequence data for virtually any species at a modest cost. Here we provide a comparative analysis of de novo assembled transcriptomic data for ten non-model species of previously understudied animal taxa. Results cDNA libraries of ten species belonging to five animal phyla (2 Annelida [including Sipuncula], 2 Arthropoda, 2 Mollusca, 2 Nemertea, and 2 Porifera were sequenced in different batches with an Illumina Genome Analyzer II (read length 100 or 150 bp, rendering between ca. 25 and 52 million reads per species. Read thinning, trimming, and de novo assembly were performed under different parameters to optimize output. Between 67,423 and 207,559 contigs were obtained across the ten species, post-optimization. Of those, 9,069 to 25,681 contigs retrieved blast hits against the NCBI non-redundant database, and approximately 50% of these were assigned with Gene Ontology terms, covering all major categories, and with similar percentages in all species. Local blasts against our datasets, using selected genes from major signaling pathways and housekeeping genes, revealed high efficiency in gene recovery compared to available genomes of closely related species. Intriguingly, our transcriptomic datasets detected multiple paralogues in all phyla and in nearly all gene pathways, including housekeeping genes that are traditionally used in phylogenetic applications for their purported single-copy nature. Conclusions We generated the first study of comparative transcriptomics across multiple animal phyla (comparing two species per phylum in most cases, established the first Illumina-based transcriptomic datasets for sponge, nemertean, and sipunculan species, and

  3. Comparative description of ten transcriptomes of newly sequenced invertebrates and efficiency estimation of genomic sampling in non-model taxa.

    Science.gov (United States)

    Riesgo, Ana; Andrade, Sónia C S; Sharma, Prashant P; Novo, Marta; Pérez-Porro, Alicia R; Vahtera, Varpu; González, Vanessa L; Kawauchi, Gisele Y; Giribet, Gonzalo

    2012-11-29

    Traditionally, genomic or transcriptomic data have been restricted to a few model or emerging model organisms, and to a handful of species of medical and/or environmental importance. Next-generation sequencing techniques have the capability of yielding massive amounts of gene sequence data for virtually any species at a modest cost. Here we provide a comparative analysis of de novo assembled transcriptomic data for ten non-model species of previously understudied animal taxa. cDNA libraries of ten species belonging to five animal phyla (2 Annelida [including Sipuncula], 2 Arthropoda, 2 Mollusca, 2 Nemertea, and 2 Porifera) were sequenced in different batches with an Illumina Genome Analyzer II (read length 100 or 150 bp), rendering between ca. 25 and 52 million reads per species. Read thinning, trimming, and de novo assembly were performed under different parameters to optimize output. Between 67,423 and 207,559 contigs were obtained across the ten species, post-optimization. Of those, 9,069 to 25,681 contigs retrieved blast hits against the NCBI non-redundant database, and approximately 50% of these were assigned with Gene Ontology terms, covering all major categories, and with similar percentages in all species. Local blasts against our datasets, using selected genes from major signaling pathways and housekeeping genes, revealed high efficiency in gene recovery compared to available genomes of closely related species. Intriguingly, our transcriptomic datasets detected multiple paralogues in all phyla and in nearly all gene pathways, including housekeeping genes that are traditionally used in phylogenetic applications for their purported single-copy nature. We generated the first study of comparative transcriptomics across multiple animal phyla (comparing two species per phylum in most cases), established the first Illumina-based transcriptomic datasets for sponge, nemertean, and sipunculan species, and generated a tractable catalogue of annotated genes (or gene

  4. Genomic analysis of host - Peste des petits ruminants vaccine viral transcriptome uncovers transcription factors modulating immune regulatory pathways.

    Science.gov (United States)

    Manjunath, Siddappa; Kumar, Gandham Ravi; Mishra, Bishnu Prasad; Mishra, Bina; Sahoo, Aditya Prasad; Joshi, Chaitanya G; Tiwari, Ashok K; Rajak, Kaushal Kishore; Janga, Sarath Chandra

    2015-02-24

    Peste des petits ruminants (PPR), is an acute transboundary viral disease of economic importance, affecting goats and sheep. Mass vaccination programs around the world resulted in the decline of PPR outbreaks. Sungri 96 is a live attenuated vaccine, widely used in Northern India against PPR. This vaccine virus, isolated from goat works efficiently both in sheep and goat. Global gene expression changes under PPR vaccine virus infection are not yet well defined. Therefore, in this study we investigated the host-vaccine virus interactions by infecting the peripheral blood mononuclear cells isolated from goat with PPRV (Sungri 96 vaccine virus), to quantify the global changes in the transcriptomic signature by RNA-sequencing. Viral genome of Sungri 96 vaccine virus was assembled from the PPRV infected transcriptome confirming the infection and demonstrating the feasibility of building a complete non-host genome from the blood transcriptome. Comparison of infected transcriptome with control transcriptome revealed 985 differentially expressed genes. Functional analysis showed enrichment of immune regulatory pathways under PPRV infection. Key genes involved in immune system regulation, spliceosomal and apoptotic pathways were identified to be dysregulated. Network analysis revealed that the protein - protein interaction network among differentially expressed genes is significantly disrupted in infected state. Several genes encoding TFs that govern immune regulatory pathways were identified to co-regulate the differentially expressed genes. These data provide insights into the host - PPRV vaccine virus interactome for the first time. Our findings suggested dysregulation of immune regulatory pathways and genes encoding Transcription Factors (TFs) that govern these pathways in response to viral infection.

  5. Large-Scale Transcriptome Analysis in Faba Bean (Vicia faba L. under Ascochyta fabae Infection.

    Directory of Open Access Journals (Sweden)

    Sara Ocaña

    Full Text Available Faba bean is an important food crop worldwide. However, progress in faba bean genomics lags far behind that of model systems due to limited availability of genetic and genomic information. Using the Illumina platform the faba bean transcriptome from leaves of two lines (29H and Vf136 subjected to Ascochyta fabae infection have been characterized. De novo transcriptome assembly provided a total of 39,185 different transcripts that were functionally annotated, and among these, 13,266 were assigned to gene ontology against Arabidopsis. Quality of the assembly was validated by RT-qPCR amplification of selected transcripts differentially expressed. Comparison of faba bean transcripts with those of better-characterized plant genomes such as Arabidopsis thaliana, Medicago truncatula and Cicer arietinum revealed a sequence similarity of 68.3%, 72.8% and 81.27%, respectively. Moreover, 39,060 single nucleotide polymorphism (SNP and 3,669 InDels were identified for genotyping applications. Mapping of the sequence reads generated onto the assembled transcripts showed that 393 and 457 transcripts were overexpressed in the resistant (29H and susceptible genotype (Vf136, respectively. Transcripts involved in plant-pathogen interactions such as leucine rich proteins (LRR or plant growth regulators involved in plant adaptation to abiotic and biotic stresses were found to be differently expressed in the resistant line. The results reported here represent the most comprehensive transcript database developed so far in faba bean, providing valuable information that could be used to gain insight into the pathways involved in the resistance mechanism against A. fabae and to identify potential resistance genes to be further used in marker assisted selection.

  6. Transcriptome analysis of root response to citrus blight based on the newly assembled Swingle citrumelo draft genome.

    Science.gov (United States)

    Zhang, Yunzeng; Barthe, Gary; Grosser, Jude W; Wang, Nian

    2016-07-08

    Citrus blight is a citrus tree overall decline disease and causes serious losses in the citrus industry worldwide. Although it was described more than one hundred years ago, its causal agent remains unknown and its pathophysiology is not well determined, which hampers our understanding of the disease and design of suitable disease management. In this study, we sequenced and assembled the draft genome for Swingle citrumelo, one important citrus rootstock. The draft genome is approximately 280 Mb, which covers 74 % of the estimated Swingle citrumelo genome and the average coverage is around 15X. The draft genome of Swingle citrumelo enabled us to conduct transcriptome analysis of roots of blight and healthy Swingle citrumelo using RNA-seq. The RNA-seq was reliable as evidenced by the high consistence of RNA-seq analysis and quantitative reverse transcription PCR results (R(2) = 0.966). Comparison of the gene expression profiles between blight and healthy root samples revealed the molecular mechanism underneath the characteristic blight phenotypes including decline, starch accumulation, and drought stress. The JA and ET biosynthesis and signaling pathways showed decreased transcript abundance, whereas SA-mediated defense-related genes showed increased transcript abundance in blight trees, suggesting unclassified biotrophic pathogen was involved in this disease. Overall, the Swingle citrumelo draft genome generated in this study will advance our understanding of plant biology and contribute to the citrus breeding. Transcriptome analysis of blight and healthy trees deepened our understanding of the pathophysiology of citrus blight.

  7. Identification of the minimal connected network of transcription factors by transcriptomic and genomic data integration.

    Science.gov (United States)

    Essaghir, Ahmed

    2014-01-01

    Thanks to high-throughput experiments, biological conditions can be investigated at both the entire genomic and transcriptomic levels. In addition, protein-protein interaction (PPI) data are widely available for well-studied organisms, such as human. In this chapter, we will present an integrative approach that makes use of these data to find the PPI module involving the key regulated transcription factors shared by a number of given conditions. These conditions could be for instance different cancer types. Briefly, for the studied conditions, we need to identify commonly affected chromosomal regions subjected to copy number alterations together with the identification of differentially expressed list of genes in each condition. Transcription factor activity will be inferred from these regulated gene lists. Then, we will define TFs, for which the activity could be explained by an associative effect of both loci copy number alteration and gene expression levels of their coding genes. PPI networks could be mined, afterwards, using appropriate algorithms to find the significant module that connect those TFs together. This module could be viewed as the minimal connected network of TFs, the regulation of which is shared between the investigated conditions.

  8. Stepwise Evolution of Coral Biomineralization Revealed with Genome-Wide Proteomics and Transcriptomics.

    Directory of Open Access Journals (Sweden)

    Takeshi Takeuchi

    Full Text Available Despite the importance of stony corals in many research fields related to global issues, such as marine ecology, climate change, paleoclimatogy, and metazoan evolution, very little is known about the evolutionary origin of coral skeleton formation. In order to investigate the evolution of coral biomineralization, we have identified skeletal organic matrix proteins (SOMPs in the skeletal proteome of the scleractinian coral, Acropora digitifera, for which large genomic and transcriptomic datasets are available. Scrupulous gene annotation was conducted based on comparisons of functional domain structures among metazoans. We found that SOMPs include not only coral-specific proteins, but also protein families that are widely conserved among cnidarians and other metazoans. We also identified several conserved transmembrane proteins in the skeletal proteome. Gene expression analysis revealed that expression of these conserved genes continues throughout development. Therefore, these genes are involved not only skeleton formation, but also in basic cellular functions, such as cell-cell interaction and signaling. On the other hand, genes encoding coral-specific proteins, including extracellular matrix domain-containing proteins, galaxins, and acidic proteins, were prominently expressed in post-settlement stages, indicating their role in skeleton formation. Taken together, the process of coral skeleton formation is hypothesized as: 1 formation of initial extracellular matrix between epithelial cells and substrate, employing pre-existing transmembrane proteins; 2 additional extracellular matrix formation using novel proteins that have emerged by domain shuffling and rapid molecular evolution and; 3 calcification controlled by coral-specific SOMPs.

  9. Genome-wide transcriptome analysis of fluoroquinolone resistance in clinical isolates of Escherichia coli.

    Science.gov (United States)

    Yamane, Takashi; Enokida, Hideki; Hayami, Hiroshi; Kawahara, Motoshi; Nakagawa, Masayuki

    2012-04-01

    Coincident with their worldwide use, resistance to fluoroquinolones in Escherichia coli has increased. To identify the gene expression profiles underlying fluoroquinolone resistance, we carried out genome-wide transcriptome analysis of fluoroquinolone-sensitive E. coli. Four fluoroquinolone-sensitive E. coli and five fluoroquinolone-resistant E. coli clinical isolates were subjected to complementary deoxyribonucleic acid microarray analysis. Some upregulated genes' expression was verified by real-time polymerase chain reaction using 104 E. coli clinical isolates, and minimum inhibitory concentration tests were carried out by using their transformants. A total of 40 genes were significantly upregulated in fluoroquinolone-resistant E. coli isolates (P fluoroquinolone-resistant E. coli. One of the phage shock protein operons, pspC, was significantly upregulated in 50 fluoroquinolone-resistant E. coli isolates (P fluoroquinolone-resistant E. coli. Deoxyribonucleic acid adenine methyltransferase (dam), which represses type I fimbriae genes, was significantly upregulated in the clinical fluoroquinolone-resistant E. coli isolates (P = 0.007). We established pspC- and dam-expressing E. coli transformants from fluoroquinolone-sensitive E. coli, and the minimum inhibitory concentration tests showed that the transformants acquired fluoroquinolone resistance, suggesting that upregulation of these genes contributes to acquiring fluoroquinolone resistance. Upregulation of psp operones and dam underlying pilus operons downregulation might be associated with fluoroquinolone resistance in E. coli. © 2011 The Japanese Urological Association.

  10. Stepwise Evolution of Coral Biomineralization Revealed with Genome-Wide Proteomics and Transcriptomics.

    Science.gov (United States)

    Takeuchi, Takeshi; Yamada, Lixy; Shinzato, Chuya; Sawada, Hitoshi; Satoh, Noriyuki

    2016-01-01

    Despite the importance of stony corals in many research fields related to global issues, such as marine ecology, climate change, paleoclimatogy, and metazoan evolution, very little is known about the evolutionary origin of coral skeleton formation. In order to investigate the evolution of coral biomineralization, we have identified skeletal organic matrix proteins (SOMPs) in the skeletal proteome of the scleractinian coral, Acropora digitifera, for which large genomic and transcriptomic datasets are available. Scrupulous gene annotation was conducted based on comparisons of functional domain structures among metazoans. We found that SOMPs include not only coral-specific proteins, but also protein families that are widely conserved among cnidarians and other metazoans. We also identified several conserved transmembrane proteins in the skeletal proteome. Gene expression analysis revealed that expression of these conserved genes continues throughout development. Therefore, these genes are involved not only skeleton formation, but also in basic cellular functions, such as cell-cell interaction and signaling. On the other hand, genes encoding coral-specific proteins, including extracellular matrix domain-containing proteins, galaxins, and acidic proteins, were prominently expressed in post-settlement stages, indicating their role in skeleton formation. Taken together, the process of coral skeleton formation is hypothesized as: 1) formation of initial extracellular matrix between epithelial cells and substrate, employing pre-existing transmembrane proteins; 2) additional extracellular matrix formation using novel proteins that have emerged by domain shuffling and rapid molecular evolution and; 3) calcification controlled by coral-specific SOMPs.

  11. Genomic and transcriptomic analyses of the tangerine pathotype of Alternaria alternata in response to oxidative stress

    Science.gov (United States)

    Wang, Mingshuang; Sun, Xuepeng; Yu, Dongliang; Xu, Jianping; Chung, Kuangren; Li, Hongye

    2016-01-01

    The tangerine pathotype of Alternaria alternata produces the A. citri toxin (ACT) and is the causal agent of citrus brown spot that results in significant yield losses worldwide. Both the production of ACT and the ability to detoxify reactive oxygen species (ROS) are required for A. alternata pathogenicity in citrus. In this study, we report the 34.41 Mb genome sequence of strain Z7 of the tangerine pathotype of A. alternata. The host selective ACT gene cluster in strain Z7 was identified, which included 25 genes with 19 of them not reported previously. Of these, 10 genes were present only in the tangerine pathotype, representing the most likely candidate genes for this pathotype specialization. A transcriptome analysis of the global effects of H2O2 on gene expression revealed 1108 up-regulated and 498 down-regulated genes. Expressions of those genes encoding catalase, peroxiredoxin, thioredoxin and glutathione were highly induced. Genes encoding several protein families including kinases, transcription factors, transporters, cytochrome P450, ubiquitin and heat shock proteins were found associated with adaptation to oxidative stress. Our data not only revealed the molecular basis of ACT biosynthesis but also provided new insights into the potential pathways that the phytopathogen A. alternata copes with oxidative stress. PMID:27582273

  12. Breast cancer genome and transcriptome integration implicates specific mutational signatures with immune cell infiltration.

    Science.gov (United States)

    Smid, Marcel; Rodríguez-González, F Germán; Sieuwerts, Anieta M; Salgado, Roberto; Prager-Van der Smissen, Wendy J C; Vlugt-Daane, Michelle van der; van Galen, Anne; Nik-Zainal, Serena; Staaf, Johan; Brinkman, Arie B; van de Vijver, Marc J; Richardson, Andrea L; Fatima, Aquila; Berentsen, Kim; Butler, Adam; Martin, Sancha; Davies, Helen R; Debets, Reno; Gelder, Marion E Meijer-Van; van Deurzen, Carolien H M; MacGrogan, Gaëtan; Van den Eynden, Gert G G M; Purdie, Colin; Thompson, Alastair M; Caldas, Carlos; Span, Paul N; Simpson, Peter T; Lakhani, Sunil R; Van Laere, Steven; Desmedt, Christine; Ringnér, Markus; Tommasi, Stefania; Eyford, Jorunn; Broeks, Annegien; Vincent-Salomon, Anne; Futreal, P Andrew; Knappskog, Stian; King, Tari; Thomas, Gilles; Viari, Alain; Langerød, Anita; Børresen-Dale, Anne-Lise; Birney, Ewan; Stunnenberg, Hendrik G; Stratton, Mike; Foekens, John A; Martens, John W M

    2016-09-26

    A recent comprehensive whole genome analysis of a large breast cancer cohort was used to link known and novel drivers and substitution signatures to the transcriptome of 266 cases. Here, we validate that subtype-specific aberrations show concordant expression changes for, for example, TP53, PIK3CA, PTEN, CCND1 and CDH1. We find that CCND3 expression levels do not correlate with amplification, while increased GATA3 expression in mutant GATA3 cancers suggests GATA3 is an oncogene. In luminal cases the total number of substitutions, irrespective of type, associates with cell cycle gene expression and adverse outcome, whereas the number of mutations of signatures 3 and 13 associates with immune-response specific gene expression, increased numbers of tumour-infiltrating lymphocytes and better outcome. Thus, while earlier reports imply that the sheer number of somatic aberrations could trigger an immune-response, our data suggests that substitutions of a particular type are more effective in doing so than others.

  13. A genome-wide longitudinal transcriptome analysis of the aging model Podospora anserina.

    Directory of Open Access Journals (Sweden)

    Oliver Philipp

    Full Text Available Aging of biological systems is controlled by various processes which have a potential impact on gene expression. Here we report a genome-wide transcriptome analysis of the fungal aging model Podospora anserina. Total RNA of three individuals of defined age were pooled and analyzed by SuperSAGE (serial analysis of gene expression. A bioinformatics analysis identified different molecular pathways to be affected during aging. While the abundance of transcripts linked to ribosomes and to the proteasome quality control system were found to decrease during aging, those associated with autophagy increase, suggesting that autophagy may act as a compensatory quality control pathway. Transcript profiles associated with the energy metabolism including mitochondrial functions were identified to fluctuate during aging. Comparison of wild-type transcripts, which are continuously down-regulated during aging, with those down-regulated in the long-lived, copper-uptake mutant grisea, validated the relevance of age-related changes in cellular copper metabolism. Overall, we (i present a unique age-related data set of a longitudinal study of the experimental aging model P. anserina which represents a reference resource for future investigations in a variety of organisms, (ii suggest autophagy to be a key quality control pathway that becomes active once other pathways fail, and (iii present testable predictions for subsequent experimental investigations.

  14. Genome-wide survey of ds exonization to enrich transcriptomes and proteomes in plants.

    Science.gov (United States)

    Liu, Li-Yu Daisy; Charng, Yuh-Chyang

    2012-01-01

    Insertion of transposable elements (TEs) into introns can lead to their activation as alternatively spliced cassette exons, an event called exonization which can enrich the complexity of transcriptomes and proteomes. Previously, we performed the first experimental assessment of TE exonization by inserting a Ds element into each intron of the rice epsps gene. Exonization of Ds in plants was biased toward providing splice donor sites from the beginning of the inserted Ds sequence. Additionally, Ds inserted in the reverse direction resulted in a continuous splice donor consensus region by offering 4 donor sites in the same intron. The current study involved genome-wide computational analysis of Ds exonization events in the dicot Arabidopsis thaliana and the monocot Oryza sativa (rice). Up to 71% of the exonized transcripts were putative targets for the nonsense-mediated decay (NMD) pathway. The insertion patterns of Ds and the polymorphic splice donor sites increased the transcripts and subsequent protein isoforms. Protein isoforms contain protein sequence due to unspliced intron-TE region and/or a shift of the reading frame. The number of interior protein isoforms would be twice that of C-terminal isoforms, on average. TE exonization provides a promising way for functional expansion of the plant proteome.

  15. Comparative and Transcriptome Analyses Uncover Key Aspects of Coding- and Long Noncoding RNAs in Flatworm Mitochondrial Genomes

    Directory of Open Access Journals (Sweden)

    Eric Ross

    2016-05-01

    Full Text Available Exploiting the conservation of various features of mitochondrial genomes has been instrumental in resolving phylogenetic relationships. Despite extensive sequence evidence, it has not previously been possible to conclusively resolve some key aspects of flatworm mitochondrial genomes, including generally conserved traits, such as start codons, noncoding regions, the full complement of tRNAs, and whether ATP8 is, or is not, encoded by this extranuclear genome. In an effort to address these difficulties, we sought to determine the mitochondrial transcriptomes and genomes of sexual and asexual taxa of freshwater triclads, a group previously poorly represented in flatworm mitogenomic studies. We have discovered evidence for an alternative start codon, an extended cox1 gene, a previously undescribed conserved open reading frame, long noncoding RNAs, and a highly conserved gene order across the large evolutionary distances represented within the triclads. Our findings contribute to the expansion and refinement of mitogenomics to address evolutionary issues in this diverse group of animals.

  16. Separation and parallel sequencing of the genomes and transcriptomes of single cells using G&T-seq.

    Science.gov (United States)

    Macaulay, Iain C; Teng, Mabel J; Haerty, Wilfried; Kumar, Parveen; Ponting, Chris P; Voet, Thierry

    2016-11-01

    Parallel sequencing of a single cell's genome and transcriptome provides a powerful tool for dissecting genetic variation and its relationship with gene expression. Here we present a detailed protocol for G&T-seq, a method for separation and parallel sequencing of genomic DNA and full-length polyA(+) mRNA from single cells. We provide step-by-step instructions for the isolation and lysis of single cells; the physical separation of polyA(+) mRNA from genomic DNA using a modified oligo-dT bead capture and the respective whole-transcriptome and whole-genome amplifications; and library preparation and sequence analyses of these amplification products. The method allows the detection of thousands of transcripts in parallel with the genetic variants captured by the DNA-seq data from the same single cell. G&T-seq differs from other currently available methods for parallel DNA and RNA sequencing from single cells, as it involves physical separation of the DNA and RNA and does not require bespoke microfluidics platforms. The process can be implemented manually or through automation. When performed manually, paired genome and transcriptome sequencing libraries from eight single cells can be produced in ∼3 d by researchers experienced in molecular laboratory work. For users with experience in the programming and operation of liquid-handling robots, paired DNA and RNA libraries from 96 single cells can be produced in the same time frame. Sequence analysis and integration of single-cell G&T-seq DNA and RNA data requires a high level of bioinformatics expertise and familiarity with a wide range of informatics tools.

  17. Modeling cancer metabolism on a genome scale

    Science.gov (United States)

    Yizhak, Keren; Chaneton, Barbara; Gottlieb, Eyal; Ruppin, Eytan

    2015-01-01

    Cancer cells have fundamentally altered cellular metabolism that is associated with their tumorigenicity and malignancy. In addition to the widely studied Warburg effect, several new key metabolic alterations in cancer have been established over the last decade, leading to the recognition that altered tumor metabolism is one of the hallmarks of cancer. Deciphering the full scope and functional implications of the dysregulated metabolism in cancer requires both the advancement of a variety of omics measurements and the advancement of computational approaches for the analysis and contextualization of the accumulated data. Encouragingly, while the metabolic network is highly interconnected and complex, it is at the same time probably the best characterized cellular network. Following, this review discusses the challenges that genome-scale modeling of cancer metabolism has been facing. We survey several recent studies demonstrating the first strides that have been done, testifying to the value of this approach in portraying a network-level view of the cancer metabolism and in identifying novel drug targets and biomarkers. Finally, we outline a few new steps that may further advance this field. PMID:26130389

  18. Genome wide transcriptome analysis of dendritic cells identifies genes with altered expression in psoriasis.

    Directory of Open Access Journals (Sweden)

    Kata Filkor

    Full Text Available Activation of dendritic cells by different pathogens induces the secretion of proinflammatory mediators resulting in local inflammation. Importantly, innate immunity must be properly controlled, as its continuous activation leads to the development of chronic inflammatory diseases such as psoriasis. Lipopolysaccharide (LPS or peptidoglycan (PGN induced tolerance, a phenomenon of transient unresponsiveness of cells to repeated or prolonged stimulation, proved valuable model for the study of chronic inflammation. Thus, the aim of this study was the identification of the transcriptional diversity of primary human immature dendritic cells (iDCs upon PGN induced tolerance. Using SAGE-Seq approach, a tag-based transcriptome sequencing method, we investigated gene expression changes of primary human iDCs upon stimulation or restimulation with Staphylococcus aureus derived PGN, a widely used TLR2 ligand. Based on the expression pattern of the altered genes, we identified non-tolerizeable and tolerizeable genes. Gene Ontology (GO and Kyoto Encyclopedia of Genes and Genomes (Kegg analysis showed marked enrichment of immune-, cell cycle- and apoptosis related genes. In parallel to the marked induction of proinflammatory mediators, negative feedback regulators of innate immunity, such as TNFAIP3, TNFAIP8, Tyro3 and Mer are markedly downregulated in tolerant cells. We also demonstrate, that the expression pattern of TNFAIP3 and TNFAIP8 is altered in both lesional, and non-lesional skin of psoriatic patients. Finally, we show that pretreatment of immature dendritic cells with anti-TNF-α inhibits the expression of IL-6 and CCL1 in tolerant iDCs and partially releases the suppression of TNFAIP8. Our findings suggest that after PGN stimulation/restimulation the host cell utilizes different mechanisms in order to maintain critical balance between inflammation and tolerance. Importantly, the transcriptome sequencing of stimulated/restimulated iDCs identified

  19. A large set of 26 new reference transcriptomes dedicated to comparative population genomics in crops and wild relatives.

    Science.gov (United States)

    Sarah, Gautier; Homa, Felix; Pointet, Stéphanie; Contreras, Sandy; Sabot, François; Nabholz, Benoit; Santoni, Sylvain; Sauné, Laure; Ardisson, Morgane; Chantret, Nathalie; Sauvage, Christopher; Tregear, James; Jourda, Cyril; Pot, David; Vigouroux, Yves; Chair, Hana; Scarcelli, Nora; Billot, Claire; Yahiaoui, Nabila; Bacilieri, Roberto; Khadari, Bouchaib; Boccara, Michel; Barnaud, Adéline; Péros, Jean-Pierre; Labouisse, Jean-Pierre; Pham, Jean-Louis; David, Jacques; Glémin, Sylvain; Ruiz, Manuel

    2016-08-04

    We produced a unique large data set of reference transcriptomes to obtain new knowledge about the evolution of plant genomes and crop domestication. For this purpose, we validated a RNA-Seq data assembly protocol to perform comparative population genomics. For the validation, we assessed and compared the quality of de novo Illumina short-read assemblies using data from two crops for which an annotated reference genome was available, namely grapevine and sorghum. We used the same protocol for the release of 26 new transcriptomes of crop plants and wild relatives, including still understudied crops such as yam, pearl millet and fonio. The species list has a wide taxonomic representation with the inclusion of 15 monocots and 11 eudicots. All contigs were annotated using BLAST, prot4EST and Blast2GO. A strong originality of the data set is that each crop is associated with close relative species, which will permit whole-genome comparative evolutionary studies between crops and their wild-related species. This large resource will thus serve research communities working on both crops and model organisms. All the data are available at http://arcad-bioinformatics.southgreen.fr/.

  20. Ensembl Genomes 2013: scaling up access to genome-wide data

    Science.gov (United States)

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species. The project exploits and extends technologies for genome annotation, analysis and dissemination, developed in the context of the vertebrate-focused Ensembl project, and provi...

  1. Local adaptation at the transcriptome level in brown trout: Evidence from early life history temperature genomic reaction norms

    DEFF Research Database (Denmark)

    Meier, Kristian; Hansen, Michael Møller; Normandeau, Eric;

    2014-01-01

    Local adaptation and its underlying molecular basis has long been a key focus in evolutionary biology. There has recently been increased interest in the evolutionary role of plasticity and the molecular mechanisms underlying local adaptation. Using transcriptome analysis, we assessed differences...... reaction norms and significantly higher QST than FST among populations for two early life-history traits. In the present study we investigated if genomic reaction norm patterns were also present at the transcriptome level. Eggs from the three populations were incubated at two temperatures (5 and 8 degrees......, the latter indicating locally adapted reaction norms. Moreover, the reaction norms paralleled those observed previously at early life-history traits. We identified 90 cDNA clones among the genes with an interaction effect that were differently expressed between the ecologically divergent populations...

  2. Novel tools for conservation genomics: comparing two high-throughput approaches for SNP discovery in the transcriptome of the European hake.

    Directory of Open Access Journals (Sweden)

    Ilaria Milano

    Full Text Available The growing accessibility to genomic resources using next-generation sequencing (NGS technologies has revolutionized the application of molecular genetic tools to ecology and evolutionary studies in non-model organisms. Here we present the case study of the European hake (Merluccius merluccius, one of the most important demersal resources of European fisheries. Two sequencing platforms, the Roche 454 FLX (454 and the Illumina Genome Analyzer (GAII, were used for Single Nucleotide Polymorphisms (SNPs discovery in the hake muscle transcriptome. De novo transcriptome assembly into unique contigs, annotation, and in silico SNP detection were carried out in parallel for 454 and GAII sequence data. High-throughput genotyping using the Illumina GoldenGate assay was performed for validating 1,536 putative SNPs. Validation results were analysed to compare the performances of 454 and GAII methods and to evaluate the role of several variables (e.g. sequencing depth, intron-exon structure, sequence quality and annotation. Despite well-known differences in sequence length and throughput, the two approaches showed similar assay conversion rates (approximately 43% and percentages of polymorphic loci (67.5% and 63.3% for GAII and 454, respectively. Both NGS platforms therefore demonstrated to be suitable for large scale identification of SNPs in transcribed regions of non-model species, although the lack of a reference genome profoundly affects the genotyping success rate. The overall efficiency, however, can be improved using strict quality and filtering criteria for SNP selection (sequence quality, intron-exon structure, target region score.

  3. A genomic and transcriptomic approach to investigate the blue pigment phenotype in Pseudomonas fluorescens.

    Science.gov (United States)

    Andreani, Nadia Andrea; Carraro, Lisa; Martino, Maria Elena; Fondi, Marco; Fasolato, Luca; Miotto, Giovanni; Magro, Massimiliano; Vianello, Fabio; Cardazzo, Barbara

    2015-11-20

    Pseudomonas fluorescens is a well-known food spoiler, able to cause serious economic losses in the food industry due to its ability to produce many extracellular, and often thermostable, compounds. The most outstanding spoilage events involving P. fluorescens were blue discoloration of several food stuffs, mainly dairy products. The bacteria involved in such high-profile cases have been identified as belonging to a clearly distinct phylogenetic cluster of the P. fluorescens group. Although the blue pigment has recently been investigated in several studies, the biosynthetic pathway leading to the pigment formation, as well as its chemical nature, remain challenging and unsolved points. In the present paper, genomic and transcriptomic data of 4 P. fluorescens strains (2 blue-pigmenting strains and 2 non-pigmenting strains) were analyzed to evaluate the presence and the expression of blue strain-specific genes. In particular, the pangenome analysis showed the presence in the blue-pigmenting strains of two copies of genes involved in the tryptophan biosynthesis pathway (including trpABCDF). The global expression profiling of blue-pigmenting strains versus non-pigmenting strains showed a general up-regulation of genes involved in iron uptake and a down-regulation of genes involved in primary metabolism. Chromogenic reaction of the blue-pigmenting bacterial cells with Kovac's reagent indicated an indole-derivative as the precursor of the blue pigment. Finally, solubility tests and MALDI-TOF mass spectrometry analysis of the isolated pigment suggested that its molecular structure is very probably a hydrophobic indigo analog.

  4. Massive-scale RNA-Seq analysis of non ribosomal transcriptome in human trisomy 21.

    Directory of Open Access Journals (Sweden)

    Valerio Costa

    Full Text Available Hybridization- and tag-based technologies have been successfully used in Down syndrome to identify genes involved in various aspects of the pathogenesis. However, these technologies suffer from several limits and drawbacks and, to date, information about rare, even though relevant, RNA species such as long and small non-coding RNAs, is completely missing. Indeed, none of published works has still described the whole transcriptional landscape of Down syndrome. Although the recent advances in high-throughput RNA sequencing have revealed the complexity of transcriptomes, most of them rely on polyA enrichment protocols, able to detect only a small fraction of total RNA content. On the opposite end, massive-scale RNA sequencing on rRNA-depleted samples allows the survey of the complete set of coding and non-coding RNA species, now emerging as novel contributors to pathogenic mechanisms. Hence, in this work we analysed for the first time the complete transcriptome of human trisomic endothelial progenitor cells to an unprecedented level of resolution and sensitivity by RNA-sequencing. Our analysis allowed us to detect differential expression of even low expressed genes crucial for the pathogenesis, to disclose novel regions of active transcription outside yet annotated loci, and to investigate a plethora of non-polyadenylated long as well as short non coding RNAs. Novel splice isoforms for a large subset of crucial genes, and novel extended untranslated regions for known genes--possibly novel miRNA targets or regulatory sites for gene transcription--were also identified in this study. Coupling the rRNA depletion of samples, followed by high-throughput RNA-sequencing, to the easy availability of these cells renders this approach very feasible for transcriptome studies, offering the possibility of investigating in-depth blood-related pathological features of Down syndrome, as well as other genetic disorders.

  5. Transcriptomic and proteomic analyses on the supercooling ability and mining of antifreeze proteins of the Chinese white wax scale insect.

    Science.gov (United States)

    Yu, Shu-Hui; Yang, Pu; Sun, Tao; Qi, Qian; Wang, Xue-Qing; Chen, Xiao-Ming; Feng, Ying; Liu, Bo-Wen

    2016-06-01

    The Chinese white wax scale insect, Ericerus pela, can survive at extremely low temperatures, and some overwintering individuals exhibit supercooling at temperatures below -30°C. To investigate the deep supercooling ability of E. pela, transcriptomic and proteomic analyses were performed to delineate the major gene and protein families responsible for the deep supercooling ability of overwintering females. Gene Ontology (GO) classification and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis indicated that genes involved in the mitogen-activated protein kinase, calcium, and PI3K-Akt signaling pathways and pathways associated with the biosynthesis of soluble sugars, sugar alcohols and free amino acids were dominant. Proteins responsible for low-temperature stress, such as cold acclimation proteins, glycerol biosynthesis-related enzymes and heat shock proteins (HSPs) were identified. However, no antifreeze proteins (AFPs) were identified through sequence similarity search methods. A random forest approach identified 388 putative AFPs in the proteome. The AFP gene ep-afp was expressed in Escherichia coli, and the expressed protein exhibited a thermal hysteresis activity of 0.97°C, suggesting its potential role in the deep supercooling ability of E. pela.

  6. Genomic analysis and temperature-dependent transcriptome profiles of the rhizosphere originating strain Pseudomonas aeruginosa M18

    Directory of Open Access Journals (Sweden)

    He Ya-Wen

    2011-08-01

    Full Text Available Abstract Background Our previously published reports have described an effective biocontrol agent named Pseudomonas sp. M18 as its 16S rDNA sequence and several regulator genes share homologous sequences with those of P. aeruginosa, but there are several unusual phenotypic features. This study aims to explore its strain specific genomic features and gene expression patterns at different temperatures. Results The complete M18 genome is composed of a single chromosome of 6,327,754 base pairs containing 5684 open reading frames. Seven genomic islands, including two novel prophages and five specific non-phage islands were identified besides the conserved P. aeruginosa core genome. Each prophage contains a putative chitinase coding gene, and the prophage II contains a capB gene encoding a putative cold stress protein. The non-phage genomic islands contain genes responsible for pyoluteorin biosynthesis, environmental substance degradation and type I and III restriction-modification systems. Compared with other P. aeruginosa strains, the fewest number (3 of insertion sequences and the most number (3 of clustered regularly interspaced short palindromic repeats in M18 genome may contribute to the relative genome stability. Although the M18 genome is most closely related to that of P. aeruginosa strain LESB58, the strain M18 is more susceptible to several antimicrobial agents and easier to be erased in a mouse acute lung infection model than the strain LESB58. The whole M18 transcriptomic analysis indicated that 10.6% of the expressed genes are temperature-dependent, with 22 genes up-regulated at 28°C in three non-phage genomic islands and one prophage but none at 37°C. Conclusions The P. aeruginosa strain M18 has evolved its specific genomic structures and temperature dependent expression patterns to meet the requirement of its fitness and competitiveness under selective pressures imposed on the strain in rhizosphere niche.

  7. A differential genome-wide transcriptome analysis: impact of cellular copper on complex biological processes like aging and development.

    Directory of Open Access Journals (Sweden)

    Jörg Servos

    Full Text Available The regulation of cellular copper homeostasis is crucial in biology. Impairments lead to severe dysfunctions and are known to affect aging and development. Previously, a loss-of-function mutation in the gene encoding the copper-sensing and copper-regulated transcription factor GRISEA of the filamentous fungus Podospora anserina was reported to lead to cellular copper depletion and a pleiotropic phenotype with hypopigmentation of the mycelium and the ascospores, affected fertility and increased lifespan by approximately 60% when compared to the wild type. This phenotype is linked to a switch from a copper-dependent standard to an alternative respiration leading to both a reduced generation of reactive oxygen species (ROS and of adenosine triphosphate (ATP. We performed a genome-wide comparative transcriptome analysis of a wild-type strain and the copper-depleted grisea mutant. We unambiguously assigned 9,700 sequences of the transcriptome in both strains to the more than 10,600 predicted and annotated open reading frames of the P. anserina genome indicating 90% coverage of the transcriptome. 4,752 of the transcripts differed significantly in abundance with 1,156 transcripts differing at least 3-fold. Selected genes were investigated by qRT-PCR analyses. Apart from this general characterization we analyzed the data with special emphasis on molecular pathways related to the grisea mutation taking advantage of the available complete genomic sequence of P. anserina. This analysis verified but also corrected conclusions from earlier data obtained by single gene analysis, identified new candidates of factors as part of the cellular copper homeostasis system including target genes of transcription factor GRISEA, and provides a rich reference source of quantitative data for further in detail investigations. Overall, the present study demonstrates the importance of systems biology approaches also in cases were mutations in single genes are analyzed to

  8. Comparative genome and transcriptome analysis reveals distinctive surface characteristics and unique physiological potentials of Pseudomonas aeruginosa ATCC 27853

    KAUST Repository

    Cao, Huiluo

    2017-06-12

    Pseudomonas aeruginosa ATCC 27853 was isolated from a hospital blood specimen in 1971 and has been widely used as a model strain to survey antibiotics susceptibilities, biofilm development, and metabolic activities of Pseudomonas spp.. Although four draft genomes of P. aeruginosa ATCC 27853 have been sequenced, the complete genome of this strain is still lacking, hindering a comprehensive understanding of its physiology and functional genome.Here we sequenced and assembled the complete genome of P. aeruginosa ATCC 27853 using the Pacific Biosciences SMRT (PacBio) technology and Illumina sequencing platform. We found that accessory genes of ATCC 27853 including prophages and genomic islands (GIs) mainly contribute to the difference between P. aeruginosa ATCC 27853 and other P. aeruginosa strains. Seven prophages were identified within the genome of P. aeruginosa ATCC 27853. Of the predicted 25 GIs, three contain genes that encode monoxoygenases, dioxygenases and hydrolases that could be involved in the metabolism of aromatic compounds. Surveying virulence-related genes revealed that a series of genes that encode the B-band O-antigen of LPS are lacking in ATCC 27853. Distinctive SNPs in genes of cellular adhesion proteins such as type IV pili and flagella biosynthesis were also observed in this strain. Colony morphology analysis confirmed an enhanced biofilm formation capability of ATCC 27853 on solid agar surface compared to Pseudomonas aeruginosa PAO1. We then performed transcriptome analysis of ATCC 27853 and PAO1 using RNA-seq and compared the expression of orthologous genes to understand the functional genome and the genomic details underlying the distinctive colony morphogenesis. These analyses revealed an increased expression of genes involved in cellular adhesion and biofilm maturation such as type IV pili, exopolysaccharide and electron transport chain components in ATCC 27853 compared with PAO1. In addition, distinctive expression profiles of the

  9. Genome Sequence and Transcriptome Analyses of Chrysochromulina tobin: Metabolic Tools for Enhanced Algal Fitness in the Prominent Order Prymnesiales (Haptophyceae.

    Directory of Open Access Journals (Sweden)

    Blake T Hovde

    Full Text Available Haptophytes are recognized as seminal players in aquatic ecosystem function. These algae are important in global carbon sequestration, form destructive harmful blooms, and given their rich fatty acid content, serve as a highly nutritive food source to a broad range of eco-cohorts. Haptophyte dominance in both fresh and marine waters is supported by the mixotrophic nature of many taxa. Despite their importance the nuclear genome sequence of only one haptophyte, Emiliania huxleyi (Isochrysidales, is available. Here we report the draft genome sequence of Chrysochromulina tobin (Prymnesiales, and transcriptome data collected at seven time points over a 24-hour light/dark cycle. The nuclear genome of C. tobin is small (59 Mb, compact (∼ 40% of the genome is protein coding and encodes approximately 16,777 genes. Genes important to fatty acid synthesis, modification, and catabolism show distinct patterns of expression when monitored over the circadian photoperiod. The C. tobin genome harbors the first hybrid polyketide synthase/non-ribosomal peptide synthase gene complex reported for an algal species, and encodes potential anti-microbial peptides and proteins involved in multidrug and toxic compound extrusion. A new haptophyte xanthorhodopsin was also identified, together with two "red" RuBisCO activases that are shared across many algal lineages. The Chrysochromulina tobin genome sequence provides new information on the evolutionary history, ecology and economic importance of haptophytes.

  10. Selecting Superior De Novo Transcriptome Assemblies: Lessons Learned by Leveraging the Best Plant Genome.

    Directory of Open Access Journals (Sweden)

    Loren A Honaas

    Full Text Available Whereas de novo assemblies of RNA-Seq data are being published for a growing number of species across the tree of life, there are currently no broadly accepted methods for evaluating such assemblies. Here we present a detailed comparison of 99 transcriptome assemblies, generated with 6 de novo assemblers including CLC, Trinity, SOAP, Oases, ABySS and NextGENe. Controlled analyses of de novo assemblies for Arabidopsis thaliana and Oryza sativa transcriptomes provide new insights into the strengths and limitations of transcriptome assembly strategies. We find that the leading assemblers generate reassuringly accurate assemblies for the majority of transcripts. At the same time, we find a propensity for assemblers to fail to fully assemble highly expressed genes. Surprisingly, the instance of true chimeric assemblies is very low for all assemblers. Normalized libraries are reduced in highly abundant transcripts, but they also lack 1000s of low abundance transcripts. We conclude that the quality of de novo transcriptome assemblies is best assessed through consideration of a combination of metrics: 1 proportion of reads mapping to an assembly 2 recovery of conserved, widely expressed genes, 3 N50 length statistics, and 4 the total number of unigenes. We provide benchmark Illumina transcriptome data and introduce SCERNA, a broadly applicable modular protocol for de novo assembly improvement. Finally, our de novo assembly of the Arabidopsis leaf transcriptome revealed ~20 putative Arabidopsis genes lacking in the current annotation.

  11. Genome and Transcriptome Analysis of the Fungal Pathogen Fusarium oxysporum f. sp. cubense Causing Banana Vascular Wilt Disease

    Science.gov (United States)

    Zeng, Huicai; Fan, Dingding; Zhu, Yabin; Feng, Yue; Wang, Guofen; Peng, Chunfang; Jiang, Xuanting; Zhou, Dajie; Ni, Peixiang; Liang, Changcong; Liu, Lei; Wang, Jun; Mao, Chao

    2014-01-01

    Background The asexual fungus Fusarium oxysporum f. sp. cubense (Foc) causing vascular wilt disease is one of the most devastating pathogens of banana (Musa spp.). To understand the molecular underpinning of pathogenicity in Foc, the genomes and transcriptomes of two Foc isolates were sequenced. Methodology/Principal Findings Genome analysis revealed that the genome structures of race 1 and race 4 isolates were highly syntenic with those of F. oxysporum f. sp. lycopersici strain Fol4287. A large number of putative virulence associated genes were identified in both Foc genomes, including genes putatively involved in root attachment, cell degradation, detoxification of toxin, transport, secondary metabolites biosynthesis and signal transductions. Importantly, relative to the Foc race 1 isolate (Foc1), the Foc race 4 isolate (Foc4) has evolved with some expanded gene families of transporters and transcription factors for transport of toxins and nutrients that may facilitate its ability to adapt to host environments and contribute to pathogenicity to banana. Transcriptome analysis disclosed a significant difference in transcriptional responses between Foc1 and Foc4 at 48 h post inoculation to the banana ‘Brazil’ in comparison with the vegetative growth stage. Of particular note, more virulence-associated genes were up regulated in Foc4 than in Foc1. Several signaling pathways like the mitogen-activated protein kinase Fmk1 mediated invasion growth pathway, the FGA1-mediated G protein signaling pathway and a pathogenicity associated two-component system were activated in Foc4 rather than in Foc1. Together, these differences in gene content and transcription response between Foc1 and Foc4 might account for variation in their virulence during infection of the banana variety ‘Brazil’. Conclusions/Significance Foc genome sequences will facilitate us to identify pathogenicity mechanism involved in the banana vascular wilt disease development. These will thus advance

  12. Genome and transcriptome analysis of the fungal pathogen Fusarium oxysporum f. sp. cubense causing banana vascular wilt disease.

    Directory of Open Access Journals (Sweden)

    Lijia Guo

    Full Text Available BACKGROUND: The asexual fungus Fusarium oxysporum f. sp. cubense (Foc causing vascular wilt disease is one of the most devastating pathogens of banana (Musa spp.. To understand the molecular underpinning of pathogenicity in Foc, the genomes and transcriptomes of two Foc isolates were sequenced. METHODOLOGY/PRINCIPAL FINDINGS: Genome analysis revealed that the genome structures of race 1 and race 4 isolates were highly syntenic with those of F. oxysporum f. sp. lycopersici strain Fol4287. A large number of putative virulence associated genes were identified in both Foc genomes, including genes putatively involved in root attachment, cell degradation, detoxification of toxin, transport, secondary metabolites biosynthesis and signal transductions. Importantly, relative to the Foc race 1 isolate (Foc1, the Foc race 4 isolate (Foc4 has evolved with some expanded gene families of transporters and transcription factors for transport of toxins and nutrients that may facilitate its ability to adapt to host environments and contribute to pathogenicity to banana. Transcriptome analysis disclosed a significant difference in transcriptional responses between Foc1 and Foc4 at 48 h post inoculation to the banana 'Brazil' in comparison with the vegetative growth stage. Of particular note, more virulence-associated genes were up regulated in Foc4 than in Foc1. Several signaling pathways like the mitogen-activated protein kinase Fmk1 mediated invasion growth pathway, the FGA1-mediated G protein signaling pathway and a pathogenicity associated two-component system were activated in Foc4 rather than in Foc1. Together, these differences in gene content and transcription response between Foc1 and Foc4 might account for variation in their virulence during infection of the banana variety 'Brazil'. CONCLUSIONS/SIGNIFICANCE: Foc genome sequences will facilitate us to identify pathogenicity mechanism involved in the banana vascular wilt disease development. These will

  13. Exploring Networks at the genome scale

    NARCIS (Netherlands)

    Lam, M.C.; Puchalka, J.; Diez, M.S.; Martins Dos Santos, V.A.P.

    2010-01-01

    Systems biology is aimed at achieving a holistic understanding of living organisms, while synthetic biology seeks to design and construct new living organisms with targeted functionalities. Genome sequencing and the fields of ‘omics’ technology have proven a goldmine of information for scientists

  14. Scaling, crumpled wires, and genome packing in virions

    Science.gov (United States)

    de Holanda, V. H.; Gomes, M. A. F.

    2016-12-01

    The packing of a genome in virions is a topic of intense current interest in biology and biological physics. The area is dominated by allometric scaling relations that connect, e.g., the length of the encapsulated genome and the size of the corresponding virion capsid. Here we report scaling laws obtained from extensive experiments of packing of a macroscopic wire within rigid three-dimensional spherical and nonspherical cavities that can shed light on the details of the genome packing in virions. We show that these results obtained with crumpled wires are comparable to those from a large compilation of biological data from several classes of virions.

  15. Using Genome-scale Models to Predict Biological Capabilities

    DEFF Research Database (Denmark)

    O’Brien, Edward J.; Monk, Jonathan M.; Palsson, Bernhard O.

    2015-01-01

    Constraint-based reconstruction and analysis (COBRA) methods at the genome scale have been under development since the first whole-genome sequences appeared in the mid-1990s. A few years ago, this approach began to demonstrate the ability to predict a range of cellular functions, including cellular...... growth capabilities on various substrates and the effect of gene knockouts at the genome scale. Thus, much interest has developed in understanding and applying these methods to areas such as metabolic engineering, antibiotic design, and organismal and enzyme evolution. This Primer will get you started....

  16. Use of genome-scale microbial models for metabolic engineering

    DEFF Research Database (Denmark)

    Patil, Kiran Raosaheb; Åkesson, M.; Nielsen, Jens

    2004-01-01

    network structures. The major challenge for metabolic engineering in the post-genomic era is to broaden its design methodologies to incorporate genome-scale biological data. Genome-scale stoichiometric models of microorganisms represent a first step in this direction.......Metabolic engineering serves as an integrated approach to design new cell factories by providing rational design procedures and valuable mathematical and experimental tools. Mathematical models have an important role for phenotypic analysis, but can also be used for the design of optimal metabolic...

  17. In the fast lane: large-scale bacterial genome engineering.

    Science.gov (United States)

    Fehér, Tamás; Burland, Valerie; Pósfai, György

    2012-07-31

    The last few years have witnessed rapid progress in bacterial genome engineering. The long-established, standard ways of DNA synthesis, modification, transfer into living cells, and incorporation into genomes have given way to more effective, large-scale, robust genome modification protocols. Expansion of these engineering capabilities is due to several factors. Key advances include: (i) progress in oligonucleotide synthesis and in vitro and in vivo assembly methods, (ii) optimization of recombineering techniques, (iii) introduction of parallel, large-scale, combinatorial, and automated genome modification procedures, and (iv) rapid identification of the modifications by barcode-based analysis and sequencing. Combination of the brute force of these techniques with sophisticated bioinformatic design and modeling opens up new avenues for the analysis of gene functions and cellular network interactions, but also in engineering more effective producer strains. This review presents a summary of recent technological advances in bacterial genome engineering.

  18. LEMONS – A Tool for the Identification of Splice Junctions in Transcriptomes of Organisms Lacking Reference Genomes

    Science.gov (United States)

    Bouskila, Amos; Chorev, Michal; Carmel, Liran; Mishmar, Dan

    2015-01-01

    RNA-seq is becoming a preferred tool for genomics studies of model and non-model organisms. However, DNA-based analysis of organisms lacking sequenced genomes cannot rely on RNA-seq data alone to isolate most genes of interest, as DNA codes both exons and introns. With this in mind, we designed a novel tool, LEMONS, that exploits the evolutionary conservation of both exon/intron boundary positions and splice junction recognition signals to produce high throughput splice-junction predictions in the absence of a reference genome. When tested on multiple annotated vertebrate mRNA data, LEMONS accurately identified 87% (average) of the splice-junctions. LEMONS was then applied to our updated Mediterranean chameleon transcriptome, which lacks a reference genome, and predicted a total of 90,820 exon-exon junctions. We experimentally verified these splice-junction predictions by amplifying and sequencing twenty randomly selected genes from chameleon DNA templates. Exons and introns were detected in 19 of 20 of the positions predicted by LEMONS. To the best of our knowledge, LEMONS is currently the only experimentally verified tool that can accurately predict splice-junctions in organisms that lack a reference genome. PMID:26606265

  19. Complete genome sequence and transcriptomics analyses reveal pigment biosynthesis and regulatory mechanisms in an industrial strain, Monascus purpureus YY-1.

    Science.gov (United States)

    Yang, Yue; Liu, Bin; Du, Xinjun; Li, Ping; Liang, Bin; Cheng, Xiaozhen; Du, Liangcheng; Huang, Di; Wang, Lei; Wang, Shuo

    2015-02-09

    Monascus has been used to produce natural colorants and food supplements for more than one thousand years, and approximately more than one billion people eat Monascus-fermented products during their daily life. In this study, using next-generation sequencing and optical mapping approaches, a 24.1-Mb complete genome of an industrial strain, Monascus purpureus YY-1, was obtained. This genome consists of eight chromosomes and 7,491 genes. Phylogenetic analysis at the genome level provides convincing evidence for the evolutionary position of M. purpureus. We provide the first comprehensive prediction of the biosynthetic pathway for Monascus pigment. Comparative genomic analyses show that the genome of M. purpureus is 13.6-40% smaller than those of closely related filamentous fungi and has undergone significant gene losses, most of which likely occurred during its specialized adaptation to starch-based foods. Comparative transcriptome analysis reveals that carbon starvation stress, resulting from the use of relatively low-quality carbon sources, contributes to the high yield of pigments by repressing central carbon metabolism and augmenting the acetyl-CoA pool. Our work provides important insights into the evolution of this economically important fungus and lays a foundation for future genetic manipulation and engineering of this strain.

  20. Impact of a short-term exposure to spaceflight on the phenotype, genome, transcriptome and proteome of Escherichia coli

    Science.gov (United States)

    Li, Tianzhi; Chang, De; Xu, Huiwen; Chen, Jiapeng; Su, Longxiang; Guo, Yinghua; Chen, Zhenhong; Wang, Yajuan; Wang, Li; Wang, Junfeng; Fang, Xiangqun; Liu, Changting

    2015-07-01

    Escherichia coli (E. coli) is the most widely applied model organism in current biological science. As a widespread opportunistic pathogen, E. coli can survive not only by symbiosis with human, but also outside the host as well, which necessitates the evaluation of its response to the space environment. Therefore, to keep humans safe in space, it is necessary to understand how the bacteria respond to this environment. Despite extensive investigations for a few decades, the response of E. coli to the real space environment is still controversial. To better understand the mechanisms how E. coli overcomes harsh environments such as microgravity in space and to investigate whether these factors may induce pathogenic changes in E. coli that are potentially detrimental to astronauts, we conducted detailed genomics, transcriptomic and proteomic studies on E. coli that experienced 17 days of spaceflight. By comparing two flight strains LCT-EC52 and LCT-EC59 to a control strain LCT-EC106 that was cultured under the same temperature conditions on the ground, we identified metabolism changes, polymorphism changes, differentially expressed genes and proteins in the two flight strains. The flight strains differed from the control in the utilization of more than 30 carbon sources. Two single nucleotide polymorphisms (SNPs) and one deletion were identified in the flight strains. The expression level of more than 1000 genes altered in flight strains. Genes involved in chemotaxis, lipid metabolism and cell motility express differently. Moreover, the two flight strains also differed extensively from each other in terms of metabolism, transcriptome and proteome, indicating the impact of space environment on individual cells is heterogeneous and probably genotype-dependent. This study presents the first systematic profile of E. coli genome, transcriptome and proteome after spaceflight, which helps to elucidate the mechanism that controls the adaptation of microbes to the space

  1. Comparative sequence analyses of genome and transcriptome reveal novel transcripts and variants in the Asian elephant Elephas maximus

    Indian Academy of Sciences (India)

    Puli Chandramouli Reddy; Ishani Sinha; Ashwin Kelkar; Farhat Habib; Saurabh J Pradhan; Raman Sukumar; Sanjeev Galande

    2015-12-01

    The Asian elephant Elephas maximus and the African elephant Loxodonta africana that diverged 5-7 million years ago exhibit differences in their physiology, behaviour and morphology. A comparative genomics approach would be useful and necessary for evolutionary and functional genetic studies of elephants. We performed sequencing of E. maximus and map to L. africana at ∼ 15X coverage. Through comparative sequence analyses, we have identified Asian elephant specific homozygous, non-synonymous single nucleotide variants (SNVs) that map to 1514 protein coding genes, many of which are involved in olfaction. We also present the first report of a high-coverage transcriptome sequence in E. maximus from peripheral blood lymphocytes. We have identified 103 novel protein coding transcripts and 66-long non-coding (Inc)RNAs. We also report the presence of 181 protein domains unique to elephants when compared to other Afrotheria species. Each of these findings can be further investigated to gain a better understanding of functional differences unique to elephant species, as well as those unique to elephantids in comparison with other mammals. This work therefore provides a valuable resource to explore the immense research potential of comparative analyses of transcriptome and genome sequences in the Asian elephant.

  2. Comparative sequence analyses of genome and transcriptome reveal novel transcripts and variants in the Asian elephant Elephas maximus.

    Science.gov (United States)

    Reddy, Puli Chandramouli; Sinha, Ishani; Kelkar, Ashwin; Habib, Farhat; Pradhan, Saurabh J; Sukumar, Raman; Galande, Sanjeev

    2015-12-01

    The Asian elephant Elephas maximus and the African elephant Loxodonta africana that diverged 5-7 million years ago exhibit differences in their physiology, behaviour and morphology. A comparative genomics approach would be useful and necessary for evolutionary and functional genetic studies of elephants. We performed sequencing of E. maximus and map to L. africana at ~15X coverage. Through comparative sequence analyses, we have identified Asian elephant specific homozygous, non-synonymous single nucleotide variants (SNVs) that map to 1514 protein coding genes, many of which are involved in olfaction. We also present the first report of a high-coverage transcriptome sequence in E. maximus from peripheral blood lymphocytes. We have identified 103 novel protein coding transcripts and 66-long non-coding (lnc)RNAs. We also report the presence of 181 protein domains unique to elephants when compared to other Afrotheria species. Each of these findings can be further investigated to gain a better understanding of functional differences unique to elephant species, as well as those unique to elephantids in comparison with other mammals. This work therefore provides a valuable resource to explore the immense research potential of comparative analyses of transcriptome and genome sequences in the Asian elephant.

  3. Comparative transcriptome assembly and genome-guided profiling for Brettanomyces bruxellensis LAMAP2480 during p-coumaric acid stress

    Science.gov (United States)

    Godoy, Liliana; Vera-Wolf, Patricia; Martinez, Claudio; Ugalde, Juan A.; Ganga, María Angélica

    2016-01-01

    Brettanomyces bruxellensis has been described as the main contaminant yeast in wine production, due to its ability to convert the hydroxycinnamic acids naturally present in the grape phenolic derivatives, into volatile phenols. Currently, there are no studies in B. bruxellensis which explains the resistance mechanisms to hydroxycinnamic acids, and in particular to p-coumaric acid which is directly involved in alterations to wine. In this work, we performed a transcriptome analysis of B. bruxellensis LAMAP248rown in the presence and absence of p-coumaric acid during lag phase. Because of reported genetic variability among B. bruxellensis strains, to complement de novo assembly of the transcripts, we used the high-quality genome of B. bruxellensis AWRI1499, as well as the draft genomes of strains CBS2499 and0 g LAMAP2480. The results from the transcriptome analysis allowed us to propose a model in which the entrance of p-coumaric acid to the cell generates a generalized stress condition, in which the expression of proton pump and efflux of toxic compounds are induced. In addition, these mechanisms could be involved in the outflux of nitrogen compounds, such as amino acids, decreasing the overall concentration and triggering the expression of nitrogen metabolism genes. PMID:27678167

  4. Genomic and transcriptome profiling identified both human and HBV genetic variations and their interactions in Chinese hepatocellular carcinoma

    Directory of Open Access Journals (Sweden)

    Hua Dong

    2015-12-01

    Full Text Available Interaction between HBV and host genome integrations in hepatocellular carcinoma (HCC development is a complex process and the mechanism is still unclear. Here we described in details the quality controls and data mining of aCGH and transcriptome sequencing data on 50 HCC samples from the Chinese patients, published by Dong et al. (2015 (GEO#: GSE65486. In additional to the HBV-MLL4 integration discovered, we also investigated the genetic aberrations of HBV and host genes as well as their genetic interactions. We reported human genome copy number changes and frequent transcriptome variations (e.g. TP53, CTNNB1 mutation, especially MLL family mutations in this cohort of the patients. For HBV genotype C, we identified a novel linkage disequilibrium region covering HBV replication regulatory elements, including basal core promoter, DR1, epsilon and poly-A regions, which is associated with HBV core antigen over-expression and almost exclusive to HBV-MLL4 integration.

  5. De novo assembly of a genome-wide transcriptome map of Vicia faba (L.) for transfer cell research.

    Science.gov (United States)

    Arun-Chinnappa, Kiruba S; McCurdy, David W

    2015-01-01

    Vicia faba (L.) is an important cool-season grain legume species used widely in agriculture but also in plant physiology research, particularly as an experimental model to study transfer cell (TC) development. TCs are specialized nutrient transport cells in plants, characterized by invaginated wall ingrowths with amplified plasma membrane surface area enriched with transporter proteins that facilitate nutrient transfer. Many TCs are formed by trans-differentiation from differentiated cells at apoplasmic/symplasmic boundaries in nutrient transport. Adaxial epidermal cells of isolated cotyledons can be induced to form functional TCs, thus providing a valuable experimental system to investigate genetic regulation of TC trans-differentiation. The genome of V. faba is exceedingly large (ca. 13 Gb), however, and limited genomic information is available for this species. To provide a resource for future transcript profiling of epidermal TC differentiation, we have undertaken de novo assembly of a genome-wide transcriptome map for V. faba. Illumina paired-end sequencing of total RNA pooled from different tissues and different stages, including isolated cotyledons induced to form epidermal TCs, generated 69.5 M reads, of which 65.8 M were used for assembly following trimming and quality control. Assembly using a De-Bruijn graph-based approach generated 21,297 contigs, of which 80.6% were successfully annotated against GO terms. The assembly was validated against known V. faba cDNAs held in GenBank, including transcripts previously identified as being specifically expressed in epidermal cells across TC trans-differentiation. This genome-wide transcriptome map therefore provides a valuable tool for future transcript profiling of epidermal TC trans-differentiation, and also enriches the genetic resources available for this important legume crop species.

  6. Integration of gene expression data into genome-scale metabolic models

    DEFF Research Database (Denmark)

    Åkesson, M.; Förster, Jochen; Nielsen, Jens

    2004-01-01

    of gene expression from chemostat and batch cultures of Saccharomyces cerevisiae were combined with a recently developed genome-scale model, and the computed metabolic flux distributions were compared to experimental values from carbon labeling experiments and metabolic network analysis. The integration......A framework for integration of transcriptome data into stoichiometric metabolic models to obtain improved flux predictions is presented. The key idea is to exploit the regulatory information in the expression data to give additional constraints on the metabolic fluxes in the model. Measurements...... of expression data resulted in improved predictions of metabolic behavior in batch cultures, enabling quantitative predictions of exchange fluxes as well as qualitative estimations of changes in intracellular fluxes. A critical discussion of correlation between gene expression and metabolic fluxes is given....

  7. Transcriptome population genomics reveals severe bottleneck and domestication cost in the African rice (Oryza glaberrima).

    Science.gov (United States)

    Nabholz, Benoit; Sarah, Gautier; Sabot, François; Ruiz, Manuel; Adam, Hélène; Nidelet, Sabine; Ghesquière, Alain; Santoni, Sylvain; David, Jacques; Glémin, Sylvain

    2014-05-01

    The African cultivated rice (Oryza glaberrima) was domesticated in West Africa 3000 years ago. Although less cultivated than the Asian rice (O. sativa), O. glaberrima landraces often display interesting adaptation to rustic environment (e.g. drought). Here, using RNA-seq technology, we were able to compare more than 12,000 transcripts between 9 O. glaberrima, 10 wild O. barthii and one O. meridionalis individuals. With a synonymous nucleotide diversity πs = 0.0006 per site, O. glaberrima appears as the least genetically diverse crop grass ever documented. Using approximate Bayesian computation, we estimated that O. glaberrima experienced a severe bottleneck during domestication. This demographic scenario almost fully accounts for the pattern of genetic diversity across O. glaberrima genome as we detected very few outliers regions where positive selection may have further impacted genetic diversity. Moreover, the large excess of derived nonsynonymous substitution that we detected suggests that the O. glaberrima population suffered from the 'cost of domestication'. In addition, we used this genome-scale data set to demonstrate that (i) O. barthii genetic diversity is positively correlated with recombination rate and negatively with gene density, (ii) expression level is negatively correlated with evolutionary constraint, and (iii) one region on chromosome 5 (position 4-6 Mb) exhibits a clear signature of introgression with a yet unidentified Oryza species. This work represents the first genome-wide survey of the African rice genetic diversity and paves the way for further comparison between the African and the Asian rice, notably regarding the genetics underlying domestication traits.

  8. The OME Framework for genome-scale systems biology

    Energy Technology Data Exchange (ETDEWEB)

    Palsson, Bernhard O. [Univ. of California, San Diego, CA (United States); Ebrahim, Ali [Univ. of California, San Diego, CA (United States); Federowicz, Steve [Univ. of California, San Diego, CA (United States)

    2014-12-19

    The life sciences are undergoing continuous and accelerating integration with computational and engineering sciences. The biology that many in the field have been trained on may be hardly recognizable in ten to twenty years. One of the major drivers for this transformation is the blistering pace of advancements in DNA sequencing and synthesis. These advances have resulted in unprecedented amounts of new data, information, and knowledge. Many software tools have been developed to deal with aspects of this transformation and each is sorely needed [1-3]. However, few of these tools have been forced to deal with the full complexity of genome-scale models along with high throughput genome- scale data. This particular situation represents a unique challenge, as it is simultaneously necessary to deal with the vast breadth of genome-scale models and the dizzying depth of high-throughput datasets. It has been observed time and again that as the pace of data generation continues to accelerate, the pace of analysis significantly lags behind [4]. It is also evident that, given the plethora of databases and software efforts [5-12], it is still a significant challenge to work with genome-scale metabolic models, let alone next-generation whole cell models [13-15]. We work at the forefront of model creation and systems scale data generation [16-18]. The OME Framework was borne out of a practical need to enable genome-scale modeling and data analysis under a unified framework to drive the next generation of genome-scale biological models. Here we present the OME Framework. It exists as a set of Python classes. However, we want to emphasize the importance of the underlying design as an addition to the discussions on specifications of a digital cell. A great deal of work and valuable progress has been made by a number of communities [13, 19-24] towards interchange formats and implementations designed to achieve similar goals. While many software tools exist for handling genome-scale

  9. The mitochondrial genome and transcriptome of the basal dinoflagellate Hematodinium sp.: character evolution within the highly derived mitochondrial genomes of dinoflagellates.

    Science.gov (United States)

    Jackson, C J; Gornik, S G; Waller, R F

    2012-01-01

    The sister phyla dinoflagellates and apicomplexans inherited a drastically reduced mitochondrial genome (mitochondrial DNA, mtDNA) containing only three protein-coding (cob, cox1, and cox3) genes and two ribosomal RNA (rRNA) genes. In apicomplexans, single copies of these genes are encoded on the smallest known mtDNA chromosome (6 kb). In dinoflagellates, however, the genome has undergone further substantial modifications, including massive genome amplification and recombination resulting in multiple copies of each gene and gene fragments linked in numerous combinations. Furthermore, protein-encoding genes have lost standard stop codons, trans-splicing of messenger RNAs (mRNAs) is required to generate complete cox3 transcripts, and extensive RNA editing recodes most genes. From taxa investigated to date, it is unclear when many of these unusual dinoflagellate mtDNA characters evolved. To address this question, we investigated the mitochondrial genome and transcriptome character states of the deep branching dinoflagellate Hematodinium sp. Genomic data show that like later-branching dinoflagellates Hematodinium sp. also contains an inflated, heavily recombined genome of multicopy genes and gene fragments. Although stop codons are also lacking for cox1 and cob, cox3 still encodes a conventional stop codon. Extensive editing of mRNAs also occurs in Hematodinium sp. The mtDNA of basal dinoflagellate Hematodinium sp. indicates that much of the mtDNA modification in dinoflagellates occurred early in this lineage, including genome amplification and recombination, and decreased use of standard stop codons. Trans-splicing, on the other hand, occurred after Hematodinium sp. diverged. Only RNA editing presents a nonlinear pattern of evolution in dinoflagellates as this process occurs in Hematodinium sp. but is absent in some later-branching taxa indicating that this process was either lost in some lineages or developed more than once during the evolution of the highly unusual

  10. Traumatic Brain Injury Induces Genome-Wide Transcriptomic, Methylomic, and Network Perturbations in Brain and Blood Predicting Neurological Disorders

    Directory of Open Access Journals (Sweden)

    Qingying Meng

    2017-02-01

    Full Text Available The complexity of the traumatic brain injury (TBI pathology, particularly concussive injury, is a serious obstacle for diagnosis, treatment, and long-term prognosis. Here we utilize modern systems biology in a rodent model of concussive injury to gain a thorough view of the impact of TBI on fundamental aspects of gene regulation, which have the potential to drive or alter the course of the TBI pathology. TBI perturbed epigenomic programming, transcriptional activities (expression level and alternative splicing, and the organization of genes in networks centered around genes such as Anax2, Ogn, and Fmod. Transcriptomic signatures in the hippocampus are involved in neuronal signaling, metabolism, inflammation, and blood function, and they overlap with those in leukocytes from peripheral blood. The homology between genomic signatures from blood and brain elicited by TBI provides proof of concept information for development of biomarkers of TBI based on composite genomic patterns. By intersecting with human genome-wide association studies, many TBI signature genes and network regulators identified in our rodent model were causally associated with brain disorders with relevant link to TBI. The overall results show that concussive brain injury reprograms genes which could lead to predisposition to neurological and psychiatric disorders, and that genomic information from peripheral leukocytes has the potential to predict TBI pathogenesis in the brain.

  11. Traumatic Brain Injury Induces Genome-Wide Transcriptomic, Methylomic, and Network Perturbations in Brain and Blood Predicting Neurological Disorders.

    Science.gov (United States)

    Meng, Qingying; Zhuang, Yumei; Ying, Zhe; Agrawal, Rahul; Yang, Xia; Gomez-Pinilla, Fernando

    2017-02-01

    The complexity of the traumatic brain injury (TBI) pathology, particularly concussive injury, is a serious obstacle for diagnosis, treatment, and long-term prognosis. Here we utilize modern systems biology in a rodent model of concussive injury to gain a thorough view of the impact of TBI on fundamental aspects of gene regulation, which have the potential to drive or alter the course of the TBI pathology. TBI perturbed epigenomic programming, transcriptional activities (expression level and alternative splicing), and the organization of genes in networks centered around genes such as Anax2, Ogn, and Fmod. Transcriptomic signatures in the hippocampus are involved in neuronal signaling, metabolism, inflammation, and blood function, and they overlap with those in leukocytes from peripheral blood. The homology between genomic signatures from blood and brain elicited by TBI provides proof of concept information for development of biomarkers of TBI based on composite genomic patterns. By intersecting with human genome-wide association studies, many TBI signature genes and network regulators identified in our rodent model were causally associated with brain disorders with relevant link to TBI. The overall results show that concussive brain injury reprograms genes which could lead to predisposition to neurological and psychiatric disorders, and that genomic information from peripheral leukocytes has the potential to predict TBI pathogenesis in the brain.

  12. Comparative genomic and transcriptomic analysis revealed genetic characteristics related to solvent formation and xylose utilization in Clostridium acetobutylicum EA 2018

    Directory of Open Access Journals (Sweden)

    Wang Shengyue

    2011-02-01

    Full Text Available Abstract Background Clostridium acetobutylicum, a gram-positive and spore-forming anaerobe, is a major strain for the fermentative production of acetone, butanol and ethanol. But a previously isolated hyper-butanol producing strain C. acetobutylicum EA 2018 does not produce spores and has greater capability of solvent production, especially for butanol, than the type strain C. acetobutylicum ATCC 824. Results Complete genome of C. acetobutylicum EA 2018 was sequenced using Roche 454 pyrosequencing. Genomic comparison with ATCC 824 identified many variations which may contribute to the hyper-butanol producing characteristics in the EA 2018 strain, including a total of 46 deletion sites and 26 insertion sites. In addition, transcriptomic profiling of gene expression in EA 2018 relative to that of ATCC824 revealed expression-level changes of several key genes related to solvent formation. For example, spo0A and adhEII have higher expression level, and most of the acid formation related genes have lower expression level in EA 2018. Interestingly, the results also showed that the variation in CEA_G2622 (CAC2613 in ATCC 824, a putative transcriptional regulator involved in xylose utilization, might accelerate utilization of substrate xylose. Conclusions Comparative analysis of C. acetobutylicum hyper-butanol producing strain EA 2018 and type strain ATCC 824 at both genomic and transcriptomic levels, for the first time, provides molecular-level understanding of non-sporulation, higher solvent production and enhanced xylose utilization in the mutant EA 2018. The information could be valuable for further genetic modification of C. acetobutylicum for more effective butanol production.

  13. Genome-wide comparative transcriptome analysis of CMS-D2 and its maintainer and restorer lines in upland cotton.

    Science.gov (United States)

    Wu, Jianyong; Zhang, Meng; Zhang, Bingbing; Zhang, Xuexian; Guo, Liping; Qi, Tingxiang; Wang, Hailin; Zhang, Jinfa; Xing, Chaozhu

    2017-06-08

    Cytoplasmic male sterility (CMS) conferred by the cytoplasm from Gossypium harknessii (D2) is an important system for hybrid seed production in Upland cotton (G. hirsutum). The male sterility of CMS-D2 (i.e., A line) can be restored to fertility by a restorer (i.e., R line) carrying the restorer gene Rf1 transferred from the D2 nuclear genome. However, the molecular mechanisms of CMS-D2 and its restoration are poorly understood. In this study, a genome-wide comparative transcriptome analysis was performed to identify differentially expressed genes (DEGs) in flower buds among the isogenic fertile R line and sterile A line derived from a backcross population (BC8F1) and the recurrent parent, i.e., the maintainer (B line). A total of 1464 DEGs were identified among the three isogenic lines, and the Rf1-carrying Chr_D05 and its homeologous Chr_A05 had more DEGs than other chromosomes. The results of GO and KEGG enrichment analysis showed differences in circadian rhythm between the fertile and sterile lines. Eleven DEGs were selected for validation using qRT-PCR, confirming the accuracy of the RNA-seq results. Through genome-wide comparative transcriptome analysis, the differential expression profiles of CMS-D2 and its maintainer and restorer lines in Upland cotton were identified. Our results provide an important foundation for further studies into the molecular mechanisms of the interactions between the restorer gene Rf1 and the CMS-D2 cytoplasm.

  14. Comparative genomic and transcriptomic analysis revealed genetic characteristics related to solvent formation and xylose utilization in Clostridium acetobutylicum EA 2018

    Science.gov (United States)

    2011-01-01

    Background Clostridium acetobutylicum, a gram-positive and spore-forming anaerobe, is a major strain for the fermentative production of acetone, butanol and ethanol. But a previously isolated hyper-butanol producing strain C. acetobutylicum EA 2018 does not produce spores and has greater capability of solvent production, especially for butanol, than the type strain C. acetobutylicum ATCC 824. Results Complete genome of C. acetobutylicum EA 2018 was sequenced using Roche 454 pyrosequencing. Genomic comparison with ATCC 824 identified many variations which may contribute to the hyper-butanol producing characteristics in the EA 2018 strain, including a total of 46 deletion sites and 26 insertion sites. In addition, transcriptomic profiling of gene expression in EA 2018 relative to that of ATCC824 revealed expression-level changes of several key genes related to solvent formation. For example, spo0A and adhEII have higher expression level, and most of the acid formation related genes have lower expression level in EA 2018. Interestingly, the results also showed that the variation in CEA_G2622 (CAC2613 in ATCC 824), a putative transcriptional regulator involved in xylose utilization, might accelerate utilization of substrate xylose. Conclusions Comparative analysis of C. acetobutylicum hyper-butanol producing strain EA 2018 and type strain ATCC 824 at both genomic and transcriptomic levels, for the first time, provides molecular-level understanding of non-sporulation, higher solvent production and enhanced xylose utilization in the mutant EA 2018. The information could be valuable for further genetic modification of C. acetobutylicum for more effective butanol production. PMID:21284892

  15. Genomic and Transcriptomic Studies of an RDX (Hexahydro-1,3,5-Trinitro-1,3,5-Triazine)-Degrading Actinobacterium

    Science.gov (United States)

    Chen, Hao-Ping; Zhu, Song-Hua; Casabon, Israël; Hallam, Steven J.; Crocker, Fiona H.; Mohn, William W.

    2012-01-01

    Whole-genome sequencing, transcriptomic analyses, and metabolic reconstruction were used to investigate Gordonia sp. strain KTR9's ability to catabolize a range of compounds, including explosives and steroids. Aspects of this mycolic acid-containing actinobacterium's catabolic potential were experimentally verified and compared with those of rhodococci and mycobacteria. PMID:22923396

  16. Discovery of genes related to insecticide resistance in Bactrocera dorsalis by functional genomic analysis of a de novo assembled transcriptome.

    Directory of Open Access Journals (Sweden)

    Ju-Chun Hsu

    Full Text Available Insecticide resistance has recently become a critical concern for control of many insect pest species. Genome sequencing and global quantization of gene expression through analysis of the transcriptome can provide useful information relevant to this challenging problem. The oriental fruit fly, Bactrocera dorsalis, is one of the world's most destructive agricultural pests, and recently it has been used as a target for studies of genetic mechanisms related to insecticide resistance. However, prior to this study, the molecular data available for this species was largely limited to genes identified through homology. To provide a broader pool of gene sequences of potential interest with regard to insecticide resistance, this study uses whole transcriptome analysis developed through de novo assembly of short reads generated by next-generation sequencing (NGS. The transcriptome of B. dorsalis was initially constructed using Illumina's Solexa sequencing technology. Qualified reads were assembled into contigs and potential splicing variants (isotigs. A total of 29,067 isotigs have putative homologues in the non-redundant (nr protein database from NCBI, and 11,073 of these correspond to distinct D. melanogaster proteins in the RefSeq database. Approximately 5,546 isotigs contain coding sequences that are at least 80% complete and appear to represent B. dorsalis genes. We observed a strong correlation between the completeness of the assembled sequences and the expression intensity of the transcripts. The assembled sequences were also used to identify large numbers of genes potentially belonging to families related to insecticide resistance. A total of 90 P450-, 42 GST-and 37 COE-related genes, representing three major enzyme families involved in insecticide metabolism and resistance, were identified. In addition, 36 isotigs were discovered to contain target site sequences related to four classes of resistance genes. Identified sequence motifs were also

  17. Discovery of Genes Related to Insecticide Resistance in Bactrocera dorsalis by Functional Genomic Analysis of a De Novo Assembled Transcriptome

    Science.gov (United States)

    Hsu, Ju-Chun; Wu, Wen-Jer; Feng, Hai-Tung; Haymer, David S.; Chen, Chien-Yu

    2012-01-01

    Insecticide resistance has recently become a critical concern for control of many insect pest species. Genome sequencing and global quantization of gene expression through analysis of the transcriptome can provide useful information relevant to this challenging problem. The oriental fruit fly, Bactrocera dorsalis, is one of the world's most destructive agricultural pests, and recently it has been used as a target for studies of genetic mechanisms related to insecticide resistance. However, prior to this study, the molecular data available for this species was largely limited to genes identified through homology. To provide a broader pool of gene sequences of potential interest with regard to insecticide resistance, this study uses whole transcriptome analysis developed through de novo assembly of short reads generated by next-generation sequencing (NGS). The transcriptome of B. dorsalis was initially constructed using Illumina's Solexa sequencing technology. Qualified reads were assembled into contigs and potential splicing variants (isotigs). A total of 29,067 isotigs have putative homologues in the non-redundant (nr) protein database from NCBI, and 11,073 of these correspond to distinct D. melanogaster proteins in the RefSeq database. Approximately 5,546 isotigs contain coding sequences that are at least 80% complete and appear to represent B. dorsalis genes. We observed a strong correlation between the completeness of the assembled sequences and the expression intensity of the transcripts. The assembled sequences were also used to identify large numbers of genes potentially belonging to families related to insecticide resistance. A total of 90 P450-, 42 GST-and 37 COE-related genes, representing three major enzyme families involved in insecticide metabolism and resistance, were identified. In addition, 36 isotigs were discovered to contain target site sequences related to four classes of resistance genes. Identified sequence motifs were also analyzed to

  18. Genome-wide gene expression surveys and a transcriptome map in chicken

    NARCIS (Netherlands)

    Nie, H.

    2010-01-01

    The chicken (Gallus gallus) is an important model organism in genetics, developmental biology, immunology, evolutionary research, and agricultural science. The completeness of the draft chicken genome sequence provided new possibilities to study genomic changes during evolution by comparing the chic

  19. High-confidence coding and noncoding transcriptome maps

    Science.gov (United States)

    2017-01-01

    The advent of high-throughput RNA sequencing (RNA-seq) has led to the discovery of unprecedentedly immense transcriptomes encoded by eukaryotic genomes. However, the transcriptome maps are still incomplete partly because they were mostly reconstructed based on RNA-seq reads that lack their orientations (known as unstranded reads) and certain boundary information. Methods to expand the usability of unstranded RNA-seq data by predetermining the orientation of the reads and precisely determining the boundaries of assembled transcripts could significantly benefit the quality of the resulting transcriptome maps. Here, we present a high-performing transcriptome assembly pipeline, called CAFE, that significantly improves the original assemblies, respectively assembled with stranded and/or unstranded RNA-seq data, by orienting unstranded reads using the maximum likelihood estimation and by integrating information about transcription start sites and cleavage and polyadenylation sites. Applying large-scale transcriptomic data comprising 230 billion RNA-seq reads from the ENCODE, Human BodyMap 2.0, The Cancer Genome Atlas, and GTEx projects, CAFE enabled us to predict the directions of about 220 billion unstranded reads, which led to the construction of more accurate transcriptome maps, comparable to the manually curated map, and a comprehensive lncRNA catalog that includes thousands of novel lncRNAs. Our pipeline should not only help to build comprehensive, precise transcriptome maps from complex genomes but also to expand the universe of noncoding genomes. PMID:28396519

  20. A blow to the fly - Lucilia cuprina draft genome and transcriptome to support advances in biology and biotechnology.

    Science.gov (United States)

    Anstead, Clare A; Batterham, Philip; Korhonen, Pasi K; Young, Neil D; Hall, Ross S; Bowles, Vernon M; Richards, Stephen; Scott, Maxwell J; Gasser, Robin B

    2016-01-01

    The blow fly, Lucilia cuprina (Wiedemann, 1830) is a parasitic insect of major global economic importance. Maggots of this fly parasitize the skin of animal hosts, feed on excretions and tissues, and cause severe disease (flystrike or myiasis). Although there has been considerable research on L. cuprina over the years, little is understood about the molecular biology, biochemistry and genetics of this parasitic fly, as well as its relationship with its hosts and the disease that it causes. This situation might change with the recent report of the draft genome and transcriptome of this blow fly, which has given new and global insights into its biology, interactions with the host animal and aspects of insecticide resistance at the molecular level. This genomic resource will likely enable many fundamental and applied research areas in the future. The present article gives a background on L. cuprina and myiasis, a brief account of past and current treatment, prevention and control approaches, and provides a perspective on the impact that the L. cuprina genome should have on future research of this and related parasitic flies, and the design of new and improved interventions for myiasis.

  1. Genome-Wide Host-Pathogen Interaction Unveiled by Transcriptomic Response of Diamondback Moth to Fungal Infection.

    Directory of Open Access Journals (Sweden)

    Zhen-Jian Chu

    Full Text Available Genome-wide insight into insect pest response to the infection of Beauveria bassiana (fungal insect pathogen is critical for genetic improvement of fungal insecticides but has been poorly explored. We constructed three pairs of transcriptomes of Plutella xylostella larvae at 24, 36 and 48 hours post treatment of infection (hptI and of control (hptC for insight into the host-pathogen interaction at genomic level. There were 2143, 3200 and 2967 host genes differentially expressed at 24, 36 and 48 hptI/hptC respectively. These infection-responsive genes (~15% of the host genome were enriched in various immune processes, such as complement and coagulation cascades, protein digestion and absorption, and drug metabolism-cytochrome P450. Fungal penetration into cuticle and host defense reaction began at 24 hptI, followed by most intensive host immune response at 36 hptI and attenuated immunity at 48 hptI. Contrastingly, 44% of fungal genes were differentially expressed in the infection course and enriched in several biological processes, such as antioxidant activity, peroxidase activity and proteolysis. There were 1636 fungal genes co-expressed during 24-48 hptI, including 116 encoding putative secretion proteins. Our results provide novel insights into the insect-pathogen interaction and help to probe molecular mechanisms involved in the fungal infection to the global pest.

  2. Ecological venomics: How genomics, transcriptomics and proteomics can shed new light on the ecology and evolution of venom.

    Science.gov (United States)

    Sunagar, Kartik; Morgenstern, David; Reitzel, Adam M; Moran, Yehu

    2016-03-01

    Animal venom is a complex cocktail of bioactive chemicals that traditionally drew interest mostly from biochemists and pharmacologists. However, in recent years the evolutionary and ecological importance of venom is realized as this trait has direct and strong influence on interactions between species. Moreover, venom content can be modulated by environmental factors. Like many other fields of biology, venom research has been revolutionized in recent years by the introduction of systems biology approaches, i.e., genomics, transcriptomics and proteomics. The employment of these methods in venom research is known as 'venomics'. In this review we describe the history and recent advancements of venomics and discuss how they are employed in studying venom in general and in particular in the context of evolutionary ecology. We also discuss the pitfalls and challenges of venomics and what the future may hold for this emerging scientific field.

  3. Genome-wide expression profiling of the transcriptomes of four Paulownia tomentosa accessions in response to drought.

    Science.gov (United States)

    Dong, Yanpeng; Fan, Guoqiang; Deng, Minjie; Xu, Enkai; Zhao, Zhenli

    2014-10-01

    Paulownia tomentosa is an important foundation forest tree species in semiarid areas. The lack of genetic information hinders research into the mechanisms involved in its response to abiotic stresses. Here, short-read sequencing technology (Illumina) was used to de novo assemble the transcriptome on P. tomentosa. A total of 99,218 unigenes with a mean length of 949 nucleotides were assembled. 68,295 unigenes were selected and the functions of their products were predicted using Clusters of Orthologous Groups, Gene Ontology, and Kyoto Encyclopedia of Genes and Genomes annotations. Afterwards, hundreds of genes involved in drought response were identified. Twelve putative drought response genes were analyzed by quantitative real-time polymerase chain reaction. This study provides a dataset of genes and inherent biochemical pathways, which will help in understanding the mechanisms of the water-deficit response in P. tomentosa. To our knowledge, this is the first study to highlight the genetic makeup of P. tomentosa.

  4. The Discovery of Novel Genomic, Transcriptomic, and Proteomic Biomarkers in Cardiovascular and Peripheral Vascular Disease: The State of the Art

    Directory of Open Access Journals (Sweden)

    Stefano de Franciscis

    2016-01-01

    Full Text Available Cardiovascular disease (CD and peripheral vascular disease (PVD are leading causes of mortality and morbidity in western countries and also responsible of a huge burden in terms of disability, functional decline, and healthcare costs. Biomarkers are measurable biological elements that reflect particular physiological or pathological states or predisposition towards diseases and they are currently widely studied in medicine and especially in CD. In this context, biomarkers can also be used to assess the severity or the evolution of several diseases, as well as the effectiveness of particular therapies. Genomics, transcriptomics, and proteomics have opened new windows on disease phenomena and may permit in the next future an effective development of novel diagnostic and prognostic medicine in order to better prevent or treat CD. This review will consider the current evidence of novel biomarkers with clear implications in the improvement of risk assessment, prevention strategies, and medical decision making in the field of CD.

  5. Transcriptome sequencing and genome-wide association analyses reveal lysosomal function and actin cytoskeleton remodeling in schizophrenia and bipolar disorder.

    Science.gov (United States)

    Zhao, Z; Xu, J; Chen, J; Kim, S; Reimers, M; Bacanu, S-A; Yu, H; Liu, C; Sun, J; Wang, Q; Jia, P; Xu, F; Zhang, Y; Kendler, K S; Peng, Z; Chen, X

    2015-05-01

    Schizophrenia (SCZ) and bipolar disorder (BPD) are severe mental disorders with high heritability. Clinicians have long noticed the similarities of clinic symptoms between these disorders. In recent years, accumulating evidence indicates some shared genetic liabilities. However, what is shared remains elusive. In this study, we conducted whole transcriptome analysis of post-mortem brain tissues (cingulate cortex) from SCZ, BPD and control subjects, and identified differentially expressed genes in these disorders. We found 105 and 153 genes differentially expressed in SCZ and BPD, respectively. By comparing the t-test scores, we found that many of the genes differentially expressed in SCZ and BPD are concordant in their expression level (q⩽0.01, 53 genes; q⩽0.05, 213 genes; q⩽0.1, 885 genes). Using genome-wide association data from the Psychiatric Genomics Consortium, we found that these differentially and concordantly expressed genes were enriched in association signals for both SCZ (Pgenes show concordant expression and association for both SCZ and BPD. Pathway analyses of these genes indicated that they are involved in the lysosome, Fc gamma receptor-mediated phagocytosis, regulation of actin cytoskeleton pathways, along with several cancer pathways. Functional analyses of these genes revealed an interconnected pathway network centered on lysosomal function and the regulation of actin cytoskeleton. These pathways and their interacting network were principally confirmed by an independent transcriptome sequencing data set of the hippocampus. Dysregulation of lysosomal function and cytoskeleton remodeling has direct impacts on endocytosis, phagocytosis, exocytosis, vesicle trafficking, neuronal maturation and migration, neurite outgrowth and synaptic density and plasticity, and different aspects of these processes have been implicated in SCZ and BPD.

  6. Systematic Identification and Assessment of Therapeutic Targets for Breast Cancer Based on Genome-Wide RNA Interference Transcriptomes

    Directory of Open Access Journals (Sweden)

    Yang Liu

    2017-02-01

    Full Text Available With accumulating public omics data, great efforts have been made to characterize the genetic heterogeneity of breast cancer. However, identifying novel targets and selecting the best from the sizeable lists of candidate targets is still a key challenge for targeted therapy, largely owing to the lack of economical, efficient and systematic discovery and assessment to prioritize potential therapeutic targets. Here, we describe an approach that combines the computational evaluation and objective, multifaceted assessment to systematically identify and prioritize targets for biological validation and therapeutic exploration. We first establish the reference gene expression profiles from breast cancer cell line MCF7 upon genome-wide RNA interference (RNAi of a total of 3689 genes, and the breast cancer query signatures using RNA-seq data generated from tissue samples of clinical breast cancer patients in the Cancer Genome Atlas (TCGA. Based on gene set enrichment analysis, we identified a set of 510 genes that when knocked down could significantly reverse the transcriptome of breast cancer state. We then perform multifaceted assessment to analyze the gene set to prioritize potential targets for gene therapy. We also propose drug repurposing opportunities and identify potentially druggable proteins that have been poorly explored with regard to the discovery of small-molecule modulators. Finally, we obtained a small list of candidate therapeutic targets for four major breast cancer subtypes, i.e., luminal A, luminal B, HER2+ and triple negative breast cancer. This RNAi transcriptome-based approach can be a helpful paradigm for relevant researches to identify and prioritize candidate targets for experimental validation.

  7. Genome-scale engineering for systems and synthetic biology

    Science.gov (United States)

    Esvelt, Kevin M; Wang, Harris H

    2013-01-01

    Genome-modification technologies enable the rational engineering and perturbation of biological systems. Historically, these methods have been limited to gene insertions or mutations at random or at a few pre-defined locations across the genome. The handful of methods capable of targeted gene editing suffered from low efficiencies, significant labor costs, or both. Recent advances have dramatically expanded our ability to engineer cells in a directed and combinatorial manner. Here, we review current technologies and methodologies for genome-scale engineering, discuss the prospects for extending efficient genome modification to new hosts, and explore the implications of continued advances toward the development of flexibly programmable chasses, novel biochemistries, and safer organismal and ecological engineering. PMID:23340847

  8. Genome-scale engineering for systems and synthetic biology.

    Science.gov (United States)

    Esvelt, Kevin M; Wang, Harris H

    2013-01-01

    Genome-modification technologies enable the rational engineering and perturbation of biological systems. Historically, these methods have been limited to gene insertions or mutations at random or at a few pre-defined locations across the genome. The handful of methods capable of targeted gene editing suffered from low efficiencies, significant labor costs, or both. Recent advances have dramatically expanded our ability to engineer cells in a directed and combinatorial manner. Here, we review current technologies and methodologies for genome-scale engineering, discuss the prospects for extending efficient genome modification to new hosts, and explore the implications of continued advances toward the development of flexibly programmable chasses, novel biochemistries, and safer organismal and ecological engineering.

  9. Genome-scale validation of deep-sequencing libraries.

    Directory of Open Access Journals (Sweden)

    Dominic Schmidt

    Full Text Available Chromatin immunoprecipitation followed by high-throughput (HTP sequencing (ChIP-seq is a powerful tool to establish protein-DNA interactions genome-wide. The primary limitation of its broad application at present is the often-limited access to sequencers. Here we report a protocol, Mab-seq, that generates genome-scale quality evaluations for nucleic acid libraries intended for deep-sequencing. We show how commercially available genomic microarrays can be used to maximize the efficiency of library creation and quickly generate reliable preliminary data on a chromosomal scale in advance of deep sequencing. We also exploit this technique to compare enriched regions identified using microarrays with those identified by sequencing, demonstrating that they agree on a core set of clearly identified enriched regions, while characterizing the additional enriched regions identifiable using HTP sequencing.

  10. Genome resequencing in Populus: Revealing large-scale genome variation and implications on specialized-trait genomics

    Energy Technology Data Exchange (ETDEWEB)

    Muchero, Wellington [ORNL; Labbe, Jessy L [ORNL; Priya, Ranjan [University of Tennessee, Knoxville (UTK); DiFazio, Steven P [West Virginia University, Morgantown; Tuskan, Gerald A [ORNL

    2014-01-01

    To date, Populus ranks among a few plant species with a complete genome sequence and other highly developed genomic resources. With the first genome sequence among all tree species, Populus has been adopted as a suitable model organism for genomic studies in trees. However, far from being just a model species, Populus is a key renewable economic resource that plays a significant role in providing raw materials for the biofuel and pulp and paper industries. Therefore, aside from leading frontiers of basic tree molecular biology and ecological research, Populus leads frontiers in addressing global economic challenges related to fuel and fiber production. The latter fact suggests that research aimed at improving quality and quantity of Populus as a raw material will likely drive the pursuit of more targeted and deeper research in order to unlock the economic potential tied in molecular biology processes that drive this tree species. Advances in genome sequence-driven technologies, such as resequencing individual genotypes, which in turn facilitates large scale SNP discovery and identification of large scale polymorphisms are key determinants of future success in these initiatives. In this treatise we discuss implications of genome sequence-enable technologies on Populus genomic and genetic studies of complex and specialized-traits.

  11. Transcriptome Analysis of Male White Wax Scale Pupae%白蜡虫雄虫真蛹转录组分析

    Institute of Scientific and Technical Information of China (English)

    于淑惠; 亓倩; 孙涛; 王雪庆; 杨璞; 冯颖

    2016-01-01

    Objective]To obtain the transcriptomic data and characterize the transcriptome of pupal stage of male white wax scale insect,Ericerus pela.[Method]The high-throughput sequencing was performed by Illumina HiSeq 2 000,and bioinformatics analysis were carried out subsequently.Three heat shock protein genes (hsps)were se-lected for expression profile analysis cross different developmental stages using quantitative real-time PCR (qRT-PCR).[Result]The transcriptomic sequencing generated 63 957 unigenes,63 272 open reading frame (ORF), and 9 561simple sequence repeat (SSR)markers.The average length of the unigenes was 674 bp.A total of 14 327 unigenes were annotated in different databases.The Gene Ontology (GO)and Kyoto Encyclopedia of Genes and Genomes (KEGG)analysis showed that the putative function of many genes coincided the physiological charac-terization of pupal stage.[Conclusion]The data and analysis of the transcriptome could lay a foundation for further gene functional research and proteomic analysis of E.pela.%[目的]蛹期对于白蜡虫雄虫发育和交配具有重要意义,但目前缺乏对于白蜡虫真蛹基因转录情况的分析,本研究旨在获得白蜡虫真蛹转录组数据,了解其转录组特征。[方法]采用Illumina HiSeq 2000进行高通量测序,并进行生物信息学分析;采用荧光定量PCR(qRT-PCR)检测3个热激蛋白基因(hsp)在不同虫态中的表达动态。[结果]共获得63957条unigene、63272个开放阅读框(ORF)、9561个SSR分子标记,unigene平均长度674 bp,共有14327条unigene获得了注释,Gene Ontology(GO)注释和 Kyoto Encyclopedia of Genes and Genomes(KEGG)注释分析表明,很多基因参与的功能与白蜡虫蛹期的生理特征吻合。[结论]该转录组数据为白蜡虫基因功能鉴定和蛋白组学分析研究奠定了数据基础。

  12. A flexible whole-genome microarray for transcriptomics in three-spine stickleback (Gasterosteus aculeatus

    Directory of Open Access Journals (Sweden)

    Primmer Craig R

    2009-09-01

    Full Text Available Abstract Background The use of microarray technology for describing changes in mRNA expression to address ecological and evolutionary questions is becoming increasingly popular. Since three-spine stickleback are an important ecological and evolutionary model-species as well as an emerging model for eco-toxicology, the ability to have a functional and flexible microarray platform for transcriptome studies will greatly enhance the research potential in these areas. Results We designed 43,392 unique oligonucleotide probes representing 19,274 genes (93% of the estimated total gene number, and tested the hybridization performance of both DNA and RNA from different populations to determine the efficacy of probe design for transcriptome analysis using the Agilent array platform. The majority of probes were functional as evidenced by the DNA hybridization success, and 30,946 probes (14,615 genes had a signal that was significantly above background for RNA isolated from liver tissue. Genes identified as being expressed in liver tissue were grouped into functional categories for each of the three Gene Ontology groups: biological process, molecular function, and cellular component. As expected, the highest proportions of functional categories belonged to those associated with metabolic functions: metabolic process, binding, catabolism, and organelles. Conclusion The probe and microarray design presented here provides an important step facilitating transcriptomics research for this important research organism by providing a set of over 43,000 probes whose hybridization success and specificity to liver expression has been demonstrated. Probes can easily be added or removed from the current design to tailor the array to specific experiments and additional flexibility lies in the ability to perform either one-color or two-color hybridizations.

  13. Symbiodinium transcriptomes: genome insights into the dinoflagellate symbionts of reef-building corals.

    KAUST Repository

    Bayer, Till

    2012-04-18

    Dinoflagellates are unicellular algae that are ubiquitously abundant in aquatic environments. Species of the genus Symbiodinium form symbiotic relationships with reef-building corals and other marine invertebrates. Despite their ecologic importance, little is known about the genetics of dinoflagellates in general and Symbiodinium in particular. Here, we used 454 sequencing to generate transcriptome data from two Symbiodinium species from different clades (clade A and clade B). With more than 56,000 assembled sequences per species, these data represent the largest transcriptomic resource for dinoflagellates to date. Our results corroborate previous observations that dinoflagellates possess the complete nucleosome machinery. We found a complete set of core histones as well as several H3 variants and H2A.Z in one species. Furthermore, transcriptome analysis points toward a low number of transcription factors in Symbiodinium spp. that also differ in the distribution of DNA-binding domains relative to other eukaryotes. In particular the cold shock domain was predominant among transcription factors. Additionally, we found a high number of antioxidative genes in comparison to non-symbiotic but evolutionary related organisms. These findings might be of relevance in the context of the role that Symbiodinium spp. play as coral symbionts.Our data represent the most comprehensive dinoflagellate EST data set to date. This study provides a comprehensive resource to further analyze the genetic makeup, metabolic capacities, and gene repertoire of Symbiodinium and dinoflagellates. Overall, our findings indicate that Symbiodinium possesses some unique characteristics, in particular the transcriptional regulation in Symbiodinium may differ from the currently known mechanisms of eukaryotic gene regulation.

  14. Genome-wide Annotation, Identification, and Global Transcriptomic Analysis of Regulatory or Small RNA Gene Expression in Staphylococcus aureus

    Directory of Open Access Journals (Sweden)

    Ronan K. Carroll

    2016-02-01

    Full Text Available In Staphylococcus aureus, hundreds of small regulatory or small RNAs (sRNAs have been identified, yet this class of molecule remains poorly understood and severely understudied. sRNA genes are typically absent from genome annotation files, and as a consequence, their existence is often overlooked, particularly in global transcriptomic studies. To facilitate improved detection and analysis of sRNAs in S. aureus, we generated updated GenBank files for three commonly used S. aureus strains (MRSA252, NCTC 8325, and USA300, in which we added annotations for >260 previously identified sRNAs. These files, the first to include genome-wide annotation of sRNAs in S. aureus, were then used as a foundation to identify novel sRNAs in the community-associated methicillin-resistant strain USA300. This analysis led to the discovery of 39 previously unidentified sRNAs. Investigating the genomic loci of the newly identified sRNAs revealed a surprising degree of inconsistency in genome annotation in S. aureus, which may be hindering the analysis and functional exploration of these elements. Finally, using our newly created annotation files as a reference, we perform a global analysis of sRNA gene expression in S. aureus and demonstrate that the newly identified tsr25 is the most highly upregulated sRNA in human serum. This study provides an invaluable resource to the S. aureus research community in the form of our newly generated annotation files, while at the same time presenting the first examination of differential sRNA expression in pathophysiologically relevant conditions.

  15. From genes to milk: Genomic organization and epigenetic regulation of the mammary transcriptome

    Science.gov (United States)

    Even in genomes lacking operons, a gene's position in the genome influences its potential for expression. The mechanisms by which adjacent genes are co-expressed are still not completely understood. Using lactation and the mammary gland as a model system, we explore the hypothesis that chromatin sta...

  16. Comparative genomics and transcriptomics of Escherichia coli isolates carrying virulence factors of both enteropathogenic and enterotoxigenic E. coli.

    Science.gov (United States)

    Hazen, Tracy H; Michalski, Jane; Luo, Qingwei; Shetty, Amol C; Daugherty, Sean C; Fleckenstein, James M; Rasko, David A

    2017-06-14

    Escherichia coli that are capable of causing human disease are often classified into pathogenic variants (pathovars) based on their virulence gene content. However, disease-associated hybrid E. coli, containing unique combinations of multiple canonical virulence factors have also been described. Such was the case of the E. coli O104:H4 outbreak in 2011, which caused significant morbidity and mortality. Among the pathovars of diarrheagenic E. coli that cause significant human disease are the enteropathogenic E. coli (EPEC) and enterotoxigenic E. coli (ETEC). In the current study we use comparative genomics, transcriptomics, and functional studies to characterize isolates that contain virulence factors of both EPEC and ETEC. Based on phylogenomic analysis, these hybrid isolates are more genomically-related to EPEC, but appear to have acquired ETEC virulence genes. Global transcriptional analysis using RNA sequencing, demonstrated that the EPEC and ETEC virulence genes of these hybrid isolates were differentially-expressed under virulence-inducing laboratory conditions, similar to reference isolates. Immunoblot assays further verified that the virulence gene products were produced and that the T3SS effector EspB of EPEC, and heat-labile toxin of ETEC were secreted. These findings document the existence and virulence potential of an E. coli pathovar hybrid that blurs the distinction between E. coli pathovars.

  17. The transcriptome of the reference potato genome Solanum tuberosum Group Phureja clone DM1-3 516R44.

    Science.gov (United States)

    Massa, Alicia N; Childs, Kevin L; Lin, Haining; Bryan, Glenn J; Giuliano, Giovanni; Buell, C Robin

    2011-01-01

    Advances in molecular breeding in potato have been limited by its complex biological system, which includes vegetative propagation, autotetraploidy, and extreme heterozygosity. The availability of the potato genome and accompanying gene complement with corresponding gene structure, location, and functional annotation are powerful resources for understanding this complex plant and advancing molecular breeding efforts. Here, we report a reference for the potato transcriptome using 32 tissues and growth conditions from the doubled monoploid Solanum tuberosum Group Phureja clone DM1-3 516R44 for which a genome sequence is available. Analysis of greater than 550 million RNA-Seq reads permitted the detection and quantification of expression levels of over 22,000 genes. Hierarchical clustering and principal component analyses captured the biological variability that accounts for gene expression differences among tissues suggesting tissue-specific gene expression, and genes with tissue or condition restricted expression. Using gene co-expression network analysis, we identified 18 gene modules that represent tissue-specific transcriptional networks of major potato organs and developmental stages. This information provides a powerful resource for potato research as well as studies on other members of the Solanaceae family.

  18. The Transcriptome of the Reference Potato Genome Solanum tuberosum Group Phureja Clone DM1-3 516R44

    Science.gov (United States)

    Massa, Alicia N.; Childs, Kevin L.; Lin, Haining; Bryan, Glenn J.; Giuliano, Giovanni; Buell, C. Robin

    2011-01-01

    Advances in molecular breeding in potato have been limited by its complex biological system, which includes vegetative propagation, autotetraploidy, and extreme heterozygosity. The availability of the potato genome and accompanying gene complement with corresponding gene structure, location, and functional annotation are powerful resources for understanding this complex plant and advancing molecular breeding efforts. Here, we report a reference for the potato transcriptome using 32 tissues and growth conditions from the doubled monoploid Solanum tuberosum Group Phureja clone DM1-3 516R44 for which a genome sequence is available. Analysis of greater than 550 million RNA-Seq reads permitted the detection and quantification of expression levels of over 22,000 genes. Hierarchical clustering and principal component analyses captured the biological variability that accounts for gene expression differences among tissues suggesting tissue-specific gene expression, and genes with tissue or condition restricted expression. Using gene co-expression network analysis, we identified 18 gene modules that represent tissue-specific transcriptional networks of major potato organs and developmental stages. This information provides a powerful resource for potato research as well as studies on other members of the Solanaceae family. PMID:22046362

  19. Integration of transcriptomic and genomic data suggests candidate mechanisms for APOE4-mediated pathogenic action in Alzheimer’s disease

    Science.gov (United States)

    Caberlotto, Laura; Marchetti, Luca; Lauria, Mario; Scotti, Marco; Parolo, Silvia

    2016-09-01

    Among the genetic factors known to increase the risk of late onset Alzheimer’s diseases (AD), the presence of the apolipoproteine e4 (APOE4) allele has been recognized as the one with the strongest effect. However, despite decades of research, the pathogenic role of APOE4 in Alzheimer’s disease has not been clearly elucidated yet. In order to investigate the pathogenic action of APOE4, we applied a systems biology approach to the analysis of transcriptomic and genomic data of APOE44 vs. APOE33 allele carriers affected by Alzheimer’s disease. Network analysis combined with a novel technique for biomarker computation allowed the identification of an alteration in aging-associated processes such as inflammation, oxidative stress and metabolic pathways, indicating that APOE4 possibly accelerates pathological processes physiologically induced by aging. Subsequent integration with genomic data indicates that the Notch pathway could be the nodal molecular mechanism altered in APOE44 allele carriers with Alzheimer’s disease. Interestingly, PSEN1 and APP, genes whose mutation are known to be linked to early onset Alzheimer’s disease, are closely linked to this pathway. In conclusion, APOE4 role on inflammation and oxidation through the Notch signaling pathway could be crucial in elucidating the risk factors of Alzheimer’s disease.

  20. Quantitative RNA-Seq analysis in non-model species: assessing transcriptome assemblies as a scaffold and the utility of evolutionary divergent genomic reference species

    Directory of Open Access Journals (Sweden)

    Hornett Emily A

    2012-08-01

    Full Text Available Abstract Background How well does RNA-Seq data perform for quantitative whole gene expression analysis in the absence of a genome? This is one unanswered question facing the rapidly growing number of researchers studying non-model species. Using Homo sapiens data and resources, we compared the direct mapping of sequencing reads to predicted genes from the genome with mapping to de novo transcriptomes assembled from RNA-Seq data. Gene coverage and expression analysis was further investigated in the non-model context by using increasingly divergent genomic reference species to group assembled contigs by unique genes. Results Eight transcriptome sets, composed of varying amounts of Illumina and 454 data, were assembled and assessed. Hybrid 454/Illumina assemblies had the highest transcriptome and individual gene coverage. Quantitative whole gene expression levels were highly similar between using a de novo hybrid assembly and the predicted genes as a scaffold, although mapping to the de novo transcriptome assembly provided data on fewer genes. Using non-target species as reference scaffolds does result in some loss of sequence and expression data, and bias and error increase with evolutionary distance. However, within a 100 million year window these effect sizes are relatively small. Conclusions Predicted gene sets from sequenced genomes of related species can provide a powerful method for grouping RNA-Seq reads and annotating contigs. Gene expression results can be produced that are similar to results obtained using gene models derived from a high quality genome, though biased towards conserved genes. Our results demonstrate the power and limitations of conducting RNA-Seq in non-model species.

  1. Transcriptome kinetics is governed by a genome-wide coupling of mRNA production and degradation: a role for RNA Pol II.

    Directory of Open Access Journals (Sweden)

    Ophir Shalem

    2011-09-01

    Full Text Available Transcriptome dynamics is governed by two opposing processes, mRNA production and degradation. Recent studies found that changes in these processes are frequently coordinated and that the relationship between them shapes transcriptome kinetics. Specifically, when transcription changes are counter-acted with changes in mRNA stability, transient fast-relaxing transcriptome kinetics is observed. A possible molecular mechanism underlying such coordinated regulation might lay in two RNA polymerase (Pol II subunits, Rpb4 and Rpb7, which are recruited to mRNAs during transcription and later affect their degradation in the cytoplasm. Here we used a yeast strain carrying a mutant Pol II which poorly recruits these subunits. We show that this mutant strain is impaired in its ability to modulate mRNA stability in response to stress. The normal negative coordinated regulation is lost in the mutant, resulting in abnormal transcriptome profiles both with respect to magnitude and kinetics of responses. These results reveal an important role for Pol II, in regulation of both mRNA synthesis and degradation, and also in coordinating between them. We propose a simple model for production-degradation coupling that accounts for our observations. The model shows how a simple manipulation of the rates of co-transcriptional mRNA imprinting by Pol II may govern genome-wide transcriptome kinetics in response to environmental changes.

  2. Genome-scale metabolic representation of Amycolatopsis balhimycina

    DEFF Research Database (Denmark)

    Vongsangnak, Wanwipa; Figueiredo, L. F.; Förster, Jochen

    2012-01-01

    EC numbers, 647 metabolites and 1,363 metabolic reactions. During the analysis of the metabolic model, linear, quadratic and evolutionary programming algorithms using flux balance analysis (FBA), minimization of metabolic adjustment (MOMA), and OptGene, respectively were applied as well as phenotypic...... biosynthesis in Amycolatopsis balhimycina. The balhimycin yield obtained by A. balhimycina is, however, low and there is therefore a need to improve balhimycin production. In this study, we performed genome sequencing, assembly and annotation analysis of A. balhimycina and further used these annotated data...... to reconstruct a genome‐scale metabolic model for the organism. Here we generated an almost complete A. balhimycina genome sequence comprising 10,562,587 base pairs assembled into 2,153 contigs. The high GC‐genome (∼69%) includes 8,585 open reading frames (ORFs). We used our integrative toolbox called SEQTOR...

  3. Unraveling Fungal Radiation Resistance Regulatory Networks through the Genome-Wide Transcriptome and Genetic Analyses of Cryptococcus neoformans

    Directory of Open Access Journals (Sweden)

    Kwang-Woo Jung

    2016-11-01

    Full Text Available The basidiomycetous fungus Cryptococcus neoformans has been known to be highly radiation resistant and has been found in fatal radioactive environments such as the damaged nuclear reactor at Chernobyl. To elucidate the mechanisms underlying the radiation resistance phenotype of C. neoformans, we identified genes affected by gamma radiation through genome-wide transcriptome analysis and characterized their functions. We found that genes involved in DNA damage repair systems were upregulated in response to gamma radiation. Particularly, deletion of recombinase RAD51 and two DNA-dependent ATPase genes, RAD54 and RDH54, increased cellular susceptibility to both gamma radiation and DNA-damaging agents. A variety of oxidative stress response genes were also upregulated. Among them, sulfiredoxin contributed to gamma radiation resistance in a peroxiredoxin/thioredoxin-independent manner. Furthermore, we found that genes involved in molecular chaperone expression, ubiquitination systems, and autophagy were induced, whereas genes involved in the biosynthesis of proteins and fatty acids/sterols were downregulated. Most importantly, we discovered a number of novel C. neoformans genes, the expression of which was modulated by gamma radiation exposure, and their deletion rendered cells susceptible to gamma radiation exposure, as well as DNA damage insults. Among these genes, we found that a unique transcription factor containing the basic leucine zipper domain, named Bdr1, served as a regulator of the gamma radiation resistance of C. neoformans by controlling expression of DNA repair genes, and its expression was regulated by the evolutionarily conserved DNA damage response protein kinase Rad53. Taken together, the current transcriptome and functional analyses contribute to the understanding of the unique molecular mechanism of the radiation-resistant fungus C. neoformans.

  4. Lifestyle transitions in plant pathogenic Colletotrichum fungi deciphered by genome and transcriptome analyses

    NARCIS (Netherlands)

    O'Connell, R.J.; Thon, M.R.; Hacquard, S.; Amyotte, S.G.; Kleemann, J.; Torres, M.F.; Damm, U.; Buiate, E.A.; Epstein, L.; Alkan, N.; Altmuller, J.; Alvarado-Balderrama, L.; Bauser, C.A.; Becker, C.; Birren, B.W.; Chen, Z.; Choi, J.; Crouch, J.A.; Duvick, J.P.; Farman, M.A.; Gan, P.; Heiman, D.; Henrissat, B.; Howard, R.J.; Kabbage, M.; Koch, C.; Kracher, B.; Kubo, Y.; Law, A.D.; Lebrun, M.-H.; Lee, Y.-H.; Miyara, I.; Moore, N.; Neumann, U.; Nordstrom, K.; Panaccione, D.G.; Panstruga, R.; Place, M.; Proctor, R.H.; Prusky, D.; Rech, G.; Reinhardt, R.; Rollins, J.A.; Rounsley, S.; Schardl, C.L.; Schwartz, D.C.; Shenoy, N.; Shirasu, K.; Sikhakolli, U.R.; Stuber, K.; Sukno, S.A.; Sweigard, J.A.; Takano, Y.; Takahara, H.; Trail, F.; Does, H.C.; Voll, L.M.; Will, I.; Young, S.; Zeng, Q.; Zhang, Jingze; Zhou, S.; Dickman, M.B.; Schulze-Lefert, P.; Verloren van Themaat, E.; Ma, L.-J.; Vaillancourt, L.J.

    2012-01-01

    Colletotrichum species are fungal pathogens that devastate crop plants worldwide. Host infection involves the differentiation of specialized cell types that are associated with penetration, growth inside living host cells (biotrophy) and tissue destruction (necrotrophy). We report here genome and

  5. Lifestyle transitions in plant pathogenic Colletotrichum fungi deciphered by genome and transcriptome analyses

    NARCIS (Netherlands)

    O'Connell, R.J.; Thon, M.R.; Hacquard, S.; Amyotte, S.G.; Kleemann, J.; Torres, M.F.; Damm, U.; Buiate, E.A.; Epstein, L.; Alkan, N.; Altmuller, J.; Alvarado-Balderrama, L.; Bauser, C.A.; Becker, C.; Birren, B.W.; Chen, Z.; Choi, J.; Crouch, J.A.; Duvick, J.P.; Farman, M.A.; Gan, P.; Heiman, D.; Henrissat, B.; Howard, R.J.; Kabbage, M.; Koch, C.; Kracher, B.; Kubo, Y.; Law, A.D.; Lebrun, M.-H.; Lee, Y.-H.; Miyara, I.; Moore, N.; Neumann, U.; Nordstrom, K.; Panaccione, D.G.; Panstruga, R.; Place, M.; Proctor, R.H.; Prusky, D.; Rech, G.; Reinhardt, R.; Rollins, J.A.; Rounsley, S.; Schardl, C.L.; Schwartz, D.C.; Shenoy, N.; Shirasu, K.; Sikhakolli, U.R.; Stuber, K.; Sukno, S.A.; Sweigard, J.A.; Takano, Y.; Takahara, H.; Trail, F.; Does, H.C.; Voll, L.M.; Will, I.; Young, S.; Zeng, Q.; Zhang, Jingze; Zhou, S.; Dickman, M.B.; Schulze-Lefert, P.; Verloren van Themaat, E.; Ma, L.-J.; Vaillancourt, L.J.

    2012-01-01

    Colletotrichum species are fungal pathogens that devastate crop plants worldwide. Host infection involves the differentiation of specialized cell types that are associated with penetration, growth inside living host cells (biotrophy) and tissue destruction (necrotrophy). We report here genome and tr

  6. Genome, transcriptome, and secretome analysis of wood decay fungus postia placenta supports unique mechanisms of lignocellulose conversion

    Energy Technology Data Exchange (ETDEWEB)

    Martinez, Diego [Los Alamos National Laboratory; Challacombe, Jean F [Los Alamos National Laboratory; Misra, Monica [Los Alamos National Laboratory; Xie, Gary [Los Alamos National Laboratory; Brettin, Thomas [Los Alamos National Laboratory; Morgenstern, Ingo [CLARK UNIV; Hibbett, David [CLARK UNIV.; Schmoll, Monika [UNIV WIEN; Kubicek, Christian P [UNIV WIEN; Ferreira, Patricia [CIB, CSIC, MADRID; Ruiz - Duenase, Francisco J [CIB, CSIC, MADRID; Martinez, Angel T [CIB, CSIC, MADRID; Kersten, Phil [FOREST PRODUCTS LAB; Hammel, Kenneth E [FOREST PRODUCTS LAB; Vanden Wymelenberg, Amber [U. WISCONSIN; Gaskell, Jill [FOREST PRODUCTS LAB; Lindquist, Erika [DOE JGI; Sabati, Grzegorz [U. WISCONSIN; Bondurant, Sandra S [U. WISCONSIN; Larrondo, Luis F [U. CATHOLICA DE CHILE; Canessa, Paulo [U. CATHOLICA DE CHILE; Vicunna, Rafael [U. CATHOLICA DE CHILE; Yadavk, Jagiit [U. CINCINATTI; Doddapaneni, Harshavardhan [U. CINCINATTI; Subramaniank, Venkataramanan [U. CINCINATTI; Pisabarro, Antonio G [PUBLIC U. NAVARRE; Lavin, Jose L [PUBLIC U. NAVARRE; Oguiza, Jose A [PUBLIC U. NAVARRE; Master, Emma [U. TORONTO; Henrissat, Bernard [CNRS, MARSEILLE; Coutinho, Pedro M [CNRS, MARSEILLE; Harris, Paul [NOVOZYMES, INC.; Magnuson, Jon K [PNNL; Baker, Scott [PNNL; Bruno, Kenneth [PNNL; Kenealy, William [MASCOMA, INC.; Hoegger, Patrik J [GEORG-AUGUST-U.; Kues, Ursula [GEORG-AUGUST-U; Ramaiva, Preethi [NOVOZYMES, INC.; Lucas, Susan [DOE JGI; Salamov, Asaf [DOE JGI; Shapiro, Harris [DOE JGI; Tuh, Hank [DOE JGI; Chee, Christine L [UNM; Teter, Sarah [NOVOZYMES, INC.; Yaver, Debbie [NOVOZYMES, INC.; James, Tim [MCMASTER U.; Mokrejs, Martin [CHARLES U.; Pospisek, Martin [CHARLES U.; Grigoriev, Igor [DOE JGI; Rokhsar, Dan [DOE JGI; Berka, Randy [NOVOZYMES; Cullen, Dan [FOREST PRODUCTS LAB

    2008-01-01

    Brown-rot fungi such as Postia placenta are common inhabitants of forest ecosystems and are also largely responsible for the destructive decay of wooden structures. Rapid depolymerization of cellulose is a distinguishing feature of brown-rot, but the biochemical mechanisms and underlying genetics are poorly understood. Systematic examination of the P. placenta genome, transcriptome and secretome revealed unique extracellular enzyme systems, including an unusual repertoire of extracellular glycoside hydrolases. Genes encoding exocellobiohydrolases and cellulose-binding domains, typical of cellulolytic microbes, are absent in this efficient cellulose-degrading fungus. When P. placenta was grown in medium containing cellulose as sole carbon source, transcripts corresponding to many hemicellulases and to a single putative {beta}-1-4 endoglucanase were expressed at high levels relative to glucose grown cultures. These transcript profiles were confirmed by direct identification of peptides by liquid chromatography-tandem mass spectrometry (LC{center_dot}MSIMS). Also upregulated during growth on cellulose medium were putative iron reductases, quinone reductase, and structurally divergent oxidases potentially involved in extracellular generation of Fe(II) and H202. These observations are consistent with a biodegradative role for Fenton chemistry in which Fe(II) and H202 react to form hydroxyl radicals, highly reactive oxidants capable of depolymerizing cellulose. The P. placenta genome resources provide unparalleled opportunities for investigating such unusual mechanisms of cellulose conversion. More broadly, the genome offers insight into the diversification of lignocellulose degrading mechanisms in fungi. Comparisons to the closely related white-rot fungus Phanerochaete chrysosporium support an evolutionary shift from white-rot to brown-rot during which the capacity for efficient depolymerization of lignin was lost.

  7. Genome Sequence and Transcriptome Analysis of Meat-Spoilage-Associated Lactic Acid Bacterium Lactococcus piscium MKFS47.

    Science.gov (United States)

    Andreevskaya, Margarita; Johansson, Per; Laine, Pia; Smolander, Olli-Pekka; Sonck, Matti; Rahkila, Riitta; Jääskeläinen, Elina; Paulin, Lars; Auvinen, Petri; Björkroth, Johanna

    2015-06-01

    Lactococcus piscium is a psychrotrophic lactic acid bacterium and is known to be one of the predominant species within spoilage microbial communities in cold-stored packaged foods, particularly in meat products. Its presence in such products has been associated with the formation of buttery and sour off-odors. Nevertheless, the spoilage potential of L. piscium varies dramatically depending on the strain and growth conditions. Additional knowledge about the genome is required to explain such variation, understand its phylogeny, and study gene functions. Here, we present the complete and annotated genomic sequence of L. piscium MKFS47, combined with a time course analysis of the glucose catabolism-based transcriptome. In addition, a comparative analysis of gene contents was done for L. piscium MKFS47 and 29 other lactococci, revealing three distinct clades within the genus. The genome of L. piscium MKFS47 consists of one chromosome, carrying 2,289 genes, and two plasmids. A wide range of carbohydrates was predicted to be fermented, and growth on glycerol was observed. Both carbohydrate and glycerol catabolic pathways were significantly upregulated in the course of time as a result of glucose exhaustion. At the same time, differential expression of the pyruvate utilization pathways, implicated in the formation of spoilage substances, switched the metabolism toward a heterofermentative mode. In agreement with data from previous inoculation studies, L. piscium MKFS47 was identified as an efficient producer of buttery-odor compounds under aerobic conditions. Finally, genes and pathways that may contribute to increased survival in meat environments were considered. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  8. Exploring the shallow end; estimating information content in transcriptomics studies

    Directory of Open Access Journals (Sweden)

    Daniel J Kliebenstein

    2012-09-01

    Full Text Available Transcriptomics is a major platform to study organismal biology. The advent of new parallel sequencing technologies has opened up a new avenue of transcriptomics with ever deeper and deeper sequencing to identify and quantify each and every transcript in a sample. However, this may not be the best usage of the parallel sequencing technology for all transcriptomics experiments. I utilized the Shannon Entropy approach to estimate the information contained within a transcriptomics experiment and tested the ability of shallow RNAseq to capture the majority of this information. This analysis showed that it was possible to capture nearly all of the network or genomic information present in a variety of transcriptomics experiments using a subset of the most abundant 5000 transcripts or less within any given sample. Thus, it appears that it should be possible and affordable to conduct large scale factorial analysis with a high degree of replication using parallel sequencing technologies.

  9. An integrative genomic and transcriptomic analysis reveals potential targets associated with cell proliferation in uterine leiomyomas

    DEFF Research Database (Denmark)

    Cirilo, Priscila Daniele Ramos; Marchi, Fábio Albuquerque; Barros Filho, Mateus de Camargo

    2013-01-01

    BACKGROUND: Uterine Leiomyomas (ULs) are the most common benign tumours affecting women of reproductive age. ULs represent a major problem in public health, as they are the main indication for hysterectomy. Approximately 40-50% of ULs have non-random cytogenetic abnormalities, and half of ULs may......: The integrated analysis identified the top 30 significant genes (Pindicated a strong association between FANCA and BRCA1. Functional in silico analysis revealed target molecules for drugs involved in cell...... and transcriptomic approach indicated that FGFR1 and IGFBP5 amplification, as well as the consequent up-regulation of the protein products, plays an important role in the aetiology of ULs and thus provides data for potential drug therapies development to target genes associated with cellular proliferation in ULs....

  10. Autism spectrum disorders: Integration of the genome, transcriptome and the environment.

    Science.gov (United States)

    Vijayakumar, N Thushara; Judy, M V

    2016-05-15

    Autism spectrum disorders denote a series of lifelong neurodevelopmental conditions characterized by an impaired social communication profile and often repetitive, stereotyped behavior. Recent years have seen the complex genetic architecture of the disease being progressively unraveled with advancements in gene finding technology and next generation sequencing methods. However, a complete elucidation of the molecular mechanisms behind autism is necessary for potential diagnostic and therapeutic applications. A multidisciplinary approach should be adopted where the focus is not only on the 'genetics' of autism but also on the combinational roles of epigenetics, transcriptomics, immune system disruption and environmental factors that could all influence the etiopathogenesis of the disease. ASD is a clinically heterogeneous disorder with great genetic complexity; only through an integrated multidimensional effort can modern autism research progress further.

  11. Genome-wide transcriptomic alterations induced by ethanol treatment in human dental pulp stem cells (DPSCs

    Directory of Open Access Journals (Sweden)

    Omar Khalid

    2014-12-01

    Full Text Available Human dental pulp stem cells (DPSCs isolated from adult dental pulp are multipotent mesenchymal stem cells that can be directed to differentiate into osteogenic/odontogenic cells and also trans-differentiate into neuronal cells. The utility of DPSC has been explored in odontogenic differentiation for tooth regeneration. Alcohol abuse appears to lead to periodontal disease, tooth decay and mouth sores that are potentially precancerous. Persons who abuse alcohol are at high risk of having seriously deteriorated teeth, gums and compromised oral health in general. It is currently unknown if alcohol exposure has any impact on adult stem cell maintenance, stem cell fate determination and plasticity, and stem cell niche environment. Here we provide detailed experimental methods, analysis and information associated with our data deposited into Gene Expression Omnibus (GEO under GSE57255. Our data provide transcriptomic changes that are occurring by EtOH treatment of DPSCs at 24-hour and 48-hour time point.

  12. Power Laws, Scale-Free Networks and Genome Biology

    CERN Document Server

    Koonin, Eugene V; Karev, Georgy P

    2006-01-01

    Power Laws, Scale-free Networks and Genome Biology deals with crucial aspects of the theoretical foundations of systems biology, namely power law distributions and scale-free networks which have emerged as the hallmarks of biological organization in the post-genomic era. The chapters in the book not only describe the interesting mathematical properties of biological networks but moves beyond phenomenology, toward models of evolution capable of explaining the emergence of these features. The collection of chapters, contributed by both physicists and biologists, strives to address the problems in this field in a rigorous but not excessively mathematical manner and to represent different viewpoints, which is crucial in this emerging discipline. Each chapter includes, in addition to technical descriptions of properties of biological networks and evolutionary models, a more general and accessible introduction to the respective problems. Most chapters emphasize the potential of theoretical systems biology for disco...

  13. Deep transcriptome sequencing of Pecten maximus hemocytes: a genomic resource for bivalve immunology.

    Science.gov (United States)

    Pauletto, Marianna; Milan, Massimo; Moreira, Rebeca; Novoa, Beatriz; Figueras, Antonio; Babbucci, Massimiliano; Patarnello, Tomaso; Bargelloni, Luca

    2014-03-01

    Pecten maximus, the king scallop, is a bivalve species with important commercial value for both fisheries and aquaculture, traditionally consumed in several European countries. Major problems in larval rearing, however, still limit hatchery-based seed production. High mortalities during early larval stages, likely related to bacterial pathogens, represent the most relevant bottleneck. To address this issue, understanding host defense mechanisms against microbes is extremely important. In this study next-generation RNA-sequencing was carried on scallop hemocytes. To enrich for immune-related transcripts, cDNA libraries from hemocytes challenged in vivo with inactivated-Vibrio anguillarum and in vitro with pathogen-associated molecular patterns, as well as unchallenged controls, were sequenced yielding 216,444,674 sequence reads. De novo assembly of the scallop hemocyte transcriptome consisted of 73,732 contigs (31% annotated). A total of 934 contigs encoded proteins with a known immune function, grouped into several functional categories. Particular attention was reserved to Toll-like receptors (TLRs), a family of pattern recognition receptors (PRRs) involved in non-self recognition. Through mining the scallop hemocyte transcriptome, at least four TLRs could be identified. The organization of canonical TLR domains demonstrated that single cysteine cluster and multiple cysteine cluster TLRs co-exist in this species. In addition, preliminary data concerning their mRNA level following bacterial challenge suggested that different members of this family could exhibit opposite responses to pathogenic stimuli. Finally, a global analysis of differential expression comparing gene-expression levels in in vitro and in vivo stimulated hemocytes against controls provided evidence on a large set of transcripts involved in the great scallop immune response.

  14. Next-generation transcriptome assembly

    Energy Technology Data Exchange (ETDEWEB)

    Martin, Jeffrey A.; Wang, Zhong

    2011-09-01

    Transcriptomics studies often rely on partial reference transcriptomes that fail to capture the full catalog of transcripts and their variations. Recent advances in sequencing technologies and assembly algorithms have facilitated the reconstruction of the entire transcriptome by deep RNA sequencing (RNA-seq), even without a reference genome. However, transcriptome assembly from billions of RNA-seq reads, which are often very short, poses a significant informatics challenge. This Review summarizes the recent developments in transcriptome assembly approaches - reference-based, de novo and combined strategies-along with some perspectives on transcriptome assembly in the near future.

  15. Genome-scale reconstruction of the Saccharomyces cerevisiae metabolic network

    DEFF Research Database (Denmark)

    Förster, Jochen; Famili, I.; Fu, P.

    2003-01-01

    and the environment were included. A total of 708 structural open reading frames (ORFs) were accounted for in the reconstructed network, corresponding to 1035 metabolic reactions. Further, 140 reactions were included on the basis of biochemical evidence resulting in a genome-scale reconstructed metabolic network...... with Escherichia coli. The reconstructed metabolic network is the first comprehensive network for a eukaryotic organism, and it may be used as the basis for in silico analysis of phenotypic functions....

  16. Determination of sample size in genome-scale RNAi screens.

    Science.gov (United States)

    Zhang, Xiaohua Douglas; Heyse, Joseph F

    2009-04-01

    For genome-scale RNAi research, it is critical to investigate sample size required for the achievement of reasonably low false negative rate (FNR) and false positive rate. The analysis in this article reveals that current design of sample size contributes to the occurrence of low signal-to-noise ratio in genome-scale RNAi projects. The analysis suggests that (i) an arrangement of 16 wells per plate is acceptable and an arrangement of 20-24 wells per plate is preferable for a negative control to be used for hit selection in a primary screen without replicates; (ii) in a confirmatory screen or a primary screen with replicates, a sample size of 3 is not large enough, and there is a large reduction in FNRs when sample size increases from 3 to 4. To search a tradeoff between benefit and cost, any sample size between 4 and 11 is a reasonable choice. If the main focus is the selection of siRNAs with strong effects, a sample size of 4 or 5 is a good choice. If we want to have enough power to detect siRNAs with moderate effects, sample size needs to be 8, 9, 10 or 11. These discoveries about sample size bring insight to the design of a genome-scale RNAi screen experiment.

  17. Genomics, transcriptomics, and peptidomics of neuropeptides and protein hormones in the red flour beetle Tribolium castaneum

    DEFF Research Database (Denmark)

    Li, Bin; Predel, Reinhard; Neupert, Susanne;

    2008-01-01

    Neuropeptides and protein hormones are ancient molecules that mediate cell-to-cell communication. The whole genome sequence from the red flour beetle Tribolium castaneum, along with those from other insect species, provides an opportunity to study the evolution of the genes encoding neuropeptide...... and protein hormones. We identified 41 of these genes in the Tribolium genome by using a combination of bioinformatic and peptidomic approaches. These genes encode >80 mature neuropeptides and protein hormones, 49 peptides of which were experimentally identified by peptidomics of the central nervous system...... with a sequenced genome. The presence of many additional osmoregulatory peptides in Tribolium agrees well with its ability to live in very dry surroundings. In contrast to these extra genes, there are at least nine neuropeptide genes missing in Tribolium, including the genes encoding the prepropeptides...

  18. Genomic divergences among cattle, dog and human estimated from large-scale alignments of genomic sequences

    Directory of Open Access Journals (Sweden)

    Shade Larry L

    2006-06-01

    Full Text Available Abstract Background Approximately 11 Mb of finished high quality genomic sequences were sampled from cattle, dog and human to estimate genomic divergences and their regional variation among these lineages. Results Optimal three-way multi-species global sequence alignments for 84 cattle clones or loci (each >50 kb of genomic sequence were constructed using the human and dog genome assemblies as references. Genomic divergences and substitution rates were examined for each clone and for various sequence classes under different functional constraints. Analysis of these alignments revealed that the overall genomic divergences are relatively constant (0.32–0.37 change/site for pairwise comparisons among cattle, dog and human; however substitution rates vary across genomic regions and among different sequence classes. A neutral mutation rate (2.0–2.2 × 10(-9 change/site/year was derived from ancestral repetitive sequences, whereas the substitution rate in coding sequences (1.1 × 10(-9 change/site/year was approximately half of the overall rate (1.9–2.0 × 10(-9 change/site/year. Relative rate tests also indicated that cattle have a significantly faster rate of substitution as compared to dog and that this difference is about 6%. Conclusion This analysis provides a large-scale and unbiased assessment of genomic divergences and regional variation of substitution rates among cattle, dog and human. It is expected that these data will serve as a baseline for future mammalian molecular evolution studies.

  19. Parallel analysis of Arabidopsis circadian clock mutants reveals different scales of transcriptome and proteome regulation

    Science.gov (United States)

    Graf, Alexander; Coman, Diana; Walsh, Sean; Flis, Anna; Stitt, Mark; Gruissem, Wilhelm

    2017-01-01

    The circadian clock regulates physiological processes central to growth and survival. To date, most plant circadian clock studies have relied on diurnal transcriptome changes to elucidate molecular connections between the circadian clock and observable phenotypes in wild-type plants. Here, we have integrated RNA-sequencing and protein mass spectrometry data to comparatively analyse the lhycca1, prr7prr9, gi and toc1 circadian clock mutant rosette at the end of day and end of night. Each mutant affects specific sets of genes and proteins, suggesting that the circadian clock regulation is modular. Furthermore, each circadian clock mutant maintains its own dynamically fluctuating transcriptome and proteome profile specific to subcellular compartments. Most of the measured protein levels do not correlate with changes in their corresponding transcripts. Transcripts and proteins that have coordinated changes in abundance are enriched for carbohydrate- and cold-responsive genes. Transcriptome changes in all four circadian clock mutants also affect genes encoding starch degradation enzymes, transcription factors and protein kinases. The comprehensive transcriptome and proteome datasets demonstrate that future system-driven research of the circadian clock requires multi-level experimental approaches. Our work also shows that further work is needed to elucidate the roles of post-translational modifications and protein degradation in the regulation of clock-related processes. PMID:28250106

  20. Parallel analysis of Arabidopsis circadian clock mutants reveals different scales of transcriptome and proteome regulation.

    Science.gov (United States)

    Graf, Alexander; Coman, Diana; Uhrig, R Glen; Walsh, Sean; Flis, Anna; Stitt, Mark; Gruissem, Wilhelm

    2017-03-01

    The circadian clock regulates physiological processes central to growth and survival. To date, most plant circadian clock studies have relied on diurnal transcriptome changes to elucidate molecular connections between the circadian clock and observable phenotypes in wild-type plants. Here, we have integrated RNA-sequencing and protein mass spectrometry data to comparatively analyse the lhycca1, prr7prr9, gi and toc1 circadian clock mutant rosette at the end of day and end of night. Each mutant affects specific sets of genes and proteins, suggesting that the circadian clock regulation is modular. Furthermore, each circadian clock mutant maintains its own dynamically fluctuating transcriptome and proteome profile specific to subcellular compartments. Most of the measured protein levels do not correlate with changes in their corresponding transcripts. Transcripts and proteins that have coordinated changes in abundance are enriched for carbohydrate- and cold-responsive genes. Transcriptome changes in all four circadian clock mutants also affect genes encoding starch degradation enzymes, transcription factors and protein kinases. The comprehensive transcriptome and proteome datasets demonstrate that future system-driven research of the circadian clock requires multi-level experimental approaches. Our work also shows that further work is needed to elucidate the roles of post-translational modifications and protein degradation in the regulation of clock-related processes. © 2017 The Authors.

  1. Characterizing acetogenic metabolism using a genome-scale metabolic reconstruction of Clostridium ljungdahlii

    Energy Technology Data Exchange (ETDEWEB)

    Nagarajan, H; Sahin, M; Nogales, J; Latif, H; Lovley, DR; Ebrahim, A; Zengler, K

    2013-11-25

    Background: The metabolic capabilities of acetogens to ferment a wide range of sugars, to grow autotrophically on H-2/CO2, and more importantly on synthesis gas (H-2/CO/CO2) make them very attractive candidates as production hosts for biofuels and biocommodities. Acetogenic metabolism is considered one of the earliest modes of bacterial metabolism. A thorough understanding of various factors governing the metabolism, in particular energy conservation mechanisms, is critical for metabolic engineering of acetogens for targeted production of desired chemicals. Results: Here, we present the genome-scale metabolic network of Clostridium ljungdahlii, the first such model for an acetogen. This genome-scale model (iHN637) consisting of 637 genes, 785 reactions, and 698 metabolites captures all the major central metabolic and biosynthetic pathways, in particular pathways involved in carbon fixation and energy conservation. A combination of metabolic modeling, with physiological and transcriptomic data provided insights into autotrophic metabolism as well as aided the characterization of a nitrate reduction pathway in C. ljungdahlii. Analysis of the iHN637 metabolic model revealed that flavin based electron bifurcation played a key role in energy conservation during autotrophic growth and helped identify genes for some of the critical steps in this mechanism. Conclusions: iHN637 represents a predictive model that recapitulates experimental data, and provides valuable insights into the metabolic response of C. ljungdahlii to genetic perturbations under various growth conditions. Thus, the model will be instrumental in guiding metabolic engineering of C. ljungdahlii for the industrial production of biocommodities and biofuels.

  2. Endophytic life strategies decoded by genome and transcriptome analyses of the mutualistic root symbiont Piriformospora indica.

    Directory of Open Access Journals (Sweden)

    Alga Zuccaro

    2011-10-01

    Full Text Available Recent sequencing projects have provided deep insight into fungal lifestyle-associated genomic adaptations. Here we report on the 25 Mb genome of the mutualistic root symbiont Piriformospora indica (Sebacinales, Basidiomycota and provide a global characterization of fungal transcriptional responses associated with the colonization of living and dead barley roots. Extensive comparative analysis of the P. indica genome with other Basidiomycota and Ascomycota fungi that have diverse lifestyle strategies identified features typically associated with both, biotrophism and saprotrophism. The tightly controlled expression of the lifestyle-associated gene sets during the onset of the symbiosis, revealed by microarray analysis, argues for a biphasic root colonization strategy of P. indica. This is supported by a cytological study that shows an early biotrophic growth followed by a cell death-associated phase. About 10% of the fungal genes induced during the biotrophic colonization encoded putative small secreted proteins (SSP, including several lectin-like proteins and members of a P. indica-specific gene family (DELD with a conserved novel seven-amino acids motif at the C-terminus. Similar to effectors found in other filamentous organisms, the occurrence of the DELDs correlated with the presence of transposable elements in gene-poor repeat-rich regions of the genome. This is the first in depth genomic study describing a mutualistic symbiont with a biphasic lifestyle. Our findings provide a significant advance in understanding development of biotrophic plant symbionts and suggest a series of incremental shifts along the continuum from saprotrophy towards biotrophy in the evolution of mycorrhizal association from decomposer fungi.

  3. Genome-wide Mapping of Transcriptional Start Sites Defines an Extensive Leaderless Transcriptome in Mycobacterium tuberculosis

    Directory of Open Access Journals (Sweden)

    Teresa Cortes

    2013-11-01

    Full Text Available Deciphering physiological changes that mediate transition of Mycobacterium tuberculosis between replicating and nonreplicating states is essential to understanding how the pathogen can persist in an individual host for decades. We have combined RNA sequencing (RNA-seq of 5′ triphosphate-enriched libraries with regular RNA-seq to characterize the architecture and expression of M. tuberculosis promoters. We identified over 4,000 transcriptional start sites (TSSs. Strikingly, for 26% of the genes with a primary TSS, the site of transcriptional initiation overlapped with the annotated start codon, generating leaderless transcripts lacking a 5′ UTR and, hence, the Shine-Dalgarno sequence commonly used to initiate ribosomal engagement in eubacteria. Genes encoding proteins with active growth functions were markedly depleted from the leaderless transcriptome, and there was a significant increase in the overall representation of leaderless mRNAs in a starvation model of growth arrest. The high percentage of leaderless genes may have particular importance in the physiology of nonreplicating M. tuberculosis.

  4. Hyperlipidemia-associated gene variations and expression patterns revealed by whole-genome and transcriptome sequencing of rabbit models

    Science.gov (United States)

    Wang, Zhen; Zhang, Jifeng; Li, Hong; Li, Junyi; Niimi, Manabu; Ding, Guohui; Chen, Haifeng; Xu, Jie; Zhang, Hongjiu; Xu, Ze; Dai, Yulin; Gui, Tuantuan; Li, Shengdi; Liu, Zhi; Wu, Sujuan; Cao, Mushui; Zhou, Lu; Lu, Xingyu; Wang, Junxia; Yang, Jing; Fu, Yunhe; Yang, Dongshan; Song, Jun; Zhu, Tianqing; Li, Shen; Ning, Bo; Wang, Ziyun; Koike, Tomonari; Shiomi, Masashi; Liu, Enqi; Chen, Luonan; Fan, Jianglin; Chen, Y. Eugene; Li, Yixue

    2016-01-01

    The rabbit (Oryctolagus cuniculus) is an important experimental animal for studying human diseases, such as hypercholesterolemia and atherosclerosis. Despite this, genetic information and RNA expression profiling of laboratory rabbits are lacking. Here, we characterized the whole-genome variants of three breeds of the most popular experimental rabbits, New Zealand White (NZW), Japanese White (JW) and Watanabe heritable hyperlipidemic (WHHL) rabbits. Although the genetic diversity of WHHL rabbits was relatively low, they accumulated a large proportion of high-frequency deleterious mutations due to the small population size. Some of the deleterious mutations were associated with the pathophysiology of WHHL rabbits in addition to the LDLR deficiency. Furthermore, we conducted transcriptome sequencing of different organs of both WHHL and cholesterol-rich diet (Chol)-fed NZW rabbits. We found that gene expression profiles of the two rabbit models were essentially similar in the aorta, even though they exhibited different types of hypercholesterolemia. In contrast, Chol-fed rabbits, but not WHHL rabbits, exhibited pronounced inflammatory responses and abnormal lipid metabolism in the liver. These results provide valuable insights into identifying therapeutic targets of hypercholesterolemia and atherosclerosis with rabbit models. PMID:27245873

  5. Carotenoid metabolic profiling and transcriptome-genome mining reveal functional equivalence among blue-pigmented copepods and appendicularia

    KAUST Repository

    Mojib, Nazia

    2014-06-01

    The tropical oligotrophic oceanic areas are characterized by high water transparency and annual solar radiation. Under these conditions, a large number of phylogenetically diverse mesozooplankton species living in the surface waters (neuston) are found to be blue pigmented. In the present study, we focused on understanding the metabolic and genetic basis of the observed blue phenotype functional equivalence between the blue-pigmented organisms from the phylum Arthropoda, subclass Copepoda (Acartia fossae) and the phylum Chordata, class Appendicularia (Oikopleura dioica) in the Red Sea. Previous studies have shown that carotenoid–protein complexes are responsible for blue coloration in crustaceans. Therefore, we performed carotenoid metabolic profiling using both targeted and nontargeted (high-resolution mass spectrometry) approaches in four different blue-pigmented genera of copepods and one blue-pigmented species of appendicularia. Astaxanthin was found to be the principal carotenoid in all the species. The pathway analysis showed that all the species can synthesize astaxanthin from β-carotene, ingested from dietary sources, via 3-hydroxyechinenone, canthaxanthin, zeaxanthin, adonirubin or adonixanthin. Further, using de novo assembled transcriptome of blue A. fossae (subclass Copepoda), we identified highly expressed homologous β-carotene hydroxylase enzymes and putative carotenoid-binding proteins responsible for astaxanthin formation and the blue phenotype. In blue O. dioica (class Appendicularia), corresponding putative genes were identified from the reference genome. Collectively, our data provide molecular evidences for the bioconversion and accumulation of blue astaxanthin–protein complexes underpinning the observed ecological functional equivalence and adaptive convergence among neustonic mesozooplankton.

  6. Complete genome sequence and transcriptomic analysis of a novel marine strain Bacillus weihaiensis reveals the mechanism of brown algae degradation.

    Science.gov (United States)

    Zhu, Yueming; Chen, Peng; Bao, Yunjuan; Men, Yan; Zeng, Yan; Yang, Jiangang; Sun, Jibin; Sun, Yuanxia

    2016-11-30

    A novel marine strain representing efficient degradation ability toward brown algae was isolated, identified, and assigned to Bacillus weihaiensis Alg07. The alga-associated marine bacteria promote the nutrient cycle and perform important functions in the marine ecosystem. The de novo sequencing of the B. weihaiensis Alg07 genome was carried out. Results of gene annotation and carbohydrate-active enzyme analysis showed that the strain harbored enzymes that can completely degrade alginate and laminarin, which are the specific polysaccharides of brown algae. We also found genes for the utilization of mannitol, the major storage monosaccharide in the cell of brown algae. To understand the process of brown algae decomposition by B. weihaiensis Alg07, RNA-seq transcriptome analysis and qRT-PCR were performed. The genes involved in alginate metabolism were all up-regulated in the initial stage of kelp degradation, suggesting that the strain Alg07 first degrades alginate to destruct the cell wall so that the laminarin and mannitol are released and subsequently decomposed. The key genes involved in alginate and laminarin degradation were expressed in Escherichia coli and characterized. Overall, the model of brown algae degradation by the marine strain Alg07 was established, and novel alginate lyases and laminarinase were discovered.

  7. Genome-Wide Association and Transcriptome Analyses Reveal Candidate Genes Underlying Yield-determining Traits in Brassica napus

    Science.gov (United States)

    Lu, Kun; Peng, Liu; Zhang, Chao; Lu, Junhua; Yang, Bo; Xiao, Zhongchun; Liang, Ying; Xu, Xingfu; Qu, Cunmin; Zhang, Kai; Liu, Liezhao; Zhu, Qinlong; Fu, Minglian; Yuan, Xiaoyan; Li, Jiana

    2017-01-01

    Yield is one of the most important yet complex crop traits. To improve our understanding of the genetic basis of yield establishment, and to identify candidate genes responsible for yield improvement in Brassica napus, we performed genome-wide association studies (GWAS) for seven yield-determining traits [main inflorescence pod number (MIPN), branch pod number (BPN), pod number per plant (PNP), seed number per pod (SPP), thousand seed weight, main inflorescence yield (MIY), and branch yield], using data from 520 diverse B. napus accessions from two different yield environments. In total, we detected 128 significant single nucleotide polymorphisms (SNPs), 93 of which were revealed as novel by integrative analysis. A combination of GWAS and transcriptome sequencing on 21 haplotype blocks from samples pooled by four extremely high-yielding or low-yielding accessions revealed the differential expression of 14 crucial candiate genes (such as Bna.MYB83, Bna.SPL5, and Bna.ROP3) associated with multiple traits or containing multiple SNPs associated with the same trait. Functional annotation and expression pattern analyses further demonstrated that these 14 candiate genes might be important in developmental processes and biomass accumulation, thus affecting the yield establishment of B. napus. These results provide valuable information for understanding the genetic mechanisms underlying the establishment of high yield in B. napus, and lay the foundation for developing high-yielding B. napus varieties. PMID:28261256

  8. Genome-wide transcriptome analysis of the plant pathogen Xanthomonas identifies sRNAs with putative virulence functions

    Science.gov (United States)

    Schmidtke, Cornelius; Findeiß, Sven; Sharma, Cynthia M.; Kuhfuß, Juliane; Hoffmann, Steve; Vogel, Jörg; Stadler, Peter F.; Bonas, Ulla

    2012-01-01

    The Gram-negative plant-pathogenic bacterium Xanthomonas campestris pv. vesicatoria (Xcv) is an important model to elucidate the mechanisms involved in the interaction with the host. To gain insight into the transcriptome of the Xcv strain 85–10, we took a differential RNA sequencing (dRNA-seq) approach. Using a novel method to automatically generate comprehensive transcription start site (TSS) maps we report 1421 putative TSSs in the Xcv genome. Genes in Xcv exhibit a poorly conserved −10 promoter element and no consensus Shine-Dalgarno sequence. Moreover, 14% of all mRNAs are leaderless and 13% of them have unusually long 5′-UTRs. Northern blot analyses confirmed 16 intergenic small RNAs and seven cis-encoded antisense RNAs in Xcv. Expression of eight intergenic transcripts was controlled by HrpG and HrpX, key regulators of the Xcv type III secretion system. More detailed characterization identified sX12 as a small RNA that controls virulence of Xcv by affecting the interaction of the pathogen and its host plants. The transcriptional landscape of Xcv is unexpectedly complex, featuring abundant antisense transcripts, alternative TSSs and clade-specific small RNAs. PMID:22080557

  9. Genomic, Transcriptomic, and Proteomic Analysis Provide Insights Into the Cold Adaptation Mechanism of the Obligate Psychrophilic Fungus Mrakia psychrophila

    Directory of Open Access Journals (Sweden)

    Yao Su

    2016-11-01

    Full Text Available Mrakia psychrophila is an obligate psychrophilic fungus. The cold adaptation mechanism of psychrophilic fungi remains unknown. Comparative genomics analysis indicated that M. psychrophila had a specific codon usage preference, especially for codons of Gly and Arg and its major facilitator superfamily (MFS transporter gene family was expanded. Transcriptomic analysis revealed that genes involved in ribosome and energy metabolism were upregulated at 4°, while genes involved in unfolded protein binding, protein processing in the endoplasmic reticulum, proteasome, spliceosome, and mRNA surveillance were upregulated at 20°. In addition, genes related to unfolded protein binding were alternatively spliced. Consistent with other psychrophiles, desaturase and glycerol 3-phosphate dehydrogenase, which are involved in biosynthesis of unsaturated fatty acid and glycerol respectively, were upregulated at 4°. Cold adaptation of M. psychrophila is mediated by synthesizing unsaturated fatty acids to maintain membrane fluidity and accumulating glycerol as a cryoprotectant. The proteomic analysis indicated that the correlations between the dynamic patterns between transcript level changes and protein level changes for some pathways were positive at 4°, but negative at 20°. The death of M. psychrophila above 20° might be caused by an unfolded protein response.

  10. 13C metabolic flux analysis at a genome-scale.

    Science.gov (United States)

    Gopalakrishnan, Saratram; Maranas, Costas D

    2015-11-01

    Metabolic models used in 13C metabolic flux analysis generally include a limited number of reactions primarily from central metabolism. They typically omit degradation pathways, complete cofactor balances, and atom transition contributions for reactions outside central metabolism. This study addresses the impact on prediction fidelity of scaling-up mapping models to a genome-scale. The core mapping model employed in this study accounts for (75 reactions and 65 metabolites) primarily from central metabolism. The genome-scale metabolic mapping model (GSMM) (697 reaction and 595 metabolites) is constructed using as a basis the iAF1260 model upon eliminating reactions guaranteed not to carry flux based on growth and fermentation data for a minimal glucose growth medium. Labeling data for 17 amino acid fragments obtained from cells fed with glucose labeled at the second carbon was used to obtain fluxes and ranges. Metabolic fluxes and confidence intervals are estimated, for both core and genome-scale mapping models, by minimizing the sum of square of differences between predicted and experimentally measured labeling patterns using the EMU decomposition algorithm. Overall, we find that both topology and estimated values of the metabolic fluxes remain largely consistent between core and GSM model. Stepping up to a genome-scale mapping model leads to wider flux inference ranges for 20 key reactions present in the core model. The glycolysis flux range doubles due to the possibility of active gluconeogenesis, the TCA flux range expanded by 80% due to the availability of a bypass through arginine consistent with labeling data, and the transhydrogenase reaction flux was essentially unresolved due to the presence of as many as five routes for the inter-conversion of NADPH to NADH afforded by the genome-scale model. By globally accounting for ATP demands in the GSMM model the unused ATP decreased drastically with the lower bound matching the maintenance ATP requirement. A non

  11. The genome and transcriptomes of the anti-tumor agent Clostridium novyi-NT.

    Science.gov (United States)

    Bettegowda, Chetan; Huang, Xin; Lin, Jimmy; Cheong, Ian; Kohli, Manu; Szabo, Stephen A; Zhang, Xiaosong; Diaz, Luis A; Velculescu, Victor E; Parmigiani, Giovanni; Kinzler, Kenneth W; Vogelstein, Bert; Zhou, Shibin

    2006-12-01

    Bacteriolytic anti-cancer therapies employ attenuated bacterial strains that selectively proliferate within tumors. Clostridium novyi-NT spores represent one of the most promising of these agents, as they generate potent anti-tumor effects in experimental animals. We have determined the 2.55-Mb genomic sequence of C. novyi-NT, identifying a new type of transposition and 139 genes that do not have homologs in other bacteria. The genomic sequence was used to facilitate the detection of transcripts expressed at various stages of the life cycle of this bacterium in vitro as well as in infections of tumors in vivo. Through this analysis, we found that C. novyi-NT spores contained mRNA and that the spore transcripts were distinct from those in vegetative forms of the bacterium.

  12. Genomic and Transcriptomic Evidence for Carbohydrate Consumption among Microorganisms in a Cold Seep Brine Pool

    KAUST Repository

    Zhang, Weipeng

    2016-11-15

    The detailed lifestyle of microorganisms in deep-sea brine environments remains largely unexplored. Using a carefully calibrated genome binning approach, we reconstructed partial to nearly-complete genomes of 51 microorganisms in biofilms from the Thuwal cold seep brine pool of the Red Sea. The recovered metagenome-assembled genomes (MAGs) belong to six different phyla: Actinobacteria, Proteobacteria, Candidatus Cloacimonetes, Candidatus Marinimicrobia, Bathyarchaeota, and Thaumarchaeota. By comparison with close relatives of these microorganisms, we identified a number of unique genes associated with organic carbon metabolism and energy generation. These genes included various glycoside hydrolases, nitrate and sulfate reductases, putative bacterial microcompartment biosynthetic clusters (BMC), and F420H2 dehydrogenases. Phylogenetic analysis suggested that the acquisition of these genes probably occurred through horizontal gene transfer (HGT). Metatranscriptomics illustrated that glycoside hydrolases are among the most highly expressed genes. Our results suggest that the microbial inhabitants are well adapted to this brine environment, and anaerobic carbohydrate consumption mediated by glycoside hydrolases and electron transport systems (ETSs) is a dominant process performed by microorganisms from various phyla within this ecosystem.

  13. Combination analysis of genome-wide association and transcriptome sequencing of residual feed intake in quality chickens.

    Science.gov (United States)

    Xu, Zhenqiang; Ji, Congliang; Zhang, Yan; Zhang, Zhe; Nie, Qinghua; Xu, Jiguo; Zhang, Dexiang; Zhang, Xiquan

    2016-08-09

    Residual feed intake (RFI) is a powerful indicator for energy utilization efficiency and responds to selection. Low RFI selection enables a reduction in feed intake without affecting growth performance. However, the effective variants or major genes dedicated to phenotypic differences in RFI in quality chickens are unclear. Therefore, a genome-wide association study (GWAS) and RNA sequencing were performed on RFI to identify genetic variants and potential candidate genes associated with energy improvement. A lower average daily feed intake was found in low-RFI birds compared to high-RFI birds. The heritability of RFI measured from 44 to 83 d of age was 0.35. GWAS showed that 32 of the significant single nucleotide polymorphisms (SNPs) associated with the RFI (P < 10(-4)) accounted for 53.01 % of the additive genetic variance. More than half of the effective SNPs were located in a 1 Mb region (16.3-17.3 Mb) of chicken (Gallus gallus) chromosome (GGA) 12. Thus, focusing on this region should enable a deeper understanding of energy utilization. RNA sequencing was performed to profile the liver transcriptomes of four male chickens selected from the high and low tails of the RFI. One hundred and sixteen unique genes were identified as differentially expressed genes (DEGs). Some of these genes were relevant to appetite, cell activities, and fat metabolism, such as CCKAR, HSP90B1, and PCK1. Some potential genes within the 500 Kb flanking region of the significant RFI-related SNPs detected in GWAS (i.e., MGP, HIST1H110, HIST1H2A4L3, OC3, NR0B2, PER2, ST6GALNAC2, and G0S2) were also identified as DEGs in chickens with divergent RFIs. The GWAS findings showed that the 1 Mb narrow region of GGA12 should be important because it contained genes involved in energy-consuming processes, such as lipogenesis, social behavior, and immunity. Similar results were obtained in the transcriptome sequencing experiments. In general, low-RFI birds seemed to optimize energy employment

  14. Genome-scale constraint-based modeling of Geobacter metallireducens

    Directory of Open Access Journals (Sweden)

    Famili Iman

    2009-01-01

    Full Text Available Abstract Background Geobacter metallireducens was the first organism that can be grown in pure culture to completely oxidize organic compounds with Fe(III oxide serving as electron acceptor. Geobacter species, including G. sulfurreducens and G. metallireducens, are used for bioremediation and electricity generation from waste organic matter and renewable biomass. The constraint-based modeling approach enables the development of genome-scale in silico models that can predict the behavior of complex biological systems and their responses to the environments. Such a modeling approach was applied to provide physiological and ecological insights on the metabolism of G. metallireducens. Results The genome-scale metabolic model of G. metallireducens was constructed to include 747 genes and 697 reactions. Compared to the G. sulfurreducens model, the G. metallireducens metabolic model contains 118 unique reactions that reflect many of G. metallireducens' specific metabolic capabilities. Detailed examination of the G. metallireducens model suggests that its central metabolism contains several energy-inefficient reactions that are not present in the G. sulfurreducens model. Experimental biomass yield of G. metallireducens growing on pyruvate was lower than the predicted optimal biomass yield. Microarray data of G. metallireducens growing with benzoate and acetate indicated that genes encoding these energy-inefficient reactions were up-regulated by benzoate. These results suggested that the energy-inefficient reactions were likely turned off during G. metallireducens growth with acetate for optimal biomass yield, but were up-regulated during growth with complex electron donors such as benzoate for rapid energy generation. Furthermore, several computational modeling approaches were applied to accelerate G. metallireducens research. For example, growth of G. metallireducens with different electron donors and electron acceptors were studied using the genome-scale

  15. Transcriptome-scale similarities between mouse and human skeletal muscles with normal and myopathic phenotypes

    Directory of Open Access Journals (Sweden)

    Kang Peter B

    2006-03-01

    Full Text Available Abstract Background Mouse and human skeletal muscle transcriptome profiles vary by muscle type, raising the question of which mouse muscle groups have the greatest molecular similarities to human skeletal muscle. Methods Orthologous (whole, sub- transcriptome profiles were compared among four mouse-human transcriptome datasets: (M six muscle groups obtained from three mouse strains (wildtype, mdx, mdx5cv; (H1 biopsied human quadriceps from controls and Duchenne muscular dystrophy patients; (H2 four different control human muscle types obtained at autopsy; and (H3 12 different control human tissues (ten non-muscle. Results Of the six mouse muscles examined, mouse soleus bore the greatest molecular similarities to human skeletal muscles, independent of the latters' anatomic location/muscle type, disease state, age and sampling method (autopsy versus biopsy. Significant similarity to any one mouse muscle group was not observed for non-muscle human tissues (dataset H3, indicating this finding to be muscle specific. Conclusion This observation may be partly explained by the higher type I fiber content of soleus relative to the other mouse muscles sampled.

  16. Genome sequencing and comparative transcriptomics of the model entomopathogenic fungi Metarhizium anisopliae and M. acridum.

    Science.gov (United States)

    Gao, Qiang; Jin, Kai; Ying, Sheng-Hua; Zhang, Yongjun; Xiao, Guohua; Shang, Yanfang; Duan, Zhibing; Hu, Xiao; Xie, Xue-Qin; Zhou, Gang; Peng, Guoxiong; Luo, Zhibing; Huang, Wei; Wang, Bing; Fang, Weiguo; Wang, Sibao; Zhong, Yi; Ma, Li-Jun; St Leger, Raymond J; Zhao, Guo-Ping; Pei, Yan; Feng, Ming-Guang; Xia, Yuxian; Wang, Chengshu

    2011-01-06

    Metarhizium spp. are being used as environmentally friendly alternatives to chemical insecticides, as model systems for studying insect-fungus interactions, and as a resource of genes for biotechnology. We present a comparative analysis of the genome sequences of the broad-spectrum insect pathogen Metarhizium anisopliae and the acridid-specific M. acridum. Whole-genome analyses indicate that the genome structures of these two species are highly syntenic and suggest that the genus Metarhizium evolved from plant endophytes or pathogens. Both M. anisopliae and M. acridum have a strikingly larger proportion of genes encoding secreted proteins than other fungi, while ~30% of these have no functionally characterized homologs, suggesting hitherto unsuspected interactions between fungal pathogens and insects. The analysis of transposase genes provided evidence of repeat-induced point mutations occurring in M. acridum but not in M. anisopliae. With the help of pathogen-host interaction gene database, ~16% of Metarhizium genes were identified that are similar to experimentally verified genes involved in pathogenicity in other fungi, particularly plant pathogens. However, relative to M. acridum, M. anisopliae has evolved with many expanded gene families of proteases, chitinases, cytochrome P450s, polyketide synthases, and nonribosomal peptide synthetases for cuticle-degradation, detoxification, and toxin biosynthesis that may facilitate its ability to adapt to heterogeneous environments. Transcriptional analysis of both fungi during early infection processes provided further insights into the genes and pathways involved in infectivity and specificity. Of particular note, M. acridum transcribed distinct G-protein coupled receptors on cuticles from locusts (the natural hosts) and cockroaches, whereas M. anisopliae transcribed the same receptor on both hosts. This study will facilitate the identification of virulence genes and the development of improved biocontrol strains with

  17. Genomics and transcriptomics of Xanthomonas campestris species challenge the concept of core type III effectome.

    Science.gov (United States)

    Roux, Brice; Bolot, Stéphanie; Guy, Endrick; Denancé, Nicolas; Lautier, Martine; Jardinaud, Marie-Françoise; Fischer-Le Saux, Marion; Portier, Perrine; Jacques, Marie-Agnès; Gagnevin, Lionel; Pruvost, Olivier; Lauber, Emmanuelle; Arlat, Matthieu; Carrère, Sébastien; Koebnik, Ralf; Noël, Laurent D

    2015-11-18

    The bacterial species Xanthomonas campestris infects a wide range of Brassicaceae. Specific pathovars of this species cause black rot (pv. campestris), bacterial blight of stock (pv. incanae) or bacterial leaf spot (pv. raphani). In this study, we extended the genomic coverage of the species by sequencing and annotating the genomes of strains from pathovar incanae (CFBP 1606R and CFBP 2527R), pathovar raphani (CFBP 5828R) and a pathovar formerly named barbareae (CFBP 5825R). While comparative analyses identified a large core ORFeome at the species level, the core type III effectome was limited to only three putative type III effectors (XopP, XopF1 and XopAL1). In Xanthomonas, these effector proteins are injected inside the plant cells by the type III secretion system and contribute collectively to virulence. A deep and strand-specific RNA sequencing strategy was adopted in order to experimentally refine genome annotation for strain CFBP 5828R. This approach also allowed the experimental definition of novel ORFs and non-coding RNA transcripts. Using a constitutively active allele of hrpG, a master regulator of the type III secretion system, a HrpG-dependent regulon of 141 genes co-regulated with the type III secretion system was identified. Importantly, all these genes but seven are positively regulated by HrpG and 56 of those encode components of the Hrp type III secretion system and putative effector proteins. This dataset is an important resource to mine for novel type III effector proteins as well as for bacterial genes which could contribute to pathogenicity of X. campestris.

  18. Genomic and transcriptomic analysis of the AP2/ERF superfamily in Vitis vinifera

    Science.gov (United States)

    2010-01-01

    Background The AP2/ERF protein family contains transcription factors that play a crucial role in plant growth and development and in response to biotic and abiotic stress conditions in plants. Grapevine (Vitis vinifera) is the only woody crop whose genome has been fully sequenced. So far, no detailed expression profile of AP2/ERF-like genes is available for grapevine. Results An exhaustive search for AP2/ERF genes was carried out on the Vitis vinifera genome and their expression profile was analyzed by Real-Time quantitative PCR (qRT-PCR) in different vegetative and reproductive tissues and under two different ripening stages. One hundred and forty nine sequences, containing at least one ERF domain, were identified. Specific clusters within the AP2 and ERF families showed conserved expression patterns reminiscent of other species and grapevine specific trends related to berry ripening. Moreover, putative targets of group IX ERFs were identified by co-expression and protein similarity comparisons. Conclusions The grapevine genome contains an amount of AP2/ERF genes comparable to that of other dicot species analyzed so far. We observed an increase in the size of specific groups within the ERF family, probably due to recent duplication events. Expression analyses in different aerial tissues display common features previously described in other plant systems and introduce possible new roles for members of some ERF groups during fruit ripening. The presented analysis of AP2/ERF genes in grapevine provides the bases for studying the molecular regulation of berry development and the ripening process. PMID:21171999

  19. A genomic scale map of genetic diversity in Trypanosoma cruzi

    Directory of Open Access Journals (Sweden)

    Ackermann Alejandro A

    2012-12-01

    Full Text Available Abstract Background Trypanosoma cruzi, the causal agent of Chagas Disease, affects more than 16 million people in Latin America. The clinical outcome of the disease results from a complex interplay between environmental factors and the genetic background of both the human host and the parasite. However, knowledge of the genetic diversity of the parasite, is currently limited to a number of highly studied loci. The availability of a number of genomes from different evolutionary lineages of T. cruzi provides an unprecedented opportunity to look at the genetic diversity of the parasite at a genomic scale. Results Using a bioinformatic strategy, we have clustered T. cruzi sequence data available in the public domain and obtained multiple sequence alignments in which one or two alleles from the reference CL-Brener were included. These data covers 4 major evolutionary lineages (DTUs: TcI, TcII, TcIII, and the hybrid TcVI. Using these set of alignments we have identified 288,957 high quality single nucleotide polymorphisms and 1,480 indels. In a reduced re-sequencing study we were able to validate ~ 97% of high-quality SNPs identified in 47 loci. Analysis of how these changes affect encoded protein products showed a 0.77 ratio of synonymous to non-synonymous changes in the T. cruzi genome. We observed 113 changes that introduce or remove a stop codon, some causing significant functional changes, and a number of tri-allelic and tetra-allelic SNPs that could be exploited in strain typing assays. Based on an analysis of the observed nucleotide diversity we show that the T. cruzi genome contains a core set of genes that are under apparent purifying selection. Interestingly, orthologs of known druggable targets show statistically significant lower nucleotide diversity values. Conclusions This study provides the first look at the genetic diversity of T. cruzi at a genomic scale. The analysis covers an estimated ~ 60% of the genetic diversity present in the

  20. BFAST: an alignment tool for large scale genome resequencing.

    Directory of Open Access Journals (Sweden)

    Nils Homer

    Full Text Available BACKGROUND: The new generation of massively parallel DNA sequencers, combined with the challenge of whole human genome resequencing, result in the need for rapid and accurate alignment of billions of short DNA sequence reads to a large reference genome. Speed is obviously of great importance, but equally important is maintaining alignment accuracy of short reads, in the 25-100 base range, in the presence of errors and true biological variation. METHODOLOGY: We introduce a new algorithm specifically optimized for this task, as well as a freely available implementation, BFAST, which can align data produced by any of current sequencing platforms, allows for user-customizable levels of speed and accuracy, supports paired end data, and provides for efficient parallel and multi-threaded computation on a computer cluster. The new method is based on creating flexible, efficient whole genome indexes to rapidly map reads to candidate alignment locations, with arbitrary multiple independent indexes allowed to achieve robustness against read errors and sequence variants. The final local alignment uses a Smith-Waterman method, with gaps to support the detection of small indels. CONCLUSIONS: We compare BFAST to a selection of large-scale alignment tools -- BLAT, MAQ, SHRiMP, and SOAP -- in terms of both speed and accuracy, using simulated and real-world datasets. We show BFAST can achieve substantially greater sensitivity of alignment in the context of errors and true variants, especially insertions and deletions, and minimize false mappings, while maintaining adequate speed compared to other current methods. We show BFAST can align the amount of data needed to fully resequence a human genome, one billion reads, with high sensitivity and accuracy, on a modest computer cluster in less than 24 hours. BFAST is available at (http://bfast.sourceforge.net.

  1. Current state of genome-scale modeling in filamentous fungi.

    Science.gov (United States)

    Brandl, Julian; Andersen, Mikael R

    2015-06-01

    The group of filamentous fungi contains important species used in industrial biotechnology for acid, antibiotics and enzyme production. Their unique lifestyle turns these organisms into a valuable genetic reservoir of new natural products and biomass degrading enzymes that has not been used to full capacity. One of the major bottlenecks in the development of new strains into viable industrial hosts is the alteration of the metabolism towards optimal production. Genome-scale models promise a reduction in the time needed for metabolic engineering by predicting the most potent targets in silico before testing them in vivo. The increasing availability of high quality models and molecular biological tools for manipulating filamentous fungi renders the model-guided engineering of these fungal factories possible with comprehensive metabolic networks. A typical fungal model contains on average 1138 unique metabolic reactions and 1050 ORFs, making them a vast knowledge-base of fungal metabolism. In the present review we focus on the current state as well as potential future applications of genome-scale models in filamentous fungi.

  2. Modeling Lactococcus lactis using a genome-scale flux model

    Directory of Open Access Journals (Sweden)

    Nielsen Jens

    2005-06-01

    Full Text Available Abstract Background Genome-scale flux models are useful tools to represent and analyze microbial metabolism. In this work we reconstructed the metabolic network of the lactic acid bacteria Lactococcus lactis and developed a genome-scale flux model able to simulate and analyze network capabilities and whole-cell function under aerobic and anaerobic continuous cultures. Flux balance analysis (FBA and minimization of metabolic adjustment (MOMA were used as modeling frameworks. Results The metabolic network was reconstructed using the annotated genome sequence from L. lactis ssp. lactis IL1403 together with physiological and biochemical information. The established network comprised a total of 621 reactions and 509 metabolites, representing the overall metabolism of L. lactis. Experimental data reported in the literature was used to fit the model to phenotypic observations. Regulatory constraints had to be included to simulate certain metabolic features, such as the shift from homo to heterolactic fermentation. A minimal medium for in silico growth was identified, indicating the requirement of four amino acids in addition to a sugar. Remarkably, de novo biosynthesis of four other amino acids was observed even when all amino acids were supplied, which is in good agreement with experimental observations. Additionally, enhanced metabolic engineering strategies for improved diacetyl producing strains were designed. Conclusion The L. lactis metabolic network can now be used for a better understanding of lactococcal metabolic capabilities and potential, for the design of enhanced metabolic engineering strategies and for integration with other types of 'omic' data, to assist in finding new information on cellular organization and function.

  3. Genome-wide analysis of primary CD4+ and CD8+ T cell transcriptomes shows evidence for a network of enriched pathways associated with HIV disease

    Directory of Open Access Journals (Sweden)

    Wang Bin

    2011-03-01

    Full Text Available Abstract Background HIV preferentially infects CD4+ T cells, and the functional impairment and numerical decline of CD4+ and CD8+ T cells characterize HIV disease. The numerical decline of CD4+ and CD8+ T cells affects the optimal ratio between the two cell types necessary for immune regulation. Therefore, this work aimed to define the genomic basis of HIV interactions with the cellular transcriptome of both CD4+ and CD8+ T cells. Results Genome-wide transcriptomes of primary CD4+ and CD8+ T cells from HIV+ patients were analyzed at different stages of HIV disease using Illumina microarray. For each cell subset, pairwise comparisons were performed and differentially expressed (DE genes were identified (fold change >2 and B-statistic >0 followed by quantitative PCR validation. Gene ontology (GO analysis of DE genes revealed enriched categories of complement activation, actin filament, proteasome core and proton-transporting ATPase complex. By gene set enrichment analysis (GSEA, a network of enriched pathways functionally connected by mitochondria was identified in both T cell subsets as a transcriptional signature of HIV disease progression. These pathways ranged from metabolism and energy production (TCA cycle and OXPHOS to mitochondria meditated cell apoptosis and cell cycle dysregulation. The most unique and significant feature of our work was that the non-progressing status in HIV+ long-term non-progressors was associated with MAPK, WNT, and AKT pathways contributing to cell survival and anti-viral responses. Conclusions These data offer new comparative insights into HIV disease progression from the aspect of HIV-host interactions at the transcriptomic level, which will facilitate the understanding of the genetic basis of transcriptomic interaction of HIV in vivo and how HIV subverts the human gene machinery at the individual cell type level.

  4. Characterizing the developmental transcriptome of the oriental fruit fly, Bactrocera dorsalis (Diptera: Tephritidae) through comparative genomic analysis with Drosophila melanogaster utilizing modENCODE datasets.

    Science.gov (United States)

    Geib, Scott M; Calla, Bernarda; Hall, Brian; Hou, Shaobin; Manoukis, Nicholas C

    2014-10-28

    The oriental fruit fly, Bactrocera dorsalis, is an important pest of fruit and vegetable crops throughout Asia, and is considered a high risk pest for establishment in the mainland United States. It is a member of the family Tephritidae, which are the most agriculturally important family of flies, and can be considered an out-group to well-studied members of the family Drosophilidae. Despite their importance as pests and their relatedness to Drosophila, little information is present on B. dorsalis transcripts and proteins. The objective of this paper is to comprehensively characterize the transcripts present throughout the life history of B. dorsalis and functionally annotate and analyse these transcripts relative to the presence, expression, and function of orthologous sequences present in Drosophila melanogaster. We present a detailed transcriptome assembly of B. dorsalis from egg through adult stages containing 20,666 transcripts across 10,799 unigene components. Utilizing data available through Flybase and the modENCODE project, we compared expression patterns of these transcripts to putative orthologs in D. melanogaster in terms of timing, abundance, and function. In addition, temporal expression patterns in B. dorsalis were characterized between stages, to establish the constitutive or stage-specific expression patterns of particular transcripts. A fully annotated transcriptome assembly is made available through NCBI, in addition to corresponding expression data. Through characterizing the transcriptome of B. dorsalis through its life history and comparing the transcriptome of B. dorsalis to the model organism D. melanogaster, a database has been developed that can be used as the foundation to functional genomic research in Bactrocera flies and help identify orthologous genes between B. dorsalis and D. melanogaster. This data provides the foundation for future functional genomic research that will focus on improving our understanding of the physiology and

  5. Understanding PRRSV infection in porcine lung based on genome-wide transcriptome response identified by deep sequencing.

    Directory of Open Access Journals (Sweden)

    Shuqi Xiao

    Full Text Available Porcine reproductive and respiratory syndrome (PRRS has been one of the most economically important diseases affecting swine industry worldwide and causes great economic losses each year. PRRS virus (PRRSV replicates mainly in porcine alveolar macrophages (PAMs and dendritic cells (DCs and develops persistent infections, antibody-dependent enhancement (ADE, interstitial pneumonia and immunosuppression. But the molecular mechanisms of PRRSV infection still are poorly understood. Here we report on the first genome-wide host transcriptional responses to classical North American type PRRSV (N-PRRSV strain CH 1a infection using Solexa/Illumina's digital gene expression (DGE system, a tag-based high-throughput transcriptome sequencing method, and analyse systematically the relationship between pulmonary gene expression profiles after N-PRRSV infection and infection pathology. Our results suggest that N-PRRSV appeared to utilize multiple strategies for its replication and spread in infected pigs, including subverting host innate immune response, inducing an anti-apoptotic and anti-inflammatory state as well as developing ADE. Upregulation expression of virus-induced pro-inflammatory cytokines, chemokines, adhesion molecules and inflammatory enzymes and inflammatory cells, antibodies, complement activation were likely to result in the development of inflammatory responses during N-PRRSV infection processes. N-PRRSV-induced immunosuppression might be mediated by apoptosis of infected cells, which caused depletion of immune cells and induced an anti-inflammatory cytokine response in which they were unable to eradicate the primary infection. Our systems analysis will benefit for better understanding the molecular pathogenesis of N-PRRSV infection, developing novel antiviral therapies and identifying genetic components for swine resistance/susceptibility to PRRS.

  6. RNA-Seq analysis of Cocos nucifera: transcriptome sequencing and de novo assembly for subsequent functional genomics approaches.

    Directory of Open Access Journals (Sweden)

    Haikuo Fan

    Full Text Available BACKGROUND: Cocos nucifera (coconut, a member of the Arecaceae family, is an economically important woody palm grown in tropical regions. Despite its agronomic importance, previous germplasm assessment studies have relied solely on morphological and agronomical traits. Molecular biology techniques have been scarcely used in assessment of genetic resources and for improvement of important agronomic and quality traits in Cocos nucifera, mostly due to the absence of available sequence information. METHODOLOGY/PRINCIPAL FINDINGS: To provide basic information for molecular breeding and further molecular biological analysis in Cocos nucifera, we applied RNA-seq technology and de novo assembly to gain a global overview of the Cocos nucifera transcriptome from mixed tissue samples. Using Illumina sequencing, we obtained 54.9 million short reads and conducted de novo assembly to obtain 57,304 unigenes with an average length of 752 base pairs. Sequence comparison between assembled unigenes and released cDNA sequences of Cocos nucifera and Elaeis guineensis indicated that the assembled sequences were of high quality. Approximately 99.9% of unigenes were novel compared to the released coconut EST sequences. Using BLASTX, 68.2% of unigenes were successfully annotated based on the Genbank non-redundant (Nr protein database. The annotated unigenes were then further classified using the Gene Ontology (GO, Clusters of Orthologous Groups (COG and Kyoto Encyclopedia of Genes and Genomes (KEGG databases. CONCLUSIONS/SIGNIFICANCE: Our study provides a large quantity of novel genetic information for Cocos nucifera. This information will act as a valuable resource for further molecular genetic studies and breeding in coconut, as well as for isolation and characterization of functional genes involved in different biochemical pathways in this important tropical crop species.

  7. Comparison of the Mitochondrial Genomes and Steady State Transcriptomes of Two Strains of the Trypanosomatid Parasite, Leishmania tarentolae.

    Directory of Open Access Journals (Sweden)

    Larry Simpson

    Full Text Available U-insertion/deletion RNA editing is a post-transcriptional mitochondrial RNA modification phenomenon required for viability of trypanosomatid parasites. Small guide RNAs encoded mainly by the thousands of catenated minicircles contain the information for this editing. We analyzed by NGS technology the mitochondrial genomes and transcriptomes of two strains, the old lab UC strain and the recently isolated LEM125 strain. PacBio sequencing provided complete minicircle sequences which avoided the assembly problem of short reads caused by the conserved regions. Minicircles were identified by a characteristic size, the presence of three short conserved sequences, a region of inherently bent DNA and the presence of single gRNA genes at a fairly defined location. The LEM125 strain contained over 114 minicircles encoding different gRNAs and the UC strain only ~24 minicircles. Some LEM125 minicircles contained no identifiable gRNAs. Approximate copy numbers of the different minicircle classes in the network were determined by the number of PacBio CCS reads that assembled to each class. Mitochondrial RNA libraries from both strains were mapped against the minicircle and maxicircle sequences. Small RNA reads mapped to the putative gRNA genes but also to multiple regions outside the genes on both strands and large RNA reads mapped in many cases over almost the entire minicircle on both strands. These data suggest that minicircle transcription is complete and bidirectional, with 3' processing yielding the mature gRNAs. Steady state RNAs in varying abundances are derived from all maxicircle genes, including portions of the repetitive divergent region. The relative extents of editing in both strains correlated with the presence of a cascade of cognate gRNAs. These data should provide the foundation for a deeper understanding of this dynamic genetic system as well as the evolutionary variation of editing in different strains.

  8. Transcriptomic and genomic evidence for Streptococcus agalactiae adaptation to the bovine environment

    Science.gov (United States)

    2013-01-01

    Background Streptococcus agalactiae is a major cause of bovine mastitis, which is the dominant health disorder affecting milk production within the dairy industry and is responsible for substantial financial losses to the industry worldwide. However, there is considerable evidence for host adaptation (ecotypes) within S. agalactiae, with both bovine and human sourced isolates showing a high degree of distinctiveness, suggesting differing ability to cause mastitis. Here, we (i) generate RNAseq data from three S. agalactiae isolates (two putative bovine adapted and one human) and (ii) compare publicly available whole genome shotgun sequence data from an additional 202 isolates, obtained from six host species, to elucidate possible genetic factors/adaptations likely important for S. agalactiae growth and survival in the bovine mammary gland. Results Tests for differential expression showed distinct expression profiles for the three isolates when grown in bovine milk. A key finding for the two putatively bovine adapted isolates was the up regulation of a lactose metabolism operon (Lac.2) that was strongly correlated with the bovine environment (all 36 bovine sourced isolates on GenBank possessed the operon, in contrast to only 8/151 human sourced isolates). Multi locus sequence typing of all genome sequences and phylogenetic analysis using conserved operon genes from 44 S. agalactiae isolates and 16 additional Streptococcus species provided strong evidence for acquisition of the operon via multiple lateral gene transfer events, with all Streptococcus species known to be major causes of mastitis, identified as possible donors. Furthermore, lactose fermentation tests were only positive for isolates possessing Lac.2. Combined, these findings suggest that lactose metabolism is likely an important adaptation to the bovine environment. Additional up regulation in the bovine adapted isolates included genes involved in copper homeostasis, metabolism of purine, pyrimidine

  9. Transcriptomic and genomic analysis of cellulose fermentation by Clostridium thermocellum ATCC 27405

    Energy Technology Data Exchange (ETDEWEB)

    Raman, Babu [ORNL; McKeown, Catherine K [ORNL; Rodriguez, Jr., Miguel [ORNL; Brown, Steven D [ORNL; Mielenz, Jonathan R [ORNL

    2011-01-01

    The ability of Clostridium thermocellum ATCC 27405 wild-type strain to hydrolyze cellulose and ferment the degradation products directly to ethanol and other metabolic byproducts makes it an attractive candidate for consolidated bioprocessing of cellulosic biomass to biofuels. In this study, whole-genome microarrays were used to investigate the expression of C. thermocellum mRNA during growth on crystalline cellulose in controlled replicate batch fermentations. A time-series analysis of gene expression revealed changes in transcript levels of {approx}40% of genes ({approx}1300 out of 3198 ORFs encoded in the genome) during transition from early-exponential to late-stationary phase. K-means clustering of genes with statistically significant changes in transcript levels identified six distinct clusters of temporal expression. Broadly, genes involved in energy production, translation, glycolysis and amino acid, nucleotide and coenzyme metabolism displayed a decreasing trend in gene expression as cells entered stationary phase. In comparison, genes involved in cell structure and motility, chemotaxis, signal transduction and transcription showed an increasing trend in gene expression. Hierarchical clustering of cellulosome-related genes highlighted temporal changes in composition of this multi-enzyme complex during batch growth on crystalline cellulose, with increased expression of several genes encoding hydrolytic enzymes involved in degradation of non-cellulosic substrates in stationary phase. Overall, the results suggest that under low substrate availability, growth slows due to decreased metabolic potential and C. thermocellum alters its gene expression to (i) modulate the composition of cellulosomes that are released into the environment with an increased proportion of enzymes than can efficiently degrade plant polysaccharides other than cellulose, (ii) enhance signal transduction and chemotaxis mechanisms perhaps to sense the oligosaccharide hydrolysis products

  10. Low level genome mistranslations deregulate the transcriptome and translatome and generate proteotoxic stress in yeast

    Directory of Open Access Journals (Sweden)

    Paredes João A

    2012-06-01

    Full Text Available Abstract Background Organisms use highly accurate molecular processes to transcribe their genes and a variety of mRNA quality control and ribosome proofreading mechanisms to maintain intact the fidelity of genetic information flow. Despite this, low level gene translational errors induced by mutations and environmental factors cause neurodegeneration and premature death in mice and mitochondrial disorders in humans. Paradoxically, such errors can generate advantageous phenotypic diversity in fungi and bacteria through poorly understood molecular processes. Results In order to clarify the biological relevance of gene translational errors we have engineered codon misreading in yeast and used profiling of total and polysome-associated mRNAs, molecular and biochemical tools to characterize the recombinant cells. We demonstrate here that gene translational errors, which have negligible impact on yeast growth rate down-regulate protein synthesis, activate the unfolded protein response and environmental stress response pathways, and down-regulate chaperones linked to ribosomes. Conclusions We provide the first global view of transcriptional and post-transcriptional responses to global gene translational errors and we postulate that they cause gradual cell degeneration through synergistic effects of overloading protein quality control systems and deregulation of protein synthesis, but generate adaptive phenotypes in unicellular organisms through activation of stress cross-protection. We conclude that these genome wide gene translational infidelities can be degenerative or adaptive depending on cellular context and physiological condition.

  11. Genome-wide transcriptome analysis of gametophyte development in Physcomitrella patens

    Directory of Open Access Journals (Sweden)

    Xiao Lihong

    2011-12-01

    Full Text Available Abstract Background Regulation of gene expression plays a pivotal role in controlling the development of multicellular plants. To explore the molecular mechanism of plant developmental-stage transition and cell-fate determination, a genome-wide analysis was undertaken of sequential developmental time-points and individual tissue types in the model moss Physcomitrella patens because of the short life cycle and relative structural simplicity of this plant. Results Gene expression was analyzed by digital gene expression tag profiling of samples taken from P. patens protonema at 3, 14 and 24 days, and from leafy shoot tissues at 30 days, after protoplast isolation, and from 14-day-old caulonemal and chloronemal tissues. In total, 4333 genes were identified as differentially displayed. Among these genes, 4129 were developmental-stage specific and 423 were preferentially expressed in either chloronemal or caulonemal tissues. Most of the differentially displayed genes were assigned to functions in organic substance and energy metabolism or macromolecule biosynthetic and catabolic processes based on gene ontology descriptions. In addition, some regulatory genes identified as candidates might be involved in controlling the developmental-stage transition and cell differentiation, namely MYB-like, HB-8, AL3, zinc finger family proteins, bHLH superfamily, GATA superfamily, GATA and bZIP transcription factors, protein kinases, genes related to protein/amino acid methylation, and auxin, ethylene, and cytokinin signaling pathways. Conclusions These genes that show highly dynamic changes in expression during development in P. patens are potential targets for further functional characterization and evolutionary developmental biology studies.

  12. Dissection of the inflammatory bowel disease transcriptome using genome-wide cDNA microarrays.

    Directory of Open Access Journals (Sweden)

    Christine M Costello

    2005-08-01

    Full Text Available BACKGROUND: The differential pathophysiologic mechanisms that trigger and maintain the two forms of inflammatory bowel disease (IBD, Crohn disease (CD, and ulcerative colitis (UC are only partially understood. cDNA microarrays can be used to decipher gene regulation events at a genome-wide level and to identify novel unknown genes that might be involved in perpetuating inflammatory disease progression. METHODS AND FINDINGS: High-density cDNA microarrays representing 33,792 UniGene clusters were prepared. Biopsies were taken from the sigmoid colon of normal controls (n = 11, CD patients (n = 10 and UC patients (n = 10. 33P-radiolabeled cDNA from purified poly(A+ RNA extracted from biopsies (unpooled was hybridized to the arrays. We identified 500 and 272 transcripts differentially regulated in CD and UC, respectively. Interesting hits were independently verified by real-time PCR in a second sample of 100 individuals, and immunohistochemistry was used for exemplary localization. The main findings point to novel molecules important in abnormal immune regulation and the highly disturbed cell biology of colonic epithelial cells in IBD pathogenesis, e.g., CYLD (cylindromatosis, turban tumor syndrome and CDH11 (cadherin 11, type 2. By the nature of the array setup, many of the genes identified were to our knowledge previously uncharacterized, and prediction of the putative function of a subsection of these genes indicate that some could be involved in early events in disease pathophysiology. CONCLUSION: A comprehensive set of candidate genes not previously associated with IBD was revealed, which underlines the polygenic and complex nature of the disease. It points out substantial differences in pathophysiology between CD and UC. The multiple unknown genes identified may stimulate new research in the fields of barrier mechanisms and cell signalling in the context of IBD, and ultimately new therapeutic approaches.

  13. Current state of genome-scale modeling in filamentous fungi

    DEFF Research Database (Denmark)

    Brandl, Julian; Andersen, Mikael Rørdam

    2015-01-01

    The group of filamentous fungi contains important species used in industrial biotechnology for acid, antibiotics and enzyme production. Their unique lifestyle turns these organisms into a valuable genetic reservoir of new natural products and biomass degrading enzymes that has not been used to full...... testing them in vivo. The increasing availability of high quality models and molecular biological tools for manipulating filamentous fungi renders the model-guided engineering of these fungal factories possible with comprehensive metabolic networks. A typical fungal model contains on average 1138 unique...... metabolic reactions and 1050 ORFs, making them a vast knowledge-base of fungal metabolism. In the present review we focus on the current state as well as potential future applications of genome-scale models in filamentous fungi....

  14. Next-generation genome-scale models for metabolic engineering

    DEFF Research Database (Denmark)

    King, Zachary A.; Lloyd, Colton J.; Feist, Adam M.

    2015-01-01

    Constraint-based reconstruction and analysis (COBRA) methods have become widely used tools for metabolic engineering in both academic and industrial laboratories. By employing a genome-scale in silico representation of the metabolic network of a host organism, COBRA methods can be used to predict...... optimal genetic modifications that improve the rate and yield of chemical production. A new generation of COBRA models and methods is now being developed. -. encompassing many biological processes and simulation strategies. -. and next-generation models enable new types of predictions. Here, three key...... examples of applying COBRA methods to strain optimization are presented and discussed. Then, an outlook is provided on the next generation of COBRA models and the new types of predictions they will enable for systems metabolic engineering....

  15. Next-generation genome-scale models for metabolic engineering.

    Science.gov (United States)

    King, Zachary A; Lloyd, Colton J; Feist, Adam M; Palsson, Bernhard O

    2015-12-01

    Constraint-based reconstruction and analysis (COBRA) methods have become widely used tools for metabolic engineering in both academic and industrial laboratories. By employing a genome-scale in silico representation of the metabolic network of a host organism, COBRA methods can be used to predict optimal genetic modifications that improve the rate and yield of chemical production. A new generation of COBRA models and methods is now being developed--encompassing many biological processes and simulation strategies-and next-generation models enable new types of predictions. Here, three key examples of applying COBRA methods to strain optimization are presented and discussed. Then, an outlook is provided on the next generation of COBRA models and the new types of predictions they will enable for systems metabolic engineering.

  16. Multi-scale coding of genomic information: From DNA sequence to genome structure and function

    Energy Technology Data Exchange (ETDEWEB)

    Arneodo, Alain, E-mail: alain.arneodo@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); Vaillant, Cedric, E-mail: cedric.vaillant@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); Audit, Benjamin, E-mail: benjamin.audit@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); Argoul, Francoise, E-mail: francoise.argoul@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); D' Aubenton-Carafa, Yves, E-mail: daubenton@cgm.cnrs-gif.f [Centre de Genetique Moleculaire, CNRS, Allee de la Terrasse, 91198 Gif-sur-Yvette (France); Thermes, Claude, E-mail: claude.thermes@cgm.cnrs-gif.f [Centre de Genetique Moleculaire, CNRS, Allee de la Terrasse, 91198 Gif-sur-Yvette (France)

    2011-02-15

    Understanding how chromatin is spatially and dynamically organized in the nucleus of eukaryotic cells and how this affects genome functions is one of the main challenges of cell biology. Since the different orders of packaging in the hierarchical organization of DNA condition the accessibility of DNA sequence elements to trans-acting factors that control the transcription and replication processes, there is actually a wealth of structural and dynamical information to learn in the primary DNA sequence. In this review, we show that when using concepts, methodologies, numerical and experimental techniques coming from statistical mechanics and nonlinear physics combined with wavelet-based multi-scale signal processing, we are able to decipher the multi-scale sequence encoding of chromatin condensation-decondensation mechanisms that play a fundamental role in regulating many molecular processes involved in nuclear functions.

  17. Transcriptional Slippage and RNA Editing Increase the Diversity of Transcripts in Chloroplasts: Insight from Deep Sequencing of Vigna radiata Genome and Transcriptome.

    Directory of Open Access Journals (Sweden)

    Ching-Ping Lin

    Full Text Available We performed deep sequencing of the nuclear and organellar genomes of three mungbean genotypes: Vigna radiata ssp. sublobata TC1966, V. radiata var. radiata NM92 and the recombinant inbred line RIL59 derived from a cross between TC1966 and NM92. Moreover, we performed deep sequencing of the RIL59 transcriptome to investigate transcript variability. The mungbean chloroplast genome has a quadripartite structure including a pair of inverted repeats separated by two single copy regions. A total of 213 simple sequence repeats were identified in the chloroplast genomes of NM92 and RIL59; 78 single nucleotide variants and nine indels were discovered in comparing the chloroplast genomes of TC1966 and NM92. Analysis of the mungbean chloroplast transcriptome revealed mRNAs that were affected by transcriptional slippage and RNA editing. Transcriptional slippage frequency was positively correlated with the length of simple sequence repeats of the mungbean chloroplast genome (R2=0.9911. In total, 41 C-to-U editing sites were found in 23 chloroplast genes and in one intergenic spacer. No editing site that swapped U to C was found. A combination of bioinformatics and experimental methods revealed that the plastid-encoded RNA polymerase-transcribed genes psbF and ndhA are affected by transcriptional slippage in mungbean and in main lineages of land plants, including three dicots (Glycine max, Brassica rapa, and Nicotiana tabacum, two monocots (Oryza sativa and Zea mays, two gymnosperms (Pinus taeda and Ginkgo biloba and one moss (Physcomitrella patens. Transcript analysis of the rps2 gene showed that transcriptional slippage could affect transcripts at single sequence repeat regions with poly-A runs. It showed that transcriptional slippage together with incomplete RNA editing may cause sequence diversity of transcripts in chloroplasts of land plants.

  18. Large-scale transcriptome data reveals transcriptional activity of fission yeast LTR retrotransposons

    DEFF Research Database (Denmark)

    Mourier, Tobias; Willerslev, Eske

    2010-01-01

    transcriptional activity from Long Terminal Repeat (LTR) retrotransposons. LTR retrotransposons are normally flanked by two LTR sequences. However, the majority of LTR sequences in S. pombe exist as solitary LTRs, i.e. as single terminal repeat sequences not flanking a retrotransposon. Transcriptional activity...... of transcriptional activity are observed from both strands of solitary LTR sequences. Transcriptome data collected during meiosis suggests that transcription of solitary LTRs is correlated with the transcription of nearby protein-coding genes. CONCLUSIONS: Presumably, the host organism negatively regulates...... proliferation of LTR retrotransposons. The finding of considerable transcriptional activity of retrotransposons suggests that part of this regulation is likely to take place at a posttranscriptional level. Alternatively, the transcriptional activity may signify a hitherto unrecognized activity level...

  19. Large-scale parallel genome assembler over cloud computing environment.

    Science.gov (United States)

    Das, Arghya Kusum; Koppa, Praveen Kumar; Goswami, Sayan; Platania, Richard; Park, Seung-Jong

    2017-06-01

    The size of high throughput DNA sequencing data has already reached the terabyte scale. To manage this huge volume of data, many downstream sequencing applications started using locality-based computing over different cloud infrastructures to take advantage of elastic (pay as you go) resources at a lower cost. However, the locality-based programming model (e.g. MapReduce) is relatively new. Consequently, developing scalable data-intensive bioinformatics applications using this model and understanding the hardware environment that these applications require for good performance, both require further research. In this paper, we present a de Bruijn graph oriented Parallel Giraph-based Genome Assembler (GiGA), as well as the hardware platform required for its optimal performance. GiGA uses the power of Hadoop (MapReduce) and Giraph (large-scale graph analysis) to achieve high scalability over hundreds of compute nodes by collocating the computation and data. GiGA achieves significantly higher scalability with competitive assembly quality compared to contemporary parallel assemblers (e.g. ABySS and Contrail) over traditional HPC cluster. Moreover, we show that the performance of GiGA is significantly improved by using an SSD-based private cloud infrastructure over traditional HPC cluster. We observe that the performance of GiGA on 256 cores of this SSD-based cloud infrastructure closely matches that of 512 cores of traditional HPC cluster.

  20. Genome-wide transcriptomic analysis of the response to nitrogen limitation in Streptomyces coelicolor A3(2

    Directory of Open Access Journals (Sweden)

    Efthimiou Georgios

    2011-03-01

    Full Text Available Abstract Background The present study represents a genome-wide transcriptomic analysis of the response of the model streptomycete Streptomyces coelicolor A3(2 M145 to fermentor culture in Modified Evans Media limited, respectively, for nitrogen, phosphate and carbon undertaken as part of the ActinoGEN consortium to provide a publicly available reference microarray dataset. Findings A microarray dataset using samples from two replicate cultures for each nutrient limitation was generated. In this report our analysis has focused on the genes which are significantly differentially expressed, as determined by Rank Products Analysis, between samples from matched time points correlated by growth phase for the three pairs of differently limited culture datasets. With a few exceptions, genes are only significantly differentially expressed between the N6/N7 time points and their corresponding time points in the C and P-limited cultures, with the vast majority of the differentially expressed genes being more highly expressed in the N-limited cultures. Our analysis of these genes indicated expression of several members of the GlnR regulon are induced upon nitrogen limitation, as assayed for by [NH4+] measurements, and we are able to identify several additional genes not present in the GlnR regulon whose expression is induced in response to nitrogen limitation. We also note SCO3327 which encodes a small protein (32 amino acid residues unusually rich in the basic amino acids lysine (31.25% and arginine (25% is significantly differentially expressed in the nitrogen limited cultures. Additionally, we investigate the expression of known members of the GlnR regulon and the relationship between gene organization and expression for the SCO2486-SCO2487 and SCO5583-SCO5585 operons. Conclusions We provide a list of genes whose expression is differentially expressed in low nitrogen culture conditions, including a putative nitrogen storage protein encoded by SCO3327

  1. Genomic, Transcriptomic and Metabolomic Studies of Two Well-Characterized, Laboratory-Derived Vancomycin-Intermediate Staphylococcus aureus Strains Derived from the Same Parent Strain

    Directory of Open Access Journals (Sweden)

    Dipti S. Hattangady

    2015-02-01

    Full Text Available Complete genome comparisons, transcriptomic and metabolomic studies were performed on two laboratory-selected, well-characterized vancomycin-intermediate Staphylococcus aureus (VISA derived from the same parent MRSA that have changes in cell wall composition and decreased autolysis. A variety of mutations were found in the VISA, with more in strain 13136p−m+V20 (vancomycin MIC = 16 µg/mL than strain 13136p−m+V5 (MIC = 8 µg/mL. Most of the mutations have not previously been associated with the VISA phenotype; some were associated with cell wall metabolism and many with stress responses, notably relating to DNA damage. The genomes and transcriptomes of the two VISA support the importance of gene expression regulation to the VISA phenotype. Similarities in overall transcriptomic and metabolomic data indicated that the VISA physiologic state includes elements of the stringent response, such as downregulation of protein and nucleotide synthesis, the pentose phosphate pathway and nutrient transport systems. Gene expression for secreted virulence determinants was generally downregulated, but was more variable for surface-associated virulence determinants, although capsule formation was clearly inhibited. The importance of activated stress response elements could be seen across all three analyses, as in the accumulation of osmoprotectant metabolites such as proline and glutamate. Concentrations of potential cell wall precursor amino acids and glucosamine were increased in the VISA strains. Polyamines were decreased in the VISA, which may facilitate the accrual of mutations. Overall, the studies confirm the wide variability in mutations and gene expression patterns that can lead to the VISA phenotype.

  2. Optimizing Hybrid de Novo Transcriptome Assembly and Extending Genomic Resources for Giant Freshwater Prawns (Macrobrachium rosenbergii): The Identification of Genes and Markers Associated with Reproduction.

    Science.gov (United States)

    Jung, Hyungtaek; Yoon, Byung-Ha; Kim, Woo-Jin; Kim, Dong-Wook; Hurwood, David A; Lyons, Russell E; Salin, Krishna R; Kim, Heui-Soo; Baek, Ilseon; Chand, Vincent; Mather, Peter B

    2016-05-07

    The giant freshwater prawn, Macrobrachium rosenbergii, a sexually dimorphic decapod crustacean is currently the world's most economically important cultured freshwater crustacean species. Despite its economic importance, there is currently a lack of genomic resources available for this species, and this has limited exploration of the molecular mechanisms that control the M. rosenbergii sex-differentiation system more widely in freshwater prawns. Here, we present the first hybrid transcriptome from M. rosenbergii applying RNA-Seq technologies directed at identifying genes that have potential functional roles in reproductive-related traits. A total of 13,733,210 combined raw reads (1720 Mbp) were obtained from Ion-Torrent PGM and 454 FLX. Bioinformatic analyses based on three state-of-the-art assemblers, the CLC Genomic Workbench, Trans-ABySS, and Trinity, that use single and multiple k-mer methods respectively, were used to analyse the data. The influence of multiple k-mers on assembly performance was assessed to gain insight into transcriptome assembly from short reads. After optimisation, de novo assembly resulted in 44,407 contigs with a mean length of 437 bp, and the assembled transcripts were further functionally annotated to detect single nucleotide polymorphisms and simple sequence repeat motifs. Gene expression analysis was also used to compare expression patterns from ovary and testis tissue libraries to identify genes with potential roles in reproduction and sex differentiation. The large transcript set assembled here represents the most comprehensive set of transcriptomic resources ever developed for reproduction traits in M. rosenbergii, and the large number of genetic markers predicted should constitute an invaluable resource for future genetic research studies on M. rosenbergii and can be applied more widely on other freshwater prawn species in the genus Macrobrachium.

  3. Optimizing Hybrid de Novo Transcriptome Assembly and Extending Genomic Resources for Giant Freshwater Prawns (Macrobrachium rosenbergii: The Identification of Genes and Markers Associated with Reproduction

    Directory of Open Access Journals (Sweden)

    Hyungtaek Jung

    2016-05-01

    Full Text Available The giant freshwater prawn, Macrobrachium rosenbergii, a sexually dimorphic decapod crustacean is currently the world’s most economically important cultured freshwater crustacean species. Despite its economic importance, there is currently a lack of genomic resources available for this species, and this has limited exploration of the molecular mechanisms that control the M. rosenbergii sex-differentiation system more widely in freshwater prawns. Here, we present the first hybrid transcriptome from M. rosenbergii applying RNA-Seq technologies directed at identifying genes that have potential functional roles in reproductive-related traits. A total of 13,733,210 combined raw reads (1720 Mbp were obtained from Ion-Torrent PGM and 454 FLX. Bioinformatic analyses based on three state-of-the-art assemblers, the CLC Genomic Workbench, Trans-ABySS, and Trinity, that use single and multiple k-mer methods respectively, were used to analyse the data. The influence of multiple k-mers on assembly performance was assessed to gain insight into transcriptome assembly from short reads. After optimisation, de novo assembly resulted in 44,407 contigs with a mean length of 437 bp, and the assembled transcripts were further functionally annotated to detect single nucleotide polymorphisms and simple sequence repeat motifs. Gene expression analysis was also used to compare expression patterns from ovary and testis tissue libraries to identify genes with potential roles in reproduction and sex differentiation. The large transcript set assembled here represents the most comprehensive set of transcriptomic resources ever developed for reproduction traits in M. rosenbergii, and the large number of genetic markers predicted should constitute an invaluable resource for future genetic research studies on M. rosenbergii and can be applied more widely on other freshwater prawn species in the genus Macrobrachium.

  4. Genome-wide transcriptome analyses of silicon metabolism in Phaeodactylum tricornutum reveal the multilevel regulation of silicic acid transporters.

    Directory of Open Access Journals (Sweden)

    Guillaume Sapriel

    Full Text Available BACKGROUND: Diatoms are largely responsible for production of biogenic silica in the global ocean. However, in surface seawater, Si(OH(4 can be a major limiting factor for diatom productivity. Analyzing at the global scale the genes networks involved in Si transport and metabolism is critical in order to elucidate Si biomineralization, and to understand diatoms contribution to biogeochemical cycles. METHODOLOGY/PRINCIPAL FINDINGS: Using whole genome expression analyses we evaluated the transcriptional response to Si availability for the model species Phaeodactylum tricornutum. Among the differentially regulated genes we found genes involved in glutamine-nitrogen pathways, encoding putative extracellular matrix components, or involved in iron regulation. Some of these compounds may be good candidates for intracellular intermediates involved in silicic acid storage and/or intracellular transport, which are very important processes that remain mysterious in diatoms. Expression analyses and localization studies gave the first picture of the spatial distribution of a silicic acid transporter in a diatom model species, and support the existence of transcriptional and post-transcriptional regulations. CONCLUSIONS/SIGNIFICANCE: Our global analyses revealed that about one fourth of the differentially expressed genes are organized in clusters, underlying a possible evolution of P. tricornutum genome, and perhaps other pennate diatoms, toward a better optimization of its response to variable environmental stimuli. High fitness and adaptation of diatoms to various Si levels in marine environments might arise in part by global regulations from gene (expression level to genomic (organization in clusters, dosage compensation by gene duplication, and by post-transcriptional regulation and spatial distribution of SIT proteins.

  5. Advances in Swine Transcriptomics

    Directory of Open Access Journals (Sweden)

    Christopher K. Tuggle , Yanfang Wang, Oliver Couture

    2007-01-01

    Full Text Available The past five years have seen a tremendous rise in porcine transcriptomic data. Available porcine Expressed Sequence Tags (ESTs have expanded greatly, with over 623,000 ESTs deposited in Genbank. ESTs have been used to expand the pig-human comparative maps, but such data has also been used in many ways to understand pig gene expression. Several methods have been used to identify genes differentially expressed (DE in specific tissues or cell types under different treatments. These include open screening methods such as suppression subtractive hybridization, differential display, serial analysis of gene expression, and EST sequence frequency, as well as closed methods that measure expression of a defined set of sequences such as hybridization to membrane arrays and microarrays. The use of microarrays to begin large-scale transcriptome analysis has been recently reported, using either specialized or broad-coverage arrays. This review covers published results using the above techniques in the pig, as well as unpublished data provided by the research community, and reports on unpublished Affymetrix data from our group. Published and unpublished bioinformatics efforts are discussed, including recent work by our group to integrate two broad-coverage microarray platforms. We conclude by predicting experiments that will become possible with new anticipated tools and data, including the porcine genome sequence. We emphasize that the need for bioinformatics infrastructure to efficiently store and analyze the expanding amounts of gene expression data is critical, and that this deficit has emerged as a limiting factor for acceleration of genomic understanding in the pig.

  6. CFMDS: CUDA-based fast multidimensional scaling for genome-scale data.

    Science.gov (United States)

    Park, Sungin; Shin, Soo-Yong; Hwang, Kyu-Baek

    2012-01-01

    Multidimensional scaling (MDS) is a widely used approach to dimensionality reduction. It has been applied to feature selection and visualization in various areas. Among diverse MDS methods, the classical MDS is a simple and theoretically sound solution for projecting data objects onto a low dimensional space while preserving the original distances among them as much as possible. However, it is not trivial to apply it to genome-scale data (e.g., microarray gene expression profiles) on regular desktop computers, because of its high computational complexity. We implemented a highly-efficient software application, called CFMDS (CUDA-based Fast MultiDimensional Scaling), which produces an approximate solution of the classical MDS based on CUDA (compute unified device architecture) and the divide-and-conquer principle. CUDA is a parallel computing architecture exploiting the power of the GPU (graphics processing unit). The principle of divide-and-conquer was adopted for circumventing the small memory problem of usual graphics cards. Our application software has been tested on various benchmark datasets including microarrays and compared with the classical MDS algorithms implemented using C# and MATLAB. In our experiments, CFMDS was more than a hundred times faster for large data than such general solutions. Regarding the quality of dimensionality reduction, our approximate solutions were as good as those from the general solutions, as the Pearson's correlation coefficients between them were larger than 0.9. CFMDS is an expeditious solution for the data dimensionality reduction problem. It is especially useful for efficient processing of genome-scale data consisting of several thousands of objects in several minutes.

  7. Genome-scale mRNA and small RNA transcriptomic insights into initiation of citrus apomixis

    Science.gov (United States)

    Long, Jian-Mei; Liu, Zheng; Wu, Xiao-Meng; Fang, Yan-Ni; Jia, Hui-Hui; Xie, Zong-Zhou; Deng, Xiu-Xin; Guo, Wen-Wu

    2016-01-01

    Nucellar embryony (NE) is an adventitious form of apomixis common in citrus, wherein asexual embryos initiate directly from nucellar cells surrounding the embryo sac. NE enables the fixation of desirable agronomic traits and the production of clonal offspring of virus-free rootstock, but impedes progress in hybrid breeding. In spite of the great importance of NE in citrus breeding and commercial production, little is understood about the underlying molecular mechanisms. In this study, the stages of nucellar embryo initiation (NEI) were determined for two polyembryonic citrus cultivars via histological observation. To explore the genes and regulatory pathways involved in NEI, we performed mRNA-seq and sRNA-seq analyses of ovules immediately prior to and at stages during NEI in the two pairs of cultivars. A total of 305 differentially expressed genes (DEGs) were identified between the poly- and monoembryonic ovules. Gene ontology (GO) analysis revealed that several processes are significantly enriched based on DEGs. In particular, response to stress, and especially response to oxidative stress, was over-represented in polyembryonic ovules. Nearly 150 miRNAs, comprising ~90 conserved and ~60 novel miRNAs, were identified in the ovules of either cultivar pair. Only two differentially expressed miRNAs (DEMs) were identified, of which the novel miRN23-5p was repressed whereas the targets accumulated in the polyembryonic ovules. This integrated study on the transcriptional and post-transcriptional regulatory profiles between poly- and monoembryonic citrus ovules provides new insights into the mechanism of NE, which should contribute to revealing the regulatory mechanisms of plant apomixis. PMID:27619233

  8. The genome and transcriptome of Trichormus sp. NMC-1: insights into adaptation to extreme environments on the Qinghai-Tibet Plateau

    Science.gov (United States)

    Qiao, Qin; Huang, Yanyan; Qi, Ji; Qu, Mingzhi; Jiang, Chen; Lin, Pengcheng; Li, Renhui; Song, Lirong; Yonezawa, Takahiro; Hasegawa, Masami; Crabbe, M. James C.; Chen, Fan; Zhang, Ticao; Zhong, Yang

    2016-01-01

    The Qinghai-Tibet Plateau (QTP) has the highest biodiversity for an extreme environment worldwide, and provides an ideal natural laboratory to study adaptive evolution. In this study, we generated a draft genome sequence of cyanobacteria Trichormus sp. NMC-1 in the QTP and performed whole transcriptome sequencing under low temperature to investigate the genetic mechanism by which T. sp. NMC-1 adapted to the specific environment. Its genome sequence was 5.9 Mb with a G+C content of 39.2% and encompassed a total of 5362 CDS. A phylogenomic tree indicated that this strain belongs to the Trichormus and Anabaena cluster. Genome comparison between T. sp. NMC-1 and six relatives showed that functionally unknown genes occupied a much higher proportion (28.12%) of the T. sp. NMC-1 genome. In addition, functions of specific, significant positively selected, expanded orthogroups, and differentially expressed genes involved in signal transduction, cell wall/membrane biogenesis, secondary metabolite biosynthesis, and energy production and conversion were analyzed to elucidate specific adaptation traits. Further analyses showed that the CheY-like genes, extracellular polysaccharide and mycosporine-like amino acids might play major roles in adaptation to harsh environments. Our findings indicate that sophisticated genetic mechanisms are involved in cyanobacterial adaptation to the extreme environment of the QTP. PMID:27381465

  9. Genome-scale genetic engineering in Escherichia coli.

    Science.gov (United States)

    Jeong, Jaehwan; Cho, Namjin; Jung, Daehee; Bang, Duhee

    2013-11-01

    Genome engineering has been developed to create useful strains for biological studies and industrial uses. However, a continuous challenge remained in the field: technical limitations in high-throughput screening and precise manipulation of strains. Today, technical improvements have made genome engineering more rapid and efficient. This review introduces recent advances in genome engineering technologies applied to Escherichia coli as well as multiplex automated genome engineering (MAGE), a recent technique proposed as a powerful toolkit due to its straightforward process, rapid experimental procedures, and highly efficient properties.

  10. Progress in prokaryotic transcriptomics.

    Science.gov (United States)

    Filiatrault, Melanie J

    2011-10-01

    Genome-wide expression studies transformed the field of transcriptomics and made it feasible to study global gene expression in extraordinary detail. These new methods have revealed an enhanced view of the transcriptional landscape and have yielded many biological insights. It is increasingly clear that the prokaryotic transcriptome is much more complex than once thought. Recent advances in microbial transcriptome analyses are highlighted in this review. Areas of progress include the development of optimized techniques that minimize the abundance of ribosomal RNAs in RNA samples as well as the development of novel methods to create transcriptome libraries. Advances such as these have led to a new emphasis in areas such as metatranscriptomics and single cell gene expression studies. Published by Elsevier Ltd.

  11. Transcriptomics resources of human tissues and organs

    DEFF Research Database (Denmark)

    Uhlén, Mathias; Hallström, Björn M.; Lindskog, Cecilia

    2016-01-01

    a framework for defining the molecular constituents of the human body as well as for generating comprehensive lists of proteins expressed across tissues or in a tissue-restricted manner. Here, we review publicly available human transcriptome resources and discuss body-wide data from independent genome......Quantifying the differential expression of genes in various human organs, tissues, and cell types is vital to understand human physiology and disease. Recently, several large-scale transcriptomics studies have analyzed the expression of protein-coding genes across tissues. These datasets provide...

  12. Transcriptome profiling and comparative analysis of Panax ginseng adventitious roots

    Directory of Open Access Journals (Sweden)

    Murukarthick Jayakodi

    2014-10-01

    Conclusion: This study will provide a comprehensive insight into the transcriptome of ginseng adventitious roots, and a way for successful transcriptome analysis and profiling of resource plants with less genomic information. The transcriptome profiling data generated in this study are available in our newly created adventitious root transcriptome database (http://im-crop.snu.ac.kr/transdb/index.php for public use.

  13. Combining different mRNA capture methods to analyze the transcriptome: analysis of the Xenopus laevis transcriptome.

    Directory of Open Access Journals (Sweden)

    Michael D Blower

    Full Text Available mRNA sequencing (mRNA-seq is a commonly used technique to survey gene expression from organisms with fully sequenced genomes. Successful mRNA-seq requires purification of mRNA away from the much more abundant ribosomal RNA, which is typically accomplished by oligo-dT selection. However, mRNAs with short poly-A tails are captured poorly by oligo-dT based methods. We demonstrate that combining mRNA capture via oligo-dT with mRNA capture by the 5' 7-methyl guanosine cap provides a more complete view of the transcriptome and can be used to assay changes in mRNA poly-A tail length on a genome-wide scale. We also show that using mRNA-seq reads from both capture methods as input for de novo assemblers provides a more complete reconstruction of the transcriptome than either method used alone. We apply these methods of mRNA capture and de novo assembly to the transcriptome of Xenopus laevis, a well-studied frog that currently lacks a finished sequenced genome, to discover transcript sequences for thousands of mRNAs that are currently absent from public databases. The methods we describe here will be broadly applicable to many organisms and will provide insight into the transcriptomes of organisms with sequenced and unsequenced genomes.

  14. Using a genome-scale metabolic network model to elucidate the mechanism of chloroquine action in Plasmodium falciparum

    Directory of Open Access Journals (Sweden)

    Shivendra G. Tewari

    2017-08-01

    Full Text Available Chloroquine, long the default first-line treatment against malaria, is now abandoned in large parts of the world because of widespread drug-resistance in Plasmodium falciparum. In spite of its importance as a cost-effective and efficient drug, a coherent understanding of the cellular mechanisms affected by chloroquine and how they influence the fitness and survival of the parasite remains elusive. Here, we used a systems biology approach to integrate genome-scale transcriptomics to map out the effects of chloroquine, identify targeted metabolic pathways, and translate these findings into mechanistic insights. Specifically, we first developed a method that integrates transcriptomic and metabolomic data, which we independently validated against a recently published set of such data for Krebs-cycle mutants of P. falciparum. We then used the method to calculate the effect of chloroquine treatment on the metabolic flux profiles of P. falciparum during the intraerythrocytic developmental cycle. The model predicted dose-dependent inhibition of DNA replication, in agreement with earlier experimental results for both drug-sensitive and drug-resistant P. falciparum strains. Our simulations also corroborated experimental findings that suggest differences in chloroquine sensitivity between ring- and schizont-stage P. falciparum. Our analysis also suggests that metabolic fluxes that govern reduced thioredoxin and phosphoenolpyruvate synthesis are significantly decreased and are pivotal to chloroquine-based inhibition of P. falciparum DNA replication. The consequences of impaired phosphoenolpyruvate synthesis and redox metabolism are reduced carbon fixation and increased oxidative stress, respectively, both of which eventually facilitate killing of the parasite. Our analysis suggests that a combination of chloroquine (or an analogue and another drug, which inhibits carbon fixation and/or increases oxidative stress, should increase the clearance of P

  15. An integrated transcriptomics-guided genome-wide promoter analysis and next-generation proteomics approach to mine factor(s) regulating cellular differentiation

    Science.gov (United States)

    Mandal, Kamal; Bader, Samuel L.; Kumar, Pankaj; Malakar, Dipankar; Campbell, David S.; Pradhan, Bhola Shankar; Sarkar, Rajesh K.; Wadhwa, Neerja; Sensharma, Souvik; Jain, Vaibhav; Moritz, Robert L.

    2017-01-01

    Abstract Differential next-generation-omics approaches aid in the visualization of biological processes and pave the way for divulging important events and/or interactions leading to a functional output at cellular or systems level. To this end, we undertook an integrated Nextgen transcriptomics and proteomics approach to divulge differential gene expression of infant and pubertal rat Sertoli cells (Sc).Unlike, pubertal Sc, infant Sc are immature and fail to support spermatogenesis. We found exclusive association of 14 and 19 transcription factor binding sites to infantile and pubertal states of Sc, respectively, using differential transcriptomics-guided genome-wide computational analysis of relevant promoters employing 220 Positional Weight Matrices from the TRANSFAC database. Proteomic SWATH-MS analysis provided extensive quantification of nuclear and cytoplasmic protein fractions revealing 1,670 proteins differentially located between the nucleus and cytoplasm of infant Sc and 890 proteins differentially located within those of pubertal Sc. Based on our multi-omics approach, the transcription factor YY1 was identified as one of the lead candidates regulating differentiation of Sc.YY1 was found to have abundant binding sites on promoters of genes upregulated during puberty. To determine its significance, we generated transgenic rats with Sc specific knockdown of YY1 that led to compromised spermatogenesis. PMID:28065881

  16. Large-scale genomic 2D visualization reveals extensive CG-AT skew correlation in bird genomes

    Directory of Open Access Journals (Sweden)

    Deng Xuemei

    2007-11-01

    Full Text Available Abstract Background Bird genomes have very different compositional structure compared with other warm-blooded animals. The variation in the base skew rules in the vertebrate genomes remains puzzling, but it must relate somehow to large-scale genome evolution. Current research is inclined to relate base skew with mutations and their fixation. Here we wish to explore base skew correlations in bird genomes, to develop methods for displaying and quantifying such correlations at different scales, and to discuss possible explanations for the peculiarities of the bird genomes in skew correlation. Results We have developed a method called Base Skew Double Triangle (BSDT for exhibiting the genome-scale change of AT/CG skew as a two-dimensional square picture, showing base skews at many scales simultaneously in a single image. By this method we found that most chicken chromosomes have high AT/CG skew correlation (symmetry in 2D picture, except for some microchromosomes. No other organisms studied (18 species show such high skew correlations. This visualized high correlation was validated by three kinds of quantitative calculations with overlapping and non-overlapping windows, all indicating that chicken and birds in general have a special genome structure. Similar features were also found in some of the mammal genomes, but clearly much weaker than in chickens. We presume that the skew correlation feature evolved near the time that birds separated from other vertebrate lineages. When we eliminated the repeat sequences from the genomes, the AT and CG skews correlation increased for some mammal genomes, but were still clearly lower than in chickens. Conclusion Our results suggest that BSDT is an expressive visualization method for AT and CG skew and enabled the discovery of the very high skew correlation in bird genomes; this peculiarity is worth further study. Computational analysis indicated that this correlation might be a compositional characteristic

  17. Whole genome and transcriptome analyses of environmental antibiotic sensitive and multi-resistant Pseudomonas aeruginosa isolates exposed to waste water and tap water.

    Science.gov (United States)

    Schwartz, Thomas; Armant, Olivier; Bretschneider, Nancy; Hahn, Alexander; Kirchen, Silke; Seifert, Martin; Dötsch, Andreas

    2015-01-01

    The fitness of sensitive and resistant Pseudomonas aeruginosa in different aquatic environments depends on genetic capacities and transcriptional regulation. Therefore, an antibiotic-sensitive isolate PA30 and a multi-resistant isolate PA49 originating from waste waters were compared via whole genome and transcriptome Illumina sequencing after exposure to municipal waste water and tap water. A number of different genomic islands (e.g. PAGIs, PAPIs) were identified in the two environmental isolates beside the highly conserved core genome. Exposure to tap water and waste water exhibited similar transcriptional impacts on several gene clusters (antibiotic and metal resistance, genetic mobile elements, efflux pumps) in both environmental P. aeruginosa isolates. The MexCD-OprJ efflux pump was overexpressed in PA49 in response to waste water. The expression of resistance genes, genetic mobile elements in PA49 was independent from the water matrix. Consistently, the antibiotic sensitive strain PA30 did not show any difference in expression of the intrinsic resistance determinants and genetic mobile elements. Thus, the exposure of both isolates to polluted waste water and oligotrophic tap water resulted in similar expression profiles of mentioned genes. However, changes in environmental milieus resulted in rather unspecific transcriptional responses than selected and stimuli-specific gene regulation. © 2014 The Authors. Microbial Biotechnology published by John Wiley & Sons Ltd and Society for Applied Microbiology.

  18. Analysis of the genome and transcriptome of Cryptococcus neoformans var. grubii reveals complex RNA expression and microevolution leading to virulence attenuation.

    Directory of Open Access Journals (Sweden)

    Guilhem Janbon

    2014-04-01

    Full Text Available Cryptococcus neoformans is a pathogenic basidiomycetous yeast responsible for more than 600,000 deaths each year. It occurs as two serotypes (A and D representing two varieties (i.e. grubii and neoformans, respectively. Here, we sequenced the genome and performed an RNA-Seq-based analysis of the C. neoformans var. grubii transcriptome structure. We determined the chromosomal locations, analyzed the sequence/structural features of the centromeres, and identified origins of replication. The genome was annotated based on automated and manual curation. More than 40,000 introns populating more than 99% of the expressed genes were identified. Although most of these introns are located in the coding DNA sequences (CDS, over 2,000 introns in the untranslated regions (UTRs were also identified. Poly(A-containing reads were employed to locate the polyadenylation sites of more than 80% of the genes. Examination of the sequences around these sites revealed a new poly(A-site-associated motif (AUGHAH. In addition, 1,197 miscRNAs were identified. These miscRNAs can be spliced and/or polyadenylated, but do not appear to have obvious coding capacities. Finally, this genome sequence enabled a comparative analysis of strain H99 variants obtained after laboratory passage. The spectrum of mutations identified provides insights into the genetics underlying the micro-evolution of a laboratory strain, and identifies mutations involved in stress responses, mating efficiency, and virulence.

  19. Sex and parasites: genomic and transcriptomic analysis of Microbotryum lychnidis-dioicae, the biotrophic and plant-castrating anther smut fungus.

    Science.gov (United States)

    Perlin, Michael H; Amselem, Joelle; Fontanillas, Eric; Toh, Su San; Chen, Zehua; Goldberg, Jonathan; Duplessis, Sebastien; Henrissat, Bernard; Young, Sarah; Zeng, Qiandong; Aguileta, Gabriela; Petit, Elsa; Badouin, Helene; Andrews, Jared; Razeeq, Dominique; Gabaldón, Toni; Quesneville, Hadi; Giraud, Tatiana; Hood, Michael E; Schultz, David J; Cuomo, Christina A

    2015-06-16

    The genus Microbotryum includes plant pathogenic fungi afflicting a wide variety of hosts with anther smut disease. Microbotryum lychnidis-dioicae infects Silene latifolia and replaces host pollen with fungal spores, exhibiting biotrophy and necrosis associated with altering plant development. We determined the haploid genome sequence for M. lychnidis-dioicae and analyzed whole transcriptome data from plant infections and other stages of the fungal lifecycle, revealing the inventory and expression level of genes that facilitate pathogenic growth. Compared to related fungi, an expanded number of major facilitator superfamily transporters and secretory lipases were detected; lipase gene expression was found to be altered by exposure to lipid compounds, which signaled a switch to dikaryotic, pathogenic growth. In addition, while enzymes to digest cellulose, xylan, xyloglucan, and highly substituted forms of pectin were absent, along with depletion of peroxidases and superoxide dismutases that protect the fungus from oxidative stress, the repertoire of glycosyltransferases and of enzymes that could manipulate host development has expanded. A total of 14% of the genome was categorized as repetitive sequences. Transposable elements have accumulated in mating-type chromosomal regions and were also associated across the genome with gene clusters of small secreted proteins, which may mediate host interactions. The unique absence of enzyme classes for plant cell wall degradation and maintenance of enzymes that break down components of pollen tubes and flowers provides a striking example of biotrophic host adaptation.

  20. Transcriptome Analysis of Two Vicia sativa Subspecies: Mining Molecular Markers to Enhance Genomic Resources for Vetch Improvement

    Directory of Open Access Journals (Sweden)

    Tae-Sung Kim

    2015-11-01

    Full Text Available The vetch (Vicia sativa is one of the most important annual forage legumes globally due to its multiple uses and high nutritional content. Despite these agronomical benefits, many drawbacks, including cyano-alanine toxin, has reduced the agronomic value of vetch varieties. Here, we used 454 technology to sequence the two V. sativa subspecies (ssp. sativa and ssp. nigra to enrich functional information and genetic marker resources for the vetch research community. A total of 86,532 and 47,103 reads produced 35,202 and 18,808 unigenes with average lengths of 735 and 601 bp for V. sativa sativa and V. sativa nigra, respectively. Gene Ontology annotations and the cluster of orthologous gene classes were used to annotate the function of the Vicia transcriptomes. The Vicia transcriptome sequences were then mined for simple sequence repeat (SSR and single nucleotide polymorphism (SNP markers. About 13% and 3% of the Vicia unigenes contained the putative SSR and SNP sequences, respectively. Among those SSRs, 100 were chosen for the validation and the polymorphism test using the Vicia germplasm set. Thus, our approach takes advantage of the utility of transcriptomic data to expedite a vetch breeding program.

  1. Incorporating Protein Biosynthesis into the Saccharomyces cerevisiae Genome-scale Metabolic Model

    DEFF Research Database (Denmark)

    Olivares Hernandez, Roberto

    Based on stoichiometric biochemical equations that occur into the cell, the genome-scale metabolic models can quantify the metabolic fluxes, which are regarded as the final representation of the physiological state of the cell. For Saccharomyces Cerevisiae the genome scale model has been......, translation initiation, translation elongation, translation termination, translation elongation, and mRNA decay. Considering these information from the mechanisms of transcription and translation, we will include this stoichiometric reactions into the genome scale model for S. Cerevisiae to obtain the first...

  2. Rapid genome-scale mapping of chromatin accessibility in tissue

    DEFF Research Database (Denmark)

    Grøntved, Lars; Bandle, Russell; John, Sam;

    2012-01-01

    BACKGROUND: The challenge in extracting genome-wide chromatin features from limiting clinical samples poses a significant hurdle in identification of regulatory marks that impact the physiological or pathological state. Current methods that identify nuclease accessible chromatin are reliant...

  3. Transcriptomes of the desiccation-tolerant resurrection plant Craterostigma plantagineum.

    Science.gov (United States)

    Rodriguez, Maria C Suarez; Edsgärd, Daniel; Hussain, Syed S; Alquezar, David; Rasmussen, Morten; Gilbert, Thomas; Nielsen, Bjørn H; Bartels, Dorothea; Mundy, John

    2010-07-01

    Studies of the resurrection plant Craterostigma plantagineum have revealed some of the mechanisms which these desiccation-tolerant plants use to survive environments with extreme dehydration and restricted seasonal water. Most resurrection plants are polyploid with large genomes, which has hindered efforts to obtain whole genome sequences and perform mutational analysis. However, the application of deep sequencing technologies to transcriptomics now permits large-scale analyses of gene expression patterns despite the lack of a reference genome. Here we use pyro-sequencing to characterize the transcriptomes of C. plantagineum leaves at four stages of dehydration and rehydration. This reveals that genes involved in several pathways, such as those required for vitamin K and thiamin biosynthesis, are tightly regulated at the level of gene expression. Our analysis also provides a comprehensive picture of the array of cellular responses controlled by gene expression that allow resurrection plants to survive desiccation.

  4. De novo transcriptome of the Hemimetabolous German cockroach (Blattella germanica.

    Directory of Open Access Journals (Sweden)

    Xiaojie Zhou

    Full Text Available BACKGROUND: The German cockroach, Blattella germanica, is an important insect pest that transmits various pathogens mechanically and causes severe allergic diseases. This insect has long served as a model system for studies of insect biology, physiology and ecology. However, the lack of genome or transcriptome information heavily hinder our further understanding about the German cockroach in every aspect at a molecular level and on a genome-wide scale. To explore the transcriptome and identify unique sequences of interest, we subjected the B. germanica transcriptome to massively parallel pyrosequencing and generated the first reference transcriptome for B. germanica. METHODOLOGY/PRINCIPAL FINDINGS: A total of 1,365,609 raw reads with an average length of 529 bp were generated via pyrosequencing the mixed cDNA library from different life stages of German cockroach including maturing oothecae, nymphs, adult females and males. The raw reads were de novo assembled to 48,800 contigs and 3,961 singletons with high-quality unique sequences. These sequences were annotated and classified functionally in terms of BLAST, GO and KEGG, and the genes putatively coding detoxification enzyme systems, insecticide targets, key components in systematic RNA interference, immunity and chemoreception pathways were identified. A total of 3,601 SSRs (Simple Sequence Repeats loci were also predicted. CONCLUSIONS/SIGNIFICANCE: The whole transcriptome pyrosequencing data from this study provides a usable genetic resource for future identification of potential functional genes involved in various biological processes.

  5. Genome-scale engineering for systems and synthetic biology

    OpenAIRE

    Esvelt, Kevin Michael; Wang, Harris H.

    2013-01-01

    Genome-modification technologies enable the rational engineering and perturbation of biological systems. Historically, these methods have been limited to gene insertions or mutations at random or at a few pre-defined locations across the genome. The handful of methods capable of targeted gene editing suffered from low efficiencies, significant labor costs, or both. Recent advances have dramatically expanded our ability to engineer cells in a directed and combinatorial manner. Here, we review ...

  6. The first whole genome and transcriptome of the cinereous vulture reveals adaptation in the gastric and immune defense systems and possible convergent evolution between the Old and New World vultures.

    Science.gov (United States)

    Chung, Oksung; Jin, Seondeok; Cho, Yun Sung; Lim, Jeongheui; Kim, Hyunho; Jho, Sungwoong; Kim, Hak-Min; Jun, JeHoon; Lee, HyeJin; Chon, Alvin; Ko, Junsu; Edwards, Jeremy; Weber, Jessica A; Han, Kyudong; O'Brien, Stephen J; Manica, Andrea; Bhak, Jong; Paek, Woon Kee

    2015-10-21

    The cinereous vulture, Aegypius monachus, is the largest bird of prey and plays a key role in the ecosystem by removing carcasses, thus preventing the spread of diseases. Its feeding habits force it to cope with constant exposure to pathogens, making this species an interesting target for discovering functionally selected genetic variants. Furthermore, the presence of two independently evolved vulture groups, Old World and New World vultures, provides a natural experiment in which to investigate convergent evolution due to obligate scavenging. We sequenced the genome of a cinereous vulture, and mapped it to the bald eagle reference genome, a close relative with a divergence time of 18 million years. By comparing the cinereous vulture to other avian genomes, we find positively selected genetic variations in this species associated with respiration, likely linked to their ability of immune defense responses and gastric acid secretion, consistent with their ability to digest carcasses. Comparisons between the Old World and New World vulture groups suggest convergent gene evolution. We assemble the cinereous vulture blood transcriptome from a second individual, and annotate genes. Finally, we infer the demographic history of the cinereous vulture which shows marked fluctuations in effective population size during the late Pleistocene. We present the first genome and transcriptome analyses of the cinereous vulture compared to other avian genomes and transcriptomes, revealing genetic signatures of dietary and environmental adaptations accompanied by possible convergent evolution between the Old World and New World vultures.

  7. Genome-based microbial ecology of anammox granules in a full-scale wastewater treatment system

    NARCIS (Netherlands)

    Speth, D.; Zandt, M.H. In 't; Guerrero-Cruz, S.; Dutilh, B.E.; Jetten, M.S.M

    2016-01-01

    Partial-nitritation anammox (PNA) is a novel wastewater treatment procedure for energy-efficient ammonium removal. Here we use genome-resolved metagenomics to build a genome-based ecological model of the microbial community in a full-scale PNA reactor. Sludge from the bioreactor examined here is use

  8. Genome-based microbial ecology of anammox granules in a full-scale wastewater treatment system

    NARCIS (Netherlands)

    Speth, D.; Zandt, M.H. In 't; Guerrero-Cruz, S.; Dutilh, B.E.; Jetten, M.S.M

    2016-01-01

    Partial-nitritation anammox (PNA) is a novel wastewater treatment procedure for energy-efficient ammonium removal. Here we use genome-resolved metagenomics to build a genome-based ecological model of the microbial community in a full-scale PNA reactor. Sludge from the bioreactor examined here is

  9. Adaptation and evolution of deep-sea scale worms (Annelida: Polynoidae): insights from transcriptome comparison with a shallow-water species

    Science.gov (United States)

    Zhang, Yanjie; Sun, Jin; Chen, Chong; Watanabe, Hiromi K.; Feng, Dong; Zhang, Yu; Chiu, Jill M. Y.; Qian, Pei-Yuan; Qiu, Jian-Wen

    2017-04-01

    Polynoid scale worms (Polynoidae, Annelida) invaded deep-sea chemosynthesis-based ecosystems approximately 60 million years ago, but little is known about their genetic adaptation to the extreme deep-sea environment. In this study, we reported the first two transcriptomes of deep-sea polynoids (Branchipolynoe pettiboneae, Lepidonotopodium sp.) and compared them with the transcriptome of a shallow-water polynoid (Harmothoe imbricata). We determined codon and amino acid usage, positive selected genes, highly expressed genes and putative duplicated genes. Transcriptome assembly produced 98,806 to 225,709 contigs in the three species. There were more positively charged amino acids (i.e., histidine and arginine) and less negatively charged amino acids (i.e., aspartic acid and glutamic acid) in the deep-sea species. There were 120 genes showing clear evidence of positive selection. Among the 10% most highly expressed genes, there were more hemoglobin genes with high expression levels in both deep-sea species. The duplicated genes related to DNA recombination and metabolism, and gene expression were only enriched in deep-sea species. Deep-sea scale worms adopted two strategies of adaptation to hypoxia in the chemosynthesis-based habitats (i.e., rapid evolution of tetra-domain hemoglobin in Branchipolynoe or high expression of single-domain hemoglobin in Lepidonotopodium sp.).

  10. Structural genomics of eukaryotic targets at a laboratory scale.

    Science.gov (United States)

    Busso, Didier; Poussin-Courmontagne, Pierre; Rosé, David; Ripp, Raymond; Litt, Alain; Thierry, Jean-Claude; Moras, Dino

    2005-01-01

    Structural genomics programs are distributed worldwide and funded by large institutions such as the NIH in United-States, the RIKEN in Japan or the European Commission through the SPINE network in Europe. Such initiatives, essentially managed by large consortia, led to technology and method developments at the different steps required to produce biological samples compatible with structural studies. Besides specific applications, method developments resulted mainly upon miniaturization and parallelization. The challenge that academic laboratories faces to pursue structural genomics programs is to produce, at a higher rate, protein samples. The Structural Biology and Genomics Department (IGBMC - Illkirch - France) is implicated in a structural genomics program of high eukaryotes whose goal is solving crystal structures of proteins and their complexes (including large complexes) related to human health and biotechnology. To achieve such a challenging goal, the Department has established a medium-throughput pipeline for producing protein samples suitable for structural biology studies. Here, we describe the setting up of our initiative from cloning to crystallization and we demonstrate that structural genomics may be manageable by academic laboratories by strategic investments in robotic and by adapting classical bench protocols and new developments, in particular in the field of protein expression, to parallelization.

  11. GenoMetric Query Language: a novel approach to large-scale genomic data management.

    Science.gov (United States)

    Masseroli, Marco; Pinoli, Pietro; Venco, Francesco; Kaitoua, Abdulrahman; Jalili, Vahid; Palluzzi, Fernando; Muller, Heiko; Ceri, Stefano

    2015-06-15

    Improvement of sequencing technologies and data processing pipelines is rapidly providing sequencing data, with associated high-level features, of many individual genomes in multiple biological and clinical conditions. They allow for data-driven genomic, transcriptomic and epigenomic characterizations, but require state-of-the-art 'big data' computing strategies, with abstraction levels beyond available tool capabilities. We propose a high-level, declarative GenoMetric Query Language (GMQL) and a toolkit for its use. GMQL operates downstream of raw data preprocessing pipelines and supports queries over thousands of heterogeneous datasets and samples; as such it is key to genomic 'big data' analysis. GMQL leverages a simple data model that provides both abstractions of genomic region data and associated experimental, biological and clinical metadata and interoperability between many data formats. Based on Hadoop framework and Apache Pig platform, GMQL ensures high scalability, expressivity, flexibility and simplicity of use, as demonstrated by several biological query examples on ENCODE and TCGA datasets. The GMQL toolkit is freely available for non-commercial use at http://www.bioinformatics.deib.polimi.it/GMQL/. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  12. Analysis of Aspergillus nidulans metabolism at the genome-scale

    DEFF Research Database (Denmark)

    David, Helga; Ozcelik, İlknur Ş; Hofmann, Gerald

    2008-01-01

    Background: Aspergillus nidulans is a member of a diverse group of filamentous fungi, sharing many of the properties of its close relatives with significance in the fields of medicine, agriculture and industry. Furthermore, A. nidulans has been a classical model organism for studies of development...... biology and gene regulation, and thus it has become one of the best-characterized filamentous fungi. It was the first Aspergillus species to have its genome sequenced, and automated gene prediction tools predicted 9,451 open reading frames (ORFs) in the genome, of which less than 10% were assigned...

  13. Large-scale genomic analysis of ovarian carcinomas.

    Science.gov (United States)

    Gorringe, Kylie L; Campbell, Ian G

    2009-04-01

    Epithelial ovarian cancers are typified by frequent genomic aberrations that have been difficult to unravel. Recently, high-resolution array technologies have provided the first glimpse of the remarkable complexity of these aberrations with some ovarian cancers containing hundreds of copy number breakpoints, micro-deletions and amplifications. Many of these alterations contain cancer-related genes suggesting that the majority is disease-associated and not just the product of random genomic instability. Future developments such as next-generation sequencing and integrated analysis of data from multiple array platforms on large numbers of samples are poised to revolutionize our understanding of this complex disease.

  14. Developmental transcriptome of Aplysia californica'

    KAUST Repository

    Heyland, Andreas

    2010-12-06

    Genome-wide transcriptional changes in development provide important insight into mechanisms underlying growth, differentiation, and patterning. However, such large-scale developmental studies have been limited to a few representatives of Ecdysozoans and Chordates. Here, we characterize transcriptomes of embryonic, larval, and metamorphic development in the marine mollusc Aplysia californica and reveal novel molecular components associated with life history transitions. Specifically, we identify more than 20 signal peptides, putative hormones, and transcription factors in association with early development and metamorphic stages-many of which seem to be evolutionarily conserved elements of signal transduction pathways. We also characterize genes related to biomineralization-a critical process of molluscan development. In summary, our experiment provides the first large-scale survey of gene expression in mollusc development, and complements previous studies on the regulatory mechanisms underlying body plan patterning and the formation of larval and juvenile structures. This study serves as a resource for further functional annotation of transcripts and genes in Aplysia, specifically and molluscs in general. A comparison of the Aplysia developmental transcriptome with similar studies in the zebra fish Danio rerio, the fruit fly Drosophila melanogaster, the nematode Caenorhabditis elegans, and other studies on molluscs suggests an overall highly divergent pattern of gene regulatory mechanisms that are likely a consequence of the different developmental modes of these organisms. © 2010 Wiley-Liss, Inc., A Wiley Company.

  15. Recent advances in fruit crop genomics

    Directory of Open Access Journals (Sweden)

    Qiang XU,Chaoyang LIU,Manosh Kumar BISWAS,Zhiyong PAN,Xiuxin DENG

    2014-02-01

    Full Text Available In recent years, dramatic progress has been made in the genomics of fruit crops. The publication of a dozen fruit crop genomes represents a milestone for both functional genomics and breeding programs in fruit crops. Rapid advances in high-throughput sequencing technology have revolutionized the manner and scale of genomics in fruit crops. Research on fruit crops is encompassing a wide range of biological questions which are unique and cannot be addressed in a model plant such as Arabidopsis. This review summarizes recent achievements of research on the genome, transcriptome, proteome, miRNAs and epigenome of fruit crops.

  16. Mapping copy number variation by population-scale genome sequencing

    DEFF Research Database (Denmark)

    Mills, Ryan E.; Walter, Klaudia; Stewart, Chip;

    2011-01-01

    Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is, ...

  17. Pore-scale simulation of microbial growth using a genome-scale metabolic model: Implications for Darcy-scale reactive transport

    Science.gov (United States)

    Tartakovsky, G. D.; Tartakovsky, A. M.; Scheibe, T. D.; Fang, Y.; Mahadevan, R.; Lovley, D. R.

    2013-09-01

    Recent advances in microbiology have enabled the quantitative simulation of microbial metabolism and growth based on genome-scale characterization of metabolic pathways and fluxes. We have incorporated a genome-scale metabolic model of the iron-reducing bacteria Geobacter sulfurreducens into a pore-scale simulation of microbial growth based on coupling of iron reduction to oxidation of a soluble electron donor (acetate). In our model, fluid flow and solute transport is governed by a combination of the Navier-Stokes and advection-diffusion-reaction equations. Microbial growth occurs only on the surface of soil grains where solid-phase mineral iron oxides are available. Mass fluxes of chemical species associated with microbial growth are described by the genome-scale microbial model, implemented using a constraint-based metabolic model, and provide the Robin-type boundary condition for the advection-diffusion equation at soil grain surfaces. Conventional models of microbially-mediated subsurface reactions use a lumped reaction model that does not consider individual microbial reaction pathways, and describe reactions rates using empirically-derived rate formulations such as the Monod-type kinetics. We have used our pore-scale model to explore the relationship between genome-scale metabolic models and Monod-type formulations, and to assess the manifestation of pore-scale variability (microenvironments) in terms of apparent Darcy-scale microbial reaction rates. The genome-scale model predicted lower biomass yield, and different stoichiometry for iron consumption, in comparison to prior Monod formulations based on energetics considerations. We were able to fit an equivalent Monod model, by modifying the reaction stoichiometry and biomass yield coefficient, that could effectively match results of the genome-scale simulation of microbial behaviors under excess nutrient conditions, but predictions of the fitted Monod model deviated from those of the genome-scale model

  18. Pore-scale simulation of microbial growth using a genome-scale metabolic model: Implications for Darcy-scale reactive transport

    Energy Technology Data Exchange (ETDEWEB)

    Tartakovsky, Guzel D.; Tartakovsky, Alexandre M.; Scheibe, Timothy D.; Fang, Yilin; Mahadevan, Radhakrishnan; Lovley, Derek R.

    2013-09-07

    Recent advances in microbiology have enabled the quantitative simulation of microbial metabolism and growth based on genome-scale characterization of metabolic pathways and fluxes. We have incorporated a genome-scale metabolic model of the iron-reducing bacteria Geobacter sulfurreducens into a pore-scale simulation of microbial growth based on coupling of iron reduction to oxidation of a soluble electron donor (acetate). In our model, fluid flow and solute transport is governed by a combination of the Navier-Stokes and advection-diffusion-reaction equations. Microbial growth occurs only on the surface of soil grains where solid-phase mineral iron oxides are available. Mass fluxes of chemical species associated with microbial growth are described by the genome-scale microbial model, implemented using a constraint-based metabolic model, and provide the Robin-type boundary condition for the advection-diffusion equation at soil grain surfaces. Conventional models of microbially-mediated subsurface reactions use a lumped reaction model that does not consider individual microbial reaction pathways, and describe reactions rates using empirically-derived rate formulations such as the Monod-type kinetics. We have used our pore-scale model to explore the relationship between genome-scale metabolic models and Monod-type formulations, and to assess the manifestation of pore-scale variability (microenvironments) in terms of apparent Darcy-scale microbial reaction rates. The genome-scale model predicted lower biomass yield, and different stoichiometry for iron consumption, in comparisonto prior Monod formulations based on energetics considerations. We were able to fit an equivalent Monod model, by modifying the reaction stoichiometry and biomass yield coefficient, that could effectively match results of the genome-scale simulation of microbial behaviors under excess nutrient conditions, but predictions of the fitted Monod model deviated from those of the genome-scale model under

  19. Resolution of deep eudicot phylogeny and their temporal diversification using nuclear genes from transcriptomic and genomic datasets.

    Science.gov (United States)

    Zeng, Liping; Zhang, Ning; Zhang, Qiang; Endress, Peter K; Huang, Jie; Ma, Hong

    2017-05-01

    Explosive diversification is widespread in eukaryotes, making it difficult to resolve phylogenetic relationships. Eudicots contain c. 75% of extant flowering plants, are important for human livelihood and terrestrial ecosystems, and have probably experienced explosive diversifications. The eudicot phylogenetic relationships, especially among those of the Pentapetalae, remain unresolved. Here, we present a highly supported eudicot phylogeny and diversification rate shifts using 31 newly generated transcriptomes and 88 other datasets covering 70% of eudicot orders. A highly supported eudicot phylogeny divided Pentapetalae into two groups: one with rosids, Saxifragales, Vitales and Santalales; the other containing asterids, Caryophyllales and Dilleniaceae, with uncertainty for Berberidopsidales. Molecular clock analysis estimated that crown eudicots originated c. 146 Ma, considerably earlier than earliest tricolpate pollen fossils and most other molecular clock estimates, and Pentapetalae sequentially diverged into eight major lineages within c. 15 Myr. Two identified increases of diversification rate are located in the stems leading to Pentapetalae and asterids, and lagged behind the gamma hexaploidization. The nuclear genes from newly generated transcriptomes revealed a well-resolved eudicot phylogeny, sequential separation of major core eudicot lineages and temporal mode of diversifications, providing new insights into the evolutionary trend of morphologies and contributions to the diversification of eudicots. © 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.

  20. Genome-Wide Transcriptome Profiling of Mycobacterium smegmatis MC² 155 Cultivated in Minimal Media Supplemented with Cholesterol, Androstenedione or Glycerol.

    Science.gov (United States)

    Li, Qun; Ge, Fanglan; Tan, Yunya; Zhang, Guangxiang; Li, Wei

    2016-05-07

    Mycobacterium smegmatis strain MC² 155 is an attractive model organism for the study of M. tuberculosis and other mycobacterial pathogens, as it can grow well using cholesterol as a carbon resource. However, its global transcriptomic response remains largely unrevealed. In this study, M. smegmatis MC² 155 cultivated in androstenedione, cholesterol and glycerol supplemented media were collected separately for a RNA-Sequencing study. The results showed that 6004, 6681 and 6348 genes were expressed in androstenedione, cholesterol and glycerol supplemented media, and 5891 genes were expressed in all three conditions, with 237 specially expressed in cholesterol added medium. A total of 1852 and 454 genes were significantly up-regulated by cholesterol compared with the other two supplements. Only occasional changes were observed in basic carbon and nitrogen metabolism, while almost all of the genes involved in cholesterol catabolism and mammalian cell entry (MCE) were up-regulated by cholesterol, but not by androstenedione. Eleven and 16 gene clusters were induced by cholesterol when compared with glycerol or androstenedione, respectively. This study provides a comprehensive analysis of the cholesterol responsive transcriptome of M. smegmatis. Our results indicated that cholesterol induced many more genes and increased the expression of the majority of genes involved in cholesterol degradation and MCE in M. smegmatis, while androstenedione did not have the same effect.

  1. Unraveling the rat blood genome-wide transcriptome after oral administration of lavender oil by a two-color dye-swap DNA microarray approach

    Directory of Open Access Journals (Sweden)

    Motohide Hori

    2016-06-01

    Full Text Available Lavender oil (LO is a commonly used essential oil in aromatherapy as non-traditional medicine. With an aim to demonstrate LO effects on the body, we have recently established an animal model investigating the influence of orally administered LO in rat tissues, genome-wide. In this brief, we investigate the effect of LO ingestion in the blood of rat. Rats were administered LO at usual therapeutic dose (5 mg/kg in humans, and following collection of the venous blood from the heart and extraction of total RNA, the differentially expressed genes were screened using a 4 × 44-K whole-genome rat chip (Agilent microarray platform; Agilent Technologies, Palo Alto, CA, USA in conjunction with a two-color dye-swap approach. A total of 834 differentially expressed genes in the blood were identified: 362 up-regulated and 472 down-regulated. These genes were functionally categorized using bioinformatics tools. The gene expression inventory of rat blood transcriptome under LO, a first report, has been deposited into the Gene Expression Omnibus (GEO: GSE67499. The data will be a valuable resource in examining the effects of natural products, and which could also serve as a human model for further functional analysis and investigation.

  2. Transcriptome analysis of tetraploid cells identifies cyclin D2 as a facilitator of adaptation to genome doubling in the presence of p53.

    Science.gov (United States)

    Potapova, Tamara A; Seidel, Christopher W; Box, Andrew C; Rancati, Giulia; Li, Rong

    2016-10-15

    Tetraploidization, or genome doubling, is a prominent event in tumorigenesis, primarily because cell division in polyploid cells is error-prone and produces aneuploid cells. This study investigates changes in gene expression evoked in acute and adapted tetraploid cells and their effect on cell-cycle progression. Acute polyploidy was generated by knockdown of the essential regulator of cytokinesis anillin, which resulted in cytokinesis failure and formation of binucleate cells, or by chemical inhibition of Aurora kinases, causing abnormal mitotic exit with formation of single cells with aberrant nuclear morphology. Transcriptome analysis of these acute tetraploid cells revealed common signatures of activation of the tumor-suppressor protein p53. Suppression of proliferation in these cells was dependent on p53 and its transcriptional target, CDK inhibitor p21. Rare proliferating tetraploid cells can emerge from acute polyploid populations. Gene expression analysis of single cell-derived, adapted tetraploid clones showed up-regulation of several p53 target genes and cyclin D2, the activator of CDK4/6/2. Overexpression of cyclin D2 in diploid cells strongly potentiated the ability to proliferate with increased DNA content despite the presence of functional p53. These results indicate that p53-mediated suppression of proliferation of polyploid cells can be averted by increased levels of oncogenes such as cyclin D2, elucidating a possible route for tetraploidy-mediated genomic instability in carcinogenesis.

  3. Genome-Wide Identification and Transcriptome-Based Expression Profiling of the Sox Gene Family in the Nile Tilapia (Oreochromis niloticus).

    Science.gov (United States)

    Wei, Ling; Yang, Chao; Tao, Wenjing; Wang, Deshou

    2016-02-23

    The Sox transcription factor family is characterized with the presence of a Sry-related high-mobility group (HMG) box and plays important roles in various biological processes in animals, including sex determination and differentiation, and the development of multiple organs. In this study, 27 Sox genes were identified in the genome of the Nile tilapia (Oreochromis niloticus), and were classified into seven groups. The members of each group of the tilapia Sox genes exhibited a relatively conserved exon-intron structure. Comparative analysis showed that the Sox gene family has undergone an expansion in tilapia and other teleost fishes following their whole genome duplication, and group K only exists in teleosts. Transcriptome-based analysis demonstrated that most of the tilapia Sox genes presented stage-specific and/or sex-dimorphic expressions during gonadal development, and six of the group B Sox genes were specifically expressed in the adult brain. Our results provide a better understanding of gene structure and spatio-temporal expression of the Sox gene family in tilapia, and will be useful for further deciphering the roles of the Sox genes during sex determination and gonadal development in teleosts.

  4. Mining whole genomes and transcriptomes of Jatropha (Jatropha curcas) and Castor bean (Ricinus communis) for NBS-LRR genes and defense response associated transcription factors.

    Science.gov (United States)

    Sood, Archit; Jaiswal, Varun; Chanumolu, Sree Krishna; Malhotra, Nikhil; Pal, Tarun; Chauhan, Rajinder Singh

    2014-11-01

    Jatropha (Jatropha curcas L.) and Castor bean (Ricinus communis) are oilseed crops of family Euphorbiaceae with the potential of producing high quality biodiesel and having industrial value. Both the bioenergy plants are becoming susceptible to various biotic stresses directly affecting the oil quality and content. No report exists as of today on analysis of Nucleotide Binding Site-Leucine Rich Repeat (NBS-LRR) gene repertoire and defense response transcription factors in both the plant species. In silico analysis of whole genomes and transcriptomes identified 47 new NBS-LRR genes in both the species and 122 and 318 defense response related transcription factors in Jatropha and Castor bean, respectively. The identified NBS-LRR genes and defense response transcription factors were mapped onto the respective genomes. Common and unique NBS-LRR genes and defense related transcription factors were identified in both the plant species. All NBS-LRR genes in both the species were characterized into Toll/interleukin-1 receptor NBS-LRRs (TNLs) and coiled-coil NBS-LRRs (CNLs), position on contigs, gene clusters and motifs and domains distribution. Transcript abundance or expression values were measured for all NBS-LRR genes and defense response transcription factors, suggesting their functional role. The current study provides a repertoire of NBS-LRR genes and transcription factors which can be used in not only dissecting the molecular basis of disease resistance phenotype but also in developing disease resistant genotypes in Jatropha and Castor bean through transgenic or molecular breeding approaches.

  5. Integration of genome-scale modeling and transcript profiling reveals metabolic pathways underlying light and temperature acclimation in Arabidopsis.

    Science.gov (United States)

    Töpfer, Nadine; Caldana, Camila; Grimbs, Sergio; Willmitzer, Lothar; Fernie, Alisdair R; Nikoloski, Zoran

    2013-04-01

    Understanding metabolic acclimation of plants to challenging environmental conditions is essential for dissecting the role of metabolic pathways in growth and survival. As stresses involve simultaneous physiological alterations across all levels of cellular organization, a comprehensive characterization of the role of metabolic pathways in acclimation necessitates integration of genome-scale models with high-throughput data. Here, we present an integrative optimization-based approach, which, by coupling a plant metabolic network model and transcriptomics data, can predict the metabolic pathways affected in a single, carefully controlled experiment. Moreover, we propose three optimization-based indices that characterize different aspects of metabolic pathway behavior in the context of the entire metabolic network. We demonstrate that the proposed approach and indices facilitate quantitative comparisons and characterization of the plant metabolic response under eight different light and/or temperature conditions. The predictions of the metabolic functions involved in metabolic acclimation of Arabidopsis thaliana to the changing conditions are in line with experimental evidence and result in a hypothesis about the role of homocysteine-to-Cys interconversion and Asn biosynthesis. The approach can also be used to reveal the role of particular metabolic pathways in other scenarios, while taking into consideration the entirety of characterized plant metabolism.

  6. Genome-based microbial ecology of anammox granules in a full-scale wastewater treatment system.

    Science.gov (United States)

    Speth, Daan R; In 't Zandt, Michiel H; Guerrero-Cruz, Simon; Dutilh, Bas E; Jetten, Mike S M

    2016-03-31

    Partial-nitritation anammox (PNA) is a novel wastewater treatment procedure for energy-efficient ammonium removal. Here we use genome-resolved metagenomics to build a genome-based ecological model of the microbial community in a full-scale PNA reactor. Sludge from the bioreactor examined here is used to seed reactors in wastewater treatment plants around the world; however, the role of most of its microbial community in ammonium removal remains unknown. Our analysis yielded 23 near-complete draft genomes that together represent the majority of the microbial community. We assign these genomes to distinct anaerobic and aerobic microbial communities. In the aerobic community, nitrifying organisms and heterotrophs predominate. In the anaerobic community, widespread potential for partial denitrification suggests a nitrite loop increases treatment efficiency. Of our genomes, 19 have no previously cultivated or sequenced close relatives and six belong to bacterial phyla without any cultivated members, including the most complete Omnitrophica (formerly OP3) genome to date.

  7. Accomplishments in genome-scale in silico modeling for industrial and medical biotechnology.

    Science.gov (United States)

    Milne, Caroline B; Kim, Pan-Jun; Eddy, James A; Price, Nathan D

    2009-12-01

    Driven by advancements in high-throughput biological technologies and the growing number of sequenced genomes, the construction of in silico models at the genome scale has provided powerful tools to investigate a vast array of biological systems and applications. Here, we review comprehensively the uses of such models in industrial and medical biotechnology, including biofuel generation, food production, and drug development. While the use of in silico models is still in its early stages for delivering to industry, significant initial successes have been achieved. For the cases presented here, genome-scale models predict engineering strategies to enhance properties of interest in an organism or to inhibit harmful mechanisms of pathogens. Going forward, genome-scale in silico models promise to extend their application and analysis scope to become a trans-formative tool in biotechnology.

  8. Genome-Scale Metabolic Modeling in the Simulation of Field-Scale Uranium Bioremediation

    Science.gov (United States)

    Yabusaki, S.; Wilkins, M.; Fang, Y.; Williams, K. H.; Waichler, S.; Long, P. E.

    2015-12-01

    Coupled variably saturated flow and biogeochemical reactive transport modeling is used to improve understanding of the processes, properties, and conditions controlling uranium bio-immobilization in a field experiment where uranium-contaminated groundwater was amended with acetate and bicarbonate. The acetate stimulates indigenous microorganisms that catalyze metal reduction, including the conversion of aqueous U(VI) to solid-phase U(IV), which effectively removes uranium from solution. The initiation of the bicarbonate amendment prior to biostimulation was designed to promote U(VI) desorption that would increase the aqueous U(VI) available for bioreduction. The three-dimensional simulations were able to largely reproduce the timing and magnitude of the physical, chemical and biological responses to the acetate and bicarbonate amendment in the context of changing water table elevation and gradient. A time series of groundwater proteomic samples exhibited correlations between the most abundant Geobacter metallireducens proteins and the genome-scale metabolic model-predicted fluxes of intra-cellular reactions associated with each of those proteins. The desorption of U(VI) induced by the bicarbonate amendment led to initially higher rates of bioreduction compared to locations with minimal bicarbonate exposure. After bicarbonate amendment ceased, bioreduction continued at these locations whereas U(VI) sorption was the dominant removal mechanism at the bicarbonate-impacted sites.

  9. RGS2 expression predicts amyloid-β sensitivity, MCI and Alzheimer's disease: genome-wide transcriptomic profiling and bioinformatics data mining

    Science.gov (United States)

    Hadar, A; Milanesi, E; Squassina, A; Niola, P; Chillotti, C; Pasmanik-Chor, M; Yaron, O; Martásek, P; Rehavi, M; Weissglas-Volkov, D; Shomron, N; Gozes, I; Gurwitz, D

    2016-01-01

    Alzheimer's disease (AD) is the most frequent cause of dementia. Misfolded protein pathological hallmarks of AD are brain deposits of amyloid-β (Aβ) plaques and phosphorylated tau neurofibrillary tangles. However, doubts about the role of Aβ in AD pathology have been raised as Aβ is a common component of extracellular brain deposits found, also by in vivo imaging, in non-demented aged individuals. It has been suggested that some individuals are more prone to Aβ neurotoxicity and hence more likely to develop AD when aging brains start accumulating Aβ plaques. Here, we applied genome-wide transcriptomic profiling of lymphoblastoid cells lines (LCLs) from healthy individuals and AD patients for identifying genes that predict sensitivity to Aβ. Real-time PCR validation identified 3.78-fold lower expression of RGS2 (regulator of G-protein signaling 2; P=0.0085) in LCLs from healthy individuals exhibiting high vs low Aβ sensitivity. Furthermore, RGS2 showed 3.3-fold lower expression (P=0.0008) in AD LCLs compared with controls. Notably, RGS2 expression in AD LCLs correlated with the patients' cognitive function. Lower RGS2 expression levels were also discovered in published expression data sets from postmortem AD brain tissues as well as in mild cognitive impairment and AD blood samples compared with controls. In conclusion, Aβ sensitivity phenotyping followed by transcriptomic profiling and published patient data mining identified reduced peripheral and brain expression levels of RGS2, a key regulator of G-protein-coupled receptor signaling and neuronal plasticity. RGS2 is suggested as a novel AD biomarker (alongside other genes) toward early AD detection and future disease modifying therapeutics. PMID:27701409

  10. Genome-wide transcriptomic and proteomic analyses of bollworm-infested developing cotton bolls revealed the genes and pathways involved in the insect pest defence mechanism.

    Science.gov (United States)

    Kumar, Saravanan; Kanakachari, Mogilicherla; Gurusamy, Dhandapani; Kumar, Krishan; Narayanasamy, Prabhakaran; Kethireddy Venkata, Padmalatha; Solanke, Amolkumar; Gamanagatti, Savita; Hiremath, Vamadevaiah; Katageri, Ishwarappa S; Leelavathi, Sadhu; Kumar, Polumetla Ananda; Reddy, Vanga Siva

    2016-06-01

    Cotton bollworm, Helicoverpa armigera, is a major insect pest that feeds on cotton bolls causing extensive damage leading to crop and productivity loss. In spite of such a major impact, cotton plant response to bollworm infection is yet to be witnessed. In this context, we have studied the genome-wide response of cotton bolls infested with bollworm using transcriptomic and proteomic approaches. Further, we have validated this data using semi-quantitative real-time PCR. Comparative analyses have revealed that 39% of the transcriptome and 35% of the proteome were differentially regulated during bollworm infestation. Around 36% of significantly regulated transcripts and 45% of differentially expressed proteins were found to be involved in signalling followed by redox regulation. Further analysis showed that defence-related stress hormones and their lipid precursors, transcription factors, signalling molecules, etc. were stimulated, whereas the growth-related counterparts were suppressed during bollworm infestation. Around 26% of the significantly up-regulated proteins were defence molecules, while >50% of the significantly down-regulated were related to photosynthesis and growth. Interestingly, the biosynthesis genes for synergistically regulated jasmonate, ethylene and suppressors of the antagonistic factor salicylate were found to be up-regulated, suggesting a choice among stress-responsive phytohormone regulation. Manual curation of the enzymes and TFs highlighted the components of retrograde signalling pathways. Our data suggest that a selective regulatory mechanism directs the reallocation of metabolic resources favouring defence over growth under bollworm infestation and these insights could be exploited to develop bollworm-resistant cotton varieties.

  11. Scaling up genome annotation using MAKER and work queue.

    Science.gov (United States)

    Thrasher, Andrew; Musgrave, Zachary; Kachmarck, Brian; Thain, Douglas; Emrich, Scott

    2014-01-01

    Next generation sequencing technologies have enabled sequencing many genomes. Because of the overall increasing demand and the inherent parallelism available in many required analyses, these bioinformatics applications should ideally run on clusters, clouds and/or grids. We present a modified annotation framework that achieves a speed-up of 45x using 50 workers using a Caenorhabditis japonica test case. We also evaluate these modifications within the Amazon EC2 cloud framework. The underlying genome annotation (MAKER) is parallelised as an MPI application. Our framework enables it to now run without MPI while utilising a wide variety of distributed computing resources. This parallel framework also allows easy explicit data transfer, which helps overcome a major limitation of bioinformatics tools that often rely on shared file systems. Combined, our proposed framework can be used, even during early stages of development, to easily run sequence analysis tools on clusters, grids and clouds.

  12. Cycling Transcriptional Networks Optimize Energy Utilization on a Genome Scale.

    Science.gov (United States)

    Wang, Guang-Zhong; Hickey, Stephanie L; Shi, Lei; Huang, Hung-Chung; Nakashe, Prachi; Koike, Nobuya; Tu, Benjamin P; Takahashi, Joseph S; Konopka, Genevieve

    2015-12-01

    Genes expressing circadian RNA rhythms are enriched for metabolic pathways, but the adaptive significance of cyclic gene expression remains unclear. We estimated the genome-wide synthetic and degradative cost of transcription and translation in three organisms and found that the cost of cycling genes is strikingly higher compared to non-cycling genes. Cycling genes are expressed at high levels and constitute the most costly proteins to synthesize in the genome. We demonstrate that metabolic cycling is accelerated in yeast grown under higher nutrient flux and the number of cycling genes increases ∼40%, which are achieved by increasing the amplitude and not the mean level of gene expression. These results suggest that rhythmic gene expression optimizes the metabolic cost of global gene expression and that highly expressed genes have been selected to be downregulated in a cyclic manner for energy conservation.

  13. Direct-to-consumer genomics on the scales of autonomy.

    Science.gov (United States)

    Vayena, Effy

    2015-04-01

    Direct-to-consumer (DTC) genetic services have generated enormous controversy from their first emergence. A dramatic recent manifestation of this is the Food and Drug Administration's (FDA) cease and desist order against 23andMe, the leading provider in the market. Critics have argued for the restrictive regulation of such services, and even their prohibition, on the grounds of the harm they pose to consumers. Their advocates, by contrast, defend them as a means of enhancing the autonomy of those same consumers. Autonomy emerges as a key battle-field in this debate, because many of the 'harm' arguments can be interpreted as identifying threats to autonomy. This paper assesses whether DTC genomic services are a threat to, or instead, an enhancement of, personal autonomy. It deploys Joseph Raz's account of personal autonomy, with its emphasis on choice from a range of valuable options. It then seeks to counter claims that DTC genomics threatens autonomy because it involves manipulation in contravention of consumers' independence or because it does not generate valuable options which can be meaningfully engaged with by consumers. It is stressed that the value of the options generated by DTC genomics should not be judged exclusively from the perspective of medical actionability, but should take into consideration plural utilities. Finally, the paper ends by broaching policy recommendations, suggesting that there is a strong autonomy-based argument for permitting DTC genomic services, and that the key question is the nature of the regulatory conditions under which they should be permitted. The discussion of autonomy in this paper helps illuminate some of these conditions.

  14. Direct-to-consumer genomics on the scales of autonomy

    Science.gov (United States)

    Vayena, Effy

    2015-01-01

    Direct-to-consumer (DTC) genetic services have generated enormous controversy from their first emergence. A dramatic recent manifestation of this is the Food and Drug Administration's (FDA) cease and desist order against 23andMe, the leading provider in the market. Critics have argued for the restrictive regulation of such services, and even their prohibition, on the grounds of the harm they pose to consumers. Their advocates, by contrast, defend them as a means of enhancing the autonomy of those same consumers. Autonomy emerges as a key battle-field in this debate, because many of the ‘harm’ arguments can be interpreted as identifying threats to autonomy. This paper assesses whether DTC genomic services are a threat to, or instead, an enhancement of, personal autonomy. It deploys Joseph Raz's account of personal autonomy, with its emphasis on choice from a range of valuable options. It then seeks to counter claims that DTC genomics threatens autonomy because it involves manipulation in contravention of consumers’ independence or because it does not generate valuable options which can be meaningfully engaged with by consumers. It is stressed that the value of the options generated by DTC genomics should not be judged exclusively from the perspective of medical actionability, but should take into consideration plural utilities. Finally, the paper ends by broaching policy recommendations, suggesting that there is a strong autonomy-based argument for permitting DTC genomic services, and that the key question is the nature of the regulatory conditions under which they should be permitted. The discussion of autonomy in this paper helps illuminate some of these conditions. PMID:24797610

  15. Genome Assembly of the Fungus Cochliobolus miyabeanus, and Transcriptome Analysis during Early Stages of Infection on American Wildrice (Zizania palustris L..

    Directory of Open Access Journals (Sweden)

    Claudia V Castell-Miller

    Full Text Available The fungus Cochliobolus miyabeanus causes severe leaf spot disease on rice (Oryza sativa and two North American specialty crops, American wildrice (Zizania palustris and switchgrass (Panicum virgatum. Despite the importance of C. miyabeanus as a disease-causing agent in wildrice, little is known about either the mechanisms of pathogenicity or host defense responses. To start bridging these gaps, the genome of C. miyabeanus strain TG12bL2 was shotgun sequenced using Illumina technology. The genome assembly consists of 31.79 Mbp in 2,378 scaffolds with an N50 = 74,921. It contains 11,000 predicted genes of which 94.5% were annotated. Approximately 10% of total gene number is expected to be secreted. The C. miyabeanus genome is rich in carbohydrate active enzymes, and harbors 187 small secreted peptides (SSPs and some fungal effector homologs. Detoxification systems were represented by a variety of enzymes that could offer protection against plant defense compounds. The non-ribosomal peptide synthetases and polyketide synthases (PKS present were common to other Cochliobolus species. Additionally, the fungal transcriptome was analyzed at 48 hours after inoculation in planta. A total of 10,674 genes were found to be expressed, some of which are known to be involved in pathogenicity or response to host defenses including hydrophobins, cutinase, cell wall degrading enzymes, enzymes related to reactive oxygen species scavenging, PKS, detoxification systems, SSPs, and a known fungal effector. This work will facilitate future research on C. miyabeanus pathogen-associated molecular patterns and effectors, and in the identification of their corresponding wildrice defense mechanisms.

  16. The genome and life-stage specific transcriptomes of Globodera pallida elucidate key aspects of plant parasitism by a cyst nematode

    KAUST Repository

    Cotton, James A

    2014-03-03

    Background: Globodera pallida is a devastating pathogen of potato crops, making it one of the most economically important plant parasitic nematodes. It is also an important model for the biology of cyst nematodes. Cyst nematodes and root-knot nematodes are the two most important plant parasitic nematode groups and together represent a global threat to food security. Results: We present the complete genome sequence of G. pallida, together with transcriptomic data from most of the nematode life cycle, particularly focusing on the life cycle stages involved in root invasion and establishment of the biotrophic feeding site. Despite the relatively close phylogenetic relationship with root-knot nematodes, we describe a very different gene family content between the two groups and in particular extensive differences in the repertoire of effectors, including an enormous expansion of the SPRY domain protein family in G. pallida, which includes the SPRYSEC family of effectors. This highlights the distinct biology of cyst nematodes compared to the root-knot nematodes that were, until now, the only sedentary plant parasitic nematodes for which genome information was available. We also present in-depth descriptions of the repertoires of other genes likely to be important in understanding the unique biology of cyst nematodes and of potential drug targets and other targets for their control. Conclusions: The data and analyses we present will be central in exploiting post-genomic approaches in the development of much-needed novel strategies for the control of G. pallida and related pathogens. 2014 Cotton et al.; licensee BioMed Central Ltd.

  17. Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics.

    Science.gov (United States)

    Fagerberg, Linn; Hallström, Björn M; Oksvold, Per; Kampf, Caroline; Djureinovic, Dijana; Odeberg, Jacob; Habuka, Masato; Tahmasebpoor, Simin; Danielsson, Angelika; Edlund, Karolina; Asplund, Anna; Sjöstedt, Evelina; Lundberg, Emma; Szigyarto, Cristina Al-Khalili; Skogs, Marie; Takanen, Jenny Ottosson; Berling, Holger; Tegel, Hanna; Mulder, Jan; Nilsson, Peter; Schwenk, Jochen M; Lindskog, Cecilia; Danielsson, Frida; Mardinoglu, Adil; Sivertsson, Asa; von Feilitzen, Kalle; Forsberg, Mattias; Zwahlen, Martin; Olsson, IngMarie; Navani, Sanjay; Huss, Mikael; Nielsen, Jens; Ponten, Fredrik; Uhlén, Mathias

    2014-02-01

    Global classification of the human proteins with regards to spatial expression patterns across organs and tissues is important for studies of human biology and disease. Here, we used a quantitative transcriptomics analysis (RNA-Seq) to classify the tissue-specific expression of genes across a representative set of all major human organs and tissues and combined this analysis with antibody-based profiling of the same tissues. To present the data, we launch a new version of the Human Protein Atlas that integrates RNA and protein expression data corresponding to ∼80% of the human protein-coding genes with access to the primary data for both the RNA and the protein analysis on an individual gene level. We present a classification of all human protein-coding genes with regards to tissue-specificity and spatial expression pattern. The integrative human expression map can be used as a starting point to explore the molecular constituents of the human body.

  18. TIGER: Toolbox for integrating genome-scale metabolic models, expression data, and transcriptional regulatory networks

    Directory of Open Access Journals (Sweden)

    Jensen Paul A

    2011-09-01

    Full Text Available Abstract Background Several methods have been developed for analyzing genome-scale models of metabolism and transcriptional regulation. Many of these methods, such as Flux Balance Analysis, use constrained optimization to predict relationships between metabolic flux and the genes that encode and regulate enzyme activity. Recently, mixed integer programming has been used to encode these gene-protein-reaction (GPR relationships into a single optimization problem, but these techniques are often of limited generality and lack a tool for automating the conversion of rules to a coupled regulatory/metabolic model. Results We present TIGER, a Toolbox for Integrating Genome-scale Metabolism, Expression, and Regulation. TIGER converts a series of generalized, Boolean or multilevel rules into a set of mixed integer inequalities. The package also includes implementations of existing algorithms to integrate high-throughput expression data with genome-scale models of metabolism and transcriptional regulation. We demonstrate how TIGER automates the coupling of a genome-scale metabolic model with GPR logic and models of transcriptional regulation, thereby serving as a platform for algorithm development and large-scale metabolic analysis. Additionally, we demonstrate how TIGER's algorithms can be used to identify inconsistencies and improve existing models of transcriptional regulation with examples from the reconstructed transcriptional regulatory network of Saccharomyces cerevisiae. Conclusion The TIGER package provides a consistent platform for algorithm development and extending existing genome-scale metabolic models with regulatory networks and high-throughput data.

  19. Genome Wide Transcriptome Analysis reveals ABA mediated response in Arabidopsis during Gold (AuCl4- treatment

    Directory of Open Access Journals (Sweden)

    Devesh eShukla

    2014-11-01

    Full Text Available The unique physico-chemical properties of gold nanoparticles (AuNPs find manifold applications in diagnostics, medicine and catalysis. Chemical synthesis produces reactive AuNPs and generates hazardous by-products. Alternatively, plants can be utilized to produce AuNPs in an eco-friendly manner. To better control the biosynthesis of AuNPs, we need to first understand the detailed molecular response induced by AuCl4- In this study, we carried out global transcriptome analysis in root tissue of Arabidopsis grown for 12- hours in presence of gold solution (HAuCl4 using the novel unbiased Affymetrix exon array. Transcriptomics analysis revealed differential regulation of a total of 704 genes and 4900 exons. Of these, 492 and 212 genes were up- and downregulated, respectively. The validation of the expressed key genes, such as glutathione-S-transferases, auxin responsive genes, cytochrome P450 82C2, methyl transferases, transducin (G protein beta subunit, ERF transcription factor, ABC, and MATE transporters, was carried out through quantitative RT-PCR. These key genes demonstrated specific induction under AuCl4- treatment relative to other heavy metals, suggesting a unique plant-gold interaction. GO enrichment analysis reveals the upregulation of processes like oxidative stress, glutathione binding, metal binding, transport, and plant hormonal responses. Changes predicted in biochemical pathways indicated major modulation in glutathione mediated detoxification, flavones and derivatives, and plant hormone biosynthesis. Motif search analysis identified a highly significant enriched motif, ACGT, which is an abscisic acid responsive core element (ABRE, suggesting the possibility of ABA- mediated signaling. Identification of abscisic acid response element (ABRE points to the operation of a predominant signaling mechanism in response to AuCl4- exposure. Overall, this study presents a useful picture of plant-gold interaction with an identification of

  20. On the road to synthetic life: the minimal cell and genome-scale engineering.

    Science.gov (United States)

    Juhas, Mario

    2016-01-01

    Synthetic biology employs rational engineering principles to build biological systems from the libraries of standard, well characterized biological parts. Biological systems designed and built by synthetic biologists fulfill a plethora of useful purposes, ranging from better healthcare and energy production to biomanufacturing. Recent advancements in the synthesis, assembly and "booting-up" of synthetic genomes and in low and high-throughput genome engineering have paved the way for engineering on the genome-wide scale. One of the key goals of genome engineering is the construction of minimal genomes consisting solely of essential genes (genes indispensable for survival of living organisms). Besides serving as a toolbox to understand the universal principles of life, the cell encoded by minimal genome could be used to build a stringently controlled "cell factory" with a desired phenotype. This review provides an update on recent advances in the genome-scale engineering with particular emphasis on the engineering of minimal genomes. Furthermore, it presents an ongoing discussion to the scientific community for better suitability of minimal or robust cells for industrial applications.

  1. In silico method for modelling metabolism and gene product expression at genome scale

    Energy Technology Data Exchange (ETDEWEB)

    Lerman, Joshua A.; Hyduke, Daniel R.; Latif, Haythem; Portnoy, Vasiliy A.; Lewis, Nathan E.; Orth, Jeffrey D.; Rutledge, Alexandra C.; Smith, Richard D.; Adkins, Joshua N.; Zengler, Karsten; Palsson, Bernard O.

    2012-07-03

    Transcription and translation use raw materials and energy generated metabolically to create the macromolecular machinery responsible for all cellular functions, including metabolism. A biochemically accurate model of molecular biology and metabolism will facilitate comprehensive and quantitative computations of an organism's molecular constitution as a function of genetic and environmental parameters. Here we formulate a model of metabolism and macromolecular expression. Prototyping it using the simple microorganism Thermotoga maritima, we show our model accurately simulates variations in cellular composition and gene expression. Moreover, through in silico comparative transcriptomics, the model allows the discovery of new regulons and improving the genome and transcription unit annotations. Our method presents a framework for investigating molecular biology and cellular physiology in silico and may allow quantitative interpretation of multi-omics data sets in the context of an integrated biochemical description of an organism.

  2. Exploring the genetic architecture and improving genomic prediction accuracy for mastitis and milk production traits in dairy cattle by mapping variants to hepatic transcriptomic regions responsive to intra-mammary infection.

    Science.gov (United States)

    Fang, Lingzhao; Sahana, Goutam; Ma, Peipei; Su, Guosheng; Yu, Ying; Zhang, Shengli; Lund, Mogens Sandø; Sørensen, Peter

    2017-05-12

    A better understanding of the genetic architecture of complex traits can contribute to improve genomic prediction. We hypothesized that genomic variants associated with mastitis and milk production traits in dairy cattle are enriched in hepatic transcriptomic regions that are responsive to intra-mammary infection (IMI). Genomic markers [e.g. single nucleotide polymorphisms (SNPs)] from those regions, if included, may improve the predictive ability of a genomic model. We applied a genomic feature best linear unbiased prediction model (GFBLUP) to implement the above strategy by considering the hepatic transcriptomic regions responsive to IMI as genomic features. GFBLUP, an extension of GBLUP, includes a separate genomic effect of SNPs within a genomic feature, and allows differential weighting of the individual marker relationships in the prediction equation. Since GFBLUP is computationally intensive, we investigated whether a SNP set test could be a computationally fast way to preselect predictive genomic features. The SNP set test assesses the association between a genomic feature and a trait based on single-SNP genome-wide association studies. We applied these two approaches to mastitis and milk production traits (milk, fat and protein yield) in Holstein (HOL, n = 5056) and Jersey (JER, n = 1231) cattle. We observed that a majority of genomic features were enriched in genomic variants that were associated with mastitis and milk production traits. Compared to GBLUP, the accuracy of genomic prediction with GFBLUP was marginally improved (3.2 to 3.9%) in within-breed prediction. The highest increase (164.4%) in prediction accuracy was observed in across-breed prediction. The significance of genomic features based on the SNP set test were correlated with changes in prediction accuracy of GFBLUP (P < 0.05). GFBLUP provides a framework for integrating multiple layers of biological knowledge to provide novel insights into the biological basis of complex traits

  3. Systems biology as a foundation for genome-scale synthetic biology.

    Science.gov (United States)

    Barrett, Christian L; Kim, Tae Yong; Kim, Hyun Uk; Palsson, Bernhard Ø; Lee, Sang Yup

    2006-10-01

    As the ambitions of synthetic biology approach genome-scale engineering, comprehensive characterization of cellular systems is required, as well as a means to accurately model cell-scale molecular interactions. These requirements are coincident with the goals of systems biology and, thus, systems biology will become the foundation for genome-scale synthetic biology. Systems biology will form this foundation through its efforts to reconstruct and integrate cellular systems, develop the mathematics, theory and software tools for the accurate modeling of these integrated systems, and through evolutionary mechanisms. As genome-scale synthetic biology is so enabled, it will prove to be a positive feedback driver of systems biology by exposing and forcing researchers to confront those aspects of systems biology which are inadequately understood.

  4. Comparative transcriptomics in the Triticeae

    Directory of Open Access Journals (Sweden)

    Waugh Robbie

    2009-06-01

    Full Text Available Abstract Background Barley and particularly wheat are two grass species of immense agricultural importance. In spite of polyploidization events within the latter, studies have shown that genotypically and phenotypically these species are very closely related and, indeed, fertile hybrids can be created by interbreeding. The advent of two genome-scale Affymetrix GeneChips now allows studies of the comparison of their transcriptomes. Results We have used the Wheat GeneChip to create a "gene expression atlas" for the wheat transcriptome (cv. Chinese Spring. For this, we chose mRNA from a range of tissues and developmental stages closely mirroring a comparable study carried out for barley (cv. Morex using the Barley1 GeneChip. This, together with large-scale clustering of the probesets from the two GeneChips into "homologous groups", has allowed us to perform a genomic-scale comparative study of expression patterns in these two species. We explore the influence of the polyploidy of wheat on the results obtained with the Wheat GeneChip and quantify the correlation between conservation in gene sequence and gene expression in wheat and barley. In addition, we show how the conservation of expression patterns can be used to elucidate, probeset by probeset, the reliability of the Wheat GeneChip. Conclusion While there are many differences in expression on the level of individual genes and tissues, we demonstrate that the wheat and barley transcriptomes appear highly correlated. This finding is significant not only because given small evolutionary distance between the two species it is widely expected, but also because it demonstrates that it is possible to use the two GeneChips for comparative studies. This is the case even though their probeset composition reflects rather different design principles as well as, of course, the present incomplete knowledge of the gene content of the two species. We also show that, in general, the Wheat GeneChip is not able

  5. Savant Genome Browser 2: visualization and analysis for population-scale genomics.

    Science.gov (United States)

    Fiume, Marc; Smith, Eric J M; Brook, Andrew; Strbenac, Dario; Turner, Brian; Mezlini, Aziz M; Robinson, Mark D; Wodak, Shoshana J; Brudno, Michael

    2012-07-01

    High-throughput sequencing (HTS) technologies are providing an unprecedented capacity for data generation, and there is a corresponding need for efficient data exploration and analysis capabilities. Although most existing tools for HTS data analysis are developed for either automated (e.g. genotyping) or visualization (e.g. genome browsing) purposes, such tools are most powerful when combined. For example, integration of visualization and computation allows users to iteratively refine their analyses by updating computational parameters within the visual framework in real-time. Here we introduce the second version of the Savant Genome Browser, a standalone program for visual and computational analysis of HTS data. Savant substantially improves upon its predecessor and existing tools by introducing innovative visualization modes and navigation interfaces for several genomic datatypes, and synergizing visual and automated analyses in a way that is powerful yet easy even for non-expert users. We also present a number of plugins that were developed by the Savant Community, which demonstrate the power of integrating visual and automated analyses using Savant. The Savant Genome Browser is freely available (open source) at www.savantbrowser.com.

  6. Nuclear RNA sequencing of the mouse erythroid cell transcriptome.

    Science.gov (United States)

    Mitchell, Jennifer A; Clay, Ieuan; Umlauf, David; Chen, Chih-Yu; Moir, Catherine A; Eskiw, Christopher H; Schoenfelder, Stefan; Chakalova, Lyubomira; Nagano, Takashi; Fraser, Peter

    2012-01-01

    In addition to protein coding genes a substantial proportion of mammalian genomes are transcribed. However, most transcriptome studies investigate steady-state mRNA levels, ignoring a considerable fraction of the transcribed genome. In addition, steady-state mRNA levels are influenced by both transcriptional and posttranscriptional mechanisms, and thus do not provide a clear picture of transcriptional output. Here, using deep sequencing of nuclear RNAs (nucRNA-Seq) in parallel with chromatin immunoprecipitation sequencing (ChIP-Seq) of active RNA polymerase II, we compared the nuclear transcriptome of mouse anemic spleen erythroid cells with polymerase occupancy on a genome-wide scale. We demonstrate that unspliced transcripts quantified by nucRNA-seq correlate with primary transcript frequencies measured by RNA FISH, but differ from steady-state mRNA levels measured by poly(A)-enriched RNA-seq. Highly expressed protein coding genes showed good correlation between RNAPII occupancy and transcriptional output; however, genome-wide we observed a poor correlation between transcriptional output and RNAPII association. This poor correlation is due to intergenic regions associated with RNAPII which correspond with transcription factor bound regulatory regions and a group of stable, nuclear-retained long non-coding transcripts. In conclusion, sequencing the nuclear transcriptome provides an opportunity to investigate the transcriptional landscape in a given cell type through quantification of unspliced primary transcripts and the identification of nuclear-retained long non-coding RNAs.

  7. Nuclear RNA sequencing of the mouse erythroid cell transcriptome.

    Directory of Open Access Journals (Sweden)

    Jennifer A Mitchell

    Full Text Available In addition to protein coding genes a substantial proportion of mammalian genomes are transcribed. However, most transcriptome studies investigate steady-state mRNA levels, ignoring a considerable fraction of the transcribed genome. In addition, steady-state mRNA levels are influenced by both transcriptional and posttranscriptional mechanisms, and thus do not provide a clear picture of transcriptional output. Here, using deep sequencing of nuclear RNAs (nucRNA-Seq in parallel with chromatin immunoprecipitation sequencing (ChIP-Seq of active RNA polymerase II, we compared the nuclear transcriptome of mouse anemic spleen erythroid cells with polymerase occupancy on a genome-wide scale. We demonstrate that unspliced transcripts quantified by nucRNA-seq correlate with primary transcript frequencies measured by RNA FISH, but differ from steady-state mRNA levels measured by poly(A-enriched RNA-seq. Highly expressed protein coding genes showed good correlation between RNAPII occupancy and transcriptional output; however, genome-wide we observed a poor correlation between transcriptional output and RNAPII association. This poor correlation is due to intergenic regions associated with RNAPII which correspond with transcription factor bound regulatory regions and a group of stable, nuclear-retained long non-coding transcripts. In conclusion, sequencing the nuclear transcriptome provides an opportunity to investigate the transcriptional landscape in a given cell type through quantification of unspliced primary transcripts and the identification of nuclear-retained long non-coding RNAs.

  8. In Silico Genome-Scale Reconstruction and Validation of the Corynebacterium glutamicum Metabolic Network

    DEFF Research Database (Denmark)

    Kjeldsen, Kjeld Raunkjær; Nielsen, J.

    2009-01-01

    A genome-scale metabolic model of the Gram-positive bacteria Corynebacterium glutamicum ATCC 13032 was constructed comprising 446 reactions and 411 metabolite, based on the annotated genome and available biochemical information. The network was analyzed using constraint based methods. The model...... and lactate. Comparable flux values between in silico model and experimental values were seen, although some differences in the phenotypic behavior between the model and the experimental data were observed,...

  9. A versatile genome-scale PCR-based pipeline for high-definition DNA FISH.

    Science.gov (United States)

    Bienko, Magda; Crosetto, Nicola; Teytelman, Leonid; Klemm, Sandy; Itzkovitz, Shalev; van Oudenaarden, Alexander

    2013-02-01

    We developed a cost-effective genome-scale PCR-based method for high-definition DNA FISH (HD-FISH). We visualized gene loci with diffraction-limited resolution, chromosomes as spot clusters and single genes together with transcripts by combining HD-FISH with single-molecule RNA FISH. We provide a database of over 4.3 million primer pairs targeting the human and mouse genomes that is readily usable for rapid and flexible generation of probes.

  10. Micro-Scale Genomic DNA Copy Number Aberrations as Another Means of Mutagenesis in Breast Cancer

    Science.gov (United States)

    Chao, Hann-Hsiang; He, Xiaping; Parker, Joel S.; Zhao, Wei; Perou, Charles M.

    2012-01-01

    Introduction In breast cancer, the basal-like subtype has high levels of genomic instability relative to other breast cancer subtypes with many basal-like-specific regions of aberration. There is evidence that this genomic instability extends to smaller scale genomic aberrations, as shown by a previously described micro-deletion event in the PTEN gene in the Basal-like SUM149 breast cancer cell line. Methods We sought to identify if small regions of genomic DNA copy number changes exist by using a high density, gene-centric Comparative Genomic Hybridizations (CGH) array on cell lines and primary tumors. A custom tiling array for CGH (244,000 probes, 200 bp tiling resolution) was created to identify small regions of genomic change, which was focused on previously identified basal-like-specific, and general cancer genes. Tumor genomic DNA from 94 patients and 2 breast cancer cell lines was labeled and hybridized to these arrays. Aberrations were called using SWITCHdna and the smallest 25% of SWITCHdna-defined genomic segments were called micro-aberrations (micro-aberrations, most of which are undetectable using typical-density genome-wide aCGH arrays. The basal-like subtype exhibited the highest incidence of these events. These micro-aberrations sometimes altered expression of the involved gene. We confirmed the presence of the PTEN micro-amplification in SUM149 and by mRNA-seq showed that this resulted in loss of expression of all exons downstream of this event. Micro-aberrations disproportionately affected the 5′ regions of the affected genes, including the promoter region, and high frequency of micro-aberrations was associated with poor survival. Conclusion Using a high-probe-density, gene-centric aCGH microarray, we present evidence of small-scale genomic aberrations that can contribute to gene inactivation. These events may contribute to tumor formation through mechanisms not detected using conventional DNA copy number analyses. PMID:23284754

  11. Transcriptomics resources of human tissues and organs.

    Science.gov (United States)

    Uhlén, Mathias; Hallström, Björn M; Lindskog, Cecilia; Mardinoglu, Adil; Pontén, Fredrik; Nielsen, Jens

    2016-04-04

    Quantifying the differential expression of genes in various human organs, tissues, and cell types is vital to understand human physiology and disease. Recently, several large-scale transcriptomics studies have analyzed the expression of protein-coding genes across tissues. These datasets provide a framework for defining the molecular constituents of the human body as well as for generating comprehensive lists of proteins expressed across tissues or in a tissue-restricted manner. Here, we review publicly available human transcriptome resources and discuss body-wide data from independent genome-wide transcriptome analyses of different tissues. Gene expression measurements from these independent datasets, generated using samples from fresh frozen surgical specimens and postmortem tissues, are consistent. Overall, the different genome-wide analyses support a distribution in which many proteins are found in all tissues and relatively few in a tissue-restricted manner. Moreover, we discuss the applications of publicly available omics data for building genome-scale metabolic models, used for analyzing cell and tissue functions both in physiological and in disease contexts. © 2016 The Authors. Published under the terms of the CC BY 4.0 license.

  12. Genome, transcriptome, and secretome analysis of wood decay fungus Postia placenta supports unique mechanisms of lignocellulose conversion

    Science.gov (United States)

    Diego Martinez; Jean Challacombe; Ingo Morgenstern; David Hibbett; Monika Schmoll; Christian P. Kubicek; Patricia Ferreira; Francisco J. Ruiz-Duenas; Angel T. Martinez; Philip J. Kersten; Kenneth E. Hammel; Jill A. Gaskell; Daniel Cullen

    2009-01-01

    Brown-rot fungi such as Postia placenta are common inhabitants of forest ecosystems and are also largely responsible for the destructive decay of wooden structures. Rapid depolymerization of cellulose is a distinguishing feature of brown-rot, but the biochemical mechanisms and underlying genetics are poorly understood. Systematic examination of the P. placenta genome,...

  13. Tools to covisualize and coanalyze proteomic data with genomes and transcriptomes: validation of genes and alternative mRNA splicing

    DEFF Research Database (Denmark)

    Pang, Chi; Tay, Aidan; Aya, Carlos

    2014-01-01

    contigs, along with RNA-seq reads. This is done in the Integrated Genome Viewer (IGV). A Results Analyzer reports the precise base position where LC-MS/MS-derived peptides cover genes or gene isoforms, on the chromosomes or contigs where this occurs. In prokaryotes, the PG Nexus pipeline facilitates...

  14. Integrative Transcriptome, Genome and Quantitative Trait Loci Resources Identify Single Nucleotide Polymorphisms in Candidate Genes for Growth Traits in Turbot

    Science.gov (United States)

    Robledo, Diego; Fernández, Carlos; Hermida, Miguel; Sciara, Andrés; Álvarez-Dios, José Antonio; Cabaleiro, Santiago; Caamaño, Rubén; Martínez, Paulino; Bouza, Carmen

    2016-01-01

    Growth traits represent a main goal in aquaculture breeding programs and may be related to adaptive variation in wild fisheries. Integrating quantitative trait loci (QTL) mapping and next generation sequencing can greatly help to identify variation in candidate genes, which can result in marker-assisted selection and better genetic structure information. Turbot is a commercially important flatfish in Europe and China, with available genomic information on QTLs and genome mapping. Muscle and liver RNA-seq from 18 individuals was carried out to obtain gene sequences and markers functionally related to growth, resulting in a total of 20,447 genes and 85,344 single nucleotide polymorphisms (SNPs). Many growth-related genes and SNPs were identified and placed in the turbot genome and genetic map to explore their co-localization with growth-QTL markers. Forty-five SNPs on growth-related genes were selected based on QTL co-localization and relevant function for growth traits. Forty-three SNPs were technically feasible and validated in a wild Atlantic population, where 91% were polymorphic. The integration of functional and structural genomic resources in turbot provides a practical approach for QTL mining in this species. Validated SNPs represent a useful set of growth-related gene markers for future association, functional and population studies in this flatfish species. PMID:26901189

  15. Integrative Transcriptome, Genome and Quantitative Trait Loci Resources Identify Single Nucleotide Polymorphisms in Candidate Genes for Growth Traits in Turbot

    Directory of Open Access Journals (Sweden)

    Diego Robledo

    2016-02-01

    Full Text Available Growth traits represent a main goal in aquaculture breeding programs and may be related to adaptive variation in wild fisheries. Integrating quantitative trait loci (QTL mapping and next generation sequencing can greatly help to identify variation in candidate genes, which can result in marker-assisted selection and better genetic structure information. Turbot is a commercially important flatfish in Europe and China, with available genomic information on QTLs and genome mapping. Muscle and liver RNA-seq from 18 individuals was carried out to obtain gene sequences and markers functionally related to growth, resulting in a total of 20,447 genes and 85,344 single nucleotide polymorphisms (SNPs. Many growth-related genes and SNPs were identified and placed in the turbot genome and genetic map to explore their co-localization with growth-QTL markers. Forty-five SNPs on growth-related genes were selected based on QTL co-localization and relevant function for growth traits. Forty-three SNPs were technically feasible and validated in a wild Atlantic population, where 91% were polymorphic. The integration of functional and structural genomic resources in turbot provides a practical approach for QTL mining in this species. Validated SNPs represent a useful set of growth-related gene markers for future association, functional and population studies in this flatfish species.

  16. Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions

    Science.gov (United States)

    Burton, Joshua N.; Adey, Andrew; Patwardhan, Rupali P.; Qiu, Ruolan; Kitzman, Jacob O.; Shendure, Jay

    2014-01-01

    Genomes assembled de novo from short reads are highly fragmented relative to the finished chromosomes of H. sapiens and key model organisms generated by the Human Genome Project. To address this, we need scalable, cost-effective methods enabling chromosome-scale contiguity. Here we show that genome-wide chromatin interaction datasets, such as those generated by Hi-C, are a rich source of long-range information for assigning, ordering and orienting genomic sequences to chromosomes, including across centromeres. To exploit this, we developed an algorithm that uses Hi-C data for ultra-long-range scaffolding of de novo genome assemblies. We demonstrate the approach by combining shotgun fragment and short jump mate-pair sequences with Hi-C data to generate chromosome-scale de novo assemblies of the human, mouse and Drosophila genomes, achieving – for human – 98% accuracy in assigning scaffolds to chromosome groups and 99% accuracy in ordering and orienting scaffolds within chromosome groups. Hi-C data can also be used to validate chromosomal translocations in cancer genomes. PMID:24185095

  17. Unidimensional nonnegative scaling for genome-wide linkage disequilibrium maps.

    Science.gov (United States)

    Liao, Haiyong; Ng, Michael; Fung, Eric; Sham, Pak C

    2008-01-01

    The main aim of this paper is to propose and develop a unidimensional nonnegative scaling model to construct Linkage Disequilibrium (LD) maps. The proposed constrained scaling model can be efficiently solved by transforming it to an unconstrained model. The method is implemented in PC Clusters at Hong Kong Baptist University. The LD maps are constructed for four populations from Hapmap data sets with chromosomes of several ten thousand Single Nucleotide Polymorphisms (SNPs). The similarities and dissimilarities of the LD maps are studied and analysed. Computational results are also reported to show the effectiveness of the method using parallel computation.

  18. Ancient Duplications and Expression Divergence in the Globin Gene Superfamily of Vertebrates: Insights from the Elephant Shark Genome and Transcriptome.

    Science.gov (United States)

    Opazo, Juan C; Lee, Alison P; Hoffmann, Federico G; Toloza-Villalobos, Jessica; Burmester, Thorsten; Venkatesh, Byrappa; Storz, Jay F

    2015-07-01

    Comparative analyses of vertebrate genomes continue to uncover a surprising diversity of genes in the globin gene superfamily, some of which have very restricted phyletic distributions despite their antiquity. Genomic analysis of the globin gene repertoire of cartilaginous fish (Chondrichthyes) should be especially informative about the duplicative origins and ancestral functions of vertebrate globins, as divergence between Chondrichthyes and bony vertebrates represents the most basal split within the jawed vertebrates. Here, we report a comparative genomic analysis of the vertebrate globin gene family that includes the complete globin gene repertoire of the elephant shark (Callorhinchus milii). Using genomic sequence data from representatives of all major vertebrate classes, integrated analyses of conserved synteny and phylogenetic relationships revealed that the last common ancestor of vertebrates possessed a repertoire of at least seven globin genes: single copies of androglobin and neuroglobin, four paralogous copies of globin X, and the single-copy progenitor of the entire set of vertebrate-specific globins. Combined with expression data, the genomic inventory of elephant shark globins yielded four especially surprising findings: 1) there is no trace of the neuroglobin gene (a highly conserved gene that is present in all other jawed vertebrates that have been examined to date), 2) myoglobin is highly expressed in heart, but not in skeletal muscle (reflecting a possible ancestral condition in vertebrates with single-circuit circulatory systems), 3) elephant shark possesses two highly divergent globin X paralogs, one of which is preferentially expressed in gonads, and 4) elephant shark possesses two structurally distinct α-globin paralogs, one of which is preferentially expressed in the brain. Expression profiles of elephant shark globin genes reveal distinct specializations of function relative to orthologs in bony vertebrates and suggest hypotheses about

  19. Modeling Method for Increased Precision and Scope of Directly Measurable Fluxes at a Genome-Scale

    DEFF Research Database (Denmark)

    McCloskey, Douglas; Young, Jamey D.; Xu, Sibei

    2016-01-01

    Metabolic flux analysis (MFA) is considered to be the gold standard for determining the intracellular flux distribution of biological systems. The majority of work using MFA has been limited to core models of metabolism due to challenges in implementing genome-scale MFA and the undesirable trade...... distributions (MIDs),(1) it was found that a total of 232 net fluxes of central and peripheral metabolism could be resolved in the E. coli network. The increase in scope was shown to cover the full biosynthetic route to an expanded set of bioproduction pathways, which should facilitate applications......-off between increased scope and decreased precision in flux estimations. This work presents a tunable workflow for expanding the scope of MFA to the genome-scale without trade-offs in flux precision. The genome-scale MFA model presented here, iDM2014, accounts for 537 net reactions, which includes the core...

  20. Mapping condition-dependent regulation of metabolism in yeast through genome-scale modeling

    DEFF Research Database (Denmark)

    Österlund, Tobias; Nookaew, Intawat; Bordel, Sergio

    2013-01-01

    ABSTRACT: BACKGROUND: The genome-scale metabolic model of Saccharomyces cerevisiae, first presented in 2003, was the first genome-scale network reconstruction for a eukaryotic organism. Since then continuous efforts have been made in order to improve and expand the yeast metabolic network. RESULTS......: Here we present iTO977, a comprehensive genome-scale metabolic model that contains more reactions, metabolites and genes than previous models. The model was constructed based on two earlier reconstructions, namely iIN800 and the consensus network, and then improved and expanded using gap......-filling methods and by introducing new reactions and pathways based on studies of the literature and databases. The model was shown to perform well both for growth simulations in different media and gene essentiality analysis for single and double knock-outs. Further, the model was used as a scaffold...

  1. In-Depth Genomic and Transcriptomic Analysis of Five K+ Transporter Gene Families in Soybean Confirm Their Differential Expression for Nodulation

    Directory of Open Access Journals (Sweden)

    Hafiz M. Rehman

    2017-05-01

    Full Text Available Plants have evolved a sophisticated network of K+ transport systems to regulate growth and development. Limited K+ resources are now forcing us to investigate how plant demand can be satisfied. To answer this complex question, we must understand the genomic and transcriptomic portfolio of K+ transporters in plants. Here, we have identified 70 putative K+ transporter genes from soybean, including 29 HAK/KT/KUP genes, 16 genes encoding voltage-gated K+ channels, 9 TPK/KCO genes, 4 HKT genes, and 12 KEA genes. To clarify the molecular evolution of each family in soybean, we analyzed their phylogeny, mode of duplication, exon structures and splice sites, and paralogs. Additionally, ortholog clustering and syntenic analysis across five other dicots further explored the evolution of these gene families and indicated that the soybean data is suitable as a model for all other legumes. Available microarray data sets from Genevestigator about nodulation was evaluated and further confirmed with the RNA sequencing data available by a web server. For each family, expression models were designed based on Transcripts Per Kilobase Million (TPM values; the outcomes indicated differential expression linked to nodulation and confirmed the genes' putative roles. In-depth studies such as ours provides the basis for understanding K+ inventories in all other plants.

  2. In-Depth Genomic and Transcriptomic Analysis of Five K(+) Transporter Gene Families in Soybean Confirm Their Differential Expression for Nodulation.

    Science.gov (United States)

    Rehman, Hafiz M; Nawaz, Muhammad A; Shah, Zahid Hussain; Daur, Ihsanullah; Khatoon, Sadia; Yang, Seung Hwan; Chung, Gyuhwa

    2017-01-01

    Plants have evolved a sophisticated network of K(+) transport systems to regulate growth and development. Limited K(+) resources are now forcing us to investigate how plant demand can be satisfied. To answer this complex question, we must understand the genomic and transcriptomic portfolio of K(+) transporters in plants. Here, we have identified 70 putative K(+) transporter genes from soybean, including 29 HAK/KT/KUP genes, 16 genes encoding voltage-gated K(+) channels, 9 TPK/KCO genes, 4 HKT genes, and 12 KEA genes. To clarify the molecular evolution of each family in soybean, we analyzed their phylogeny, mode of duplication, exon structures and splice sites, and paralogs. Additionally, ortholog clustering and syntenic analysis across five other dicots further explored the evolution of these gene families and indicated that the soybean data is suitable as a model for all other legumes. Available microarray data sets from Genevestigator about nodulation was evaluated and further confirmed with the RNA sequencing data available by a web server. For each family, expression models were designed based on Transcripts Per Kilobase Million (TPM) values; the outcomes indicated differential expression linked to nodulation and confirmed the genes' putative roles. In-depth studies such as ours provides the basis for understanding K(+) inventories in all other plants.

  3. In-Depth Genomic and Transcriptomic Analysis of Five K+ Transporter Gene Families in Soybean Confirm Their Differential Expression for Nodulation

    Science.gov (United States)

    Rehman, Hafiz M.; Nawaz, Muhammad A.; Shah, Zahid Hussain; Daur, Ihsanullah; Khatoon, Sadia; Yang, Seung Hwan; Chung, Gyuhwa

    2017-01-01

    Plants have evolved a sophisticated network of K+ transport systems to regulate growth and development. Limited K+ resources are now forcing us to investigate how plant demand can be satisfied. To answer this complex question, we must understand the genomic and transcriptomic portfolio of K+ transporters in plants. Here, we have identified 70 putative K+ transporter genes from soybean, including 29 HAK/KT/KUP genes, 16 genes encoding voltage-gated K+ channels, 9 TPK/KCO genes, 4 HKT genes, and 12 KEA genes. To clarify the molecular evolution of each family in soybean, we analyzed their phylogeny, mode of duplication, exon structures and splice sites, and paralogs. Additionally, ortholog clustering and syntenic analysis across five other dicots further explored the evolution of these gene families and indicated that the soybean data is suitable as a model for all other legumes. Available microarray data sets from Genevestigator about nodulation was evaluated and further confirmed with the RNA sequencing data available by a web server. For each family, expression models were designed based on Transcripts Per Kilobase Million (TPM) values; the outcomes indicated differential expression linked to nodulation and confirmed the genes' putative roles. In-depth studies such as ours provides the basis for understanding K+ inventories in all other plants. PMID:28588592

  4. An ANOCEF genomic and transcriptomic microarray study of the response to radiotherapy or to alkylating first-line chemotherapy in glioblastoma patients

    Directory of Open Access Journals (Sweden)

    Ducray François

    2010-09-01

    Full Text Available Abstract Background The molecular characteristics associated with the response to treatment in glioblastomas (GBMs remain largely unknown. We performed a retrospective study to assess the genomic characteristics associated with the response of GBMs to either first-line chemotherapy or radiation therapy. The gene expression (n = 56 and genomic profiles (n = 67 of responders and non-responders to first-line chemotherapy or radiation therapy alone were compared on Affymetrix Plus 2 gene expression arrays and BAC CGH arrays. Results According to Verhaak et al.'s classification system, mesenchymal GBMs were more likely to respond to radiotherapy than to first-line chemotherapy, whereas classical GBMs were more likely to respond to first-line chemotherapy than to radiotherapy. In patients treated with radiation therapy alone, the response was associated with differential expression of microenvironment-associated genes; the expression of hypoxia-related genes was associated with short-term progression-free survival ( 10 months. Consistently, infiltration of the tumor by both CD3 and CD68 cells was significantly more frequent in responders to radiotherapy than in non-responders. In patients treated with first-line chemotherapy, the expression of stem-cell genes was associated with resistance to chemotherapy, and there was a significant association between response to treatment and p16 locus deletions. Consistently, in an independent data set of patients treated with either radiotherapy alone or with both radiotherapy and adjuvant chemotherapy, we found that patients with the p16 deletion benefited from adjuvant chemotherapy regardless of their MGMT promoter methylation status, whereas in patients without the p16 deletion, this benefit was only observed in patients with a methylated MGMT promoter. Conclusion Differential expression of microenvironment genes and p16 locus deletion are associated with responses to radiation therapy and to first

  5. Insights into the physiology and ecology of the brackish-water-adapted Cyanobacterium Nodularia spumigena CCY9414 based on a genome-transcriptome analysis.

    Directory of Open Access Journals (Sweden)

    Björn Voss

    Full Text Available Nodularia spumigena is a filamentous diazotrophic cyanobacterium that dominates the annual late summer cyanobacterial blooms in the Baltic Sea. But N. spumigena also is common in brackish water bodies worldwide, suggesting special adaptation allowing it to thrive at moderate salinities. A draft genome analysis of N. spumigena sp. CCY9414 yielded a single scaffold of 5,462,271 nucleotides in length on which genes for 5,294 proteins were annotated. A subsequent strand-specific transcriptome analysis identified more than 6,000 putative transcriptional start sites (TSS. Orphan TSSs located in intergenic regions led us to predict 764 non-coding RNAs, among them 70 copies of a possible retrotransposon and several potential RNA regulators, some of which are also present in other N2-fixing cyanobacteria. Approximately 4% of the total coding capacity is devoted to the production of secondary metabolites, among them the potent hepatotoxin nodularin, the linear spumigin and the cyclic nodulapeptin. The transcriptional complexity associated with genes involved in nitrogen fixation and heterocyst differentiation is considerably smaller compared to other Nostocales. In contrast, sophisticated systems exist for the uptake and assimilation of iron and phosphorus compounds, for the synthesis of compatible solutes, and for the formation of gas vesicles, required for the active control of buoyancy. Hence, the annotation and interpretation of this sequence provides a vast array of clues into the genomic underpinnings of the physiology of this cyanobacterium and indicates in particular a competitive edge of N. spumigena in nutrient-limited brackish water ecosystems.

  6. Functional genomics of probiotic Escherichia coli Nissle 1917 and 83972, and UPEC strain CFT073: comparison of transcriptomes, growth and biofilm formation.

    Science.gov (United States)

    Hancock, Viktoria; Vejborg, Rebecca Munk; Klemm, Per

    2010-12-01

    Strain CFT073 is a bona fide uropathogen, whereas strains 83972 and Nissle 1917 are harmless probiotic strains of urinary tract and faecal origin, respectively. Despite their different environmental origins and dispositions the three strains are very closely related and the ancestors of 83972 and Nissle 1917 must have been very similar to CFT073. Here, we report the first functional genome profiling of Nissle 1917 and the first biofilm profiling of a uropathogen. Transcriptomic profiling revealed that Nissle 1917 expressed many UPEC-associated genes and showed that the active genomic profiles of the three strains are closely related. The data demonstrate that the distance from a pathogen to a probiotic strain can be surprisingly short. We demonstrate that Nissle 1917, in spite of its intestinal niche origin, grows well in urine, and is a good biofilm former in this medium in which it also out-competes CFT073 during planktonic growth. The role in biofilm formation of three up-regulated genes, yhaK, yhcN and ybiJ, was confirmed by knockout mutants in Nissle 1917 and CFT073. Two of these mutants CFT073∆yhcN and CFT073∆ybiJ had significantly reduced motility compared with the parent strain, arguably accounting for the impaired biofilm formation. Although the three strains have very different strategies vis-à-vis the human host their functional gene profiles are surprisingly similar. It is also interesting to note that the only two Escherichia coli strains used as probiotics are in fact deconstructed pathogens.

  7. Rapid Prototyping of Microbial Cell Factories via Genome-scale Engineering

    Science.gov (United States)

    Si, Tong; Xiao, Han; Zhao, Huimin

    2014-01-01

    Advances in reading, writing and editing genetic materials have greatly expanded our ability to reprogram biological systems at the resolution of a single nucleotide and on the scale of a whole genome. Such capacity has greatly accelerated the cycles of design, build and test to engineer microbes for efficient synthesis of fuels, chemicals and drugs. In this review, we summarize the emerging technologies that have been applied, or are potentially useful for genome-scale engineering in microbial systems. We will focus on the development of high-throughput methodologies, which may accelerate the prototyping of microbial cell factories. PMID:25450192

  8. New approach for phylogenetic tree recovery based on genome-scale metabolic networks.

    Science.gov (United States)

    Gamermann, Daniel; Montagud, Arnaud; Conejero, J Alberto; Urchueguía, Javier F; de Córdoba, Pedro Fernández

    2014-07-01

    A wide range of applications and research has been done with genome-scale metabolic models. In this work, we describe an innovative methodology for comparing metabolic networks constructed from genome-scale metabolic models and how to apply this comparison in order to infer evolutionary distances between different organisms. Our methodology allows a quantification of the metabolic differences between different species from a broad range of families and even kingdoms. This quantification is then applied in order to reconstruct phylogenetic trees for sets of various organisms.

  9. Rapid prototyping of microbial cell factories via genome-scale engineering.

    Science.gov (United States)

    Si, Tong; Xiao, Han; Zhao, Huimin

    2015-11-15

    Advances in reading, writing and editing genetic materials have greatly expanded our ability to reprogram biological systems at the resolution of a single nucleotide and on the scale of a whole genome. Such capacity has greatly accelerated the cycles of design, build and test to engineer microbes for efficient synthesis of fuels, chemicals and drugs. In this review, we summarize the emerging technologies that have been applied, or are potentially useful for genome-scale engineering in microbial systems. We will focus on the development of high-throughput methodologies, which may accelerate the prototyping of microbial cell factories.

  10. Multi-scaling hierarchical structure analysis on the sequence of E. coli complete genome

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    We have applied the newly developed hierarchical structure theory for complex systems to analyze the multi-scaling structures of the nucleotide density distribution along a linear DNA sequence from the complete Escherichia coli genome. The hierarchical symmetry in the nucleotide density distribution was demonstrated. In particular, we have shown that the G, C density distribution that represents a strong H-bonding between the two DNA chains is more coherent with smaller similarity parameter compared to that of A, T density distribution, indicating a better organized multi-scaling fluctuation field for G, C density distribution along the genome sequence. The biological significance of these findings is under investigation.

  11. Genome-wide DNA promoter methylation and transcriptome analysis in human adipose tissue unravels novel candidate genes for obesity

    OpenAIRE

    Maria Keller; Lydia Hopp; Xuanshi Liu; Tobias Wohland; Kerstin Rohde; Raffaella Cancello; Matthias Klös; Karl Bacos; Matthias Kern; Fabian Eichelmann; Arne Dietrich; Michael R Schön; Daniel Gärtner; Tobias Lohmann; Miriam Dreßler

    2017-01-01

    Objective/methods: DNA methylation plays an important role in obesity and related metabolic complications. We examined genome-wide DNA promoter methylation along with mRNA profiles in paired samples of human subcutaneous adipose tissue (SAT) and omental visceral adipose tissue (OVAT) from non-obese vs. obese individuals. Results: We identified negatively correlated methylation and expression of several obesity-associated genes in our discovery dataset and in silico replicated ETV6 in two i...

  12. Exploring massive, genome scale datasets with the genometricorr package

    KAUST Repository

    Favorov, Alexander

    2012-05-31

    We have created a statistically grounded tool for determining the correlation of genomewide data with other datasets or known biological features, intended to guide biological exploration of high-dimensional datasets, rather than providing immediate answers. The software enables several biologically motivated approaches to these data and here we describe the rationale and implementation for each approach. Our models and statistics are implemented in an R package that efficiently calculates the spatial correlation between two sets of genomic intervals (data and/or annotated features), for use as a metric of functional interaction. The software handles any type of pointwise or interval data and instead of running analyses with predefined metrics, it computes the significance and direction of several types of spatial association; this is intended to suggest potentially relevant relationships between the datasets. Availability and implementation: The package, GenometriCorr, can be freely downloaded at http://genometricorr.sourceforge.net/. Installation guidelines and examples are available from the sourceforge repository. The package is pending submission to Bioconductor. © 2012 Favorov et al.

  13. Integrating genomics and transcriptomics with geo-ethnicity and the environment for the resolution of complex cardiovascular diseases.

    Science.gov (United States)

    Seda, Ondrej; Tremblay, Johanne; Sedová, Lucie; Hamet, Pavel

    2005-12-01

    One of the crucial steps on the way to individualized medicine to treat cardiovascular disease (CVD) is to better understand the identities, roles, extent and at least the major patterns of interaction between influential genomic and environmental factors. It is clear that such a bold goal can hardly be achieved without a major upgrade of our conceptualization of the phenomena studied, taking advantage of recent developments of novel technological and computational tools. Firstly, the search for the genomic components of the most common multifactorial CVDs is no longer restricted to protein-coding genes; truly genome-wide investigations should replace them in both humans and animal models. Secondly, the 'environment' has also undergone semantic expansion, incorporating such remote constituents as developmental plasticity and epigenetics on one side, and socioeconomic status on the other. To elucidate and analyze the resulting complex picture, appropriate statistical models and approaches need to be designed to tackle issues such as population stratification and admixture, multiple testing, and multidimensionality reduction in models involving multiple genes and environmental factors. Eventually, an integrated platform bringing together all of the above will probably be necessary to secure relevant information specific to a particular combination of conditions and settings (age, geo-ethnicity and exposure), which may perhaps become visible only after a step back, through systems (network) biology.

  14. Genomes and transcriptomes of partners in plant-fungal-interactions between canola (Brassica napus and two Leptosphaeria species.

    Directory of Open Access Journals (Sweden)

    Rohan G T Lowe

    Full Text Available Leptosphaeria maculans 'brassicae' is a damaging fungal pathogen of canola (Brassica napus, causing lesions on cotyledons and leaves, and cankers on the lower stem. A related species, L. biglobosa 'canadensis', colonises cotyledons but causes few stem cankers. We describe the complement of genes encoding carbohydrate-active enzymes (CAZys and peptidases of these fungi, as well as of four related plant pathogens. We also report dual-organism RNA-seq transcriptomes of these two Leptosphaeria species and B. napus during disease. During the first seven days of infection L. biglobosa 'canadensis', a necrotroph, expressed more cell wall degrading genes than L. maculans 'brassicae', a hemi-biotroph. L. maculans 'brassicae' expressed many genes in the Carbohydrate Binding Module class of CAZy, particularly CBM50 genes, with potential roles in the evasion of basal innate immunity in the host plant. At this time, three avirulence genes were amongst the top 20 most highly upregulated L. maculans 'brassicae' genes in planta. The two fungi had a similar number of peptidase genes, and trypsin was transcribed at high levels by both fungi early in infection. L. biglobosa 'canadensis' infection activated the jasmonic acid and salicylic acid defence pathways in B. napus, consistent with defence against necrotrophs. L. maculans 'brassicae' triggered a high level of expression of isochorismate synthase 1, a reporter for salicylic acid signalling. L. biglobosa 'canadensis' infection triggered coordinated shutdown of photosynthesis genes, and a concomitant increase in transcription of cell wall remodelling genes of the host plant. Expression of particular classes of CAZy genes and the triggering of host defence and particular metabolic pathways are consistent with the necrotrophic lifestyle of L. biglobosa 'canadensis', and the hemibiotrophic life style of L. maculans 'brassicae'.

  15. Development and experimental verification of a genome-scale metabolic model for Corynebacterium glutamicum

    Directory of Open Access Journals (Sweden)

    Hirasawa Takashi

    2009-08-01

    Full Text Available Abstract Background In silico genome-scale metabolic models enable the analysis of the characteristics of metabolic systems of organisms. In this study, we reconstructed a genome-scale metabolic model of Corynebacterium glutamicum on the basis of genome sequence annotation and physiological data. The metabolic characteristics were analyzed using flux balance analysis (FBA, and the results of FBA were validated using data from culture experiments performed at different oxygen uptake rates. Results The reconstructed genome-scale metabolic model of C. glutamicum contains 502 reactions and 423 metabolites. We collected the reactions and biomass components from the database and literatures, and made the model available for the flux balance analysis by filling gaps in the reaction networks and removing inadequate loop reactions. Using the framework of FBA and our genome-scale metabolic model, we first simulated the changes in the metabolic flux profiles that occur on changing the oxygen uptake rate. The predicted production yields of carbon dioxide and organic acids agreed well with the experimental data. The metabolic profiles of amino acid production phases were also investigated. A comprehensive gene deletion study was performed in which the effects of gene deletions on metabolic fluxes were simulated; this helped in the identification of several genes whose deletion resulted in an improvement in organic acid production. Conclusion The genome-scale metabolic model provides useful information for the evaluation of the metabolic capabilities and prediction of the metabolic characteristics of C. glutamicum. This can form a basis for the in silico design of C. glutamicum metabolic networks for improved bioproduction of desirable metabolites.

  16. Symbolic flux analysis for genome-scale metabolic networks

    Directory of Open Access Journals (Sweden)

    Peterson Pearu

    2011-05-01

    Full Text Available Abstract Background With the advent of genomic technology, the size of metabolic networks that are subject to analysis is growing. A common task when analyzing metabolic networks is to find all possible steady state regimes. There are several technical issues that have to be addressed when analyzing large metabolic networks including accumulation of numerical errors and presentation of the solution to the researcher. One way to resolve those technical issues is to analyze the network using symbolic methods. The aim of this paper is to develop a routine that symbolically finds the steady state solutions of large metabolic networks. Results A symbolic Gauss-Jordan elimination routine was developed for analyzing large metabolic networks. This routine was tested by finding the steady state solutions for a number of curated stoichiometric matrices with the largest having about 4000 reactions. The routine was able to find the solution with a computational time similar to the time used by a numerical singular value decomposition routine. As an advantage of symbolic solution, a set of independent fluxes can be suggested by the researcher leading to the formation of a desired flux basis describing the steady state solution of the network. These independent fluxes can be constrained using experimental data. We demonstrate the application of constraints by calculating a flux distribution for the central metabolic and amino acid biosynthesis pathways of yeast. Conclusions We were able to find symbolic solutions for the steady state flux distribution of large metabolic networks. The ability to choose a flux basis was found to be useful in the constraint process and provides a strong argument for using symbolic Gauss-Jordan elimination in place of singular value decomposition.

  17. Symbolic flux analysis for genome-scale metabolic networks.

    Science.gov (United States)

    Schryer, David W; Vendelin, Marko; Peterson, Pearu

    2011-05-23

    With the advent of genomic technology, the size of metabolic networks that are subject to analysis is growing. A common task when analyzing metabolic networks is to find all possible steady state regimes. There are several technical issues that have to be addressed when analyzing large metabolic networks including accumulation of numerical errors and presentation of the solution to the researcher. One way to resolve those technical issues is to analyze the network using symbolic methods. The aim of this paper is to develop a routine that symbolically finds the steady state solutions of large metabolic networks. A symbolic Gauss-Jordan elimination routine was developed for analyzing large metabolic networks. This routine was tested by finding the steady state solutions for a number of curated stoichiometric matrices with the largest having about 4000 reactions. The routine was able to find the solution with a computational time similar to the time used by a numerical singular value decomposition routine. As an advantage of symbolic solution, a set of independent fluxes can be suggested by the researcher leading to the formation of a desired flux basis describing the steady state solution of the network. These independent fluxes can be constrained using experimental data. We demonstrate the application of constraints by calculating a flux distribution for the central metabolic and amino acid biosynthesis pathways of yeast. We were able to find symbolic solutions for the steady state flux distribution of large metabolic networks. The ability to choose a flux basis was found to be useful in the constraint process and provides a strong argument for using symbolic Gauss-Jordan elimination in place of singular value decomposition.

  18. Rapid genome-scale mapping of chromatin accessibility in tissue

    Science.gov (United States)

    2012-01-01

    Background The challenge in extracting genome-wide chromatin features from limiting clinical samples poses a significant hurdle in identification of regulatory marks that impact the physiological or pathological state. Current methods that identify nuclease accessible chromatin are reliant on large amounts of purified nuclei as starting material. This complicates analysis of trace clinical tissue samples that are often stored frozen. We have developed an alternative nuclease based procedure to bypass nuclear preparation to interrogate nuclease accessible regions in frozen tissue samples. Results Here we introduce a novel technique that specifically identifies Tissue Accessible Chromatin (TACh). The TACh method uses pulverized frozen tissue as starting material and employs one of the two robust endonucleases, Benzonase or Cyansase, which are fully active under a range of stringent conditions such as high levels of detergent and DTT. As a proof of principle we applied TACh to frozen mouse liver tissue. Combined with massive parallel sequencing TACh identifies accessible regions that are associated with euchromatic features and accessibility at transcriptional start sites correlates positively with levels of gene transcription. Accessible chromatin identified by TACh overlaps to a large extend with accessible chromatin identified by DNase I using nuclei purified from freshly isolated liver tissue as starting material. The similarities are most pronounced at highly accessible regions, whereas identification of less accessible regions tends to be more divergence between nucleases. Interestingly, we show that some of the differences between DNase I and Benzonase relate to their intrinsic sequence biases and accordingly accessibility of CpG islands is probed more efficiently using TACh. Conclusion The TACh methodology identifies accessible chromatin derived from frozen tissue samples. We propose that this simple, robust approach can be applied across a broad range of

  19. Rapid genome-scale mapping of chromatin accessibility in tissue

    Directory of Open Access Journals (Sweden)

    Grøntved Lars

    2012-06-01

    Full Text Available Abstract Background The challenge in extracting genome-wide chromatin features from limiting clinical samples poses a significant hurdle in identification of regulatory marks that impact the physiological or pathological state. Current methods that identify nuclease accessible chromatin are reliant on large amounts of purified nuclei as starting material. This complicates analysis of trace clinical tissue samples that are often stored frozen. We have developed an alternative nuclease based procedure to bypass nuclear preparation to interrogate nuclease accessible regions in frozen tissue samples. Results Here we introduce a novel technique that specifically identifies Tissue Accessible Chromatin (TACh. The TACh method uses pulverized frozen tissue as starting material and employs one of the two robust endonucleases, Benzonase or Cyansase, which are fully active under a range of stringent conditions such as high levels of detergent and DTT. As a proof of principle we applied TACh to frozen mouse liver tissue. Combined with massive parallel sequencing TACh identifies accessible regions that are associated with euchromatic features and accessibility at transcriptional start sites correlates positively with levels of gene transcription. Accessible chromatin identified by TACh overlaps to a large extend with accessible chromatin identified by DNase I using nuclei purified from freshly isolated liver tissue as starting material. The similarities are most pronounced at highly accessible regions, whereas identification of less accessible regions tends to be more divergence between nucleases. Interestingly, we show that some of the differences between DNase I and Benzonase relate to their intrinsic sequence biases and accordingly accessibility of CpG islands is probed more efficiently using TACh. Conclusion The TACh methodology identifies accessible chromatin derived from frozen tissue samples. We propose that this simple, robust approach can be applied

  20. Scaling laws in functional genome content across prokaryotic clades and lifestyles.

    Science.gov (United States)

    Molina, Nacho; van Nimwegen, Erik

    2009-06-01

    For high-level functional categories that are represented in almost all prokaryotic genomes, the numbers of genes in these categories scale as power-laws in the total number of genes. We present a comprehensive analysis of the variation in these scaling laws across prokaryotic clades and lifestyles. For the large majority of functional categories, including transcription regulators, the inferred scaling laws are statistically indistinguishable across clades and lifestyles, supporting the simple hypothesis that these scaling laws are universally shared by all prokaryotes.

  1. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models.

    Science.gov (United States)

    King, Zachary A; Lu, Justin; Dräger, Andreas; Miller, Philip; Federowicz, Stephen; Lerman, Joshua A; Ebrahim, Ali; Palsson, Bernhard O; Lewis, Nathan E

    2016-01-01

    Genome-scale metabolic models are mathematically-structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized repositories of high-quality models must be established, models must adhere to established standards and model components must be linked to relevant databases. Tools for model visualization further enhance their utility. To meet these needs, we present BiGG Models (http://bigg.ucsd.edu), a completely redesigned Biochemical, Genetic and Genomic knowledge base. BiGG Models contains more than 75 high-quality, manually-curated genome-scale metabolic models. On the website, users can browse, search and visualize models. BiGG Models connects genome-scale models to genome annotations and external databases. Reaction and metabolite identifiers have been standardized across models to conform to community standards and enable rapid comparison across models. Furthermore, BiGG Models provides a comprehensive application programming interface for accessing BiGG Models with modeling and analysis tools. As a resource for highly curated, standardized and accessible models of metabolism, BiGG Models will facilitate diverse systems biology studies and support knowledge-based analysis of diverse experimental data.

  2. Genome-wide transcriptomic analysis of cotton under drought stress reveal significant down-regulation of genes and pathways involved in fibre elongation and up-regulation of defense responsive genes.

    Science.gov (United States)

    Padmalatha, Kethireddy Venkata; Dhandapani, Gurusamy; Kanakachari, Mogilicherla; Kumar, Saravanan; Dass, Abhishek; Patil, Deepak Prabhakar; Rajamani, Vijayalakshmi; Kumar, Krishan; Pathak, Ranjana; Rawat, Bhupendra; Leelavathi, Sadhu; Reddy, Palakolanu Sudhakar; Jain, Neha; Powar, Kasu N; Hiremath, Vamadevaiah; Katageri, Ishwarappa S; Reddy, Malireddy K; Solanke, Amolkumar U; Reddy, Vanga Siva; Kumar, Polumetla Ananda

    2012-02-01

    Cotton is an important source of natural fibre used in the textile industry and the productivity of the crop is adversely affected by drought stress. High throughput transcriptomic analyses were used to identify genes involved in fibre development. However, not much information is available on cotton genome response in developing fibres under drought stress. In the present study a genome wide transcriptome analysis was carried out to identify differentially expressed genes at various stages of fibre growth under drought stress. Our study identified a number of genes differentially expressed during fibre elongation as compared to other stages. High level up-regulation of genes encoding for enzymes involved in pectin modification and cytoskeleton proteins was observed at fibre initiation stage. While a large number of genes encoding transcription factors (AP2-EREBP, WRKY, NAC and C2H2), osmoprotectants, ion transporters and heat shock proteins and pathways involved in hormone (ABA, ethylene and JA) biosynthesis and signal transduction were up-regulated and genes involved in phenylpropanoid and flavonoid biosynthesis, pentose and glucuronate interconversions and starch and sucrose metabolism pathways were down-regulated during fibre elongation. This study showed that drought has relatively less impact on fibre initiation but has profound effect on fibre elongation by down-regulating important genes involved in cell wall loosening and expansion process. The comprehensive transcriptome analysis under drought stress has provided valuable information on differentially expressed genes and pathways during fibre development that will be useful in developing drought tolerant cotton cultivars without compromising fibre quality.

  3. Large-scale profiling of microRNAs for The Cancer Genome Atlas.

    Science.gov (United States)

    Chu, Andy; Robertson, Gordon; Brooks, Denise; Mungall, Andrew J; Birol, Inanc; Coope, Robin; Ma, Yussanne; Jones, Steven; Marra, Marco A

    2016-01-01

    The comprehensive multiplatform genomics data generated by The Cancer Genome Atlas (TCGA) Research Network is an enabling resource for cancer research. It includes an unprecedented amount of microRNA sequence data: ~11 000 libraries across 33 cancer types. Combined with initiatives like the National Cancer Institute Genomics Cloud Pilots, such data resources will make intensive analysis of large-scale cancer genomics data widely accessible. To support such initiatives, and to enable comparison of TCGA microRNA data to data from other projects, we describe the process that we developed and used to generate the microRNA sequence data, from library construction through to submission of data to repositories. In the context of this process, we describe the computational pipeline that we used to characterize microRNA expression across large patient cohorts.

  4. Improved annotation through genome-scale metabolic modeling of Aspergillus oryzae

    DEFF Research Database (Denmark)

    Vongsangnak, Wanwipa; Olsen, Peter; Hansen, Kim;

    2008-01-01

    to a genome scale metabolic model of A. oryzae. Results: Our assembled EST sequences we identified 1,046 newly predicted genes in the A. oryzae genome. Furthermore, it was possible to assign putative protein functions to 398 of the newly predicted genes. Noteworthy, our annotation strategy resulted......Background: Since ancient times the filamentous fungus Aspergillus oryzae has been used in the fermentation industry for the production of fermented sauces and the production of industrial enzymes. Recently, the genome sequence of A. oryzae with 12,074 annotated genes was released but the number...... of hypothetical proteins accounted for more than 50% of the annotated genes. Considering the industrial importance of this fungus, it is therefore valuable to improve the annotation and further integrate genomic information with biochemical and physiological information available for this microorganism and other...

  5. Genome-wide fine-scale recombination rate variation in Drosophila melanogaster.

    Directory of Open Access Journals (Sweden)

    Andrew H Chan

    Full Text Available Estimating fine-scale recombination maps of Drosophila from population genomic data is a challenging problem, in particular because of the high background recombination rate. In this paper, a new computational method is developed to address this challenge. Through an extensive simulation study, it is demonstrated that the method allows more accurate inference, and exhibits greater robustness to the effects of natural selection and noise, compared to a well-used previous method developed for studying fine-scale recombination rate variation in the human genome. As an application, a genome-wide analysis of genetic variation data is performed for two Drosophila melanogaster populations, one from North America (Raleigh, USA and the other from Africa (Gikongoro, Rwanda. It is shown that fine-scale recombination rate variation is widespread throughout the D. melanogaster genome, across all chromosomes and in both populations. At the fine-scale, a conservative, systematic search for evidence of recombination hotspots suggests the existence of a handful of putative hotspots each with at least a tenfold increase in intensity over the background rate. A wavelet analysis is carried out to compare the estimated recombination maps in the two populations and to quantify the extent to which recombination rates are conserved. In general, similarity is observed at very broad scales, but substantial differences are seen at fine scales. The average recombination rate of the X chromosome appears to be higher than that of the autosomes in both populations, and this pattern is much more pronounced in the African population than the North American population. The correlation between various genomic features-including recombination rates, diversity, divergence, GC content, gene content, and sequence quality-is examined using the wavelet analysis, and it is shown that the most notable difference between D. melanogaster and humans is in the correlation between

  6. Genome-scale reconstruction of the sigma factor network in Escherichia coli: topology and functional states

    DEFF Research Database (Denmark)

    Cho, Byung-Kwan; Kim, Donghyuk; Knight, Eric M.

    2014-01-01

    to transcription units (TUs), representing an increase of more than 300% over what has been previously reported. The reconstructed network was used to investigate competition between alternative sigma-factors (the sigma(70) and sigma(38) regulons), confirming the competition model of sigma substitution......Background: At the beginning of the transcription process, the RNA polymerase (RNAP) core enzyme requires a sigma-factor to recognize the genomic location at which the process initiates. Although the crucial role of sigma-factors has long been appreciated and characterized for many individual...... promoters, we do not yet have a genome-scale assessment of their function. Results: Using multiple genome-scale measurements, we elucidated the network of s-factor and promoter interactions in Escherichia coli. The reconstructed network includes 4,724 sigma-factor-specific promoters corresponding...

  7. Generation and Evaluation of a Genome-Scale Metabolic Network Model of Synechococcus elongatus PCC7942

    Directory of Open Access Journals (Sweden)

    Julián Triana

    2014-08-01

    Full Text Available The reconstruction of genome-scale metabolic models and their applications represent a great advantage of systems biology. Through their use as metabolic flux simulation models, production of industrially-interesting metabolites can be predicted. Due to the growing number of studies of metabolic models driven by the increasing genomic sequencing projects, it is important to conceptualize steps of reconstruction and analysis. We have focused our work in the cyanobacterium Synechococcus elongatus PCC7942, for which several analyses and insights are unveiled. A comprehensive approach has been used, which can be of interest to lead the process of manual curation and genome-scale metabolic analysis. The final model, iSyf715 includes 851 reactions and 838 metabolites. A biomass equation, which encompasses elementary building blocks to allow cell growth, is also included. The applicability of the model is finally demonstrated by simulating autotrophic growth conditions of Synechococcus elongatus PCC7942.

  8. Generation and Evaluation of a Genome-Scale Metabolic Network Model of Synechococcus elongatus PCC7942

    Science.gov (United States)

    Triana, Julián; Montagud†, Arnau; Siurana, Maria; Fuente, David; Urchueguía, Arantxa; Gamermann, Daniel; Torres, Javier; Tena, Jose; de Córdoba, Pedro Fernández; Urchueguía, Javier F.

    2014-01-01

    The reconstruction of genome-scale metabolic models and their applications represent a great advantage of systems biology. Through their use as metabolic flux simulation models, production of industrially-interesting metabolites can be predicted. Due to the growing number of studies of metabolic models driven by the increasing genomic sequencing projects, it is important to conceptualize steps of reconstruction and analysis. We have focused our work in the cyanobacterium Synechococcus elongatus PCC7942, for which several analyses and insights are unveiled. A comprehensive approach has been used, which can be of interest to lead the process of manual curation and genome-scale metabolic analysis. The final model, iSyf715 includes 851 reactions and 838 metabolites. A biomass equation, which encompasses elementary building blocks to allow cell growth, is also included. The applicability of the model is finally demonstrated by simulating autotrophic growth conditions of Synechococcus elongatus PCC7942. PMID:25141288

  9. Identification of molecular phenotypic descriptors of breast capsular contracture formation using informatics analysis of the whole genome transcriptome.

    Science.gov (United States)

    Kyle, Daniel J T; Harvey, Alison G; Shih, Barbara; Tan, Kian T; Chaudhry, Iskander H; Bayat, Ardeshir

    2013-01-01

    Breast capsular contracture formation following silicone implant augmentation/reconstruction is a common complication that remains poorly understood. The aim of this study was to identify potential biomarkers implicated in breast capsular contracture formation by using, for the first time, whole genome arrays. Biopsy samples were taken from 18 patients (23 breast capsules) with Baker Grade I-II (Control) and Baker Grade III-IV (Contracted). Whole genome microarrays were performed and six significantly dysregulated genes were selected for further validation with quantitative reverse transcriptase polymerase chain reaction and immunohistochemistry. Hematoxylin and eosin was also carried out to compare the histological characteristics of control and contracted samples. Microarray results showed that aggrecan, tissue inhibitor of metalloproteinase 4 (TIMP4), and tumor necrosis factor superfamily (ligand) member 11 were significantly down-regulated in contracted capsules; while matrix metallopeptidase 12, serum amyloid A 1, and interleukin 8 (IL8) were significantly up-regulated. The dysregulation of aggrecan, tumor necrosis factor superfamily (ligand) member 11, TIMP4, and IL8 was validated by quantitative reverse transcriptase polymerase chain reaction (p contracture formation. IL8 and TIMP4 may serve as potential key diagnostic, therapeutic, and prognostic biomarkers in capsular contracture formation. © 2013 by the Wound Healing Society.

  10. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models

    DEFF Research Database (Denmark)

    2016-01-01

    Genome-scale metabolic models are mathematically-structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized...

  11. Reliable and efficient solution of genome-scale models of Metabolism and macromolecular Expression

    DEFF Research Database (Denmark)

    Ma, Ding; Yang, Laurence; Fleming, Ronan M. T.

    2017-01-01

    Constraint-Based Reconstruction and Analysis (COBRA) is currently the only methodology that permits integrated modeling of Metabolism and macromolecular Expression (ME) at genome-scale. Linear optimization computes steady-state flux solutions to ME models, but flux values are spread over many...

  12. Interplay between Constraints, Objectives, and Optimality for Genome-Scale Stoichiometric Models

    NARCIS (Netherlands)

    Maarleveld, T.R.; Wortel, M.; Olivier, B.G.; Teusink, B.; Bruggeman, F.J.

    2015-01-01

    High-throughput data generation and genome-scale stoichiometric models have greatly facilitated the comprehensive study of metabolic networks. The computation of all feasible metabolic routes with these models, given stoichiometric, thermodynamic, and steady-state constraints, provides important ins

  13. MultiMetEval : Comparative and Multi-Objective Analysis of Genome-Scale Metabolic Models

    NARCIS (Netherlands)

    Zakrzewski, Piotr; Medema, Marnix H.; Gevorgyan, Albert; Kierzek, Andrzej M.; Breitling, Rainer; Takano, Eriko; Fong, Stephen S.

    2012-01-01

    Comparative metabolic modelling is emerging as a novel field, supported by the development of reliable and standardized approaches for constructing genome-scale metabolic models in high throughput. New software solutions are needed to allow efficient comparative analysis of multiple models in the co

  14. Challenges in experimental data integration within genome-scale metabolic models

    Directory of Open Access Journals (Sweden)

    Képès François

    2010-04-01

    Full Text Available Abstract A report of the meeting "Challenges in experimental data integration within genome-scale metabolic models", Institut Henri Poincaré, Paris, October 10-11 2009, organized by the CNRS-MPG joint program in Systems Biology.

  15. The architecture of ArgR-DNA complexes at the genome-scale in> Escherichia coli

    DEFF Research Database (Denmark)

    Cho, Suhyung; Cho, Yoo-Bok; Kang, Taek Jin;

    2015-01-01

    DNA-binding motifs that are recognized by transcription factors (TFs) have been well studied; however, challenges remain in determining the in vivo architecture of TF-DNA complexes on a genome-scale. Here, we determined the in vivo architecture of Escherichia coli arginine repressor (ArgR)-DNA co...

  16. A versatile genome-scale PCR-based pipeline for high-definition DNA FISH

    NARCIS (Netherlands)

    Bienko, M.; Crosetto, N.; Teytelman, L.; Klemm, S.; Itzkovitz, S.; van Oudenaarden, A.

    2013-01-01

    We developed a cost-effective genome-scale PCR-based method for high-definition DNA FISH (HD-FISH). We visualized gene loci with diffraction-limited resolution, chromosomes as spot clusters and single genes together with transcripts by combining HD-FISH with single-molecule RNA FISH. We provide a da

  17. Comparative genome-scale metabolic modeling of actinomycetes : The topology of essential core metabolism

    NARCIS (Netherlands)

    Alam, Mohammad Tauqeer; Medema, Marnix H.; Takano, Eriko; Breitling, Rainer; Gojobori, Takashi

    2011-01-01

    Actinomycetes are highly important bacteria. On one hand, some of them cause severe human and plant diseases, on the other hand, many species are known for their ability to produce antibiotics. Here we report the results of a comparative analysis of genome-scale metabolic models of 37 species of act

  18. Comparative genome-scale metabolic modeling of actinomycetes: the topology of essential core metabolism.

    NARCIS (Netherlands)

    Alam, M.T.; Medema, M.H.; Takano, E.; Breitling, R.

    2011-01-01

    Actinomycetes are highly important bacteria. On one hand, some of them cause severe human and plant diseases, on the other hand, many species are known for their ability to produce antibiotics. Here we report the results of a comparative analysis of genome-scale metabolic models of 37 species of act

  19. Comparative genome-scale metabolic modeling of actinomycetes: the topology of essential core metabolism.

    NARCIS (Netherlands)

    Alam, M.T.; Medema, M.H.; Takano, E.; Breitling, R.

    2011-01-01

    Actinomycetes are highly important bacteria. On one hand, some of them cause severe human and plant diseases, on the other hand, many species are known for their ability to produce antibiotics. Here we report the results of a comparative analysis of genome-scale metabolic models of 37 species of

  20. Comparative genome-scale metabolic modeling of actinomycetes : The topology of essential core metabolism

    NARCIS (Netherlands)

    Alam, Mohammad Tauqeer; Medema, Marnix H.; Takano, Eriko; Breitling, Rainer; Gojobori, Takashi

    2011-01-01

    Actinomycetes are highly important bacteria. On one hand, some of them cause severe human and plant diseases, on the other hand, many species are known for their ability to produce antibiotics. Here we report the results of a comparative analysis of genome-scale metabolic models of 37 species of

  1. GMATA: an integrated software package for genome-scale SSR mining, marker development and viewing

    Directory of Open Access Journals (Sweden)

    Xuewen Wang

    2016-09-01

    Full Text Available Simple sequence repeats (SSRs, also referred to as microsatellites, are highly variable tandem DNAs that are widely used as genetic markers. The increasing availability of whole-genome and transcript sequences provides information resources for SSR marker development. However, efficient software is required to efficiently identify and display SSR information along with other gene features at a genome scale. We developed novel software package Genome-wide Microsatellite Analyzing Tool Package (GMATA integrating SSR mining, statistical analysis and plotting, marker design, polymorphism screening and marker transferability, and enabled simultaneously display SSR markers with other genome features. GMATA applies novel strategies for SSR analysis and primer design in large genomes, which allows GMATA to perform faster calculation and provides more accurate results than existing tools. Our package is also capable of processing DNA sequences of any size on a standard computer. GMATA is user friendly, only requires mouse clicks or types inputs on the command line, and is executable in multiple computing platforms. We demonstrated the application of GMATA in plants genomes and reveal a novel distribution pattern of SSRs in 15 grass genomes. The most abundant motifs are dimer GA/TC, the A/T monomer and the GCG/CGC trimer, rather than the rich G/C content in DNA sequence. We also revealed that SSR count is a linear to the chromosome length in fully assembled grass genomes. GMATA represents a powerful application tool that facilitates genomic sequence analyses. GAMTA is freely available at http://sourceforge.net/projects/gmata/?source=navbar.

  2. GMATA: An Integrated Software Package for Genome-Scale SSR Mining, Marker Development and Viewing

    Science.gov (United States)

    Wang, Xuewen; Wang, Le

    2016-01-01

    Simple sequence repeats (SSRs), also referred to as microsatellites, are highly variable tandem DNAs that are widely used as genetic markers. The increasing availability of whole-genome and transcript sequences provides information resources for SSR marker development. However, efficient software is required to efficiently identify and display SSR information along with other gene features at a genome scale. We developed novel software package Genome-wide Microsatellite Analyzing Tool Package (GMATA) integrating SSR mining, statistical analysis and plotting, marker design, polymorphism screening and marker transferability, and enabled simultaneously display SSR markers with other genome features. GMATA applies novel strategies for SSR analysis and primer design in large genomes, which allows GMATA to perform faster calculation and provides more accurate results than existing tools. Our package is also capable of processing DNA sequences of any size on a standard computer. GMATA is user friendly, only requires mouse clicks or types inputs on the command line, and is executable in multiple computing platforms. We demonstrated the application of GMATA in plants genomes and reveal a novel distribution pattern of SSRs in 15 grass genomes. The most abundant motifs are dimer GA/TC, the A/T monomer and the GCG/CGC trimer, rather than the rich G/C content in DNA sequence. We also revealed that SSR count is a linear to the chromosome length in fully assembled grass genomes. GMATA represents a powerful application tool that facilitates genomic sequence analyses. GAMTA is freely available at http://sourceforge.net/projects/gmata/?source=navbar. PMID:27679641

  3. Genome-scale identification method applied to find cryptic aminoglycoside resistance genes in Pseudomonas aeruginosa.

    Directory of Open Access Journals (Sweden)

    Julie M Struble

    Full Text Available BACKGROUND: The ability of bacteria to rapidly evolve resistance to antibiotics is a critical public health problem. Resistance leads to increased disease severity and death rates, as well as imposes pressure towards the discovery and development of new antibiotic therapies. Improving understanding of the evolution and genetic basis of resistance is a fundamental goal in the field of microbiology. RESULTS: We have applied a new genomic method, Scalar Analysis of Library Enrichments (SCALEs, to identify genomic regions that, given increased copy number, may lead to aminoglycoside resistance in Pseudomonas aeruginosa at the genome scale. We report the result of selections on highly representative genomic libraries for three different aminoglycoside antibiotics (amikacin, gentamicin, and tobramycin. At the genome-scale, we show significant (p<0.05 overlap in genes identified for each aminoglycoside evaluated. Among the genomic segments identified, we confirmed increased resistance associated with an increased copy number of several genomic regions, including the ORF of PA5471, recently implicated in MexXY efflux pump related aminoglycoside resistance, PA4943-PA4946 (encoding a probable GTP-binding protein, a predicted host factor I protein, a delta 2-isopentenylpyrophosphate transferase, and DNA mismatch repair protein mutL, PA0960-PA0963 (encoding hypothetical proteins, a probable cold shock protein, a probable DNA-binding stress protein, and aspartyl-tRNA synthetase, a segment of PA4967 (encoding a topoisomerase IV subunit B, as well as a chimeric clone containing two inserts including the ORFs PA0547 and PA2326 (encoding a probable transcriptional regulator and a probable hypothetical protein, respectively. CONCLUSIONS: The studies reported here demonstrate the application of new a genomic method, SCALEs, which can be used to improve understanding of the evolution of antibiotic resistance in P. aeruginosa. In our demonstration studies, we

  4. Heat stress-responsive transcriptome analysis in heat susceptible and tolerant wheat (Triticum aestivum L. by using Wheat Genome Array

    Directory of Open Access Journals (Sweden)

    Peng Huiru

    2008-09-01

    Full Text Available Abstract Background Wheat is a major crop in the world, and the high temperature stress can reduce the yield of wheat by as much as 15%. The molecular changes in response to heat stress are poorly understood. Using GeneChip® Wheat Genome Array, we analyzed genome-wide gene expression profiles in the leaves of two wheat genotypes, namely, heat susceptible 'Chinese Spring' (CS and heat tolerant 'TAM107' (TAM. Results A total of 6560 (~10.7% probe sets displayed 2-fold or more changes in expression in at least one heat treatment (false discovery rate, FDR, α = 0.001. Except for heat shock protein (HSP and heat shock factor (HSF genes, these putative heat responsive genes encode transcription factors and proteins involved in phytohormone biosynthesis/signaling, calcium and sugar signal pathways, RNA metabolism, ribosomal proteins, primary and secondary metabolisms, as well as proteins related to other stresses. A total of 313 probe sets were differentially expressed between the two genotypes, which could be responsible for the difference in heat tolerance of the two genotypes. Moreover, 1314 were differentially expressed between the heat treatments with and without pre-acclimation, and 4533 were differentially expressed between short and prolonged heat treatments. Conclusion The differences in heat tolerance in different wheat genotypes may be associated with multiple processes and mechanisms involving HSPs, transcription factors, and other stress related genes. Heat acclimation has little effects on gene expression under prolonged treatments but affects gene expression in wheat under short-term heat stress. The heat stress responsive genes identified in this study will facilitate our understanding of molecular basis for heat tolerance in different wheat genotypes and future improvement of heat tolerance in wheat and other cereals.

  5. Genomic and Transcriptomic Analyses of Colistin-Resistant Clinical Isolates of Klebsiella pneumoniae Reveal Multiple Pathways of Resistance

    Science.gov (United States)

    Wright, Meredith S.; Suzuki, Yo; Jones, Marcus B.; Marshall, Steven H.; Rudin, Susan D.; van Duin, David; Kaye, Keith; Jacobs, Michael R.

    2014-01-01

    The emergence of multidrug-resistant (MDR) Klebsiella pneumoniae has resulted in a more frequent reliance on treatment using colistin. However, resistance to colistin (Colr) is increasingly reported from clinical settings. The genetic mechanisms that lead to Colr in K. pneumoniae are not fully characterized. Using a combination of genome sequencing and transcriptional profiling by RNA sequencing (RNA-Seq) analysis, distinct genetic mechanisms were found among nine Colr clinical isolates. Colr was related to mutations in three different genes in K. pneumoniae strains, with distinct impacts on gene expression. Upregulation of the pmrH operon encoding 4-amino-4-deoxy-l-arabinose (Ara4N) modification of lipid A was found in all Colr strains. Alteration of the mgrB gene was observed in six strains. One strain had a mutation in phoQ. Common among these seven strains was elevated expression of phoPQ and unaltered expression of pmrCAB, which is involved in phosphoethanolamine addition to lipopolysaccharide (LPS). In two strains, separate mutations were found in a previously uncharacterized histidine kinase gene that is part of a two-component regulatory system (TCRS) now designated crrAB. In these strains, expression of pmrCAB, crrAB, and an adjacent glycosyltransferase gene, but not that of phoPQ, was elevated. Complementation with the wild-type allele restored colistin susceptibility in both strains. The crrAB genes are present in most K. pneumoniae genomes, but not in Escherichia coli. Additional upregulated genes in all strains include those involved in cation transport and maintenance of membrane integrity. Because the crrAB genes are present in only some strains, Colr mechanisms may be dependent on the genetic background. PMID:25385117

  6. Assembly of 500,000 inter-specific catfish expressed sequence tags and large scale gene-associated marker development for whole genome association studies

    Energy Technology Data Exchange (ETDEWEB)

    Catfish Genome Consortium; Wang, Shaolin; Peatman, Eric; Abernathy, Jason; Waldbieser, Geoff; Lindquist, Erika; Richardson, Paul; Lucas, Susan; Wang, Mei; Li, Ping; Thimmapuram, Jyothi; Liu, Lei; Vullaganti, Deepika; Kucuktas, Huseyin; Murdock, Christopher; Small, Brian C; Wilson, Melanie; Liu, Hong; Jiang, Yanliang; Lee, Yoona; Chen, Fei; Lu, Jianguo; Wang, Wenqi; Xu, Peng; Somridhivej, Benjaporn; Baoprasertkul, Puttharat; Quilang, Jonas; Sha, Zhenxia; Bao, Baolong; Wang, Yaping; Wang, Qun; Takano, Tomokazu; Nandi, Samiran; Liu, Shikai; Wong, Lilian; Kaltenboeck, Ludmilla; Quiniou, Sylvie; Bengten, Eva; Miller, Norman; Trant, John; Rokhsar, Daniel; Liu, Zhanjiang

    2010-03-23

    Background-Through the Community Sequencing Program, a catfish EST sequencing project was carried out through a collaboration between the catfish research community and the Department of Energy's Joint Genome Institute. Prior to this project, only a limited EST resource from catfish was available for the purpose of SNP identification. Results-A total of 438,321 quality ESTs were generated from 8 channel catfish (Ictalurus punctatus) and 4 blue catfish (Ictalurus furcatus) libraries, bringing the number of catfish ESTs to nearly 500,000. Assembly of all catfish ESTs resulted in 45,306 contigs and 66,272 singletons. Over 35percent of the unique sequences had significant similarities to known genes, allowing the identification of 14,776 unique genes in catfish. Over 300,000 putative SNPs have been identified, of which approximately 48,000 are high-quality SNPs identified from contigs with at least four sequences and the minor allele presence of at least two sequences in the contig. The EST resource should be valuable for identification of microsatellites, genome annotation, large-scale expression analysis, and comparative genome analysis. Conclusions-This project generated a large EST resource for catfish that captured the majority of the catfish transcriptome. The parallel analysis of ESTs from two closely related Ictalurid catfishes should also provide powerful means for the evaluation of ancient and recent gene duplications, and for the development of high-density microarrays in catfish. The inter- and intra-specific SNPs identified from all catfish EST dataset assembly will greatly benefit the catfish introgression breeding program and whole genome association studies.

  7. Transcriptomic and proteomic responses of Serratia marcescens to spaceflight conditions involve large-scale changes in metabolic pathways

    Science.gov (United States)

    Wang, Yajuan; Yuan, Yanting; Liu, Jinwen; Su, Longxiang; Chang, De; Guo, Yinghua; Chen, Zhenhong; Fang, Xiangqun; Wang, Junfeng; Li, Tianzhi; Zhou, Lisha; Fang, Chengxiang; Yang, Ruifu; Liu, Changting

    2014-04-01

    The microgravity environment of spaceflight expeditions has been associated with altered microbial responses. This study explores the characterization of Serratia marcescensis grown in a spaceflight environment at the phenotypic, transcriptomic and proteomic levels. From November 1, 2011 to November 17, 2011, a strain of S. marcescensis was sent into space for 398 h on the Shenzhou VIII spacecraft, and ground simulation was performed as a control (LCT-SM213). After the flight, two mutant strains (LCT-SM166 and LCT-SM262) were selected for further analysis. Although no changes in the morphology, post-culture growth kinetics, hemolysis or antibiotic sensitivity were observed, the two mutant strains exhibited significant changes in their metabolic profiles after exposure to spaceflight. Enrichment analysis of the transcriptome showed that the differentially expressed genes of the two spaceflight strains and the ground control strain mainly included those involved in metabolism and degradation. The proteome revealed that changes at the protein level were also associated with metabolic functions, such as glycolysis/gluconeogenesis, pyruvate metabolism, arginine and proline metabolism and the degradation of valine, leucine and isoleucine. In summary S. marcescens showed alterations primarily in genes and proteins that were associated with metabolism under spaceflight conditions, which gave us valuable clues for future research.

  8. Genome-Wide Transcriptome Analysis of Cotton (Gossypium hirsutum L.) Identifies Candidate Gene Signatures in Response to Aflatoxin Producing Fungus Aspergillus flavus.

    Science.gov (United States)

    Bedre, Renesh; Rajasekaran, Kanniah; Mangu, Venkata Ramanarao; Sanchez Timm, Luis Eduardo; Bhatnagar, Deepak; Baisakh, Niranjan

    2015-01-01

    Aflatoxins are toxic and potent carcinogenic metabolites produced from the fungi Aspergillus flavus and A. parasiticus. Aflatoxins can contaminate cottonseed under conducive preharvest and postharvest conditions. United States federal regulations restrict the use of aflatoxin contaminated cottonseed at >20 ppb for animal feed. Several strategies have been proposed for controlling aflatoxin contamination, and much success has been achieved by the application of an atoxigenic strain of A. flavus in cotton, peanut and maize fields. Development of cultivars resistant to aflatoxin through overexpression of resistance associated genes and/or knocking down aflatoxin biosynthesis of A. flavus will be an effective strategy for controlling aflatoxin contamination in cotton. In this study, genome-wide transcriptome profiling was performed to identify differentially expressed genes in response to infection with both toxigenic and atoxigenic strains of A. flavus on cotton (Gossypium hirsutum L.) pericarp and seed. The genes involved in antifungal response, oxidative burst, transcription factors, defense signaling pathways and stress response were highly differentially expressed in pericarp and seed tissues in response to A. flavus infection. The cell-wall modifying genes and genes involved in the production of antimicrobial substances were more active in pericarp as compared to seed. The genes involved in auxin and cytokinin signaling were also induced. Most of the genes involved in defense response in cotton were highly induced in pericarp than in seed. The global gene expression analysis in response to fungal invasion in cotton will serve as a source for identifying biomarkers for breeding, potential candidate genes for transgenic manipulation, and will help in understanding complex plant-fungal interaction for future downstream research.

  9. Genome-Scale Analysis of Cell-Specific Regulatory Codes Using Nuclear Enzymes.

    Science.gov (United States)

    Baek, Songjoon; Sung, Myong-Hee

    2016-01-01

    High-throughput sequencing technologies have made it possible for biologists to generate genome-wide profiles of chromatin features at the nucleotide resolution. Enzymes such as nucleases or transposes have been instrumental as a chromatin-probing agent due to their ability to target accessible chromatin for cleavage or insertion. On the scale of a few hundred base pairs, preferential action of the nuclear enzymes on accessible chromatin allows mapping of cell state-specific accessibility in vivo. Such accessible regions contain functionally important regulatory sites, including promoters and enhancers, which undergo active remodeling for cells adapting in a dynamic environment. DNase-seq and the more recent ATAC-seq are two assays that are gaining popularity. Deep sequencing of DNA libraries from these assays, termed genomic footprinting, has been proposed to enable the comprehensive construction of protein occupancy profiles over the genome at the nucleotide level. Recent studies have discovered limitations of genomic footprinting which reduce the scope of detectable proteins. In addition, the identification of putative factors that bind to the observed footprints remains challenging. Despite these caveats, the methodology still presents significant advantages over alternative techniques such as ChIP-seq or FAIRE-seq. Here we describe computational approaches and tools for analysis of chromatin accessibility and genomic footprinting. Proper experimental design and assay-specific data analysis ensure the detection sensitivity and maximize retrievable information. The enzyme-based chromatin profiling approaches represent a powerful and evolving methodology which facilitates our understanding of how the genome is regulated.

  10. Rare and common regulatory variation in population-scale sequenced human genomes.

    Directory of Open Access Journals (Sweden)

    Stephen B Montgomery

    2011-07-01

    Full Text Available Population-scale genome sequencing allows the characterization of functional effects of a broad spectrum of genetic variants underlying human phenotypic variation. Here, we investigate the influence of rare and common genetic variants on gene expression patterns, using variants identified from sequencing data from the 1000 genomes project in an African and European population sample and gene expression data from lymphoblastoid cell lines. We detect comparable numbers of expression quantitative trait loci (eQTLs when compared to genotypes obtained from HapMap 3, but as many as 80% of the top expression quantitative trait variants (eQTVs discovered from 1000 genomes data are novel. The properties of the newly discovered variants suggest that mapping common causal regulatory variants is challenging even with full resequencing data; however, we observe significant enrichment of regulatory effects in splice-site and nonsense variants. Using RNA sequencing data, we show that 46.2% of nonsynonymous variants are differentially expressed in at least one individual in our sample, creating widespread potential for interactions between functional protein-coding and regulatory variants. We also use allele-specific expression to identify putative rare causal regulatory variants. Furthermore, we demonstrate that outlier expression values can be due to rare variant effects, and we approximate the number of such effects harboured in an individual by effect size. Our results demonstrate that integration of genomic and RNA sequencing analyses allows for the joint assessment of genome sequence and genome function.

  11. Toward the automated generation of genome-scale metabolic networks in the SEED

    Directory of Open Access Journals (Sweden)

    Gould John

    2007-04-01

    Full Text Available Abstract Background Current methods for the automated generation of genome-scale metabolic networks focus on genome annotation and preliminary biochemical reaction network assembly, but do not adequately address the process of identifying and filling gaps in the reaction network, and verifying that the network is suitable for systems level analysis. Thus, current methods are only sufficient for generating draft-quality networks, and refinement of the reaction network is still largely a manual, labor-intensive process. Results We have developed a method for generating genome-scale metabolic networks that produces substantially complete reaction networks, suitable for systems level analysis. Our method partitions the reaction space of central and intermediary metabolism into discrete, interconnected components that can be assembled and verified in isolation from each other, and then integrated and verified at the level of their interconnectivity. We have developed a database of components that are common across organisms, and have created tools for automatically assembling appropriate components for a particular organism based on the metabolic pathways encoded in the organism's genome. This focuses manual efforts on that portion of an organism's metabolism that is not yet represented in the database. We have demonstrated the efficacy of our method by reverse-engineering and automatically regenerating the reaction network from a published genome-scale metabolic model for Staphylococcus aureus. Additionally, we have verified that our method capitalizes on the database of common reaction network components created for S. aureus, by using these components to generate substantially complete reconstructions of the reaction networks from three other published metabolic models (Escherichia coli, Helicobacter pylori, and Lactococcus lactis. We have implemented our tools and database within the SEED, an open-source software environment for comparative

  12. The genome sequence of E. coli W (ATCC 9637: comparative genome analysis and an improved genome-scale reconstruction of E. coli

    Directory of Open Access Journals (Sweden)

    Lee Sang

    2011-01-01

    Full Text Available Abstract Background Escherichia coli is a model prokaryote, an important pathogen, and a key organism for industrial biotechnology. E. coli W (ATCC 9637, one of four strains designated as safe for laboratory purposes, has not been sequenced. E. coli W is a fast-growing strain and is the only safe strain that can utilize sucrose as a carbon source. Lifecycle analysis has demonstrated that sucrose from sugarcane is a preferred carbon source for industrial bioprocesses. Results We have sequenced and annotated the genome of E. coli W. The chromosome is 4,900,968 bp and encodes 4,764 ORFs. Two plasmids, pRK1 (102,536 bp and pRK2 (5,360 bp, are also present. W has unique features relative to other sequenced laboratory strains (K-12, B and Crooks: it has a larger genome and belongs to phylogroup B1 rather than A. W also grows on a much broader range of carbon sources than does K-12. A genome-scale reconstruction was developed and validated in order to interrogate metabolic properties. Conclusions The genome of W is more similar to commensal and pathogenic B1 strains than phylogroup A strains, and therefore has greater utility for comparative analyses with these strains. W should therefore be the strain of choice, or 'type strain' for group B1 comparative analyses. The genome annotation and tools created here are expected to allow further utilization and development of E. coli W as an industrial organism for sucrose-based bioprocesses. Refinements in our E. coli metabolic reconstruction allow it to more accurately define E. coli metabolism relative to previous models.

  13. Gene network analyses of first service conception in Brangus heifers: use of genome and trait associations, hypothalamic-transcriptome information, and transcription factors.

    Science.gov (United States)

    Fortes, M R S; Snelling, W M; Reverter, A; Nagaraj, S H; Lehnert, S A; Hawken, R J; DeAtley, K L; Peters, S O; Silver, G A; Rincon, G; Medrano, J F; Islas-Trejo, A; Thomas, M G

    2012-09-01

    Measures of heifer fertility are economically relevant traits for beef production systems and knowledge of candidate genes could be incorporated into future genomic selection strategies. Ten traits related to growth and fertility were measured in 890 Brangus heifers (3/8 Brahman × 5/8 Angus, from 67 sires). These traits were: BW and hip height adjusted to 205 and 365 d of age, postweaning ADG, yearling assessment of carcass traits (i.e., back fat thickness, intramuscular fat, and LM area), as well as heifer pregnancy and first service conception (FSC). These fertility traits were collected from controlled breeding seasons initiated with estrous synchronization and AI targeting heifers to calve by 24 mo of age. The BovineSNP50 BeadChip was used to ascertain 53,692 SNP genotypes for ∼802 heifers. Associations of genotypes and phenotypes were performed and SNP effects were estimated for each trait. Minimally associated SNP (P < 0.05) and their effects across the 10 traits formed the basis for an association weight matrix and its derived gene network related to FSC (57.3% success and heritability = 0.06 ± 0.05). These analyses yielded 1,555 important SNP, which inferred genes linked by 113,873 correlations within a network. Specifically, 1,386 SNP were nodes and the 5,132 strongest correlations (|r| ≥ 0.90) were edges. The network was filtered with genes queried from a transcriptome resource created from deep sequencing of RNA (i.e., RNA-Seq) from the hypothalamus of a prepubertal and a postpubertal Brangus heifer. The remaining hypothalamic-influenced network contained 978 genes connected by 2,560 edges or predicted gene interactions. This hypothalamic gene network was enriched with genes involved in axon guidance, which is a pathway known to influence pulsatile release of LHRH. There were 5 transcription factors with 21 or more connections: ZMAT3, STAT6, RFX4, PLAGL1, and NR6A1 for FSC. The SNP that identified these genes were intragenic and were on chromosomes

  14. Systematic planning of genome-scale experiments in poorly studied species.

    Science.gov (United States)

    Guan, Yuanfang; Dunham, Maitreya; Caudy, Amy; Troyanskaya, Olga

    2010-03-05

    Genome-scale datasets have been used extensively in model organisms to screen for specific candidates or to predict functions for uncharacterized genes. However, despite the availability of extensive knowledge in model organisms, the planning of genome-scale experiments in poorly studied species is still based on the intuition of experts or heuristic trials. We propose that computational and systematic approaches can be applied to drive the experiment planning process in poorly studied species based on available data and knowledge in closely related model organisms. In this paper, we suggest a computational strategy for recommending genome-scale experiments based on their capability to interrogate diverse biological processes to enable protein function assignment. To this end, we use the data-rich functional genomics compendium of the model organism to quantify the accuracy of each dataset in predicting each specific biological process and the overlap in such coverage between different datasets. Our approach uses an optimized combination of these quantifications to recommend an ordered list of experiments for accurately annotating most proteins in the poorly studied related organisms to most biological processes, as well as a set of experiments that target each specific biological process. The effectiveness of this experiment- planning system is demonstrated for two related yeast species: the model organism Saccharomyces cerevisiae and the comparatively poorly studied Saccharomyces bayanus. Our system recommended a set of S. bayanus experiments based on an S. cerevisiae microarray data compendium. In silico evaluations estimate that less than 10% of the experiments could achieve similar functional coverage to the whole microarray compendium. This estimation was confirmed by performing the recommended experiments in S. bayanus, therefore significantly reducing the labor devoted to characterize the poorly studied genome. This experiment-planning framework could

  15. Disturbance of gene expression in primary human hepatocytes by hepatotoxic pyrrolizidine alkaloids: A whole genome transcriptome analysis.

    Science.gov (United States)

    Luckert, Claudia; Hessel, Stefanie; Lenze, Dido; Lampen, Alfonso

    2015-10-01

    1,2-unsaturated pyrrolizidine alkaloids (PA) are plant metabolites predominantly occurring in the plant families Asteraceae and Boraginaceae. Acute and chronic PA poisoning causes severe hepatotoxicity. So far, the molecular mechanisms of PA toxicity are not well understood. To analyze its mode of action, primary human hepatocytes were exposed to a non-cytotoxic dose of 100 μM of four structurally different PA: echimidine, heliotrine, senecionine, senkirkine. Changes in mRNA expression were analyzed by a whole genome microarray. Employing cut-off values with a |fold change| of 2 and a q-value of 0.01, data analysis revealed numerous changes in gene expression. In total, 4556, 1806, 3406 and 8623 genes were regulated by echimidine, heliotrine, senecione and senkirkine, respectively. 1304 genes were identified as commonly regulated. PA affected pathways related to cell cycle regulation, cell death and cancer development. The transcription factors TP53, MYC, NFκB and NUPR1 were predicted to be activated upon PA treatment. Furthermore, gene expression data showed a considerable interference with lipid metabolism and bile acid flow. The associated transcription factors FXR, LXR, SREBF1/2, and PPARα/γ/δ were predicted to be inhibited. In conclusion, though structurally different, all four PA significantly regulated a great number of genes in common. This proposes similar molecular mechanisms, although the extent seems to differ between the analyzed PA as reflected by the potential hepatotoxicity and individual PA structure.

  16. An ANOCEF Genomic and Transcriptomic Microarray Study of the Response to Irinotecan and Bevacizumab in Recurrent Glioblastomas

    Directory of Open Access Journals (Sweden)

    Julien Laffaire

    2014-01-01

    Full Text Available Background. We performed a retrospective study to assess whether the initial molecular characteristics of glioblastomas (GBMs were associated with the response to the bevacizumab/irinotecan chemotherapy regimen given at recurrence. Results. Comparison of the genomic and gene expression profiles of the responders (n=12 and nonresponders (n=13 demonstrated only slight differences and could not identify any robust biomarkers associated with the response. In contrast, a significant association was observed between GBMs molecular subtypes and response rates. GBMs assigned to molecular subtype IGS-18 and to classical subtype had a lower response rate than those assigned to other subtypes. In an independent series of 33 patients, neither EGFR amplification nor CDKN2A deletion (which are frequent in IGS-18 and classical GBMs was significantly associated with the response rate, suggesting that these two alterations are unlikely to explain the lower response rate of these GBMs molecular subtypes. Conclusion. Despite its limited sample size, the present study suggests that comparing the initial molecular profiles of responders and nonresponders might not be an effective strategy to identify biomarkers of the response to bevacizumab given at recurrence. Yet it suggests that the response rate might differ among GBMs molecular subtypes.

  17. Harvesting clues from genome wide transcriptome analysis for exploring thalidomide mediated anomalies in eye development of chick embryo: Nitric oxide rectifies the thalidomide mediated anomalies by swinging back the system to normal transcriptome pattern.

    Science.gov (United States)

    Kumar, Pavitra; Kasiviswanathan, Dharanibalan; Sundaresan, Lakshmikirupa; Kathirvel, Priyadarshan; Veeriah, Vimal; Dutta, Priya; Sankaranarayanan, Kavitha; Gupta, Ravi; Chatterjee, Suvro

    2016-02-01

    Thalidomide, the notorious teratogen is known to cause various developmental abnormalities, among which a range of eye deformations are very common. From the clinical point of view, it is necessary to pinpoint the mechanisms of teratogens that tune the gene expression. However, to our knowledge, the molecular basis of eye deformities under thalidomide treatmenthas not been reported so far. Present study focuses on the possible mechanism by which thalidomide affects eye development and the role of Nitric Oxide in recovering thalidomide-mediated anomalies of eye development using chick embryo and zebrafish models with transcriptome analysis. Transcriptome analysis showed that 403 genes were up-regulated and 223 genes were down-regulated significantly in thalidomide pre-treated embryos. 8% of the significantly modulated genes have been implicated in eye development including Pax6, OTX2, Dkk1 and Shh. A wide range of biological process and molecular function was affected by thalidomide exposure. Biological Processes including structural constituent of eye lens and Molecular functions such as visual perception and retinal metabolic process formed strong annotation clustersindicating the adverse effects of thalidomide on eye development and function. Here, we have discussed the whole embryo transcriptome with the expression of PAX6, SOX2, and CRYAAgenes from developing eyes. Our experimental data showing structural and functional aspects includingeye size, lens transparency and optic nerve activity and bioinformatics analyses of transcriptome suggest that NO could partially protect thalidomide treated embryos from its devastating effects on eye development and function.

  18. Reciprocal genomic evolution in the ant-fungus agricultural symbiosis

    DEFF Research Database (Denmark)

    Nygaard, Sanne; Hu, Haofu; Li, Cai;

    2016-01-01

    The attine ant-fungus agricultural symbiosis evolved over tens of millions of years, producing complex societies with industrial-scale farming analogous to that of humans. Here we document reciprocal shifts in the genomes and transcriptomes of seven fungus-farming ant species and their fungal cul...

  19. Reciprocal genomic evolution in the ant-fungus agricultural symbiosis

    DEFF Research Database (Denmark)

    Nygaard, Sanne; Hu, Haofu; Li, Cai

    2016-01-01

    The attine ant-fungus agricultural symbiosis evolved over tens of millions of years, producing complex societies with industrial-scale farming analogous to that of humans. Here we document reciprocal shifts in the genomes and transcriptomes of seven fungus-farming ant species and their fungal...

  20. Reciprocal genomic evolution in the ant-fungus agricultural symbiosis

    DEFF Research Database (Denmark)

    Nygaard, Sanne; Hu, Haofu; Li, Cai;

    2016-01-01

    The attine ant-fungus agricultural symbiosis evolved over tens of millions of years, producing complex societies with industrial-scale farming analogous to that of humans. Here we document reciprocal shifts in the genomes and transcriptomes of seven fungus-farming ant species and their fungal...

  1. iAK692: A genome-scale metabolic model of Spirulina platensis C1

    Science.gov (United States)

    2012-01-01

    Background Spirulina (Arthrospira) platensis is a well-known filamentous cyanobacterium used in the production of many industrial products, including high value compounds, healthy food supplements, animal feeds, pharmaceuticals and cosmetics, for example. It has been increasingly studied around the world for scientific purposes, especially for its genome, biology, physiology, and also for the analysis of its small-scale metabolic network. However, the overall description of the metabolic and biotechnological capabilities of S. platensis requires the development of a whole cellular metabolism model. Recently, the S. platensis C1 (Arthrospira sp. PCC9438) genome sequence has become available, allowing systems-level studies of this commercial cyanobacterium. Results In this work, we present the genome-scale metabolic network analysis of S. platensis C1, iAK692, its topological properties, and its metabolic capabilities and functions. The network was reconstructed from the S. platensis C1 annotated genomic sequence using Pathway Tools software to generate a preliminary network. Then, manual curation was performed based on a collective knowledge base and a combination of genomic, biochemical, and physiological information. The genome-scale metabolic model consists of 692 genes, 837 metabolites, and 875 reactions. We validated iAK692 by conducting fermentation experiments and simulating the model under autotrophic, heterotrophic, and mixotrophic growth conditions using COBRA toolbox. The model predictions under these growth conditions were consistent with the experimental results. The iAK692 model was further used to predict the unique active reactions and essential genes for each growth condition. Additionally, the metabolic states of iAK692 during autotrophic and mixotrophic growths were described by phenotypic phase plane (PhPP) analysis. Conclusions This study proposes the first genome-scale model of S. platensis C1, iAK692, which is a predictive metabolic platform

  2. iAK692: A genome-scale metabolic model of Spirulina platensis C1

    Directory of Open Access Journals (Sweden)

    Klanchui Amornpan

    2012-06-01

    Full Text Available Abstract Background Spirulina (Arthrospira platensis is a well-known filamentous cyanobacterium used in the production of many industrial products, including high value compounds, healthy food supplements, animal feeds, pharmaceuticals and cosmetics, for example. It has been increasingly studied around the world for scientific purposes, especially for its genome, biology, physiology, and also for the analysis of its small-scale metabolic network. However, the overall description of the metabolic and biotechnological capabilities of S. platensis requires the development of a whole cellular metabolism model. Recently, the S. platensis C1 (Arthrospira sp. PCC9438 genome sequence has become available, allowing systems-level studies of this commercial cyanobacterium. Results In this work, we present the genome-scale metabolic network analysis of S. platensis C1, iAK692, its topological properties, and its metabolic capabilities and functions. The network was reconstructed from the S. platensis C1 annotated genomic sequence using Pathway Tools software to generate a preliminary network. Then, manual curation was performed based on a collective knowledge base and a combination of genomic, biochemical, and physiological information. The genome-scale metabolic model consists of 692 genes, 837 metabolites, and 875 reactions. We validated iAK692 by conducting fermentation experiments and simulating the model under autotrophic, heterotrophic, and mixotrophic growth conditions using COBRA toolbox. The model predictions under these growth conditions were consistent with the experimental results. The iAK692 model was further used to predict the unique active reactions and essential genes for each growth condition. Additionally, the metabolic states of iAK692 during autotrophic and mixotrophic growths were described by phenotypic phase plane (PhPP analysis. Conclusions This study proposes the first genome-scale model of S. platensis C1, iAK692, which is a

  3. Informed Consent in Genome-Scale Research: What Do Prospective Participants Think?

    Science.gov (United States)

    Trinidad, Susan Brown; Fullerton, Stephanie M.; Bares, Julie M.; Jarvik, Gail P.; Larson, Eric B.; Burke, Wylie

    2012-01-01

    Background To promote effective genome-scale research, genomic and clinical data for large population samples must be collected, stored, and shared. Methods We conducted focus groups with 45 members of a Seattle-based integrated healthcare delivery system to learn about their views and expectations for informed consent in genome-scale studies. Results Participants viewed information about study purpose, aims, and how and by whom study data could be used to be at least as important as information about risks and possible harms. They generally supported a tiered consent approach for specific issues, including research purpose, data sharing, and access to individual research results. Participants expressed a continuum of opinions with respect to the acceptability of broad consent, ranging from completely acceptable to completely unacceptable. Older participants were more likely to view the consent process in relational – rather than contractual – terms, compared with younger participants. The majority of participants endorsed seeking study subjects’ permission regarding material changes in study purpose and data sharing. Conclusions Although this study sample was limited in terms of racial and socioeconomic diversity, our results suggest a strong positive interest in genomic research on the part of at least some prospective participants and indicate a need for increased public engagement, as well as strategies for ongoing communication with study participants. PMID:23493836

  4. Investigating host-pathogen behavior and their interaction using genome-scale metabolic network models.

    Science.gov (United States)

    Sadhukhan, Priyanka P; Raghunathan, Anu

    2014-01-01

    Genome Scale Metabolic Modeling methods represent one way to compute whole cell function starting from the genome sequence of an organism and contribute towards understanding and predicting the genotype-phenotype relationship. About 80 models spanning all the kingdoms of life from archaea to eukaryotes have been built till date and used to interrogate cell phenotype under varying conditions. These models have been used to not only understand the flux distribution in evolutionary conserved pathways like glycolysis and the Krebs cycle but also in applications ranging from value added product formation in Escherichia coli to predicting inborn errors of Homo sapiens metabolism. This chapter describes a protocol that delineates the process of genome scale metabolic modeling for analysing host-pathogen behavior and interaction using flux balance analysis (FBA). The steps discussed in the process include (1) reconstruction of a metabolic network from the genome sequence, (2) its representation in a precise mathematical framework, (3) its translation to a model, and (4) the analysis using linear algebra and optimization. The methods for biological interpretations of computed cell phenotypes in the context of individual host and pathogen models and their integration are also discussed.

  5. Genome-Wide Motif Statistics are Shaped by DNA Binding Proteins over Evolutionary Time Scales

    Science.gov (United States)

    Qian, Long; Kussell, Edo

    2016-10-01

    The composition of a genome with respect to all possible short DNA motifs impacts the ability of DNA binding proteins to locate and bind their target sites. Since nonfunctional DNA binding can be detrimental to cellular functions and ultimately to organismal fitness, organisms could benefit from reducing the number of nonfunctional DNA binding sites genome wide. Using in vitro measurements of binding affinities for a large collection of DNA binding proteins, in multiple species, we detect a significant global avoidance of weak binding sites in genomes. We demonstrate that the underlying evolutionary process leaves a distinct genomic hallmark in that similar words have correlated frequencies, a signal that we detect in all species across domains of life. We consider the possibility that natural selection against weak binding sites contributes to this process, and using an evolutionary model we show that the strength of selection needed to maintain global word compositions is on the order of point mutation rates. Likewise, we show that evolutionary mechanisms based on interference of protein-DNA binding with replication and mutational repair processes could yield similar results and operate with similar rates. On the basis of these modeling and bioinformatic results, we conclude that genome-wide word compositions have been molded by DNA binding proteins acting through tiny evolutionary steps over time scales spanning millions of generations.

  6. A systems approach to predict oncometabolites via context-specific genome-scale metabolic networks.

    Directory of Open Access Journals (Sweden)

    Hojung Nam

    2014-09-01

    Full Text Available Altered metabolism in cancer cells has been viewed as a passive response required for a malignant transformation. However, this view has changed through the recently described metabolic oncogenic factors: mutated isocitrate dehydrogenases (IDH, succinate dehydrogenase (SDH, and fumarate hydratase (FH that produce oncometabolites that competitively inhibit epigenetic regulation. In this study, we demonstrate in silico predictions of oncometabolites that have the potential to dysregulate epigenetic controls in nine types of cancer by incorporating massive scale genetic mutation information (collected from more than 1,700 cancer genomes, expression profiling data, and deploying Recon 2 to reconstruct context-specific genome-scale metabolic models. Our analysis predicted 15 compounds and 24 substructures of potential oncometabolites that could result from the loss-of-function and gain-of-function mutations of metabolic enzymes, respectively. These results suggest a substantial potential for discovering unidentified oncometabolites in various forms of cancers.

  7. Metingear: a development environment for annotating genome-scale metabolic models.

    Science.gov (United States)

    May, John W; James, A Gordon; Steinbeck, Christoph

    2013-09-01

    Genome-scale metabolic models often lack annotations that would allow them to be used for further analysis. Previous efforts have focused on associating metabolites in the model with a cross reference, but this can be problematic if the reference is not freely available, multiple resources are used or the metabolite is added from a literature review. Associating each metabolite with chemical structure provides unambiguous identification of the components and a more detailed view of the metabolism. We have developed an open-source desktop application that simplifies the process of adding database cross references and chemical structures to genome-scale metabolic models. Annotated models can be exported to the Systems Biology Markup Language open interchange format. Source code, binaries, documentation and tutorials are freely available at http://johnmay.github.com/metingear. The application is implemented in Java with bundles available for MS Windows and Macintosh OS X.

  8. Genome-scale metabolic model of Pichia pastoris with native and humanized glycosylation of recombinant proteins

    DEFF Research Database (Denmark)

    Irani, Zahra Azimzadeh; Kerkhoven, Eduard J.; Shojaosadati, Seyed Abbas;

    2016-01-01

    Pichia pastoris is used for commercial production of human therapeutic proteins, and genome-scale models of P. pastoris metabolism have been generated in the past to study the metabolism and associated protein production by this yeast. A major challenge with clinical usage of recombinant proteins...... produced by P. pastoris is the difference in N-glycosylation of proteins produced by humans and this yeast. However, through metabolic engineering, a P. pastoris strain capable of producing humanized N-glycosylated proteins was constructed. The current genome-scale models of P. pastoris do not address...... native nor humanized N-glycosylation, and we therefore developed ihGlycopastoris, an extension to the iLC915 model with both native and humanized N-glycosylation for recombinant protein production, but also an estimation of N-glycosylation of P. pastoris native proteins. This new model gives a better...

  9. Basic and applied uses of genome-scale metabolic network reconstructions of Escherichia coli

    DEFF Research Database (Denmark)

    McCloskey, Douglas; Palsson, Bernhard; Feist, Adam

    2013-01-01

    The genome-scale model (GEM) of metabolism in the bacterium Escherichia coli K-12 has been in development for over a decade and is now in wide use. GEM-enabled studies of E. coli have been primarily focused on six applications: (1) metabolic engineering, (2) model-driven discovery, (3) prediction...... of cellular phenotypes, (4) analysis of biological network properties, (5) studies of evolutionary processes, and (6) models of interspecies interactions. In this review, we provide an overview of these applications along with a critical assessment of their successes and limitations, and a perspective...... on likely future developments in the field. Taken together, the studies performed over the past decade have established a genome-scale mechanistic understanding of genotype-phenotype relationships in E. coli metabolism that forms the basis for similar efforts for other microbial species. Future challenges...

  10. The genome-scale metabolic extreme pathway structure in Haemophilus influenzae shows significant network redundancy.

    Science.gov (United States)

    Papin, Jason A; Price, Nathan D; Edwards, Jeremy S; Palsson B, Bernhard Ø

    2002-03-07

    Genome-scale metabolic networks can be characterized by a set of systemically independent and unique extreme pathways. These extreme pathways span a convex, high-dimensional space that circumscribes all potential steady-state flux distributions achievable by the defined metabolic network. Genome-scale extreme pathways associated with the production of non-essential amino acids in Haemophilus influenzae were computed. They offer valuable insight into the functioning of its metabolic network. Three key results were obtained. First, there were multiple internal flux maps corresponding to externally indistinguishable states. It was shown that there was an average of 37 internal states per unique exchange flux vector in H. influenzae when the network was used to produce a single amino acid while allowing carbon dioxide and acetate as carbon sinks. With the inclusion of succinate as an additional output, this ratio increased to 52, a 40% increase. Second, an analysis of the carbon fates illustrated that the extreme pathways were non-uniformly distributed across the carbon fate spectrum. In the detailed case study, 45% of the distinct carbon fate values associated with lysine production represented 85% of the extreme pathways. Third, this distribution fell between distinct systemic constraints. For lysine production, the carbon fate values that represented 85% of the pathways described above corresponded to only 2 distinct ratios of 1:1 and 4:1 between carbon dioxide and acetate. The present study analysed single outputs from one organism, and provides a start to genome-scale extreme pathways studies. These emergent system-level characterizations show the significance of metabolic extreme pathway analysis at the genome-scale.

  11. Genome-scale dynamic modeling of the competition between Rhodoferax and Geobacter in anoxic subsurface environments.

    Science.gov (United States)

    Zhuang, Kai; Izallalen, Mounir; Mouser, Paula; Richter, Hanno; Risso, Carla; Mahadevan, Radhakrishnan; Lovley, Derek R

    2011-02-01

    The advent of rapid complete genome sequencing, and the potential to capture this information in genome-scale metabolic models, provide the possibility of comprehensively modeling microbial community interactions. For example, Rhodoferax and Geobacter species are acetate-oxidizing Fe(III)-reducers that compete in anoxic subsurface environments and this competition may have an influence on the in situ bioremediation of uranium-contaminated groundwater. Therefore, genome-scale models of Geobacter sulfurreducens and Rhodoferax ferrireducens were used to evaluate how Geobacter and Rhodoferax species might compete under diverse conditions found in a uranium-contaminated aquifer in Rifle, CO. The model predicted that at the low rates of acetate flux expected under natural conditions at the site, Rhodoferax will outcompete Geobacter as long as sufficient ammonium is available. The model also predicted that when high concentrations of acetate are added during in situ bioremediation, Geobacter species would predominate, consistent with field-scale observations. This can be attributed to the higher expected growth yields of Rhodoferax and the ability of Geobacter to fix nitrogen. The modeling predicted relative proportions of Geobacter and Rhodoferax in geochemically distinct zones of the Rifle site that were comparable to those that were previously documented with molecular techniques. The model also predicted that under nitrogen fixation, higher carbon and electron fluxes would be diverted toward respiration rather than biomass formation in Geobacter, providing a potential explanation for enhanced in situ U(VI) reduction in low-ammonium zones. These results show that genome-scale modeling can be a useful tool for predicting microbial interactions in subsurface environments and shows promise for designing bioremediation strategies.

  12. Comparative Genome-Scale Reconstruction of Gapless Metabolic Networks for Present and Ancestral Species

    Science.gov (United States)

    Pitkänen, Esa; Jouhten, Paula; Hou, Jian; Syed, Muhammad Fahad; Blomberg, Peter; Kludas, Jana; Oja, Merja; Holm, Liisa; Penttilä, Merja; Rousu, Juho; Arvas, Mikko

    2014-01-01

    We introduce a novel computational approach, CoReCo, for comparative metabolic reconstruction and provide genome-scale metabolic network models for 49 important fungal species. Leveraging on the exponential growth in sequenced genome availability, our method reconstructs genome-scale gapless metabolic networks simultaneously for a large number of species by integrating sequence data in a probabilistic framework. High reconstruction accuracy is demonstrated by comparisons to the well-curated Saccharomyces cerevisiae consensus model and large-scale knock-out experiments. Our comparative approach is particularly useful in scenarios where the quality of available sequence data is lacking, and when reconstructing evolutionary distant species. Moreover, the reconstructed networks are fully carbon mapped, allowing their use in 13C flux analysis. We demonstrate the functionality and usability of the reconstructed fungal models with computational steady-state biomass production experiment, as these fungi include some of the most important production organisms in industrial biotechnology. In contrast to many existing reconstruction techniques, only minimal manual effort is required before the reconstructed models are usable in flux balance experiments. CoReCo is available at http://esaskar.github.io/CoReCo/. PMID:24516375

  13. Diagnostics for stochastic genome-scale modeling via model slicing and debugging.

    Directory of Open Access Journals (Sweden)

    Kevin J Tsai

    Full Text Available Modeling of biological behavior has evolved from simple gene expression plots represented by mathematical equations to genome-scale systems biology networks. However, due to obstacles in complexity and scalability of creating genome-scale models, several biological modelers have turned to programming or scripting languages and away from modeling fundamentals. In doing so, they have traded the ability to have exchangeable, standardized model representation formats, while those that remain true to standardized model representation are faced with challenges in model complexity and analysis. We have developed a model diagnostic methodology inspired by program slicing and debugging and demonstrate the effectiveness of the methodology on a genome-scale metabolic network model published in the BioModels database. The computer-aided identification revealed specific points of interest such as reversibility of reactions, initialization of species amounts, and parameter estimation that improved a candidate cell's adenosine triphosphate production. We then compared the advantages of our methodology over other modeling techniques such as model checking and model reduction. A software application that implements the methodology is available at http://gel.ym.edu.tw/gcs/.

  14. Genome-scale modeling of human metabolism - a systems biology approach.

    Science.gov (Unite