WorldWideScience

Sample records for genome scale transcriptome

  1. The Amaranth Genome: Genome, Transcriptome, and Physical Map Assembly

    Directory of Open Access Journals (Sweden)

    J. W. Clouse

    2016-03-01

    Full Text Available Amaranth ( L. is an emerging pseudocereal native to the New World that has garnered increased attention in recent years because of its nutritional quality, in particular its seed protein and more specifically its high levels of the essential amino acid lysine. It belongs to the Amaranthaceae family, is an ancient paleopolyploid that shows disomic inheritance (2 = 32, and has an estimated genome size of 466 Mb. Here we present a high-quality draft genome sequence of the grain amaranth. The genome assembly consisted of 377 Mb in 3518 scaffolds with an N of 371 kb. Repetitive element analysis predicted that 48% of the genome is comprised of repeat sequences, of which -like elements were the most commonly classified retrotransposon. A de novo transcriptome consisting of 66,370 contigs was assembled from eight different amaranth tissue and abiotic stress libraries. Annotation of the genome identified 23,059 protein-coding genes. Seven grain amaranths (, , and and their putative progenitor ( were resequenced. A single nucleotide polymorphism (SNP phylogeny supported the classification of as the progenitor species of the grain amaranths. Lastly, we generated a de novo physical map for using the BioNano Genomics’ Genome Mapping platform. The physical map spanned 340 Mb and a hybrid assembly using the BioNano physical maps nearly doubled the N of the assembly to 697 kb. Moreover, we analyzed synteny between amaranth and sugar beet ( L. and estimated, using analysis, the age of the most recent polyploidization event in amaranth.

  2. Genome Annotation and Transcriptomics of Oil-Producing Algae

    Science.gov (United States)

    2015-03-16

    AFRL-OSR-VA-TR-2015-0103 GENOME ANNOTATION AND TRANSCRIPTOMICS OF OIL-PRODUCING ALGAE Sabeeha Merchant UNIVERSITY OF CALIFORNIA LOS ANGELES Final...2010 To 12-31-2014 4. TITLE AND SUBTITLE GENOME ANNOTATION AND TRANSCRIPTOMICS OF OIL-PRODUCING ALGAE 5a. CONTRACT NUMBER FA9550-10-1-0095 5b...NOTES 14. ABSTRACT Most algae accumulate triacylglycerols (TAGs) when they are starved for essential nutrients like N, S, P (or Si in the case of some

  3. De novo Genome Assembly and Single Nucleotide Variations for Soybean Mosaic Virus Using Soybean Seed Transcriptome Data

    Directory of Open Access Journals (Sweden)

    Yeonhwa Jo

    2017-10-01

    Full Text Available Soybean is the most important legume crop in the world. Several diseases in soybean lead to serious yield losses in major soybean-producing countries. Moreover, soybean can be infected by diverse viruses. Recently, we carried out a large-scale screening to identify viruses infecting soybean using available soybean transcriptome data. Of the screened transcriptomes, a soybean transcriptome for soybean seed development analysis contains several virus-associated sequences. In this study, we identified five viruses, including soybean mosaic virus (SMV, infecting soybean by de novo transcriptome assembly followed by blast search. We assembled a nearly complete consensus genome sequence of SMV China using transcriptome data. Based on phylogenetic analysis, the consensus genome sequence of SMV China was closely related to SMV isolates from South Korea. We examined single nucleotide variations (SNVs for SMVs in the soybean seed transcriptome revealing 780 SNVs, which were evenly distributed on the SMV genome. Four SNVs, C-U, U-C, A-G, and G-A, were frequently identified. This result demonstrated the quasispecies variation of the SMV genome. Taken together, this study carried out bioinformatics analyses to identify viruses using soybean transcriptome data. In addition, we demonstrated the application of soybean transcriptome data for virus genome assembly and SNV analysis.

  4. Characterization of mango (Mangifera indica L.) transcriptome and chloroplast genome.

    Science.gov (United States)

    Azim, M Kamran; Khan, Ishtaiq A; Zhang, Yong

    2014-05-01

    We characterized mango leaf transcriptome and chloroplast genome using next generation DNA sequencing. The RNA-seq output of mango transcriptome generated >12 million reads (total nucleotides sequenced >1 Gb). De novo transcriptome assembly generated 30,509 unigenes with lengths in the range of 300 to ≥3,000 nt and 67× depth of coverage. Blast searching against nonredundant nucleotide databases and several Viridiplantae genomic datasets annotated 24,593 mango unigenes (80% of total) and identified Citrus sinensis as closest neighbor of mango with 9,141 (37%) matched sequences. The annotation with gene ontology and Clusters of Orthologous Group terms categorized unigene sequences into 57 and 25 classes, respectively. More than 13,500 unigenes were assigned to 293 KEGG pathways. Besides major plant biology related pathways, KEGG based gene annotation pointed out active presence of an array of biochemical pathways involved in (a) biosynthesis of bioactive flavonoids, flavones and flavonols, (b) biosynthesis of terpenoids and lignins and (c) plant hormone signal transduction. The mango transcriptome sequences revealed 235 proteases belonging to five catalytic classes of proteolytic enzymes. The draft genome of mango chloroplast (cp) was obtained by a combination of Sanger and next generation sequencing. The draft mango cp genome size is 151,173 bp with a pair of inverted repeats of 27,093 bp separated by small and large single copy regions, respectively. Out of 139 genes in mango cp genome, 91 found to be protein coding. Sequence analysis revealed cp genome of C. sinensis as closest neighbor of mango. We found 51 short repeats in mango cp genome supposed to be associated with extensive rearrangements. This is the first report of transcriptome and chloroplast genome analysis of any Anacardiaceae family member.

  5. Reptilian Transcriptomes v2.0: An Extensive Resource for Sauropsida Genomics and Transcriptomics.

    Science.gov (United States)

    Tzika, Athanasia C; Ullate-Agote, Asier; Grbic, Djordje; Milinkovitch, Michel C

    2015-07-01

    Despite the availability of deep-sequencing techniques, genomic and transcriptomic data remain unevenly distributed across phylogenetic groups. For example, reptiles are poorly represented in sequence databases, hindering functional evolutionary and developmental studies in these lineages substantially more diverse than mammals. In addition, different studies use different assembly and annotation protocols, inhibiting meaningful comparisons. Here, we present the "Reptilian Transcriptomes Database 2.0," which provides extensive annotation of transcriptomes and genomes from species covering the major reptilian lineages. To this end, we sequenced normalized complementary DNA libraries of multiple adult tissues and various embryonic stages of the leopard gecko and the corn snake and gathered published reptilian sequence data sets from representatives of the four extant orders of reptiles: Squamata (snakes and lizards), the tuatara, crocodiles, and turtles. The LANE runner 2.0 software was implemented to annotate all assemblies within a single integrated pipeline. We show that this approach increases the annotation completeness of the assembled transcriptomes/genomes. We then built large concatenated protein alignments of single-copy genes and inferred phylogenetic trees that support the positions of turtles and the tuatara as sister groups of Archosauria and Squamata, respectively. The Reptilian Transcriptomes Database 2.0 resource will be updated to include selected new data sets as they become available, thus making it a reference for differential expression studies, comparative genomics and transcriptomics, linkage mapping, molecular ecology, and phylogenomic analyses involving reptiles. The database is available at www.reptilian-transcriptomes.org and can be enquired using a wwwblast server installed at the University of Geneva. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  6. Integrated analysis of whole genome and transcriptome sequencing reveals diverse transcriptomic aberrations driven by somatic genomic changes in liver cancers.

    Directory of Open Access Journals (Sweden)

    Yuichi Shiraishi

    Full Text Available Recent studies applying high-throughput sequencing technologies have identified several recurrently mutated genes and pathways in multiple cancer genomes. However, transcriptional consequences from these genomic alterations in cancer genome remain unclear. In this study, we performed integrated and comparative analyses of whole genomes and transcriptomes of 22 hepatitis B virus (HBV-related hepatocellular carcinomas (HCCs and their matched controls. Comparison of whole genome sequence (WGS and RNA-Seq revealed much evidence that various types of genomic mutations triggered diverse transcriptional changes. Not only splice-site mutations, but also silent mutations in coding regions, deep intronic mutations and structural changes caused splicing aberrations. HBV integrations generated diverse patterns of virus-human fusion transcripts depending on affected gene, such as TERT, CDK15, FN1 and MLL4. Structural variations could drive over-expression of genes such as WNT ligands, with/without creating gene fusions. Furthermore, by taking account of genomic mutations causing transcriptional aberrations, we could improve the sensitivity of deleterious mutation detection in known cancer driver genes (TP53, AXIN1, ARID2, RPS6KA3, and identified recurrent disruptions in putative cancer driver genes such as HNF4A, CPS1, TSC1 and THRAP3 in HCCs. These findings indicate genomic alterations in cancer genome have diverse transcriptomic effects, and integrated analysis of WGS and RNA-Seq can facilitate the interpretation of a large number of genomic alterations detected in cancer genome.

  7. Genome interplay in the grain transcriptome of hexaploid bread wheat.

    Science.gov (United States)

    Pfeifer, Matthias; Kugler, Karl G; Sandve, Simen R; Zhan, Bujie; Rudi, Heidi; Hvidsten, Torgeir R; Mayer, Klaus F X; Olsen, Odd-Arne

    2014-07-18

    Allohexaploid bread wheat (Triticum aestivum L.) provides approximately 20% of calories consumed by humans. Lack of genome sequence for the three homeologous and highly similar bread wheat genomes (A, B, and D) has impeded expression analysis of the grain transcriptome. We used previously unknown genome information to analyze the cell type-specific expression of homeologous genes in the developing wheat grain and identified distinct co-expression clusters reflecting the spatiotemporal progression during endosperm development. We observed no global but cell type- and stage-dependent genome dominance, organization of the wheat genome into transcriptionally active chromosomal regions, and asymmetric expression in gene families related to baking quality. Our findings give insight into the transcriptional dynamics and genome interplay among individual grain cell types in a polyploid cereal genome. Copyright © 2014, American Association for the Advancement of Science.

  8. Pichia stipitis genomics, transcriptomics, and gene clusters

    Science.gov (United States)

    Thomas W. Jeffries; Jennifer R. Headman Van Vleet

    2009-01-01

    Genome sequencing and subsequent global gene expression studies have advanced our understanding of the lignocellulose-fermenting yeast Pichia stipitis. These studies have provided an insight into its central carbon metabolism, and analysis of its genome has revealed numerous functional gene clusters and tandem repeats. Specialized physiological traits are often the...

  9. Bacillus anthracis genome organization in light of whole transcriptome sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Martin, Jeffrey; Zhu, Wenhan; Passalacqua, Karla D.; Bergman, Nicholas; Borodovsky, Mark

    2010-03-22

    Emerging knowledge of whole prokaryotic transcriptomes could validate a number of theoretical concepts introduced in the early days of genomics. What are the rules connecting gene expression levels with sequence determinants such as quantitative scores of promoters and terminators? Are translation efficiency measures, e.g. codon adaptation index and RBS score related to gene expression? We used the whole transcriptome shotgun sequencing of a bacterial pathogen Bacillus anthracis to assess correlation of gene expression level with promoter, terminator and RBS scores, codon adaptation index, as well as with a new measure of gene translational efficiency, average translation speed. We compared computational predictions of operon topologies with the transcript borders inferred from RNA-Seq reads. Transcriptome mapping may also improve existing gene annotation. Upon assessment of accuracy of current annotation of protein-coding genes in the B. anthracis genome we have shown that the transcriptome data indicate existence of more than a hundred genes missing in the annotation though predicted by an ab initio gene finder. Interestingly, we observed that many pseudogenes possess not only a sequence with detectable coding potential but also promoters that maintain transcriptional activity.

  10. Comprehensive analyses of genomes, transcriptomes and metabolites of neem tree

    Directory of Open Access Journals (Sweden)

    Nagesh A. Kuravadi

    2015-08-01

    Full Text Available Neem (Azadirachta indica A. Juss is one of the most versatile tropical evergreen tree species known in India since the Vedic period (1500 BC–600 BC. Neem tree is a rich source of limonoids, having a wide spectrum of activity against insect pests and microbial pathogens. Complex tetranortriterpenoids such as azadirachtin, salanin and nimbin are the major active principles isolated from neem seed. Absolutely nothing is known about the biochemical pathways of these metabolites in neem tree. To identify genes and pathways in neem, we sequenced neem genomes and transcriptomes using next generation sequencing technologies. Assembly of Illumina and 454 sequencing reads resulted in 267 Mb, which accounts for 70% of estimated size of neem genome. We predicted 44,495 genes in the neem genome, of which 32,278 genes were expressed in neem tissues. Neem genome consists about 32.5% (87 Mb of repetitive DNA elements. Neem tree is phylogenetically related to citrus, Citrus sinensis. Comparative analysis anchored 62% (161 Mb of assembled neem genomic contigs onto citrus chromomes. Ultrahigh performance liquid chromatography-mass spectrometry-selected reaction monitoring (UHPLC-MS/SRM method was used to quantify azadirachtin, nimbin, and salanin from neem tissues. Weighted Correlation Network Analysis (WCGNA of expressed genes and metabolites resulted in identification of possible candidate genes involved in azadirachtin biosynthesis pathway. This study provides genomic, transcriptomic and quantity of top three neem metabolites resource, which will accelerate basic research in neem to understand biochemical pathways.

  11. Comprehensive analyses of genomes, transcriptomes and metabolites of neem tree

    Science.gov (United States)

    Rangiah, Kannan; Mahesh, HB; Rajamani, Anantharamanan; Shirke, Meghana D.; Russiachand, Heikham; Loganathan, Ramya Malarini; Shankara Lingu, Chandana; Siddappa, Shilpa; Ramamurthy, Aishwarya; Sathyanarayana, BN

    2015-01-01

    Neem (Azadirachta indica A. Juss) is one of the most versatile tropical evergreen tree species known in India since the Vedic period (1500 BC–600 BC). Neem tree is a rich source of limonoids, having a wide spectrum of activity against insect pests and microbial pathogens. Complex tetranortriterpenoids such as azadirachtin, salanin and nimbin are the major active principles isolated from neem seed. Absolutely nothing is known about the biochemical pathways of these metabolites in neem tree. To identify genes and pathways in neem, we sequenced neem genomes and transcriptomes using next generation sequencing technologies. Assembly of Illumina and 454 sequencing reads resulted in 267 Mb, which accounts for 70% of estimated size of neem genome. We predicted 44,495 genes in the neem genome, of which 32,278 genes were expressed in neem tissues. Neem genome consists about 32.5% (87 Mb) of repetitive DNA elements. Neem tree is phylogenetically related to citrus, Citrus sinensis. Comparative analysis anchored 62% (161 Mb) of assembled neem genomic contigs onto citrus chromomes. Ultrahigh performance liquid chromatography-mass spectrometry-selected reaction monitoring (UHPLC-MS/SRM) method was used to quantify azadirachtin, nimbin, and salanin from neem tissues. Weighted Correlation Network Analysis (WCGNA) of expressed genes and metabolites resulted in identification of possible candidate genes involved in azadirachtin biosynthesis pathway. This study provides genomic, transcriptomic and quantity of top three neem metabolites resource, which will accelerate basic research in neem to understand biochemical pathways. PMID:26290780

  12. The past, present, and future of Leishmania genomics and transcriptomics

    Science.gov (United States)

    Cantacessi, Cinzia; Dantas-Torres, Filipe; Nolan, Matthew J.; Otranto, Domenico

    2015-01-01

    It has been nearly 10 years since the completion of the first entire genome sequence of a Leishmania parasite. Genomic and transcriptomic analyses have advanced our understanding of the biology of Leishmania, and shed new light on the complex interactions occurring within the parasite–host–vector triangle. Here, we review these advances and examine potential avenues for translation of these discoveries into treatment and control programs. In addition, we argue for a strong need to explore how disease in dogs relates to that in humans, and how an improved understanding in line with the ‘One Health’ concept may open new avenues for the control of these devastating diseases. PMID:25638444

  13. Single Cell Genomics and Transcriptomics for Unicellular Eukaryotes

    Energy Technology Data Exchange (ETDEWEB)

    Ciobanu, Doina; Clum, Alicia; Singh, Vasanth; Salamov, Asaf; Han, James; Copeland, Alex; Grigoriev, Igor; James, Timothy; Singer, Steven; Woyke, Tanja; Malmstrom, Rex; Cheng, Jan-Fang

    2014-03-14

    Despite their small size, unicellular eukaryotes have complex genomes with a high degree of plasticity that allow them to adapt quickly to environmental changes. Unicellular eukaryotes live with prokaryotes and higher eukaryotes, frequently in symbiotic or parasitic niches. To this day their contribution to the dynamics of the environmental communities remains to be understood. Unfortunately, the vast majority of eukaryotic microorganisms are either uncultured or unculturable, making genome sequencing impossible using traditional approaches. We have developed an approach to isolate unicellular eukaryotes of interest from environmental samples, and to sequence and analyze their genomes and transcriptomes. We have tested our methods with six species: an uncharacterized protist from cellulose-enriched compost identified as Platyophrya, a close relative of P. vorax; the fungus Metschnikowia bicuspidate, a parasite of water flea Daphnia; the mycoparasitic fungi Piptocephalis cylindrospora, a parasite of Cokeromyces and Mucor; Caulochytrium protosteloides, a parasite of Sordaria; Rozella allomycis, a parasite of the water mold Allomyces; and the microalgae Chlamydomonas reinhardtii. Here, we present the four components of our approach: pre-sequencing methods, sequence analysis for single cell genome assembly, sequence analysis of single cell transcriptomes, and genome annotation. This technology has the potential to uncover the complexity of single cell eukaryotes and their role in the environmental samples.

  14. Applied bioinformatics: Genome annotation and transcriptome analysis

    DEFF Research Database (Denmark)

    Gupta, Vikas

    agricultural and biological importance. Its capacity to form symbiotic relationships with rhizobia and microrrhizal fungi has fascinated researchers for years. Lotus has a small genome of approximately 470 Mb and a short life cycle of 2 to 3 months, which has made Lotus a model legume plant for many molecular...

  15. Whole-genome and Transcriptome Sequencing of Prostate Cancer Identify New Genetic Alterations Driving Disease Progression

    DEFF Research Database (Denmark)

    Ren, Shancheng; Wei, Gong-Hong; Liu, Dongbing

    2018-01-01

    BACKGROUND: Global disparities in prostate cancer (PCa) incidence highlight the urgent need to identify genomic abnormalities in prostate tumors in different ethnic populations including Asian men. OBJECTIVE: To systematically explore the genomic complexity and define disease-driven genetic......-scale and comprehensive genomic data of prostate cancer from Asian population. Identification of these genetic alterations may help advance prostate cancer diagnosis, prognosis, and treatment....... alterations in PCa. DESIGN, SETTING, AND PARTICIPANTS: The study sequenced whole-genome and transcriptome of tumor-benign paired tissues from 65 treatment-naive Chinese PCa patients. Subsequent targeted deep sequencing of 293 PCa-relevant genes was performed in another cohort of 145 prostate tumors. OUTCOME...

  16. The draft genome and transcriptome of Cannabis sativa.

    Science.gov (United States)

    van Bakel, Harm; Stout, Jake M; Cote, Atina G; Tallon, Carling M; Sharpe, Andrew G; Hughes, Timothy R; Page, Jonathan E

    2011-10-20

    Cannabis sativa has been cultivated throughout human history as a source of fiber, oil and food, and for its medicinal and intoxicating properties. Selective breeding has produced cannabis plants for specific uses, including high-potency marijuana strains and hemp cultivars for fiber and seed production. The molecular biology underlying cannabinoid biosynthesis and other traits of interest is largely unexplored. We sequenced genomic DNA and RNA from the marijuana strain Purple Kush using shortread approaches. We report a draft haploid genome sequence of 534 Mb and a transcriptome of 30,000 genes. Comparison of the transcriptome of Purple Kush with that of the hemp cultivar 'Finola' revealed that many genes encoding proteins involved in cannabinoid and precursor pathways are more highly expressed in Purple Kush than in 'Finola'. The exclusive occurrence of Δ9-tetrahydrocannabinolic acid synthase in the Purple Kush transcriptome, and its replacement by cannabidiolic acid synthase in 'Finola', may explain why the psychoactive cannabinoid Δ9-tetrahydrocannabinol (THC) is produced in marijuana but not in hemp. Resequencing the hemp cultivars 'Finola' and 'USO-31' showed little difference in gene copy numbers of cannabinoid pathway enzymes. However, single nucleotide variant analysis uncovered a relatively high level of variation among four cannabis types, and supported a separation of marijuana and hemp. The availability of the Cannabis sativa genome enables the study of a multifunctional plant that occupies a unique role in human culture. Its availability will aid the development of therapeutic marijuana strains with tailored cannabinoid profiles and provide a basis for the breeding of hemp with improved agronomic characteristics.

  17. Comparative genomics and transcriptomics of trait-gene association

    Directory of Open Access Journals (Sweden)

    Pierlé Sebastián

    2012-11-01

    Full Text Available Abstract Background The Order Rickettsiales includes important tick-borne pathogens, from Rickettsia rickettsii, which causes Rocky Mountain spotted fever, to Anaplasma marginale, the most prevalent vector-borne pathogen of cattle. Although most pathogens in this Order are transmitted by arthropod vectors, little is known about the microbial determinants of transmission. A. marginale provides unique tools for studying the determinants of transmission, with multiple strain sequences available that display distinct and reproducible transmission phenotypes. The closed core A. marginale genome suggests that any phenotypic differences are due to single nucleotide polymorphisms (SNPs. We combined DNA/RNA comparative genomic approaches using strains with different tick transmission phenotypes and identified genes that segregate with transmissibility. Results Comparison of seven strains with different transmission phenotypes generated a list of SNPs affecting 18 genes and nine promoters. Transcriptional analysis found two candidate genes downstream from promoter SNPs that were differentially transcribed. To corroborate the comparative genomics approach we used three RNA-seq platforms to analyze the transcriptomes from two A. marginale strains with different transmission phenotypes. RNA-seq analysis confirmed the comparative genomics data and found 10 additional genes whose transcription between strains with distinct transmission efficiencies was significantly different. Six regions of the genome that contained no annotation were found to be transcriptionally active, and two of these newly identified transcripts were differentially transcribed. Conclusions This approach identified 30 genes and two novel transcripts potentially involved in tick transmission. We describe the transcriptome of an obligate intracellular bacterium in depth, while employing massive parallel sequencing to dissect an important trait in bacterial pathogenesis.

  18. Exploration of the Germline Genome of the Ciliate Chilodonella uncinata through Single-Cell Omics (Transcriptomics and Genomics

    Directory of Open Access Journals (Sweden)

    Xyrus X. Maurer-Alcalá

    2018-01-01

    Full Text Available Separate germline and somatic genomes are found in numerous lineages across the eukaryotic tree of life, often separated into distinct tissues (e.g., in plants, animals, and fungi or distinct nuclei sharing a common cytoplasm (e.g., in ciliates and some foraminifera. In ciliates, germline-limited (i.e., micronuclear-specific DNA is eliminated during the development of a new somatic (i.e., macronuclear genome in a process that is tightly linked to large-scale genome rearrangements, such as deletions and reordering of protein-coding sequences. Most studies of germline genome architecture in ciliates have focused on the model ciliates Oxytricha trifallax, Paramecium tetraurelia, and Tetrahymena thermophila, for which the complete germline genome sequences are known. Outside of these model taxa, only a few dozen germline loci have been characterized from a limited number of cultivable species, which is likely due to difficulties in obtaining sufficient quantities of “purified” germline DNA in these taxa. Combining single-cell transcriptomics and genomics, we have overcome these limitations and provide the first insights into the structure of the germline genome of the ciliate Chilodonella uncinata, a member of the understudied class Phyllopharyngea. Our analyses reveal the following: (i large gene families contain a disproportionate number of genes from scrambled germline loci; (ii germline-soma boundaries in the germline genome are demarcated by substantial shifts in GC content; (iii single-cell omics techniques provide large-scale quality germline genome data with limited effort, at least for ciliates with extensively fragmented somatic genomes. Our approach provides an efficient means to understand better the evolution of genome rearrangements between germline and soma in ciliates.

  19. Genome sequence and transcriptome analyses of the thermophilic zygomycete fungus Rhizomucor miehei.

    Science.gov (United States)

    Zhou, Peng; Zhang, Guoqiang; Chen, Shangwu; Jiang, Zhengqiang; Tang, Yanbin; Henrissat, Bernard; Yan, Qiaojuan; Yang, Shaoqing; Chen, Chin-Fu; Zhang, Bing; Du, Zhenglin

    2014-04-21

    The zygomycete fungi like Rhizomucor miehei have been extensively exploited for the production of various enzymes. As a thermophilic fungus, R. miehei is capable of growing at temperatures that approach the upper limits for all eukaryotes. To date, over hundreds of fungal genomes are publicly available. However, Zygomycetes have been rarely investigated both genetically and genomically. Here, we report the genome of R. miehei CAU432 to explore the thermostable enzymatic repertoire of this fungus. The assembled genome size is 27.6-million-base (Mb) with 10,345 predicted protein-coding genes. Even being thermophilic, the G + C contents of fungal whole genome (43.8%) and coding genes (47.4%) are less than 50%. Phylogenetically, R. miehei is more closerly related to Phycomyces blakesleeanus than to Mucor circinelloides and Rhizopus oryzae. The genome of R. miehei harbors a large number of genes encoding secreted proteases, which is consistent with the characteristics of R. miehei being a rich producer of proteases. The transcriptome profile of R. miehei showed that the genes responsible for degrading starch, glucan, protein and lipid were highly expressed. The genome information of R. miehei will facilitate future studies to better understand the mechanisms of fungal thermophilic adaptation and the exploring of the potential of R. miehei in industrial-scale production of thermostable enzymes. Based on the existence of a large repertoire of amylolytic, proteolytic and lipolytic genes in the genome, R. miehei has potential in the production of a variety of such enzymes.

  20. Comparative Genomics and Transcriptomics Analyses Reveal Divergent Lifestyle Features of Nematode Endoparasitic Fungus Hirsutella minnesotensis

    Science.gov (United States)

    Lai, Yiling; Liu, Keke; Zhang, Xinyu; Zhang, Xiaoling; Li, Kuan; Wang, Niuniu; Shu, Chi; Wu, Yunpeng; Wang, Chengshu; Bushley, Kathryn E.; Xiang, Meichun; Liu, Xingzhong

    2014-01-01

    Hirsutella minnesotensis [Ophiocordycipitaceae (Hypocreales, Ascomycota)] is a dominant endoparasitic fungus by using conidia that adhere to and penetrate the secondary stage juveniles of soybean cyst nematode. Its genome was de novo sequenced and compared with five entomopathogenic fungi in the Hypocreales and three nematode-trapping fungi in the Orbiliales (Ascomycota). The genome of H. minnesotensis is 51.4 Mb and encodes 12,702 genes enriched with transposable elements up to 32%. Phylogenomic analysis revealed that H. minnesotensis was diverged from entomopathogenic fungi in Hypocreales. Genome of H. minnesotensis is similar to those of entomopathogenic fungi to have fewer genes encoding lectins for adhesion and glycoside hydrolases for cellulose degradation, but is different from those of nematode-trapping fungi to possess more genes for protein degradation, signal transduction, and secondary metabolism. Those results indicate that H. minnesotensis has evolved different mechanism for nematode endoparasitism compared with nematode-trapping fungi. Transcriptomics analyses for the time-scale parasitism revealed the upregulations of lectins, secreted proteases and the genes for biosynthesis of secondary metabolites that could be putatively involved in host surface adhesion, cuticle degradation, and host manipulation. Genome and transcriptome analyses provided comprehensive understanding of the evolution and lifestyle of nematode endoparasitism. PMID:25359922

  1. A Universal Genome Array and Transcriptome Atlas for Brachypodium Distachyon

    Energy Technology Data Exchange (ETDEWEB)

    Mockler, Todd [Oregon State Univ., Corvallis, OR (United States)

    2017-04-17

    Brachypodium distachyon is the premier experimental model grass platform and is related to candidate feedstock crops for bioethanol production. Based on the DOE-JGI Brachypodium Bd21 genome sequence and annotation we designed a whole genome DNA microarray platform. The quality of this array platform is unprecedented due to the exceptional quality of the Brachypodium genome assembly and annotation and the stringent probe selection criteria employed in the design. We worked with members of the international community and the bioinformatics/design team at Affymetrix at all stages in the development of the array. We used the Brachypodium arrays to interrogate the transcriptomes of plants grown in a variety of environmental conditions including diurnal and circadian light/temperature conditions and under a variety of environmental conditions. We examined the transciptional responses of Brachypodium seedlings subjected to various abiotic stresses including heat, cold, salt, and high intensity light. We generated a gene expression atlas representing various organs and developmental stages. The results of these efforts including all microarray datasets are published and available at online public databases.

  2. Genome-Scale Reconstruction of the Human Astrocyte Metabolic Network

    OpenAIRE

    Mart?n-Jim?nez, Cynthia A.; Salazar-Barreto, Diego; Barreto, George E.; Gonz?lez, Janneth

    2017-01-01

    Astrocytes are the most abundant cells of the central nervous system; they have a predominant role in maintaining brain metabolism. In this sense, abnormal metabolic states have been found in different neuropathological diseases. Determination of metabolic states of astrocytes is difficult to model using current experimental approaches given the high number of reactions and metabolites present. Thus, genome-scale metabolic networks derived from transcriptomic data can be used as a framework t...

  3. Transcriptome analysis reveals the time of the fourth round of genome duplication in common carp (Cyprinus carpio)

    Science.gov (United States)

    2012-01-01

    Background Common carp (Cyprinus carpio) is thought to have undergone one extra round of genome duplication compared to zebrafish. Transcriptome analysis has been used to study the existence and timing of genome duplication in species for which genome sequences are incomplete. Large-scale transcriptome data for the common carp genome should help reveal the timing of the additional duplication event. Results We have sequenced the transcriptome of common carp using 454 pyrosequencing. After assembling the 454 contigs and the published common carp sequences together, we obtained 49,669 contigs and identified genes using homology searches and an ab initio method. We identified 4,651 orthologous pairs between common carp and zebrafish and found 129,984 paralogous pairs within the common carp. An estimation of the synonymous substitution rate in the orthologous pairs indicated that common carp and zebrafish diverged 120 million years ago (MYA). We identified one round of genome duplication in common carp and estimated that it had occurred 5.6 to 11.3 MYA. In zebrafish, no genome duplication event after speciation was observed, suggesting that, compared to zebrafish, common carp had undergone an additional genome duplication event. We annotated the common carp contigs with Gene Ontology terms and KEGG pathways. Compared with zebrafish gene annotations, we found that a set of biological processes and pathways were enriched in common carp. Conclusions The assembled contigs helped us to estimate the time of the fourth-round of genome duplication in common carp. The resource that we have built as part of this study will help advance functional genomics and genome annotation studies in the future. PMID:22424280

  4. Multiple reference genomes and transcriptomes for Arabidopsis thaliana

    KAUST Repository

    Gan, Xiangchao

    2011-08-28

    Genetic differences between Arabidopsis thaliana accessions underlie the plants extensive phenotypic variation, and until now these have been interpreted largely in the context of the annotated reference accession Col-0. Here we report the sequencing, assembly and annotation of the genomes of 18 natural A. thaliana accessions, and their transcriptomes. When assessed on the basis of the reference annotation, one-third of protein-coding genes are predicted to be disrupted in at least one accession. However, re-annotation of each genome revealed that alternative gene models often restore coding potential. Gene expression in seedlings differed for nearly half of expressed genes and was frequently associated with cis variants within 5 kilobases, as were intron retention alternative splicing events. Sequence and expression variation is most pronounced in genes that respond to the biotic environment. Our data further promote evolutionary and functional studies in A. thaliana, especially the MAGIC genetic reference population descended from these accessions. ©2011 Macmillan Publishers Limited. All rights reserved.

  5. Multiple reference genomes and transcriptomes for Arabidopsis thaliana

    KAUST Repository

    Gan, Xiangchao; Stegle, Oliver; Behr, Jonas; Steffen, Joshua G.; Drewe, Philipp; Hildebrand, Katie L.; Lyngsoe, Rune; Schultheiss, Sebastian J.; Osborne, Edward J.; Sreedharan, Vipin T.; Kahles, André ; Bohnert, Regina; Jean, Gé raldine; Derwent, Paul; Kersey, Paul; Belfield, Eric J.; Harberd, Nicholas P.; Kemen, Eric; Toomajian, Christopher; Kover, Paula X.; Clark, Richard M.; Rä tsch, Gunnar; Mott, Richard

    2011-01-01

    Genetic differences between Arabidopsis thaliana accessions underlie the plants extensive phenotypic variation, and until now these have been interpreted largely in the context of the annotated reference accession Col-0. Here we report the sequencing, assembly and annotation of the genomes of 18 natural A. thaliana accessions, and their transcriptomes. When assessed on the basis of the reference annotation, one-third of protein-coding genes are predicted to be disrupted in at least one accession. However, re-annotation of each genome revealed that alternative gene models often restore coding potential. Gene expression in seedlings differed for nearly half of expressed genes and was frequently associated with cis variants within 5 kilobases, as were intron retention alternative splicing events. Sequence and expression variation is most pronounced in genes that respond to the biotic environment. Our data further promote evolutionary and functional studies in A. thaliana, especially the MAGIC genetic reference population descended from these accessions. ©2011 Macmillan Publishers Limited. All rights reserved.

  6. Consequences of normalizing transcriptomic and genomic libraries of plant genomes using a duplex-specific nuclease and tetramethylammonium chloride.

    Science.gov (United States)

    Matvienko, Marta; Kozik, Alexander; Froenicke, Lutz; Lavelle, Dean; Martineau, Belinda; Perroud, Bertrand; Michelmore, Richard

    2013-01-01

    Several applications of high throughput genome and transcriptome sequencing would benefit from a reduction of the high-copy-number sequences in the libraries being sequenced and analyzed, particularly when applied to species with large genomes. We adapted and analyzed the consequences of a method that utilizes a thermostable duplex-specific nuclease for reducing the high-copy components in transcriptomic and genomic libraries prior to sequencing. This reduces the time, cost, and computational effort of obtaining informative transcriptomic and genomic sequence data for both fully sequenced and non-sequenced genomes. It also reduces contamination from organellar DNA in preparations of nuclear DNA. Hybridization in the presence of 3 M tetramethylammonium chloride (TMAC), which equalizes the rates of hybridization of GC and AT nucleotide pairs, reduced the bias against sequences with high GC content. Consequences of this method on the reduction of high-copy and enrichment of low-copy sequences are reported for Arabidopsis and lettuce.

  7. Consequences of normalizing transcriptomic and genomic libraries of plant genomes using a duplex-specific nuclease and tetramethylammonium chloride.

    Directory of Open Access Journals (Sweden)

    Marta Matvienko

    Full Text Available Several applications of high throughput genome and transcriptome sequencing would benefit from a reduction of the high-copy-number sequences in the libraries being sequenced and analyzed, particularly when applied to species with large genomes. We adapted and analyzed the consequences of a method that utilizes a thermostable duplex-specific nuclease for reducing the high-copy components in transcriptomic and genomic libraries prior to sequencing. This reduces the time, cost, and computational effort of obtaining informative transcriptomic and genomic sequence data for both fully sequenced and non-sequenced genomes. It also reduces contamination from organellar DNA in preparations of nuclear DNA. Hybridization in the presence of 3 M tetramethylammonium chloride (TMAC, which equalizes the rates of hybridization of GC and AT nucleotide pairs, reduced the bias against sequences with high GC content. Consequences of this method on the reduction of high-copy and enrichment of low-copy sequences are reported for Arabidopsis and lettuce.

  8. Transcriptome

    Science.gov (United States)

    ... Also: Talking Glossary of Genetic Terms Definitions for genetic terms used on this page En Español: Transcriptoma Transcriptome What is a transcriptome? What can a transcriptome tell us? How can transcriptome data be used to explore gene function? What is ...

  9. Comparative genomics reveals conservative evolution of the xylem transcriptome in vascular plants.

    Science.gov (United States)

    Li, Xinguo; Wu, Harry X; Southerton, Simon G

    2010-06-21

    Wood is a valuable natural resource and a major carbon sink. Wood formation is an important developmental process in vascular plants which played a crucial role in plant evolution. Although genes involved in xylem formation have been investigated, the molecular mechanisms of xylem evolution are not well understood. We use comparative genomics to examine evolution of the xylem transcriptome to gain insights into xylem evolution. The xylem transcriptome is highly conserved in conifers, but considerably divergent in angiosperms. The functional domains of genes in the xylem transcriptome are moderately to highly conserved in vascular plants, suggesting the existence of a common ancestral xylem transcriptome. Compared to the total transcriptome derived from a range of tissues, the xylem transcriptome is relatively conserved in vascular plants. Of the xylem transcriptome, cell wall genes, ancestral xylem genes, known proteins and transcription factors are relatively more conserved in vascular plants. A total of 527 putative xylem orthologs were identified, which are unevenly distributed across the Arabidopsis chromosomes with eight hot spots observed. Phylogenetic analysis revealed that evolution of the xylem transcriptome has paralleled plant evolution. We also identified 274 conifer-specific xylem unigenes, all of which are of unknown function. These xylem orthologs and conifer-specific unigenes are likely to have played a crucial role in xylem evolution. Conifers have highly conserved xylem transcriptomes, while angiosperm xylem transcriptomes are relatively diversified. Vascular plants share a common ancestral xylem transcriptome. The xylem transcriptomes of vascular plants are more conserved than the total transcriptomes. Evolution of the xylem transcriptome has largely followed the trend of plant evolution.

  10. Transcriptomic and genomic features of invasive lobular breast cancer.

    Science.gov (United States)

    Desmedt, Christine; Zoppoli, Gabriele; Sotiriou, Christos; Salgado, Roberto

    2017-06-01

    Accounting for 10-15% of all breast neoplasms, invasive lobular breast cancer (ILC) is the second most common histological subtype of breast cancer after invasive ductal breast cancer (IDC). Understanding ILC biology, which differs from IDC in terms of clinical presentation, treatment response, relapse timing and patterns, is essential in order to adopt novel, disease-specific management strategies. While the contribution of the histological subtypes to tumour biology has been poorly investigated and acknowledged in the past, recently several major, independent efforts have led to the assembly and molecular characterization of well-annotated ILC case sets. In this review, we provide a critical overview of the literature exploring ILC, through comprehensive and multiomic methods. The first part specifically focuses on ILC transcriptomic features by reviewing the intrinsic molecular subtypes, the application of gene expression scores for the prediction of recurrence, and the identification of gene expression subtypes. The second part describes the main research efforts that lead to the identification of the genomic landscape of ILC, with a special focus to findings that differentiate ILC from IDC and carry potential clinical relevance. Copyright © 2017 Elsevier Ltd. All rights reserved.

  11. Plasmodium vivax Biology: Insights Provided by Genomics, Transcriptomics and Proteomics

    Science.gov (United States)

    Bourgard, Catarina; Albrecht, Letusa; Kayano, Ana C. A. V.; Sunnerhagen, Per; Costa, Fabio T. M.

    2018-01-01

    During the last decade, the vast omics field has revolutionized biological research, especially the genomics, transcriptomics and proteomics branches, as technological tools become available to the field researcher and allow difficult question-driven studies to be addressed. Parasitology has greatly benefited from next generation sequencing (NGS) projects, which have resulted in a broadened comprehension of basic parasite molecular biology, ecology and epidemiology. Malariology is one example where application of this technology has greatly contributed to a better understanding of Plasmodium spp. biology and host-parasite interactions. Among the several parasite species that cause human malaria, the neglected Plasmodium vivax presents great research challenges, as in vitro culturing is not yet feasible and functional assays are heavily limited. Therefore, there are gaps in our P. vivax biology knowledge that affect decisions for control policies aiming to eradicate vivax malaria in the near future. In this review, we provide a snapshot of key discoveries already achieved in P. vivax sequencing projects, focusing on developments, hurdles, and limitations currently faced by the research community, as well as perspectives on future vivax malaria research. PMID:29473024

  12. Development of genome- and transcriptome-derived microsatellites in related species of snapping shrimps with highly duplicated genomes.

    Science.gov (United States)

    Gaynor, Kaitlyn M; Solomon, Joseph W; Siller, Stefanie; Jessell, Linnet; Duffy, J Emmett; Rubenstein, Dustin R

    2017-11-01

    Molecular markers are powerful tools for studying patterns of relatedness and parentage within populations and for making inferences about social evolution. However, the development of molecular markers for simultaneous study of multiple species presents challenges, particularly when species exhibit genome duplication or polyploidy. We developed microsatellite markers for Synalpheus shrimp, a genus in which species exhibit not only great variation in social organization, but also interspecific variation in genome size and partial genome duplication. From the four primary clades within Synalpheus, we identified microsatellites in the genomes of four species and in the consensus transcriptome of two species. Ultimately, we designed and tested primers for 143 microsatellite markers across 25 species. Although the majority of markers were disomic, many markers were polysomic for certain species. Surprisingly, we found no relationship between genome size and the number of polysomic markers. As expected, markers developed for a given species amplified better for closely related species than for more distant relatives. Finally, the markers developed from the transcriptome were more likely to work successfully and to be disomic than those developed from the genome, suggesting that consensus transcriptomes are likely to be conserved across species. Our findings suggest that the transcriptome, particularly consensus sequences from multiple species, can be a valuable source of molecular markers for taxa with complex, duplicated genomes. © 2017 John Wiley & Sons Ltd.

  13. Tools for the Validation of Genomes and Transcriptomes with Proteomics data

    DEFF Research Database (Denmark)

    Pang, Chi Nam Ignatius; Aya, Carlos; Tay, Aidan

    data generated from protein mass spectrometry. We are developing a set of tools which allow users to: •Co-visualise genomics, transcriptomics, and proteomics data using the Integrated Genomics Viewer (IGV).1 •Validate the existence of genes and mRNAs using peptides identified from mass spectrometry...

  14. Transcriptome complexity in a genome-reduced bacterium

    DEFF Research Database (Denmark)

    Güell, Marc; van Noort, Vera; Yus, Eva

    2009-01-01

    To study basic principles of transcriptome organization in bacteria, we analyzed one of the smallest self-replicating organisms, Mycoplasma pneumoniae. We combined strand-specific tiling arrays, complemented by transcriptome sequencing, with more than 252 spotted arrays. We detected 117 previousl...

  15. Draft genomes and reference transcriptomes extend the coding potential of the fish pathogen Piscirickettsia salmonis

    Directory of Open Access Journals (Sweden)

    Angela D. Millar

    2018-05-01

    Full Text Available Background: Draft and complete genome sequences from bacteria are key tools to understand genetic determinants involved in pathogenesis in several disease models. Piscirickettsia salmonis is a Gram-negative bacterium responsible for the Salmon Rickettsial Syndrome (SRS, a bacterial disease that threatens the sustainability of the Chilean salmon industry. In previous reports, complete and draft genome sequences have been generated and annotated. However, the lack of transcriptome data underestimates the genetic potential, does not provide information about transcriptional units and contributes to disseminate annotation errors. Results: Here we present the draft genome and transcriptome sequences of four P. salmonis strains. We have identified the transcriptional architecture of previously characterized virulence factors and trait-specific genes associated to cation uptake, metal efflux, antibiotic resistance, secretion systems and other virulence factors. Conclusions: This data has provided a refined genome annotation and also new insights on the transcriptional structures and coding potential of this fish pathogen.How to cite: Millar AD, Tapia P, Gomez FA, et al. Draft genomes and reference transcriptomes extend the coding potential of the fish pathogen Piscirickettsia salmonis. Electron J Biotechnol 2018;33. https://doi.org/10.1016/j.ejbt.2018.04.002. Keywords: Bacterial genomes, Coding potential, Comparative analysis, Draft genome, Piscirickettsia salmonis, Reference transcriptome, Refined annotation, Salmon Rickettsial Syndrome, Salmonids

  16. CTDB: An Integrated Chickpea Transcriptome Database for Functional and Applied Genomics

    OpenAIRE

    Verma, Mohit; Kumar, Vinay; Patel, Ravi K.; Garg, Rohini; Jain, Mukesh

    2015-01-01

    Chickpea is an important grain legume used as a rich source of protein in human diet. The narrow genetic diversity and limited availability of genomic resources are the major constraints in implementing breeding strategies and biotechnological interventions for genetic enhancement of chickpea. We developed an integrated Chickpea Transcriptome Database (CTDB), which provides the comprehensive web interface for visualization and easy retrieval of transcriptome data in chickpea. The database fea...

  17. Marine Genomics: A clearing-house for genomic and transcriptomic data of marine organisms

    Directory of Open Access Journals (Sweden)

    Trent Harold F

    2005-03-01

    Full Text Available Abstract Background The Marine Genomics project is a functional genomics initiative developed to provide a pipeline for the curation of Expressed Sequence Tags (ESTs and gene expression microarray data for marine organisms. It provides a unique clearing-house for marine specific EST and microarray data and is currently available at http://www.marinegenomics.org. Description The Marine Genomics pipeline automates the processing, maintenance, storage and analysis of EST and microarray data for an increasing number of marine species. It currently contains 19 species databases (over 46,000 EST sequences that are maintained by registered users from local and remote locations in Europe and South America in addition to the USA. A collection of analysis tools are implemented. These include a pipeline upload tool for EST FASTA file, sequence trace file and microarray data, an annotative text search, automated sequence trimming, sequence quality control (QA/QC editing, sequence BLAST capabilities and a tool for interactive submission to GenBank. Another feature of this resource is the integration with a scientific computing analysis environment implemented by MATLAB. Conclusion The conglomeration of multiple marine organisms with integrated analysis tools enables users to focus on the comprehensive descriptions of transcriptomic responses to typical marine stresses. This cross species data comparison and integration enables users to contain their research within a marine-oriented data management and analysis environment.

  18. De novo assembling and primary analysis of genome and transcriptome of gray whale Eschrichtius robustus.

    Science.gov (United States)

    Moskalev, Alexey А; Kudryavtseva, Anna V; Graphodatsky, Alexander S; Beklemisheva, Violetta R; Serdyukova, Natalya A; Krutovsky, Konstantin V; Sharov, Vadim V; Kulakovskiy, Ivan V; Lando, Andrey S; Kasianov, Artem S; Kuzmin, Dmitry A; Putintseva, Yuliya A; Feranchuk, Sergey I; Shaposhnikov, Mikhail V; Fraifeld, Vadim E; Toren, Dmitri; Snezhkina, Anastasia V; Sitnik, Vasily V

    2017-12-28

    Gray whale, Eschrichtius robustus (E. robustus), is a single member of the family Eschrichtiidae, which is considered to be the most primitive in the class Cetacea. Gray whale is often described as a "living fossil". It is adapted to extreme marine conditions and has a high life expectancy (77 years). The assembly of a gray whale genome and transcriptome will allow to carry out further studies of whale evolution, longevity, and resistance to extreme environment. In this work, we report the first de novo assembly and primary analysis of the E. robustus genome and transcriptome based on kidney and liver samples. The presented draft genome assembly is complete by 55% in terms of a total genome length, but only by 24% in terms of the BUSCO complete gene groups, although 10,895 genes were identified. Transcriptome annotation and comparison with other whale species revealed robust expression of DNA repair and hypoxia-response genes, which is expected for whales. This preliminary study of the gray whale genome and transcriptome provides new data to better understand the whale evolution and the mechanisms of their adaptation to the hypoxic conditions.

  19. A genome resource to address mechanisms of developmental programming: determination of the fetal sheep heart transcriptome.

    Science.gov (United States)

    Cox, Laura A; Glenn, Jeremy P; Spradling, Kimberly D; Nijland, Mark J; Garcia, Roy; Nathanielsz, Peter W; Ford, Stephen P

    2012-06-15

    The pregnant sheep has provided seminal insights into reproduction related to animal and human development (ovarian function, fertility, implantation, fetal growth, parturition and lactation). Fetal sheep physiology has been extensively studied since 1950, contributing significantly to the basis for our understanding of many aspects of fetal development and behaviour that remain in use in clinical practice today. Understanding mechanisms requires the combination of systems approaches uniquely available in fetal sheep with the power of genomic studies. Absence of the full range of sheep genomic resources has limited the full realization of the power of this model, impeding progress in emerging areas of pregnancy biology such as developmental programming. We have examined the expressed fetal sheep heart transcriptome using high-throughput sequencing technologies. In so doing we identified 36,737 novel transcripts and describe genes, gene variants and pathways relevant to fundamental developmental mechanisms. Genes with the highest expression levels and with novel exons in the fetal heart transcriptome are known to play central roles in muscle development. We show that high-throughput sequencing methods can generate extensive transcriptome information in the absence of an assembled and annotated genome for that species. The gene sequence data obtained provide a unique genomic resource for sheep specific genetic technology development and, combined with the polymorphism data, augment annotation and assembly of the sheep genome. In addition, identification and pathway analysis of novel fetal sheep heart transcriptome splice variants is a first step towards revealing mechanisms of genetic variation and gene environment interactions during fetal heart development.

  20. Improving amphibian genomic resources: a multitissue reference transcriptome of an iconic invader.

    Science.gov (United States)

    Richardson, Mark F; Sequeira, Fernando; Selechnik, Daniel; Carneiro, Miguel; Vallinoto, Marcelo; Reid, Jack G; West, Andrea J; Crossland, Michael R; Shine, Richard; Rollins, Lee A

    2018-01-01

    Cane toads (Rhinella marina) are an iconic invasive species introduced to 4 continents and well utilized for studies of rapid evolution in introduced environments. Despite the long introduction history of this species, its profound ecological impacts, and its utility for demonstrating evolutionary principles, genetic information is sparse. Here we produce a de novo transcriptome spanning multiple tissues and life stages to enable investigation of the genetic basis of previously identified rapid phenotypic change over the introduced range. Using approximately 1.9 billion reads from developing tadpoles and 6 adult tissue-specific cDNA libraries, as well as a transcriptome assembly pipeline encompassing 100 separate de novo assemblies, we constructed 62 202 transcripts, of which we functionally annotated ∼50%. Our transcriptome assembly exhibits 90% full-length completeness of the Benchmarking Universal Single-Copy Orthologs data set. Robust assembly metrics and comparisons with several available anuran transcriptomes and genomes indicate that our cane toad assembly is one of the most complete anuran genomic resources available. This comprehensive anuran transcriptome will provide a valuable resource for investigation of genes under selection during invasion in cane toads, but will also greatly expand our general knowledge of anuran genomes, which are underrepresented in the literature. The data set is publically available in NCBI and GigaDB to serve as a resource for other researchers. © The Authors 2017. Published by Oxford University Press.

  1. [Genomics and transcriptomics of the Chinese liver fluke Clonorchis sinensis (Opisthorchiidae, Trematoda)].

    Science.gov (United States)

    Chelomina, G N

    2017-01-01

    The review summarizes the results of first genomic and transcriptomic investigations of the liver fluke Clonorchis sinensis (Opisthorchiidae, Trematoda). The studies mark the dawn of the genomic era for opisthorchiids, which cause severe hepatobiliary diseases in humans and animals. Their results aided in understanding the molecular mechanisms of adaptation to parasitism, parasite survival in mammalian biliary tracts, and genome dynamics in the individual development and the development of parasite-host relationships. Special attention is paid to the achievements in studying the codon usage bias and the roles of mobile genetic elements (MGEs) and small interfering RNAs (siRNAs). Interspecific comparisons at the genomic and transcriptomic levels revealed molecular differences, which may contribute to understanding the specialized niches and physiological needs of the respective species. The studies in C. sinensis provide a basis for further basic and applied research in liver flukes and, in particular, the development of efficient means to prevent, diagnose, and treat clonorchiasis.

  2. Comparative genomics and transcriptomics of lineages I, II, and III strains of Listeria monocytogenes

    Directory of Open Access Journals (Sweden)

    Hain Torsten

    2012-04-01

    Full Text Available Abstract Background Listeria monocytogenes is a food-borne pathogen that causes infections with a high-mortality rate and has served as an invaluable model for intracellular parasitism. Here, we report complete genome sequences for two L. monocytogenes strains belonging to serotype 4a (L99 and 4b (CLIP80459, and transcriptomes of representative strains from lineages I, II, and III, thereby permitting in-depth comparison of genome- and transcriptome -based data from three lineages of L. monocytogenes. Lineage III, represented by the 4a L99 genome is known to contain strains less virulent for humans. Results The genome analysis of the weakly pathogenic L99 serotype 4a provides extensive evidence of virulence gene decay, including loss of several important surface proteins. The 4b CLIP80459 genome, unlike the previously sequenced 4b F2365 genome harbours an intact inlB invasion gene. These lineage I strains are characterized by the lack of prophage genes, as they share only a single prophage locus with other L. monocytogenes genomes 1/2a EGD-e and 4a L99. Comparative transcriptome analysis during intracellular growth uncovered adaptive expression level differences in lineages I, II and III of Listeria, notable amongst which was a strong intracellular induction of flagellar genes in strain 4a L99 compared to the other lineages. Furthermore, extensive differences between strains are manifest at levels of metabolic flux control and phosphorylated sugar uptake. Intriguingly, prophage gene expression was found to be a hallmark of intracellular gene expression. Deletion mutants in the single shared prophage locus of lineage II strain EGD-e 1/2a, the lma operon, revealed severe attenuation of virulence in a murine infection model. Conclusion Comparative genomics and transcriptome analysis of L. monocytogenes strains from three lineages implicate prophage genes in intracellular adaptation and indicate that gene loss and decay may have led to the emergence

  3. De novo Transcriptome Assemblies of Rana (Lithobates catesbeiana and Xenopus laevis Tadpole Livers for Comparative Genomics without Reference Genomes.

    Directory of Open Access Journals (Sweden)

    Inanc Birol

    Full Text Available In this work we studied the liver transcriptomes of two frog species, the American bullfrog (Rana (Lithobates catesbeiana and the African clawed frog (Xenopus laevis. We used high throughput RNA sequencing (RNA-seq data to assemble and annotate these transcriptomes, and compared how their baseline expression profiles change when tadpoles of the two species are exposed to thyroid hormone. We generated more than 1.5 billion RNA-seq reads in total for the two species under two conditions as treatment/control pairs. We de novo assembled these reads using Trans-ABySS to reconstruct reference transcriptomes, obtaining over 350,000 and 130,000 putative transcripts for R. catesbeiana and X. laevis, respectively. Using available genomics resources for X. laevis, we annotated over 97% of our X. laevis transcriptome contigs, demonstrating the utility and efficacy of our methodology. Leveraging this validated analysis pipeline, we also annotated the assembled R. catesbeiana transcriptome. We used the expression profiles of the annotated genes of the two species to examine the similarities and differences between the tadpole liver transcriptomes. We also compared the gene ontology terms of expressed genes to measure how the animals react to a challenge by thyroid hormone. Our study reports three main conclusions. First, de novo assembly of RNA-seq data is a powerful method for annotating and establishing transcriptomes of non-model organisms. Second, the liver transcriptomes of the two frog species, R. catesbeiana and X. laevis, show many common features, and the distribution of their gene ontology profiles are statistically indistinguishable. Third, although they broadly respond the same way to the presence of thyroid hormone in their environment, their receptor/signal transduction pathways display marked differences.

  4. Next generation transcriptomics and genomics elucidate biological complexity of microglia in health and disease

    NARCIS (Netherlands)

    Wes, Paul D; Holtman, Inge R; Boddeke, Erik W G M; Möller, Thomas; Eggen, Bart J L

    2015-01-01

    Genome-wide expression profiling technology has resulted in detailed transcriptome data for a wide range of tissues, conditions and diseases. In neuroscience, expression datasets were mostly generated using whole brain tissue samples, resulting in data from a mixture of cell types, including glial

  5. Transcriptome and genome size analysis of the venus flytrap

    DEFF Research Database (Denmark)

    Jensen, Michael Krogh; Vogt, Josef Korbinian; Bressendorff, Simon

    2015-01-01

    . muscipula flowers and traps. Using the Oases transcriptome assembler 79,165,657 quality trimmed reads were assembled into 80,806 cDNA contigs, with an average length of 679 bp and an N50 length of 1,051 bp. A total of 17,047 unique proteins were identified, and assigned to Gene Ontology (GO) and classified...

  6. Genome-scale neurogenetics: methodology and meaning.

    Science.gov (United States)

    McCarroll, Steven A; Feng, Guoping; Hyman, Steven E

    2014-06-01

    Genetic analysis is currently offering glimpses into molecular mechanisms underlying such neuropsychiatric disorders as schizophrenia, bipolar disorder and autism. After years of frustration, success in identifying disease-associated DNA sequence variation has followed from new genomic technologies, new genome data resources, and global collaborations that could achieve the scale necessary to find the genes underlying highly polygenic disorders. Here we describe early results from genome-scale studies of large numbers of subjects and the emerging significance of these results for neurobiology.

  7. Transcriptome sequencing and whole genome expression profiling of chrysanthemum under dehydration stress

    Science.gov (United States)

    2013-01-01

    Background Chrysanthemum is one of the most important ornamental crops in the world and drought stress seriously limits its production and distribution. In order to generate a functional genomics resource and obtain a deeper understanding of the molecular mechanisms regarding chrysanthemum responses to dehydration stress, we performed large-scale transcriptome sequencing of chrysanthemum plants under dehydration stress using the Illumina sequencing technology. Results Two cDNA libraries constructed from mRNAs of control and dehydration-treated seedlings were sequenced by Illumina technology. A total of more than 100 million reads were generated and de novo assembled into 98,180 unique transcripts which were further extensively annotated by comparing their sequencing to different protein databases. Biochemical pathways were predicted from these transcript sequences. Furthermore, we performed gene expression profiling analysis upon dehydration treatment in chrysanthemum and identified 8,558 dehydration-responsive unique transcripts, including 307 transcription factors and 229 protein kinases and many well-known stress responsive genes. Gene ontology (GO) term enrichment and biochemical pathway analyses showed that dehydration stress caused changes in hormone response, secondary and amino acid metabolism, and light and photoperiod response. These findings suggest that drought tolerance of chrysanthemum plants may be related to the regulation of hormone biosynthesis and signaling, reduction of oxidative damage, stabilization of cell proteins and structures, and maintenance of energy and carbon supply. Conclusions Our transcriptome sequences can provide a valuable resource for chrysanthemum breeding and research and novel insights into chrysanthemum responses to dehydration stress and offer candidate genes or markers that can be used to guide future studies attempting to breed drought tolerant chrysanthemum cultivars. PMID:24074255

  8. Genomics, transcriptomics and proteomics to elucidate the pathogenesis of rheumatoid arthritis.

    Science.gov (United States)

    Song, Xinqiang; Lin, Qingsong

    2017-08-01

    Rheumatoid arthritis is an autoimmune disease that affects several organs and tissues, predominantly the synovial joints. The pathogenesis of this disease is not completely understood, which maybe involved in the genomic variations, gene expression, protein translation and post-translational modifications. These system variations in genomics, transcriptomics and proteomics are dynamic in nature and their crosstalk is overwhelmingly complex, thus analyzing them separately may not be very informative. However, various '-omics' techniques developed in recent years have opened up new possibilities for clarifying disease pathways and thereby facilitating early diagnosis and specific therapies. This review examines how recent advances in the fields of genomics, transcriptomics and proteomics have contributed to our understanding of rheumatoid arthritis.

  9. CoryneCenter – An online resource for the integrated analysis of corynebacterial genome and transcriptome data

    Directory of Open Access Journals (Sweden)

    Hüser Andrea T

    2007-11-01

    Full Text Available Abstract Background The introduction of high-throughput genome sequencing and post-genome analysis technologies, e.g. DNA microarray approaches, has created the potential to unravel and scrutinize complex gene-regulatory networks on a large scale. The discovery of transcriptional regulatory interactions has become a major topic in modern functional genomics. Results To facilitate the analysis of gene-regulatory networks, we have developed CoryneCenter, a web-based resource for the systematic integration and analysis of genome, transcriptome, and gene regulatory information for prokaryotes, especially corynebacteria. For this purpose, we extended and combined the following systems into a common platform: (1 GenDB, an open source genome annotation system, (2 EMMA, a MAGE compliant application for high-throughput transcriptome data storage and analysis, and (3 CoryneRegNet, an ontology-based data warehouse designed to facilitate the reconstruction and analysis of gene regulatory interactions. We demonstrate the potential of CoryneCenter by means of an application example. Using microarray hybridization data, we compare the gene expression of Corynebacterium glutamicum under acetate and glucose feeding conditions: Known regulatory networks are confirmed, but moreover CoryneCenter points out additional regulatory interactions. Conclusion CoryneCenter provides more than the sum of its parts. Its novel analysis and visualization features significantly simplify the process of obtaining new biological insights into complex regulatory systems. Although the platform currently focusses on corynebacteria, the integrated tools are by no means restricted to these species, and the presented approach offers a general strategy for the analysis and verification of gene regulatory networks. CoryneCenter provides freely accessible projects with the underlying genome annotation, gene expression, and gene regulation data. The system is publicly available at http://www.CoryneCenter.de.

  10. Babelomics: advanced functional profiling of transcriptomics, proteomics and genomics experiments

    Science.gov (United States)

    Al-Shahrour, Fátima; Carbonell, José; Minguez, Pablo; Goetz, Stefan; Conesa, Ana; Tárraga, Joaquín; Medina, Ignacio; Alloza, Eva; Montaner, David; Dopazo, Joaquín

    2008-01-01

    We present a new version of Babelomics, a complete suite of web tools for the functional profiling of genome scale experiments, with new and improved methods as well as more types of functional definitions. Babelomics includes different flavours of conventional functional enrichment methods as well as more advanced gene set analysis methods that makes it a unique tool among the similar resources available. In addition to the well-known functional definitions (GO, KEGG), Babelomics includes new ones such as Biocarta pathways or text mining-derived functional terms. Regulatory modules implemented include transcriptional control (Transfac, CisRed) and other levels of regulation such as miRNA-mediated interference. Moreover, Babelomics allows for sub-selection of terms in order to test more focused hypothesis. Also gene annotation correspondence tables can be imported, which allows testing with user-defined functional modules. Finally, a tool for the ‘de novo’ functional annotation of sequences has been included in the system. This allows using yet unannotated organisms in the program. Babelomics has been extensively re-engineered and now it includes the use of web services and Web 2.0 technology features, a new user interface with persistent sessions and a new extended database of gene identifiers. Babelomics is available at http://www.babelomics.org PMID:18515841

  11. A draft of the genome and four transcriptomes of a medicinal and pesticidal angiosperm Azadirachta indica

    Directory of Open Access Journals (Sweden)

    Krishnan Neeraja M

    2012-09-01

    Full Text Available Abstract Background The Azadirachta indica (neem tree is a source of a wide number of natural products, including the potent biopesticide azadirachtin. In spite of its widespread applications in agriculture and medicine, the molecular aspects of the biosynthesis of neem terpenoids remain largely unexplored. The current report describes the draft genome and four transcriptomes of A. indica and attempts to contextualise the sequence information in terms of its molecular phylogeny, transcript expression and terpenoid biosynthesis pathways. A. indica is the first member of the family Meliaceae to be sequenced using next generation sequencing approach. Results The genome and transcriptomes of A. indica were sequenced using multiple sequencing platforms and libraries. The A. indica genome is AT-rich, bears few repetitive DNA elements and comprises about 20,000 genes. The molecular phylogenetic analyses grouped A. indica together with Citrus sinensis from the Rutaceae family validating its conventional taxonomic classification. Comparative transcript expression analysis showed either exclusive or enhanced expression of known genes involved in neem terpenoid biosynthesis pathways compared to other sequenced angiosperms. Genome and transcriptome analyses in A. indica led to the identification of repeat elements, nucleotide composition and expression profiles of genes in various organs. Conclusions This study on A. indica genome and transcriptomes will provide a model for characterization of metabolic pathways involved in synthesis of bioactive compounds, comparative evolutionary studies among various Meliaceae family members and help annotate their genomes. A better understanding of molecular pathways involved in the azadirachtin synthesis in A. indica will pave ways for bulk production of environment friendly biopesticides.

  12. A draft of the genome and four transcriptomes of a medicinal and pesticidal angiosperm Azadirachta indica

    Science.gov (United States)

    2012-01-01

    Background The Azadirachta indica (neem) tree is a source of a wide number of natural products, including the potent biopesticide azadirachtin. In spite of its widespread applications in agriculture and medicine, the molecular aspects of the biosynthesis of neem terpenoids remain largely unexplored. The current report describes the draft genome and four transcriptomes of A. indica and attempts to contextualise the sequence information in terms of its molecular phylogeny, transcript expression and terpenoid biosynthesis pathways. A. indica is the first member of the family Meliaceae to be sequenced using next generation sequencing approach. Results The genome and transcriptomes of A. indica were sequenced using multiple sequencing platforms and libraries. The A. indica genome is AT-rich, bears few repetitive DNA elements and comprises about 20,000 genes. The molecular phylogenetic analyses grouped A. indica together with Citrus sinensis from the Rutaceae family validating its conventional taxonomic classification. Comparative transcript expression analysis showed either exclusive or enhanced expression of known genes involved in neem terpenoid biosynthesis pathways compared to other sequenced angiosperms. Genome and transcriptome analyses in A. indica led to the identification of repeat elements, nucleotide composition and expression profiles of genes in various organs. Conclusions This study on A. indica genome and transcriptomes will provide a model for characterization of metabolic pathways involved in synthesis of bioactive compounds, comparative evolutionary studies among various Meliaceae family members and help annotate their genomes. A better understanding of molecular pathways involved in the azadirachtin synthesis in A. indica will pave ways for bulk production of environment friendly biopesticides. PMID:22958331

  13. Genome and Transcriptome Sequencing of the Ostreid herpesvirus 1 From Tomales Bay, California

    Science.gov (United States)

    Burge, C. A.; Langevin, S.; Closek, C. J.; Roberts, S. B.; Friedman, C. S.

    2016-02-01

    Mass mortalities of larval and seed bivalve molluscs attributed to the Ostreid herpesvirus 1 (OsHV-1) occur globally. OsHV-1 was fully sequenced and characterized as a member of the Family Malacoherpesviridae. Multiple strains of OsHV-1 exist and may vary in virulence, i.e. OsHV-1 µvar. For most global variants of OsHV-1, sequence data is limited to PCR-based sequencing of segments, including two recent genomes. In the United States, OsHV-1 is limited to detection in adjacent embayments in California, Tomales and Drakes bays. Limited DNA sequence data of OsHV-1 infecting oysters in Tomales Bay indicates the virus detected in Tomales Bay is similar but not identical to any one global variant of OsHV-1. In order to better understand both strain variation and virulence of OsHV-1 infecting oysters in Tomales Bay, we used genomic and transcriptomic sequencing. Meta-genomic sequencing (Illumina MiSeq) was conducted from infected oysters (n=4 per year) collected in 2003, 2007, and 2014, where full OsHV-1 genome sequences and low overall microbial diversity were achieved from highly infected oysters. Increased microbial diversity was detected in three of four samples sequenced from 2003, where qPCR based genome copy numbers of OsHV-1 were lower. Expression analysis (SOLiD RNA sequencing) of OsHV-1 genes expressed in oyster larvae at 24 hours post exposure revealed a nearly complete transcriptome, with several highly expressed genes, which are similar to recent transcriptomic analyses of other OsHV-1 variants. Taken together, our results indicate that genome and transcriptome sequencing may be powerful tools in understanding both strain variation and virulence of non-culturable marine viruses.

  14. Analysis Of Transcriptomes In A Porcine Tissue Collection Using RNA-Seq And Genome Assembly 10

    DEFF Research Database (Denmark)

    Hornshøj, Henrik; Thomsen, Bo; Hedegaard, Jakob

    2011-01-01

    The release of Sus scrofa genome assembly 10 supports improvement of the pig genome annotation and in depth transcriptome analyses using next-generation sequencing technologies. In this study we analyze RNA-seq reads from a tissue collection, including 10 separate tissues from Duroc boars and 10...... short read alignment software we mapped the reads to the genome assembly 10. We extracted contig sequences of gene transcripts using the Cufflinks software. Based on this information we identified expressed genes that are present in the genome assembly. The portion of these genes being previously known...... was roughly estimated by sequence comparison to known genes. Similarly, we searched for genes that are expressed in the tissues but not present in the genome assembly by aligning the non-genome-mapped reads to known gene transcripts. For the genes predicted to have alternative transcript variants by Cufflinks...

  15. The genome and transcriptome of perennial ryegrass mitochondria

    DEFF Research Database (Denmark)

    Islam, Md. Shofiqul; Studer, Bruno; Byrne, Stephen

    2013-01-01

    Background: Perennial ryegrass (Lolium perenne L.) is one of the most important forage and turf grass species of temperate regions worldwide. Its mitochondrial genome is inherited maternally and contains genes that can influence traits of agricultural importance. Moreover, the DNA sequence...... and annotation of the complete mitochondrial genome from perennial ryegrass. Results: Intact mitochondria from perennial ryegrass leaves were isolated and used for mtDNA extraction. The mitochondrial genome was sequenced to a 167-fold coverage using the Roche 454 GS-FLX Titanium platform, and assembled...... of mitochondrial genomes has been established and compared for a large number of species in order to characterize evolutionary relationships.Therefore, it is crucial to understand the organization of the mitochondrial genome and how it varies between and within species. Here, we report the first de novo assembly...

  16. Soybean (Glycine max) SWEET gene family: insights through comparative genomics, transcriptome profiling and whole genome re-sequence analysis.

    Science.gov (United States)

    Patil, Gunvant; Valliyodan, Babu; Deshmukh, Rupesh; Prince, Silvas; Nicander, Bjorn; Zhao, Mingzhe; Sonah, Humira; Song, Li; Lin, Li; Chaudhary, Juhi; Liu, Yang; Joshi, Trupti; Xu, Dong; Nguyen, Henry T

    2015-07-11

    SWEET (MtN3_saliva) domain proteins, a recently identified group of efflux transporters, play an indispensable role in sugar efflux, phloem loading, plant-pathogen interaction and reproductive tissue development. The SWEET gene family is predominantly studied in Arabidopsis and members of the family are being investigated in rice. To date, no transcriptome or genomics analysis of soybean SWEET genes has been reported. In the present investigation, we explored the evolutionary aspect of the SWEET gene family in diverse plant species including primitive single cell algae to angiosperms with a major emphasis on Glycine max. Evolutionary features showed expansion and duplication of the SWEET gene family in land plants. Homology searches with BLAST tools and Hidden Markov Model-directed sequence alignments identified 52 SWEET genes that were mapped to 15 chromosomes in the soybean genome as tandem duplication events. Soybean SWEET (GmSWEET) genes showed a wide range of expression profiles in different tissues and developmental stages. Analysis of public transcriptome data and expression profiling using quantitative real time PCR (qRT-PCR) showed that a majority of the GmSWEET genes were confined to reproductive tissue development. Several natural genetic variants (non-synonymous SNPs, premature stop codons and haplotype) were identified in the GmSWEET genes using whole genome re-sequencing data analysis of 106 soybean genotypes. A significant association was observed between SNP-haplogroup and seed sucrose content in three gene clusters on chromosome 6. Present investigation utilized comparative genomics, transcriptome profiling and whole genome re-sequencing approaches and provided a systematic description of soybean SWEET genes and identified putative candidates with probable roles in the reproductive tissue development. Gene expression profiling at different developmental stages and genomic variation data will aid as an important resource for the soybean research

  17. Chapter 4 genomics, transcriptomics, and epigenomics in traumatic brain injury research.

    Science.gov (United States)

    Puccio, Ava M; Alexander, Sheila

    2015-01-01

    The long-term effects and significant impact of the full spectrum of traumatic brain injury (TBI) has received increased attention in recent years. Despite increased research efforts, there has been little movement toward improving outcomes for the survivors of TBI. TBI is a heterogeneous condition with a complex biological response, and significant variability in human recovery contributes to the difficulty in identifying therapeutics that improve outcomes. Personalized medicine, identifying the best course of treatment for a given individual based on individual characteristics, has great potential to improve recovery for TBI survivors. The advances in medical genetics and genomics over the past 20 years have increased our understanding of many biological processes. A substantial amount of research has focused on the genomic, transcriptomic, and epigenomic profiles in many health and disease states, including recovery from TBI. The focus of this review chapter is to describe the current state of the science in genomic, transcriptomic, and epigenomic research in the TBI population. There have been some advancements toward understanding the genomic, transcriptomic, and epigenomic processes in humans, but much of this work remains at the preclinical stage. This current evidence does improve our understanding of TBI recovery, but also serves as an excellent platform upon which to build further study toward improved outcomes for this population.

  18. Comparative whole genome transcriptome and metabolome analyses of five Klebsiella pneumonia strains.

    Science.gov (United States)

    Lee, Soojin; Kim, Borim; Yang, Jeongmo; Jeong, Daun; Park, Soohyun; Shin, Sang Heum; Kook, Jun Ho; Yang, Kap-Seok; Lee, Jinwon

    2015-11-01

    The integration of transcriptomics and metabolomics can provide precise information on gene-to-metabolite networks for identifying the function of novel genes. The goal of this study was to identify novel gene functions involved in 2,3-butanediol (2,3-BDO) biosynthesis by a comprehensive analysis of the transcriptome and metabolome of five mutated Klebsiella pneumonia strains (∆wabG = SGSB100, ∆wabG∆budA = SGSB106, ∆wabG∆budB = SGSB107, ∆wabG∆budC = SGSB108, ∆wabG∆budABC = SGSB109). First, the transcriptomes of all five mutants were analyzed and the genes exhibiting reproducible changes in expression were determined. The transcriptome was well conserved among the five strains, and differences in gene expression occurred mainly in genes coding for 2,3-BDO biosynthesis (budA, budB, and budC) and the genes involved in the degradation of reactive oxygen, biosynthesis and transport of arginine, cysteine biosynthesis, sulfur metabolism, oxidoreductase reaction, and formate dehydrogenase reaction. Second, differences in the metabolome (estimated by carbon distribution, CO2 emission, and redox balance) among the five mutant strains due to gene alteration of the 2,3-BDO operon were detected. The functional genomics approach integrating metabolomics and transcriptomics in K. Pneumonia presented here provides an innovative means of identifying novel gene functions involved in 2,3-BDO biosynthesis metabolism and whole cell metabolism.

  19. KONAGAbase: a genomic and transcriptomic database for the diamondback moth, Plutella xylostella

    OpenAIRE

    Jouraku, Akiya; Yamamoto, Kimiko; Kuwazaki, Seigo; Urio, Masahiro; Suetsugu, Yoshitaka; Narukawa, Junko; Miyamoto, Kazuhisa; Kurita, Kanako; Kanamori, Hiroyuki; Katayose, Yuichi; Matsumoto, Takashi; Noda, Hiroaki

    2013-01-01

    Background The diamondback moth (DBM), Plutella xylostella, is one of the most harmful insect pests for crucifer crops worldwide. DBM has rapidly evolved high resistance to most conventional insecticides such as pyrethroids, organophosphates, fipronil, spinosad, Bacillus thuringiensis, and diamides. Therefore, it is important to develop genomic and transcriptomic DBM resources for analysis of genes related to insecticide resistance, both to clarify the mechanism of resistance of DBM and to fa...

  20. KONAGAbase: a genomic and transcriptomic database for the diamondback moth, Plutella xylostella.

    Science.gov (United States)

    Jouraku, Akiya; Yamamoto, Kimiko; Kuwazaki, Seigo; Urio, Masahiro; Suetsugu, Yoshitaka; Narukawa, Junko; Miyamoto, Kazuhisa; Kurita, Kanako; Kanamori, Hiroyuki; Katayose, Yuichi; Matsumoto, Takashi; Noda, Hiroaki

    2013-07-09

    The diamondback moth (DBM), Plutella xylostella, is one of the most harmful insect pests for crucifer crops worldwide. DBM has rapidly evolved high resistance to most conventional insecticides such as pyrethroids, organophosphates, fipronil, spinosad, Bacillus thuringiensis, and diamides. Therefore, it is important to develop genomic and transcriptomic DBM resources for analysis of genes related to insecticide resistance, both to clarify the mechanism of resistance of DBM and to facilitate the development of insecticides with a novel mode of action for more effective and environmentally less harmful insecticide rotation. To contribute to this goal, we developed KONAGAbase, a genomic and transcriptomic database for DBM (KONAGA is the Japanese word for DBM). KONAGAbase provides (1) transcriptomic sequences of 37,340 ESTs/mRNAs and 147,370 RNA-seq contigs which were clustered and assembled into 84,570 unigenes (30,695 contigs, 50,548 pseudo singletons, and 3,327 singletons); and (2) genomic sequences of 88,530 WGS contigs with 246,244 degenerate contigs and 106,455 singletons from which 6,310 de novo identified repeat sequences and 34,890 predicted gene-coding sequences were extracted. The unigenes and predicted gene-coding sequences were clustered and 32,800 representative sequences were extracted as a comprehensive putative gene set. These sequences were annotated with BLAST descriptions, Gene Ontology (GO) terms, and Pfam descriptions, respectively. KONAGAbase contains rich graphical user interface (GUI)-based web interfaces for easy and efficient searching, browsing, and downloading sequences and annotation data. Five useful search interfaces consisting of BLAST search, keyword search, BLAST result-based search, GO tree-based search, and genome browser are provided. KONAGAbase is publicly available from our website (http://dbm.dna.affrc.go.jp/px/) through standard web browsers. KONAGAbase provides DBM comprehensive transcriptomic and draft genomic sequences with

  1. CTDB: An Integrated Chickpea Transcriptome Database for Functional and Applied Genomics.

    Directory of Open Access Journals (Sweden)

    Mohit Verma

    Full Text Available Chickpea is an important grain legume used as a rich source of protein in human diet. The narrow genetic diversity and limited availability of genomic resources are the major constraints in implementing breeding strategies and biotechnological interventions for genetic enhancement of chickpea. We developed an integrated Chickpea Transcriptome Database (CTDB, which provides the comprehensive web interface for visualization and easy retrieval of transcriptome data in chickpea. The database features many tools for similarity search, functional annotation (putative function, PFAM domain and gene ontology search and comparative gene expression analysis. The current release of CTDB (v2.0 hosts transcriptome datasets with high quality functional annotation from cultivated (desi and kabuli types and wild chickpea. A catalog of transcription factor families and their expression profiles in chickpea are available in the database. The gene expression data have been integrated to study the expression profiles of chickpea transcripts in major tissues/organs and various stages of flower development. The utilities, such as similarity search, ortholog identification and comparative gene expression have also been implemented in the database to facilitate comparative genomic studies among different legumes and Arabidopsis. Furthermore, the CTDB represents a resource for the discovery of functional molecular markers (microsatellites and single nucleotide polymorphisms between different chickpea types. We anticipate that integrated information content of this database will accelerate the functional and applied genomic research for improvement of chickpea. The CTDB web service is freely available at http://nipgr.res.in/ctdb.html.

  2. The Whole-Genome and Transcriptome of the Manila Clam (Ruditapes philippinarum).

    Science.gov (United States)

    Mun, Seyoung; Kim, Yun-Ji; Markkandan, Kesavan; Shin, Wonseok; Oh, Sumin; Woo, Jiyoung; Yoo, Jongsu; An, Hyesuck; Han, Kyudong

    2017-06-01

    The manila clam, Ruditapes philippinarum, is an important bivalve species in worldwide aquaculture including Korea. The aquaculture production of R. philippinarum is under threat from diverse environmental factors including viruses, microorganisms, parasites, and water conditions with subsequently declining production. In spite of its importance as a marine resource, the reference genome of R. philippinarum for comprehensive genetic studies is largely unexplored. Here, we report the de novo whole-genome and transcriptome assembly of R. philippinarum across three different tissues (foot, gill, and adductor muscle), and provide the basic data for advanced studies in selective breeding and disease control in order to obtain successful aquaculture systems. An approximately 2.56 Gb high quality whole-genome was assembled with various library construction methods. A total of 108,034 protein coding gene models were predicted and repetitive elements including simple sequence repeats and noncoding RNAs were identified to further understanding of the genetic background of R. philippinarum for genomics-assisted breeding. Comparative analysis with the bivalve marine invertebrates uncover that the gene family related to complement C1q was enriched. Furthermore, we performed transcriptome analysis with three different tissues in order to support genome annotation and then identified 41,275 transcripts which were annotated. The R. philippinarum genome resource will markedly advance a wide range of potential genetic studies, a reference genome for comparative analysis of bivalve species and unraveling mechanisms of biological processes in molluscs. We believe that the R. philippinarum genome will serve as an initial platform for breeding better-quality clams using a genomic approach. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  3. Deep transcriptome sequencing provides new insights into the structural and functional organization of the wheat genome.

    Science.gov (United States)

    Pingault, Lise; Choulet, Frédéric; Alberti, Adriana; Glover, Natasha; Wincker, Patrick; Feuillet, Catherine; Paux, Etienne

    2015-02-10

    Because of its size, allohexaploid nature, and high repeat content, the bread wheat genome is a good model to study the impact of the genome structure on gene organization, function, and regulation. However, because of the lack of a reference genome sequence, such studies have long been hampered and our knowledge of the wheat gene space is still limited. The access to the reference sequence of the wheat chromosome 3B provided us with an opportunity to study the wheat transcriptome and its relationships to genome and gene structure at a level that has never been reached before. By combining this sequence with RNA-seq data, we construct a fine transcriptome map of the chromosome 3B. More than 8,800 transcription sites are identified, that are distributed throughout the entire chromosome. Expression level, expression breadth, alternative splicing as well as several structural features of genes, including transcript length, number of exons, and cumulative intron length are investigated. Our analysis reveals a non-monotonic relationship between gene expression and structure and leads to the hypothesis that gene structure is determined by its function, whereas gene expression is subject to energetic cost. Moreover, we observe a recombination-based partitioning at the gene structure and function level. Our analysis provides new insights into the relationships between gene and genome structure and function. It reveals mechanisms conserved with other plant species as well as superimposed evolutionary forces that shaped the wheat gene space, likely participating in wheat adaptation.

  4. Swine transcriptome characterization by combined Iso-Seq and RNA-seq for annotating the emerging long read-based reference genome

    Science.gov (United States)

    PacBio long-read sequencing technology is increasingly popular in genome sequence assembly and transcriptome cataloguing. Recently, a new-generation pig reference genome was assembled based on long reads from this technology. To finely annotate this genome assembly, transcriptomes of nine tissues fr...

  5. Extreme-Scale De Novo Genome Assembly

    Energy Technology Data Exchange (ETDEWEB)

    Georganas, Evangelos [Intel Corporation, Santa Clara, CA (United States); Hofmeyr, Steven [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Joint Genome Inst.; Egan, Rob [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Division; Buluc, Aydin [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Joint Genome Inst.; Oliker, Leonid [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Joint Genome Inst.; Rokhsar, Daniel [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Division; Yelick, Katherine [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Joint Genome Inst.

    2017-09-26

    De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially erroneous DNA segments and is one of the most important computations in modern genomics. This work presents HipMER, a high-quality end-to-end de novo assembler designed for extreme scale analysis, via efficient parallelization of the Meraculous code. Genome assembly software has many components, each of which stresses different components of a computer system. This chapter explains the computational challenges involved in each step of the HipMer pipeline, the key distributed data structures, and communication costs in detail. We present performance results of assembling the human genome and the large hexaploid wheat genome on large supercomputers up to tens of thousands of cores.

  6. Analyzing AbrB-Knockout Effects through Genome and Transcriptome Sequencing of Bacillus licheniformis DW2

    Science.gov (United States)

    Shu, Cheng-Cheng; Wang, Dong; Guo, Jing; Song, Jia-Ming; Chen, Shou-Wen; Chen, Ling-Ling; Gao, Jun-Xiang

    2018-01-01

    As an industrial bacterium, Bacillus licheniformis DW2 produces bacitracin which is an important antibiotic for many pathogenic microorganisms. Our previous study showed AbrB-knockout could significantly increase the production of bacitracin. Accordingly, it was meaningful to understand its genome features, expression differences between wild and AbrB-knockout (ΔAbrB) strains, and the regulation of bacitracin biosynthesis. Here, we sequenced, de novo assembled and annotated its genome, and also sequenced the transcriptomes in three growth phases. The genome of DW2 contained a DNA molecule of 4,468,952 bp with 45.93% GC content and 4,717 protein coding genes. The transcriptome reads were mapped to the assembled genome, and obtained 4,102∼4,536 expressed genes from different samples. We investigated transcription changes in B. licheniformis DW2 and showed that ΔAbrB caused hundreds of genes up-regulation and down-regulation in different growth phases. We identified a complete bacitracin synthetase gene cluster, including the location and length of bacABC, bcrABC, and bacT, as well as their arrangement. The gene cluster bcrABC were significantly up-regulated in ΔAbrB strain, which supported the hypothesis in previous study of bcrABC transporting bacitracin out of the cell to avoid self-intoxication, and was consistent with the previous experimental result that ΔAbrB could yield more bacitracin. This study provided a high quality reference genome for B. licheniformis DW2, and the transcriptome data depicted global alterations across two strains and three phases offered an understanding of AbrB regulation and bacitracin biosynthesis through gene expression. PMID:29599755

  7. Combined analysis of the chloroplast genome and transcriptome of the Antarctic vascular plant Deschampsia antarctica Desv.

    Science.gov (United States)

    Lee, Jungeun; Kang, Yoonjee; Shin, Seung Chul; Park, Hyun; Lee, Hyoungseok

    2014-01-01

    Antarctic hairgrass (Deschampsia antarctica Desv.) is the only natural grass species in the maritime Antarctic. It has been researched as an important ecological marker and as an extremophile plant for studies on stress tolerance. Despite its importance, little genomic information is available for D. antarctica. Here, we report the complete chloroplast genome, transcriptome profiles of the coding/noncoding genes, and the posttranscriptional processing by RNA editing in the chloroplast system. The complete chloroplast genome of D. antarctica is 135,362 bp in length with a typical quadripartite structure, including the large (LSC: 79,881 bp) and small (SSC: 12,519 bp) single-copy regions, separated by a pair of identical inverted repeats (IR: 21,481 bp). It contains 114 unique genes, including 81 unique protein-coding genes, 29 tRNA genes, and 4 rRNA genes. Sequence divergence analysis with other plastomes from the BEP clade of the grass family suggests a sister relationship between D. antarctica, Festuca arundinacea and Lolium perenne of the Poeae tribe, based on the whole plastome. In addition, we conducted high-resolution mapping of the chloroplast-derived transcripts. Thus, we created an expression profile for 81 protein-coding genes and identified ndhC, psbJ, rps19, psaJ, and psbA as the most highly expressed chloroplast genes. Small RNA-seq analysis identified 27 small noncoding RNAs of chloroplast origin that were preferentially located near the 5'- or 3'-ends of genes. We also found >30 RNA-editing sites in the D. antarctica chloroplast genome, with a dominance of C-to-U conversions. We assembled and characterized the complete chloroplast genome sequence of D. antarctica and investigated the features of the plastid transcriptome. These data may contribute to a better understanding of the evolution of D. antarctica within the Poaceae family for use in molecular phylogenetic studies and may also help researchers understand the characteristics of the chloroplast

  8. The Genomic and Transcriptomic Landscape of a HeLa Cell Line

    Science.gov (United States)

    Landry, Jonathan J. M.; Pyl, Paul Theodor; Rausch, Tobias; Zichner, Thomas; Tekkedil, Manu M.; Stütz, Adrian M.; Jauch, Anna; Aiyar, Raeka S.; Pau, Gregoire; Delhomme, Nicolas; Gagneur, Julien; Korbel, Jan O.; Huber, Wolfgang; Steinmetz, Lars M.

    2013-01-01

    HeLa is the most widely used model cell line for studying human cellular and molecular biology. To date, no genomic reference for this cell line has been released, and experiments have relied on the human reference genome. Effective design and interpretation of molecular genetic studies performed using HeLa cells require accurate genomic information. Here we present a detailed genomic and transcriptomic characterization of a HeLa cell line. We performed DNA and RNA sequencing of a HeLa Kyoto cell line and analyzed its mutational portfolio and gene expression profile. Segmentation of the genome according to copy number revealed a remarkably high level of aneuploidy and numerous large structural variants at unprecedented resolution. Some of the extensive genomic rearrangements are indicative of catastrophic chromosome shattering, known as chromothripsis. Our analysis of the HeLa gene expression profile revealed that several pathways, including cell cycle and DNA repair, exhibit significantly different expression patterns from those in normal human tissues. Our results provide the first detailed account of genomic variants in the HeLa genome, yielding insight into their impact on gene expression and cellular function as well as their origins. This study underscores the importance of accounting for the strikingly aberrant characteristics of HeLa cells when designing and interpreting experiments, and has implications for the use of HeLa as a model of human biology. PMID:23550136

  9. The Genome and Development-Dependent Transcriptomes of Pyronema confluens: A Window into Fungal Evolution

    Science.gov (United States)

    Traeger, Stefanie; Altegoer, Florian; Freitag, Michael; Gabaldon, Toni; Kempken, Frank; Kumar, Abhishek; Marcet-Houben, Marina; Pöggeler, Stefanie; Stajich, Jason E.; Nowrousian, Minou

    2013-01-01

    Fungi are a large group of eukaryotes found in nearly all ecosystems. More than 250 fungal genomes have already been sequenced, greatly improving our understanding of fungal evolution, physiology, and development. However, for the Pezizomycetes, an early-diverging lineage of filamentous ascomycetes, there is so far only one genome available, namely that of the black truffle, Tuber melanosporum, a mycorrhizal species with unusual subterranean fruiting bodies. To help close the sequence gap among basal filamentous ascomycetes, and to allow conclusions about the evolution of fungal development, we sequenced the genome and assayed transcriptomes during development of Pyronema confluens, a saprobic Pezizomycete with a typical apothecium as fruiting body. With a size of 50 Mb and ∼13,400 protein-coding genes, the genome is more characteristic of higher filamentous ascomycetes than the large, repeat-rich truffle genome; however, some typical features are different in the P. confluens lineage, e.g. the genomic environment of the mating type genes that is conserved in higher filamentous ascomycetes, but only partly conserved in P. confluens. On the other hand, P. confluens has a full complement of fungal photoreceptors, and expression studies indicate that light perception might be similar to distantly related ascomycetes and, thus, represent a basic feature of filamentous ascomycetes. Analysis of spliced RNA-seq sequence reads allowed the detection of natural antisense transcripts for 281 genes. The P. confluens genome contains an unusually high number of predicted orphan genes, many of which are upregulated during sexual development, consistent with the idea of rapid evolution of sex-associated genes. Comparative transcriptomics identified the transcription factor gene pro44 that is upregulated during development in P. confluens and the Sordariomycete Sordaria macrospora. The P. confluens pro44 gene (PCON_06721) was used to complement the S. macrospora pro44 deletion

  10. The genome and development-dependent transcriptomes of Pyronema confluens: a window into fungal evolution.

    Directory of Open Access Journals (Sweden)

    Stefanie Traeger

    Full Text Available Fungi are a large group of eukaryotes found in nearly all ecosystems. More than 250 fungal genomes have already been sequenced, greatly improving our understanding of fungal evolution, physiology, and development. However, for the Pezizomycetes, an early-diverging lineage of filamentous ascomycetes, there is so far only one genome available, namely that of the black truffle, Tuber melanosporum, a mycorrhizal species with unusual subterranean fruiting bodies. To help close the sequence gap among basal filamentous ascomycetes, and to allow conclusions about the evolution of fungal development, we sequenced the genome and assayed transcriptomes during development of Pyronema confluens, a saprobic Pezizomycete with a typical apothecium as fruiting body. With a size of 50 Mb and ~13,400 protein-coding genes, the genome is more characteristic of higher filamentous ascomycetes than the large, repeat-rich truffle genome; however, some typical features are different in the P. confluens lineage, e.g. the genomic environment of the mating type genes that is conserved in higher filamentous ascomycetes, but only partly conserved in P. confluens. On the other hand, P. confluens has a full complement of fungal photoreceptors, and expression studies indicate that light perception might be similar to distantly related ascomycetes and, thus, represent a basic feature of filamentous ascomycetes. Analysis of spliced RNA-seq sequence reads allowed the detection of natural antisense transcripts for 281 genes. The P. confluens genome contains an unusually high number of predicted orphan genes, many of which are upregulated during sexual development, consistent with the idea of rapid evolution of sex-associated genes. Comparative transcriptomics identified the transcription factor gene pro44 that is upregulated during development in P. confluens and the Sordariomycete Sordaria macrospora. The P. confluens pro44 gene (PCON_06721 was used to complement the S. macrospora

  11. Comparative Genomics and Transcriptomic Analysis of Mycobacterium Kansasii

    KAUST Repository

    Alzahid, Yara

    2014-04-01

    The group of Mycobacteria is one of the most intensively studied bacterial taxa, as they cause the two historical and worldwide known diseases: leprosy and tuberculosis. Mycobacteria not identified as tuberculosis or leprosy complex, have been referred to by ‘environmental mycobacteria’ or ‘Nontuberculous mycobacteria (NTM). Mycobacterium kansasii (M. kansasii) is one of the most frequent NTM pathogens, as it causes pulmonary disease in immuno-competent patients and pulmonary, and disseminated disease in patients with various immuno-deficiencies. There have been five documented subtypes of this bacterium, by different molecular typing methods, showing that type I causes tuberculosis-like disease in healthy individuals, and type II in immune-compromised individuals. The remaining types are said to be environmental, thereby, not causing any diseases. The aim of this project was to conduct a comparative genomic study of M. kansasii types I-V and investigating the gene expression level of those types. From various comparative genomics analysis, provided genomics evidence on why M. kansasii type I is considered pathogenic, by focusing on three key elements that are involved in virulence of Mycobacteria: ESX secretion system, Phospholipase c (plcb) and Mammalian cell entry (Mce) operons. The results showed the lack of the espA operon in types II-V, which renders the ESX- 1 operon dysfunctional, as espA is one of the key factors that control this secretion system. However, gene expression analysis showed this operon to be deleted in types II, III and IV. Furthermore, plcB was found to be truncated in types III and IV. Analysis of Mce operons (1-4) show that mce-1 operon is duplicated, mce-2 is absent and mce-3 and mce-4 is present in one copy in M. kansasii types I-V. Gene expression profiles of type I-IV, showed that the secreted proteins of ESX-1 were slightly upregulated in types II-IV when compared to type I and the secreted forms of ESX-5 were highly down

  12. Novel mouse model recapitulates genome and transcriptome alterations in human colorectal carcinomas.

    Science.gov (United States)

    McNeil, Nicole E; Padilla-Nash, Hesed M; Buishand, Floryne O; Hue, Yue; Ried, Thomas

    2017-03-01

    Human colorectal carcinomas are defined by a nonrandom distribution of genomic imbalances that are characteristic for this disease. Often, these imbalances affect entire chromosomes. Understanding the role of these aneuploidies for carcinogenesis is of utmost importance. Currently, established transgenic mice do not recapitulate the pathognonomic genome aberration profile of human colorectal carcinomas. We have developed a novel model based on the spontaneous transformation of murine colon epithelial cells. During this process, cells progress through stages of pre-immortalization, immortalization and, finally, transformation, and result in tumors when injected into immunocompromised mice. We analyzed our model for genome and transcriptome alterations using ArrayCGH, spectral karyotyping (SKY), and array based gene expression profiling. ArrayCGH revealed a recurrent pattern of genomic imbalances. These results were confirmed by SKY. Comparing these imbalances with orthologous maps of human chromosomes revealed a remarkable overlap. We observed focal deletions of the tumor suppressor genes Trp53 and Cdkn2a/p16. High-level focal genomic amplification included the locus harboring the oncogene Mdm2, which was confirmed by FISH in the form of double minute chromosomes. Array-based global gene expression revealed distinct differences between the sequential steps of spontaneous transformation. Gene expression changes showed significant similarities with human colorectal carcinomas. Pathways most prominently affected included genes involved in chromosomal instability and in epithelial to mesenchymal transition. Our novel mouse model therefore recapitulates the most prominent genome and transcriptome alterations in human colorectal cancer, and might serve as a valuable tool for understanding the dynamic process of tumorigenesis, and for preclinical drug testing. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  13. Spider Transcriptomes Identify Ancient Large-Scale Gene Duplication Event Potentially Important in Silk Gland Evolution.

    Science.gov (United States)

    Clarke, Thomas H; Garb, Jessica E; Hayashi, Cheryl Y; Arensburger, Peter; Ayoub, Nadia A

    2015-06-08

    The evolution of specialized tissues with novel functions, such as the silk synthesizing glands in spiders, is likely an influential driver of adaptive success. Large-scale gene duplication events and subsequent paralog divergence are thought to be required for generating evolutionary novelty. Such an event has been proposed for spiders, but not tested. We de novo assembled transcriptomes from three cobweb weaving spider species. Based on phylogenetic analyses of gene families with representatives from each of the three species, we found numerous duplication events indicative of a whole genome or segmental duplication. We estimated the age of the gene duplications relative to several speciation events within spiders and arachnids and found that the duplications likely occurred after the divergence of scorpions (order Scorpionida) and spiders (order Araneae), but before the divergence of the spider suborders Mygalomorphae and Araneomorphae, near the evolutionary origin of spider silk glands. Transcripts that are expressed exclusively or primarily within black widow silk glands are more likely to have a paralog descended from the ancient duplication event and have elevated amino acid replacement rates compared with other transcripts. Thus, an ancient large-scale gene duplication event within the spider lineage was likely an important source of molecular novelty during the evolution of silk gland-specific expression. This duplication event may have provided genetic material for subsequent silk gland diversification in the true spiders (Araneomorphae). © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  14. Transcriptome and metabolome of synthetic Solanum autotetraploids reveal key genomic stress events following polyploidization.

    Science.gov (United States)

    Fasano, Carlo; Diretto, Gianfranco; Aversano, Riccardo; D'Agostino, Nunzio; Di Matteo, Antonio; Frusciante, Luigi; Giuliano, Giovanni; Carputo, Domenico

    2016-06-01

    Polyploids are generally classified as autopolyploids, derived from a single species, and allopolyploids, arising from interspecific hybridization. The former represent ideal materials with which to study the consequences of genome doubling and ascertain whether there are molecular and functional rules operating following polyploidization events. To investigate whether the effects of autopolyploidization are common to different species, or if species-specific or stochastic events are prevalent, we performed a comprehensive transcriptomic and metabolomic characterization of diploids and autotetraploids of Solanum commersonii and Solanum bulbocastanum. Autopolyploidization remodelled the transcriptome and the metabolome of both species. In S. commersonii, differentially expressed genes (DEGs) were highly enriched in pericentromeric regions. Most changes were stochastic, suggesting a strong genotypic response. However, a set of robustly regulated transcripts and metabolites was also detected, including purine bases and nucleosides, which are likely to underlie a common response to polyploidization. We hypothesize that autopolyploidization results in nucleotide pool imbalance, which in turn triggers a genomic shock responsible for the stochastic events observed. The more extensive genomic stress and the higher number of stochastic events observed in S. commersonii with respect to S. bulbocastanum could be the result of the higher nucleoside depletion observed in this species. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.

  15. Integration of transcriptome and whole genomic resequencing data to identify key genes affecting swine fat deposition.

    Directory of Open Access Journals (Sweden)

    Kai Xing

    Full Text Available Fat deposition is highly correlated with the growth, meat quality, reproductive performance and immunity of pigs. Fatty acid synthesis takes place mainly in the adipose tissue of pigs; therefore, in this study, a high-throughput massively parallel sequencing approach was used to generate adipose tissue transcriptomes from two groups of Songliao black pigs that had opposite backfat thickness phenotypes. The total number of paired-end reads produced for each sample was in the range of 39.29-49.36 millions. Approximately 188 genes were differentially expressed in adipose tissue and were enriched for metabolic processes, such as fatty acid biosynthesis, lipid synthesis, metabolism of fatty acids, etinol, caffeine and arachidonic acid and immunity. Additionally, many genetic variations were detected between the two groups through pooled whole-genome resequencing. Integration of transcriptome and whole-genome resequencing data revealed important genomic variations among the differentially expressed genes for fat deposition, for example, the lipogenic genes. Further studies are required to investigate the roles of candidate genes in fat deposition to improve pig breeding programs.

  16. Genome-wide investigation and transcriptome analysis of the WRKY gene family in Gossypium.

    Science.gov (United States)

    Ding, Mingquan; Chen, Jiadong; Jiang, Yurong; Lin, Lifeng; Cao, YueFen; Wang, Minhua; Zhang, Yuting; Rong, Junkang; Ye, Wuwei

    2015-02-01

    WRKY transcription factors play important roles in various stress responses in diverse plant species. In cotton, this family has not been well studied, especially in relation to fiber development. Here, the genomes and transcriptomes of Gossypium raimondii and Gossypium arboreum were investigated to identify fiber development related WRKY genes. This represents the first comprehensive comparative study of WRKY transcription factors in both diploid A and D cotton species. In total, 112 G. raimondii and 109 G. arboreum WRKY genes were identified. No significant gene structure or domain alterations were detected between the two species, but many SNPs distributed unequally in exon and intron regions. Physical mapping revealed that the WRKY genes in G. arboreum were not located in the corresponding chromosomes of G. raimondii, suggesting great chromosome rearrangement in the diploid cotton genomes. The cotton WRKY genes, especially subgroups I and II, have expanded through multiple whole genome duplications and tandem duplications compared with other plant species. Sequence comparison showed many functionally divergent sites between WRKY subgroups, while the genes within each group are under strong purifying selection. Transcriptome analysis suggested that many WRKY genes participate in specific fiber development processes such as fiber initiation, elongation and maturation with different expression patterns between species. Complex WRKY gene expression such as differential Dt and At allelic gene expression in G. hirsutum and alternative splicing events were also observed in both diploid and tetraploid cottons during fiber development process. In conclusion, this study provides important information on the evolution and function of WRKY gene family in cotton species.

  17. Genome-Wide Transcriptome Analysis of Cadmium Stress in Rice

    Directory of Open Access Journals (Sweden)

    Youko Oono

    2016-01-01

    Full Text Available Rice growth is severely affected by toxic concentrations of the nonessential heavy metal cadmium (Cd. To elucidate the molecular basis of the response to Cd stress, we performed mRNA sequencing of rice following our previous study on exposure to high concentrations of Cd (Oono et al., 2014. In this study, rice plants were hydroponically treated with low concentrations of Cd and approximately 211 million sequence reads were mapped onto the IRGSP-1.0 reference rice genome sequence. Many genes, including some identified under high Cd concentration exposure in our previous study, were found to be responsive to low Cd exposure, with an average of about 11,000 transcripts from each condition. However, genes expressed constitutively across the developmental course responded only slightly to low Cd concentrations, in contrast to their clear response to high Cd concentration, which causes fatal damage to rice seedlings according to phenotypic changes. The expression of metal ion transporter genes tended to correlate with Cd concentration, suggesting the potential of the RNA-Seq strategy to reveal novel Cd-responsive transporters by analyzing gene expression under different Cd concentrations. This study could help to develop novel strategies for improving tolerance to Cd exposure in rice and other cereal crops.

  18. Genome scale engineering techniques for metabolic engineering.

    Science.gov (United States)

    Liu, Rongming; Bassalo, Marcelo C; Zeitoun, Ramsey I; Gill, Ryan T

    2015-11-01

    Metabolic engineering has expanded from a focus on designs requiring a small number of genetic modifications to increasingly complex designs driven by advances in genome-scale engineering technologies. Metabolic engineering has been generally defined by the use of iterative cycles of rational genome modifications, strain analysis and characterization, and a synthesis step that fuels additional hypothesis generation. This cycle mirrors the Design-Build-Test-Learn cycle followed throughout various engineering fields that has recently become a defining aspect of synthetic biology. This review will attempt to summarize recent genome-scale design, build, test, and learn technologies and relate their use to a range of metabolic engineering applications. Copyright © 2015 International Metabolic Engineering Society. Published by Elsevier Inc. All rights reserved.

  19. De Novo Genome and Transcriptome Assembly of the Canadian Beaver (Castor canadensis

    Directory of Open Access Journals (Sweden)

    Si Lok

    2017-02-01

    Full Text Available The Canadian beaver (Castor canadensis is the largest indigenous rodent in North America. We report a draft annotated assembly of the beaver genome, the first for a large rodent and the first mammalian genome assembled directly from uncorrected and moderate coverage (< 30 × long reads generated by single-molecule sequencing. The genome size is 2.7 Gb estimated by k-mer analysis. We assembled the beaver genome using the new Canu assembler optimized for noisy reads. The resulting assembly was refined using Pilon supported by short reads (80 × and checked for accuracy by congruency against an independent short read assembly. We scaffolded the assembly using the exon–gene models derived from 9805 full-length open reading frames (FL-ORFs constructed from the beaver leukocyte and muscle transcriptomes. The final assembly comprised 22,515 contigs with an N50 of 278,680 bp and an N50-scaffold of 317,558 bp. Maximum contig and scaffold lengths were 3.3 and 4.2 Mb, respectively, with a combined scaffold length representing 92% of the estimated genome size. The completeness and accuracy of the scaffold assembly was demonstrated by the precise exon placement for 91.1% of the 9805 assembled FL-ORFs and 83.1% of the BUSCO (Benchmarking Universal Single-Copy Orthologs gene set used to assess the quality of genome assemblies. Well-represented were genes involved in dentition and enamel deposition, defining characteristics of rodents with which the beaver is well-endowed. The study provides insights for genome assembly and an important genomics resource for Castoridae and rodent evolutionary biology.

  20. LEMONS - A Tool for the Identification of Splice Junctions in Transcriptomes of Organisms Lacking Reference Genomes.

    Directory of Open Access Journals (Sweden)

    Liron Levin

    Full Text Available RNA-seq is becoming a preferred tool for genomics studies of model and non-model organisms. However, DNA-based analysis of organisms lacking sequenced genomes cannot rely on RNA-seq data alone to isolate most genes of interest, as DNA codes both exons and introns. With this in mind, we designed a novel tool, LEMONS, that exploits the evolutionary conservation of both exon/intron boundary positions and splice junction recognition signals to produce high throughput splice-junction predictions in the absence of a reference genome. When tested on multiple annotated vertebrate mRNA data, LEMONS accurately identified 87% (average of the splice-junctions. LEMONS was then applied to our updated Mediterranean chameleon transcriptome, which lacks a reference genome, and predicted a total of 90,820 exon-exon junctions. We experimentally verified these splice-junction predictions by amplifying and sequencing twenty randomly selected genes from chameleon DNA templates. Exons and introns were detected in 19 of 20 of the positions predicted by LEMONS. To the best of our knowledge, LEMONS is currently the only experimentally verified tool that can accurately predict splice-junctions in organisms that lack a reference genome.

  1. A systematically improved high quality genome and transcriptome of the human blood fluke Schistosoma mansoni.

    Directory of Open Access Journals (Sweden)

    Anna V Protasio

    2012-01-01

    Full Text Available Schistosomiasis is one of the most prevalent parasitic diseases, affecting millions of people in developing countries. Amongst the human-infective species, Schistosoma mansoni is also the most commonly used in the laboratory and here we present the systematic improvement of its draft genome. We used Sanger capillary and deep-coverage Illumina sequencing from clonal worms to upgrade the highly fragmented draft 380 Mb genome to one with only 885 scaffolds and more than 81% of the bases organised into chromosomes. We have also used transcriptome sequencing (RNA-seq from four time points in the parasite's life cycle to refine gene predictions and profile their expression. More than 45% of predicted genes have been extensively modified and the total number has been reduced from 11,807 to 10,852. Using the new version of the genome, we identified trans-splicing events occurring in at least 11% of genes and identified clear cases where it is used to resolve polycistronic transcripts. We have produced a high-resolution map of temporal changes in expression for 9,535 genes, covering an unprecedented dynamic range for this organism. All of these data have been consolidated into a searchable format within the GeneDB (www.genedb.org and SchistoDB (www.schistodb.net databases. With further transcriptional profiling and genome sequencing increasingly accessible, the upgraded genome will form a fundamental dataset to underpin further advances in schistosome research.

  2. Genome and transcriptome analysis of the food-yeast Candida utilis.

    Directory of Open Access Journals (Sweden)

    Yasuyuki Tomita

    Full Text Available The industrially important food-yeast Candida utilis is a Crabtree effect-negative yeast used to produce valuable chemicals and recombinant proteins. In the present study, we conducted whole genome sequencing and phylogenetic analysis of C. utilis, which showed that this yeast diverged long before the formation of the CUG and Saccharomyces/Kluyveromyces clades. In addition, we performed comparative genome and transcriptome analyses using next-generation sequencing, which resulted in the identification of genes important for characteristic phenotypes of C. utilis such as those involved in nitrate assimilation, in addition to the gene encoding the functional hexose transporter. We also found that an antisense transcript of the alcohol dehydrogenase gene, which in silico analysis did not predict to be a functional gene, was transcribed in the stationary-phase, suggesting a novel system of repression of ethanol production. These findings should facilitate the development of more sophisticated systems for the production of useful reagents using C. utilis.

  3. Genome scale metabolic modeling of cancer

    DEFF Research Database (Denmark)

    Nilsson, Avlant; Nielsen, Jens

    2017-01-01

    of metabolism which allows simulation and hypotheses testing of metabolic strategies. It has successfully been applied to many microorganisms and is now used to study cancer metabolism. Generic models of human metabolism have been reconstructed based on the existence of metabolic genes in the human genome......Cancer cells reprogram metabolism to support rapid proliferation and survival. Energy metabolism is particularly important for growth and genes encoding enzymes involved in energy metabolism are frequently altered in cancer cells. A genome scale metabolic model (GEM) is a mathematical formalization...

  4. Genomic and transcriptomic approaches to study immunology in cyprinids: What is next?

    Science.gov (United States)

    Petit, Jules; David, Lior; Dirks, Ron; Wiegertjes, Geert F

    2017-10-01

    Accelerated by the introduction of Next-Generation Sequencing (NGS), a number of genomes of cyprinid fish species have been drafted, leading to a highly valuable collective resource of comparative genome information on cyprinids (Cyprinidae). In addition, NGS-based transcriptome analyses of different developmental stages, organs, or cell types, increasingly contribute to the understanding of complex physiological processes, including immune responses. Cyprinids are a highly interesting family because they comprise one of the most-diversified families of teleosts and because of their variation in ploidy level, with diploid, triploid, tetraploid, hexaploid and sometimes even octoploid species. The wealth of data obtained from NGS technologies provides both challenges and opportunities for immunological research, which will be discussed here. Correct interpretation of ploidy effects on immune responses requires knowledge of the degree of functional divergence between duplicated genes, which can differ even between closely-related cyprinid fish species. We summarize NGS-based progress in analysing immune responses and discuss the importance of respecting the presence of (multiple) duplicated gene sequences when performing transcriptome analyses for detailed understanding of complex physiological processes. Progressively, advances in NGS technology are providing workable methods to further elucidate the implications of gene duplication events and functional divergence of duplicates genes and proteins involved in immune responses in cyprinids. We conclude with discussing how future applications of NGS technologies and analysis methods could enhance immunological research and understanding. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.

  5. Transcriptome analysis in Concholepas concholepas (Gastropoda, Muricidae): mining and characterization of new genomic and molecular markers.

    Science.gov (United States)

    Cárdenas, Leyla; Sánchez, Roland; Gomez, Daniela; Fuenzalida, Gonzalo; Gallardo-Escárate, Cristián; Tanguy, Arnaud

    2011-09-01

    The marine gastropod Concholepas concholepas, locally known as the "loco", is the main target species of the benthonic Chilean fisheries. Genetic and genomic tools are necessary to study the genome of this species in order to understand the molecular basis of its development, growth, and other key traits to improve the management strategies and to identify local adaptation to prevent loss of biodiversity. Here, we use pyrosequencing technologies to generate the first transcriptomic database from adult specimens of the loco. After trimming, a total of 140,756 Expressed Sequence Tag sequences were achieved. Clustering and assembly analysis identified 19,219 contigs and 105,435 singleton sequences. BlastN analysis showed a significant identity with Expressed Sequence Tags of different gastropod species available in public databases. Similarly, BlastX results showed that only 895 out of the total 124,654 had significant hits and may represent novel genes for marine gastropods. From this database, simple sequence repeat motifs were also identified and a total of 38 primer pairs were designed and tested to assess their potential as informative markers and to investigate their cross-species amplification in different related gastropod species. This dataset represents the first publicly available 454 data for a marine gastropod endemic to the southeastern Pacific coast, providing a valuable transcriptomic resource for future efforts of gene discovery and development of functional markers in other marine gastropods. Copyright © 2011 Elsevier B.V. All rights reserved.

  6. Analysing human genomes at different scales

    DEFF Research Database (Denmark)

    Liu, Siyang

    The thriving of the Next-Generation sequencing (NGS) technologies in the past decade has dramatically revolutionized the field of human genetics. We are experiencing a wave of several large-scale whole genome sequencing studies of humans in the world. Those studies vary greatly regarding cohort...... will be reflected by the analysis of real data. This thesis covers studies in two human genome sequencing projects that distinctly differ in terms of studied population, sample size and sequencing depth. In the first project, we sequenced 150 Danish individuals from 50 trio families to 78x coverage....... The sophisticated experimental design enables high-quality de novo assembly of the genomes and provides a good opportunity for mapping the structural variations in the human population. We developed the AsmVar approach to discover, genotype and characterize the structural variations from the assemblies. Our...

  7. Trade-off between Transcriptome Plasticity and Genome Evolution in Cephalopods.

    Science.gov (United States)

    Liscovitch-Brauer, Noa; Alon, Shahar; Porath, Hagit T; Elstein, Boaz; Unger, Ron; Ziv, Tamar; Admon, Arie; Levanon, Erez Y; Rosenthal, Joshua J C; Eisenberg, Eli

    2017-04-06

    RNA editing, a post-transcriptional process, allows the diversification of proteomes beyond the genomic blueprint; however it is infrequently used among animals for this purpose. Recent reports suggesting increased levels of RNA editing in squids thus raise the question of the nature and effects of these events. We here show that RNA editing is particularly common in behaviorally sophisticated coleoid cephalopods, with tens of thousands of evolutionarily conserved sites. Editing is enriched in the nervous system, affecting molecules pertinent for excitability and neuronal morphology. The genomic sequence flanking editing sites is highly conserved, suggesting that the process confers a selective advantage. Due to the large number of sites, the surrounding conservation greatly reduces the number of mutations and genomic polymorphisms in protein-coding regions. This trade-off between genome evolution and transcriptome plasticity highlights the importance of RNA recoding as a strategy for diversifying proteins, particularly those associated with neural function. PAPERCLIP. Copyright © 2017 Elsevier Inc. All rights reserved.

  8. Genome-wide binding and transcriptome analysis of human farnesoid X receptor in primary human hepatocytes.

    Directory of Open Access Journals (Sweden)

    Le Zhan

    Full Text Available Farnesoid X receptor (FXR, NR1H4 is a ligand-activated transcription factor, belonging to the nuclear receptor superfamily. FXR is highly expressed in the liver and is essential in regulating bile acid homeostasis. FXR deficiency is implicated in numerous liver diseases and mice with modulation of FXR have been used as animal models to study liver physiology and pathology. We have reported genome-wide binding of FXR in mice by chromatin immunoprecipitation - deep sequencing (ChIP-seq, with results indicating that FXR may be involved in regulating diverse pathways in liver. However, limited information exists for the functions of human FXR and the suitability of using murine models to study human FXR functions.In the current study, we performed ChIP-seq in primary human hepatocytes (PHHs treated with a synthetic FXR agonist, GW4064 or DMSO control. In parallel, RNA deep sequencing (RNA-seq and RNA microarray were performed for GW4064 or control treated PHHs and wild type mouse livers, respectively.ChIP-seq showed similar profiles of genome-wide FXR binding in humans and mice in terms of motif analysis and pathway prediction. However, RNA-seq and microarray showed more different transcriptome profiles between PHHs and mouse livers upon GW4064 treatment.In summary, we have established genome-wide human FXR binding and transcriptome profiles. These results will aid in determining the human FXR functions, as well as judging to what level the mouse models could be used to study human FXR functions.

  9. Genomes

    National Research Council Canada - National Science Library

    Brown, T. A. (Terence A.)

    2002-01-01

    ... of genome expression and replication processes, and transcriptomics and proteomics. This text is richly illustrated with clear, easy-to-follow, full color diagrams, which are downloadable from the book's website...

  10. Differential genomic arrangements in Caryophyllales through deep transcriptome sequencing of A. hypochondriacus.

    Directory of Open Access Journals (Sweden)

    Meeta Sunil

    Full Text Available Genome duplication event in edible dicots under the orders Rosid and Asterid, common during the oligocene period, is missing for species under the order Caryophyllales. Despite this, grain amaranths not only survived this period but display many desirable traits missing in species under rosids and asterids. For example, grain amaranths display traits like C4 photosynthesis, high-lysine seeds, high-yield, drought resistance, tolerance to infection and resilience to stress. It is, therefore, of interest to look for minor genome rearrangements with potential functional implications that are unique to grain amaranths. Here, by deep sequencing and assembly of 16 transcriptomes (86.8 billion bases we have interrogated differential genome rearrangement unique to Amaranthus hypochondriacus with potential links to these phenotypes. We have predicted 125,581 non-redundant transcripts including 44,529 protein coding transcripts identified based on homology to known proteins and 13,529 predicted as novel/amaranth specific coding transcripts. Of the protein coding de novo assembled transcripts, we have identified 1810 chimeric transcripts. More than 30% and 19% of the gene pairs within the chimeric transcripts are found within the same loci in the genomes of A. hypochondriacus and Beta vulgaris respectively and are considered real positives. Interestingly, one of the chimeric transcripts comprises two important genes, namely DHDPS1, a key enzyme implicated in the biosynthesis of lysine, and alpha-glucosidase, an enzyme involved in sucrose catabolism, in close proximity to each other separated by a distance of 612 bases in the genome of A. hypochondriacus in a convergent configuration. We have experimentally validated that transcripts of these two genes are also overlapping in the 3' UTR with their expression negatively correlated from bud to mature seed, suggesting a potential link between the high seed lysine trait and unique genome organization.

  11. Analysis of the Legionella longbeachae genome and transcriptome uncovers unique strategies to cause Legionnaires' disease.

    Directory of Open Access Journals (Sweden)

    Christel Cazalet

    2010-02-01

    Full Text Available Legionella pneumophila and L. longbeachae are two species of a large genus of bacteria that are ubiquitous in nature. L. pneumophila is mainly found in natural and artificial water circuits while L. longbeachae is mainly present in soil. Under the appropriate conditions both species are human pathogens, capable of causing a severe form of pneumonia termed Legionnaires' disease. Here we report the sequencing and analysis of four L. longbeachae genomes, one complete genome sequence of L. longbeachae strain NSW150 serogroup (Sg 1, and three draft genome sequences another belonging to Sg1 and two to Sg2. The genome organization and gene content of the four L. longbeachae genomes are highly conserved, indicating strong pressure for niche adaptation. Analysis and comparison of L. longbeachae strain NSW150 with L. pneumophila revealed common but also unexpected features specific to this pathogen. The interaction with host cells shows distinct features from L. pneumophila, as L. longbeachae possesses a unique repertoire of putative Dot/Icm type IV secretion system substrates, eukaryotic-like and eukaryotic domain proteins, and encodes additional secretion systems. However, analysis of the ability of a dotA mutant of L. longbeachae NSW150 to replicate in the Acanthamoeba castellanii and in a mouse lung infection model showed that the Dot/Icm type IV secretion system is also essential for the virulence of L. longbeachae. In contrast to L. pneumophila, L. longbeachae does not encode flagella, thereby providing a possible explanation for differences in mouse susceptibility to infection between the two pathogens. Furthermore, transcriptome analysis revealed that L. longbeachae has a less pronounced biphasic life cycle as compared to L. pneumophila, and genome analysis and electron microscopy suggested that L. longbeachae is encapsulated. These species-specific differences may account for the different environmental niches and disease epidemiology of these

  12. Genome-scale cold stress response regulatory networks in ten Arabidopsis thaliana ecotypes

    DEFF Research Database (Denmark)

    Barah, Pankaj; Jayavelu, Naresh Doni; Rasmussen, Simon

    2013-01-01

    available from Arabidopsis thaliana 1001 genome project, we further investigated sequence polymorphisms in the core cold stress regulon genes. Significant numbers of non-synonymous amino acid changes were observed in the coding region of the CBF regulon genes. Considering the limited knowledge about......BACKGROUND: Low temperature leads to major crop losses every year. Although several studies have been conducted focusing on diversity of cold tolerance level in multiple phenotypically divergent Arabidopsis thaliana (A. thaliana) ecotypes, genome-scale molecular understanding is still lacking....... RESULTS: In this study, we report genome-scale transcript response diversity of 10 A. thaliana ecotypes originating from different geographical locations to non-freezing cold stress (10°C). To analyze the transcriptional response diversity, we initially compared transcriptome changes in all 10 ecotypes...

  13. Genome-wide functional genomic and transcriptomic analyses for genes regulating sensitivity to vorinostat.

    Science.gov (United States)

    Falkenberg, Katrina J; Gould, Cathryn M; Johnstone, Ricky W; Simpson, Kaylene J

    2014-01-01

    Identification of mechanisms of resistance to histone deacetylase inhibitors, such as vorinostat, is important in order to utilise these anticancer compounds more efficiently in the clinic. Here, we present a dataset containing multiple tiers of stringent siRNA screening for genes that when knocked down conferred sensitivity to vorinostat-induced cell death. We also present data from a miRNA overexpression screen for miRNAs contributing to vorinostat sensitivity. Furthermore, we provide transcriptomic analysis using massively parallel sequencing upon knockdown of 14 validated vorinostat-resistance genes. These datasets are suitable for analysis of genes and miRNAs involved in cell death in the presence and absence of vorinostat as well as computational biology approaches to identify gene regulatory networks.

  14. Genome-wide RNA-seq analysis of human and mouse platelet transcriptomes

    Science.gov (United States)

    Rowley, Jesse W.; Oler, Andrew J.; Tolley, Neal D.; Hunter, Benjamin N.; Low, Elizabeth N.; Nix, David A.; Yost, Christian C.; Zimmerman, Guy A.

    2011-01-01

    Inbred mice are a useful tool for studying the in vivo functions of platelets. Nonetheless, the mRNA signature of mouse platelets is not known. Here, we use paired-end next-generation RNA sequencing (RNA-seq) to characterize the polyadenylated transcriptomes of human and mouse platelets. We report that RNA-seq provides unprecedented resolution of mRNAs that are expressed across the entire human and mouse genomes. Transcript expression and abundance are often conserved between the 2 species. Several mRNAs, however, are differentially expressed in human and mouse platelets. Moreover, previously described functional disparities between mouse and human platelets are reflected in differences at the transcript level, including protease activated receptor-1, protease activated receptor-3, platelet activating factor receptor, and factor V. This suggests that RNA-seq is a useful tool for predicting differences in platelet function between mice and humans. Our next-generation sequencing analysis provides new insights into the human and murine platelet transcriptomes. The sequencing dataset will be useful in the design of mouse models of hemostasis and a catalyst for discovery of new functions of platelets. Access to the dataset is found in the “Introduction.” PMID:21596849

  15. Talaromyces marneffei Genomic, Transcriptomic, Proteomic and Metabolomic Studies Reveal Mechanisms for Environmental Adaptations and Virulence

    Directory of Open Access Journals (Sweden)

    Susanna K. P. Lau

    2017-06-01

    Full Text Available Talaromyces marneffei is a thermally dimorphic fungus causing systemic infections in patients positive for HIV or other immunocompromised statuses. Analysis of its ~28.9 Mb draft genome and additional transcriptomic, proteomic and metabolomic studies revealed mechanisms for environmental adaptations and virulence. Meiotic genes and genes for pheromone receptors, enzymes which process pheromones, and proteins involved in pheromone response pathway are present, indicating its possibility as a heterothallic fungus. Among the 14 Mp1p homologs, only Mp1p is a virulence factor binding a variety of host proteins, fatty acids and lipids. There are 23 polyketide synthase genes, one for melanin and two for mitorubrinic acid/mitorubrinol biosynthesis, which are virulence factors. Another polyketide synthase is for biogenesis of the diffusible red pigment, which consists of amino acid conjugates of monascorubin and rubropunctatin. Novel microRNA-like RNAs (milRNAs and processing proteins are present. The dicer protein, dcl-2, is required for biogenesis of two milRNAs, PM-milR-M1 and PM-milR-M2, which are more highly expressed in hyphal cells. Comparative transcriptomics showed that tandem repeat-containing genes were overexpressed in yeast phase, generating protein polymorphism among cells, evading host’s immunity. Comparative proteomics between yeast and hyphal cells revealed that glyceraldehyde-3-phosphate dehydrogenase, up-regulated in hyphal cells, is an adhesion factor for conidial attachment.

  16. Sugar Metabolism of the First Thermophilic Planctomycete Thermogutta terrifontis: Comparative Genomic and Transcriptomic Approaches

    Science.gov (United States)

    Elcheninov, Alexander G.; Menzel, Peter; Gudbergsdottir, Soley R.; Slesarev, Alexei I.; Kadnikov, Vitaly V.; Krogh, Anders; Bonch-Osmolovskaya, Elizaveta A.; Peng, Xu; Kublanov, Ilya V.

    2017-01-01

    Xanthan gum, a complex polysaccharide comprising glucose, mannose and glucuronic acid residues, is involved in numerous biotechnological applications in cosmetics, agriculture, pharmaceuticals, food and petroleum industries. Additionally, its oligosaccharides were shown to possess antimicrobial, antioxidant, and few other properties. Yet, despite its extensive usage, little is known about xanthan gum degradation pathways and mechanisms. Thermogutta terrifontis, isolated from a sample of microbial mat developed in a terrestrial hot spring of Kunashir island (Far-East of Russia), was described as the first thermophilic representative of the Planctomycetes phylum. It grows well on xanthan gum either at aerobic or anaerobic conditions. Genomic analysis unraveled the pathways of oligo- and polysaccharides utilization, as well as the mechanisms of aerobic and anaerobic respiration. The combination of genomic and transcriptomic approaches suggested a novel xanthan gum degradation pathway which involves novel glycosidase(s) of DUF1080 family, hydrolyzing xanthan gum backbone beta-glucosidic linkages and beta-mannosidases instead of xanthan lyases, catalyzing cleavage of terminal beta-mannosidic linkages. Surprisingly, the genes coding DUF1080 proteins were abundant in T. terrifontis and in many other Planctomycetes genomes, which, together with our observation that xanthan gum being a selective substrate for many planctomycetes, suggest crucial role of DUF1080 in xanthan gum degradation. Our findings shed light on the metabolism of the first thermophilic planctomycete, capable to degrade a number of polysaccharides, either aerobically or anaerobically, including the biotechnologically important bacterial polysaccharide xanthan gum. PMID:29163426

  17. Genome and Transcriptome of Clostridium phytofermentans, Catalyst for the Direct Conversion of Plant Feedstocks to Fuels.

    Directory of Open Access Journals (Sweden)

    Elsa Petit

    Full Text Available Clostridium phytofermentans was isolated from forest soil and is distinguished by its capacity to directly ferment plant cell wall polysaccharides into ethanol as the primary product, suggesting that it possesses unusual catabolic pathways. The objective of the present study was to understand the molecular mechanisms of biomass conversion to ethanol in a single organism, Clostridium phytofermentans, by analyzing its complete genome and transcriptome during growth on plant carbohydrates. The saccharolytic versatility of C. phytofermentans is reflected in a diversity of genes encoding ATP-binding cassette sugar transporters and glycoside hydrolases, many of which may have been acquired through horizontal gene transfer. These genes are frequently organized as operons that may be controlled individually by the many transcriptional regulators identified in the genome. Preferential ethanol production may be due to high levels of expression of multiple ethanol dehydrogenases and additional pathways maximizing ethanol yield. The genome also encodes three different proteinaceous bacterial microcompartments with the capacity to compartmentalize pathways that divert fermentation intermediates to various products. These characteristics make C. phytofermentans an attractive resource for improving the efficiency and speed of biomass conversion to biofuels.

  18. Breeding in peach, cherry and plum: from a tissue culture, genetic, transcriptomic and genomic perspective

    Directory of Open Access Journals (Sweden)

    Basilio Carrasco

    2013-01-01

    Full Text Available This review is an overview of traditional and modern breeding methodologies being used to develop new Prunus cultivars (stone fruits with major emphasis on peach, sweet cherry and Japanese plum. To this end, common breeding tools used to produce seedlings, including in vitro culture tools, are discussed. Additionally, the mechanisms of inheritance of many important agronomical traits are described. Recent advances in stone fruit transcriptomics and genomic resources are providing an understanding of the molecular basis of phenotypic variability as well as the identification of allelic variants and molecular markers. These have potential applications for understanding the genetic diversity of the Prunus species, molecular marker-assisted selection and transgenesis. Simple Sequence Repeat (SSR and Single Nucleotide Polymorphism (SNPs molecular markers are described as useful tools to describe genetic diversity in peach, sweet cherry and Japanese plum. Additionally, the recently sequenced peach genome and the public release of the sweet cherry genome are discussed in terms of their applicability to breeding programs

  19. Sugar Metabolism of the First Thermophilic Planctomycete Thermogutta terrifontis: Comparative Genomic and Transcriptomic Approaches

    Directory of Open Access Journals (Sweden)

    Alexander G. Elcheninov

    2017-11-01

    Full Text Available Xanthan gum, a complex polysaccharide comprising glucose, mannose and glucuronic acid residues, is involved in numerous biotechnological applications in cosmetics, agriculture, pharmaceuticals, food and petroleum industries. Additionally, its oligosaccharides were shown to possess antimicrobial, antioxidant, and few other properties. Yet, despite its extensive usage, little is known about xanthan gum degradation pathways and mechanisms. Thermogutta terrifontis, isolated from a sample of microbial mat developed in a terrestrial hot spring of Kunashir island (Far-East of Russia, was described as the first thermophilic representative of the Planctomycetes phylum. It grows well on xanthan gum either at aerobic or anaerobic conditions. Genomic analysis unraveled the pathways of oligo- and polysaccharides utilization, as well as the mechanisms of aerobic and anaerobic respiration. The combination of genomic and transcriptomic approaches suggested a novel xanthan gum degradation pathway which involves novel glycosidase(s of DUF1080 family, hydrolyzing xanthan gum backbone beta-glucosidic linkages and beta-mannosidases instead of xanthan lyases, catalyzing cleavage of terminal beta-mannosidic linkages. Surprisingly, the genes coding DUF1080 proteins were abundant in T. terrifontis and in many other Planctomycetes genomes, which, together with our observation that xanthan gum being a selective substrate for many planctomycetes, suggest crucial role of DUF1080 in xanthan gum degradation. Our findings shed light on the metabolism of the first thermophilic planctomycete, capable to degrade a number of polysaccharides, either aerobically or anaerobically, including the biotechnologically important bacterial polysaccharide xanthan gum.

  20. Comprehensive transcriptome and improved genome annotation of Bacillus licheniformis WX-02.

    Science.gov (United States)

    Guo, Jing; Cheng, Gang; Gou, Xiang-Yong; Xing, Feng; Li, Sen; Han, Yi-Chao; Wang, Long; Song, Jia-Ming; Shu, Cheng-Cheng; Chen, Shou-Wen; Chen, Ling-Ling

    2015-08-19

    The updated genome of Bacillus licheniformis WX-02 comprises a circular chromosome of 4286821 base-pairs containing 4512 protein-coding genes. We applied strand-specific RNA-sequencing to explore the transcriptome profiles of B. licheniformis WX-02 under normal and high-salt conditions (NaCl 6%). We identified 2381 co-expressed gene pairs constituting 871 operon structures. In addition, 1169 antisense transcripts and 90 small RNAs were detected. Systematic comparison of differentially expressed genes under different conditions revealed that genes involved in multiple functions were significantly repressed in long-term high salt adaptation process. Genes related to promotion of glutamic acid synthesis were activated by 6% NaCl, potentially explaining the high yield of γ-PGA under salt condition. This study will be useful for the optimization of crucial metabolic activities in this bacterium. Copyright © 2015. Published by Elsevier B.V.

  1. An integrated genomic and transcriptomic survey of mucormycosis-causing fungi

    Science.gov (United States)

    Chibucos, Marcus C.; Soliman, Sameh; Gebremariam, Teclegiorgis; Lee, Hongkyu; Daugherty, Sean; Orvis, Joshua; Shetty, Amol C.; Crabtree, Jonathan; Hazen, Tracy H.; Etienne, Kizee A.; Kumari, Priti; O'Connor, Timothy D.; Rasko, David A.; Filler, Scott G.; Fraser, Claire M.; Lockhart, Shawn R.; Skory, Christopher D.; Ibrahim, Ashraf S.; Bruno, Vincent M.

    2016-01-01

    Mucormycosis is a life-threatening infection caused by Mucorales fungi. Here we sequence 30 fungal genomes, and perform transcriptomics with three representative Rhizopus and Mucor strains and with human airway epithelial cells during fungal invasion, to reveal key host and fungal determinants contributing to pathogenesis. Analysis of the host transcriptional response to Mucorales reveals platelet-derived growth factor receptor B (PDGFRB) signaling as part of a core response to divergent pathogenic fungi; inhibition of PDGFRB reduces Mucorales-induced damage to host cells. The unique presence of CotH invasins in all invasive Mucorales, and the correlation between CotH gene copy number and clinical prevalence, are consistent with an important role for these proteins in mucormycosis pathogenesis. Our work provides insight into the evolution of this medically and economically important group of fungi, and identifies several molecular pathways that might be exploited as potential therapeutic targets. PMID:27447865

  2. Comparative genomics and transcriptome analysis of Aspergillus niger and metabolic engineering for citrate production

    Science.gov (United States)

    Yin, Xian; Shin, Hyun-dong; Li, Jianghua; Du, Guocheng; Liu, Long; Chen, Jian

    2017-01-01

    Despite a long and successful history of citrate production in Aspergillus niger, the molecular mechanism of citrate accumulation is only partially understood. In this study, we used comparative genomics and transcriptome analysis of citrate-producing strains—namely, A. niger H915-1 (citrate titer: 157 g L−1), A1 (117 g L−1), and L2 (76 g L−1)—to gain a genome-wide view of the mechanism of citrate accumulation. Compared with A. niger A1 and L2, A. niger H915-1 contained 92 mutated genes, including a succinate-semialdehyde dehydrogenase in the γ-aminobutyric acid shunt pathway and an aconitase family protein involved in citrate synthesis. Furthermore, transcriptome analysis of A. niger H915-1 revealed that the transcription levels of 479 genes changed between the cell growth stage (6 h) and the citrate synthesis stage (12 h, 24 h, 36 h, and 48 h). In the glycolysis pathway, triosephosphate isomerase was up-regulated, whereas pyruvate kinase was down-regulated. Two cytosol ATP-citrate lyases, which take part in the cycle of citrate synthesis, were up-regulated, and may coordinate with the alternative oxidases in the alternative respiratory pathway for energy balance. Finally, deletion of the oxaloacetate acetylhydrolase gene in H915-1 eliminated oxalate formation but neither influence on pH decrease nor difference in citrate production were observed. PMID:28106122

  3. An integrative genomic and transcriptomic analysis reveals potential targets associated with cell proliferation in uterine leiomyomas.

    Directory of Open Access Journals (Sweden)

    Priscila Daniele Ramos Cirilo

    Full Text Available Uterine Leiomyomas (ULs are the most common benign tumours affecting women of reproductive age. ULs represent a major problem in public health, as they are the main indication for hysterectomy. Approximately 40-50% of ULs have non-random cytogenetic abnormalities, and half of ULs may have copy number alterations (CNAs. Gene expression microarrays studies have demonstrated that cell proliferation genes act in response to growth factors and steroids. However, only a few genes mapping to CNAs regions were found to be associated with ULs.We applied an integrative analysis using genomic and transcriptomic data to identify the pathways and molecular markers associated with ULs. Fifty-one fresh frozen specimens were evaluated by array CGH (JISTIC and gene expression microarrays (SAM. The CONEXIC algorithm was applied to integrate the data.The integrated analysis identified the top 30 significant genes (P<0.01, which comprised genes associated with cancer, whereas the protein-protein interaction analysis indicated a strong association between FANCA and BRCA1. Functional in silico analysis revealed target molecules for drugs involved in cell proliferation, including FGFR1 and IGFBP5. Transcriptional and protein analyses showed that FGFR1 (P = 0.006 and P<0.01, respectively and IGFBP5 (P = 0.0002 and P = 0.006, respectively were up-regulated in the tumours when compared with the adjacent normal myometrium.The integrative genomic and transcriptomic approach indicated that FGFR1 and IGFBP5 amplification, as well as the consequent up-regulation of the protein products, plays an important role in the aetiology of ULs and thus provides data for potential drug therapies development to target genes associated with cellular proliferation in ULs.

  4. Integration of expression data in genome-scale metabolic network reconstructions

    Directory of Open Access Journals (Sweden)

    Anna S. Blazier

    2012-08-01

    Full Text Available With the advent of high-throughput technologies, the field of systems biology has amassed an abundance of omics data, quantifying thousands of cellular components across a variety of scales, ranging from mRNA transcript levels to metabolite quantities. Methods are needed to not only integrate this omics data but to also use this data to heighten the predictive capabilities of computational models. Several recent studies have successfully demonstrated how flux balance analysis (FBA, a constraint-based modeling approach, can be used to integrate transcriptomic data into genome-scale metabolic network reconstructions to generate predictive computational models. In this review, we summarize such FBA-based methods for integrating expression data into genome-scale metabolic network reconstructions, highlighting their advantages as well as their limitations.

  5. Genome-wide analysis of miRNA and mRNA transcriptomes during amelogenesis.

    Science.gov (United States)

    Yin, Kaifeng; Hacia, Joseph G; Zhong, Zhe; Paine, Michael L

    2014-11-19

    In the rodent incisor during amelogenesis, as ameloblast cells transition from secretory stage to maturation stage, their morphology and transcriptome profiles change dramatically. Prior whole genome transcriptome analysis has given a broad picture of the molecular activities dominating both stages of amelogenesis, but this type of analysis has not included miRNA transcript profiling. In this study, we set out to document which miRNAs and corresponding target genes change significantly as ameloblasts transition from secretory- to maturation-stage amelogenesis. Total RNA samples from both secretory- and maturation-stage rat enamel organs were subjected to genome-wide miRNA and mRNA transcript profiling. We identified 59 miRNAs that were differentially expressed at the maturation stage relative to the secretory stage of enamel development (False Discovery Rate (FDR)<0.05, fold change (FC)≥1.8). In parallel, transcriptome profiling experiments identified 1,729 mRNA transcripts that were differentially expressed in the maturation stage compared to the secretory stage (FDR<0.05, FC≥1.8). Based on bioinformatics analyses, 5.8% (629 total) of these differentially expressed genes (DEGS) were highlighted as being the potential targets of 59 miRNAs that were differentially expressed in the opposite direction, in the same tissue samples. Although the number of predicted target DEGs was not higher than baseline expectations generated by examination of stably expressed miRNAs, Gene Ontology (GO) analysis showed that these 629 DEGS were enriched for ion transport, pH regulation, calcium handling, endocytotic, and apoptotic activities. Seven differentially expressed miRNAs (miR-21, miR-31, miR-488, miR-153, miR-135b, miR-135a and miR298) in secretory- and/or maturation-stage enamel organs were confirmed by in situ hybridization. Further, we used luciferase reporter assays to provide evidence that two of these differentially expressed miRNAs, miR-153 and miR-31, are potential

  6. Improved evidence-based genome-scale metabolic models for maize leaf, embryo, and endosperm

    Energy Technology Data Exchange (ETDEWEB)

    Seaver, Samuel M. D.; Bradbury, Louis M. T.; Frelin, Océane; Zarecki, Raphy; Ruppin, Eytan; Hanson, Andrew D.; Henry, Christopher S.

    2015-03-10

    There is a growing demand for genome-scale metabolic reconstructions for plants, fueled by the need to understand the metabolic basis of crop yield and by progress in genome and transcriptome sequencing. Methods are also required to enable the interpretation of plant transcriptome data to study how cellular metabolic activity varies under different growth conditions or even within different organs, tissues, and developmental stages. Such methods depend extensively on the accuracy with which genes have been mapped to the biochemical reactions in the plant metabolic pathways. Errors in these mappings lead to metabolic reconstructions with an inflated number of reactions and possible generation of unreliable metabolic phenotype predictions. Here we introduce a new evidence-based genome-scale metabolic reconstruction of maize, with significant improvements in the quality of the gene-reaction associations included within our model. We also present a new approach for applying our model to predict active metabolic genes based on transcriptome data. This method includes a minimal set of reactions associated with low expression genes to enable activity of a maximum number of reactions associated with high expression genes. We apply this method to construct an organ-specific model for the maize leaf, and tissue specific models for maize embryo and endosperm cells. We validate our models using fluxomics data for the endosperm and embryo, demonstrating an improved capacity of our models to fit the available fluxomics data. All models are publicly available via the DOE Systems Biology Knowledgebase and PlantSEED, and our new method is generally applicable for analysis transcript profiles from any plant, paving the way for further in silico studies with a wide variety of plant genomes.

  7. Genome and transcriptome sequencing characterises the gene space of Macadamia integrifolia (Proteaceae).

    Science.gov (United States)

    Nock, Catherine J; Baten, Abdul; Barkla, Bronwyn J; Furtado, Agnelo; Henry, Robert J; King, Graham J

    2016-11-17

    The large Gondwanan plant family Proteaceae is an early-diverging eudicot lineage renowned for its morphological, taxonomic and ecological diversity. Macadamia is the most economically important Proteaceae crop and represents an ancient rainforest-restricted lineage. The family is a focus for studies of adaptive radiation due to remarkable species diversification in Mediterranean-climate biodiversity hotspots, and numerous evolutionary transitions between biomes. Despite a long history of research, comparative analyses in the Proteaceae and macadamia breeding programs are restricted by a paucity of genetic information. To address this, we sequenced the genome and transcriptome of the widely grown Macadamia integrifolia cultivar 741. Over 95 gigabases of DNA and RNA-seq sequence data were de novo assembled and annotated. The draft assembly has a total length of 518 Mb and spans approximately 79% of the estimated genome size. Following annotation, 35,337 protein-coding genes were predicted of which over 90% were expressed in at least one of the leaf, shoot or flower tissues examined. Gene family comparisons with five other eudicot species revealed 13,689 clusters containing macadamia genes and 1005 macadamia-specific clusters, and provides evidence for linage-specific expansion of gene families involved in pathogen recognition, plant defense and monoterpene synthesis. Cyanogenesis is an important defense strategy in the Proteaceae, and a detailed analysis of macadamia gene homologues potentially involved in cyanogenic glycoside biosynthesis revealed several highly expressed candidate genes. The gene space of macadamia provides a foundation for comparative genomics, gene discovery and the acceleration of molecular-assisted breeding. This study presents the first available genomic resources for the large basal eudicot family Proteaceae, access to most macadamia genes and opportunities to uncover the genetic basis of traits of importance for adaptation and crop

  8. Transcriptome sequencing of two phenotypic mosaic Eucalyptus trees reveals large scale transcriptome re-modelling.

    Directory of Open Access Journals (Sweden)

    Amanda Padovan

    Full Text Available Phenotypic mosaic trees offer an ideal system for studying differential gene expression. We have investigated two mosaic eucalypt trees from two closely related species (Eucalyptus melliodora and E. sideroxylon, which each support two types of leaves: one part of the canopy is resistant to insect herbivory and the remaining leaves are susceptible. Driving this ecological distinction are differences in plant secondary metabolites. We used these phenotypic mosaics to investigate genome wide patterns of foliar gene expression with the aim of identifying patterns of differential gene expression and the somatic mutation(s that lead to this phenotypic mosaicism. We sequenced the mRNA pool from leaves of the resistant and susceptible ecotypes from both mosaic eucalypts using the Illumina HiSeq 2000 platform. We found large differences in pathway regulation and gene expression between the ecotypes of each mosaic. The expression of the genes in the MVA and MEP pathways is reflected by variation in leaf chemistry, however this is not the case for the terpene synthases. Apart from the terpene biosynthetic pathway, there are several other metabolic pathways that are differentially regulated between the two ecotypes, suggesting there is much more phenotypic diversity than has been described. Despite the close relationship between the two species, they show large differences in the global patterns of gene and pathway regulation.

  9. Large-Scale Sequencing: The Future of Genomic Sciences Colloquium

    Energy Technology Data Exchange (ETDEWEB)

    Margaret Riley; Merry Buckley

    2009-01-01

    Genetic sequencing and the various molecular techniques it has enabled have revolutionized the field of microbiology. Examining and comparing the genetic sequences borne by microbes - including bacteria, archaea, viruses, and microbial eukaryotes - provides researchers insights into the processes microbes carry out, their pathogenic traits, and new ways to use microorganisms in medicine and manufacturing. Until recently, sequencing entire microbial genomes has been laborious and expensive, and the decision to sequence the genome of an organism was made on a case-by-case basis by individual researchers and funding agencies. Now, thanks to new technologies, the cost and effort of sequencing is within reach for even the smallest facilities, and the ability to sequence the genomes of a significant fraction of microbial life may be possible. The availability of numerous microbial genomes will enable unprecedented insights into microbial evolution, function, and physiology. However, the current ad hoc approach to gathering sequence data has resulted in an unbalanced and highly biased sampling of microbial diversity. A well-coordinated, large-scale effort to target the breadth and depth of microbial diversity would result in the greatest impact. The American Academy of Microbiology convened a colloquium to discuss the scientific benefits of engaging in a large-scale, taxonomically-based sequencing project. A group of individuals with expertise in microbiology, genomics, informatics, ecology, and evolution deliberated on the issues inherent in such an effort and generated a set of specific recommendations for how best to proceed. The vast majority of microbes are presently uncultured and, thus, pose significant challenges to such a taxonomically-based approach to sampling genome diversity. However, we have yet to even scratch the surface of the genomic diversity among cultured microbes. A coordinated sequencing effort of cultured organisms is an appropriate place to begin

  10. The genome and life-stage specific transcriptomes of Globodera pallida elucidate key aspects of plant parasitism by a cyst nematode

    KAUST Repository

    Cotton, James A; Lilley, Catherine J; Jones, Laura M; Kikuchi, Taisei; Reid, Adam J; Thorpe, Peter; Tsai, Isheng J; Beasley, Helen; Blok, Vivian; Cock, Peter J A; den Akker, Sebastian Eves-van; Holroyd, Nancy; Hunt, Martin; Mantelin, Sophie; Naghra, Hardeep; Pain, Arnab; Palomares-Rius, Juan E; Zarowiecki, Magdalena; Berriman, Matthew; Jones, John T; Urwin, Peter E

    2014-01-01

    -knot nematodes are the two most important plant parasitic nematode groups and together represent a global threat to food security. Results: We present the complete genome sequence of G. pallida, together with transcriptomic data from most of the nematode life

  11. A Genomic, Transcriptomic and Proteomic Look at the GE2270 Producer Planobispora rosea, an Uncommon Actinomycete.

    Directory of Open Access Journals (Sweden)

    Arianna Tocchetti

    Full Text Available We report the genome sequence of Planobispora rosea ATCC 53733, a mycelium-forming soil-dweller belonging to one of the lesser studied genera of Actinobacteria and producing the thiopeptide GE2270. The P. rosea genome presents considerable convergence in gene organization and function with other members in the family Streptosporangiaceae, with a significant number (44% of shared orthologs. Patterns of gene expression in P. rosea cultures during exponential and stationary phase have been analyzed using whole transcriptome shotgun sequencing and by proteome analysis. Among the differentially abundant proteins, those involved in protein metabolism are particularly represented, including the GE2270-insensitive EF-Tu. Two proteins from the pbt cluster, directing GE2270 biosynthesis, slightly increase their abundance values over time. While GE2270 production starts during the exponential phase, most pbt genes, as analyzed by qRT-PCR, are down-regulated. The exception is represented by pbtA, encoding the precursor peptide of the ribosomally synthesized GE2270, whose expression reached the highest level at the entry into stationary phase.

  12. The first Chameleon transcriptome: comparative genomic analysis of the OXPHOS system reveals loss of COX8 in Iguanian lizards.

    Science.gov (United States)

    Bar-Yaacov, Dan; Bouskila, Amos; Mishmar, Dan

    2013-01-01

    Recently, we found dramatic mitochondrial DNA divergence of Israeli Chamaeleo chamaeleon populations into two geographically distinct groups. We aimed to examine whether the same pattern of divergence could be found in nuclear genes. However, no genomic resource is available for any chameleon species. Here we present the first chameleon transcriptome, obtained using deep sequencing (SOLiD). Our analysis identified 164,000 sequence contigs of which 19,000 yielded unique BlastX hits. To test the efficacy of our sequencing effort, we examined whether the chameleon and other available reptilian transcriptomes harbored complete sets of genes comprising known biochemical pathways, focusing on the nDNA-encoded oxidative phosphorylation (OXPHOS) genes as a model. As a reference for the screen, we used the human 86 (including isoforms) known structural nDNA-encoded OXPHOS subunits. Analysis of 34 publicly available vertebrate transcriptomes revealed orthologs for most human OXPHOS genes. However, OXPHOS subunit COX8 (Cytochrome C oxidase subunit 8), including all its known isoforms, was consistently absent in transcriptomes of iguanian lizards, implying loss of this subunit during the radiation of this suborder. The lack of COX8 in the suborder Iguania is intriguing, since it is important for cellular respiration and ATP production. Our sequencing effort added a new resource for comparative genomic studies, and shed new light on the evolutionary dynamics of the OXPHOS system.

  13. Transcriptomics and molecular evolutionary rate analysis of the bladderwort (Utricularia, a carnivorous plant with a minimal genome

    Directory of Open Access Journals (Sweden)

    Herrera-Estrella Alfredo

    2011-06-01

    Full Text Available Abstract Background The carnivorous plant Utricularia gibba (bladderwort is remarkable in having a minute genome, which at ca. 80 megabases is approximately half that of Arabidopsis. Bladderworts show an incredible diversity of forms surrounding a defined theme: tiny, bladder-like suction traps on terrestrial, epiphytic, or aquatic plants with a diversity of unusual vegetative forms. Utricularia plants, which are rootless, are also anomalous in physiological features (respiration and carbon distribution, and highly enhanced molecular evolutionary rates in chloroplast, mitochondrial and nuclear ribosomal sequences. Despite great interest in the genus, no genomic resources exist for Utricularia, and the substitution rate increase has received limited study. Results Here we describe the sequencing and analysis of the Utricularia gibba transcriptome. Three different organs were surveyed, the traps, the vegetative shoot bodies, and the inflorescence stems. We also examined the bladderwort transcriptome under diverse stress conditions. We detail aspects of functional classification, tissue similarity, nitrogen and phosphorus metabolism, respiration, DNA repair, and detoxification of reactive oxygen species (ROS. Long contigs of plastid and mitochondrial genomes, as well as sequences for 100 individual nuclear genes, were compared with those of other plants to better establish information on molecular evolutionary rates. Conclusion The Utricularia transcriptome provides a detailed genomic window into processes occurring in a carnivorous plant. It contains a deep representation of the complex metabolic pathways that characterize a putative minimal plant genome, permitting its use as a source of genomic information to explore the structural, functional, and evolutionary diversity of the genus. Vegetative shoots and traps are the most similar organs by functional classification of their transcriptome, the traps expressing hydrolytic enzymes for prey

  14. Genomics of Compositae crops: reference transcriptome assemblies and evidence of hybridization with wild relatives.

    Science.gov (United States)

    Hodgins, Kathryn A; Lai, Zhao; Oliveira, Luiz O; Still, David W; Scascitelli, Moira; Barker, Michael S; Kane, Nolan C; Dempewolf, Hannes; Kozik, Alex; Kesseli, Richard V; Burke, John M; Michelmore, Richard W; Rieseberg, Loren H

    2014-01-01

    Although the Compositae harbours only two major food crops, sunflower and lettuce, many other species in this family are utilized by humans and have experienced various levels of domestication. Here, we have used next-generation sequencing technology to develop 15 reference transcriptome assemblies for Compositae crops or their wild relatives. These data allow us to gain insight into the evolutionary and genomic consequences of plant domestication. Specifically, we performed Illumina sequencing of Cichorium endivia, Cichorium intybus, Echinacea angustifolia, Iva annua, Helianthus tuberosus, Dahlia hybrida, Leontodon taraxacoides and Glebionis segetum, as well 454 sequencing of Guizotia scabra, Stevia rebaudiana, Parthenium argentatum and Smallanthus sonchifolius. Illumina reads were assembled using Trinity, and 454 reads were assembled using MIRA and CAP3. We evaluated the coverage of the transcriptomes using BLASTX analysis of a set of ultra-conserved orthologs (UCOs) and recovered most of these genes (88-98%). We found a correlation between contig length and read length for the 454 assemblies, and greater contig lengths for the 454 compared with the Illumina assemblies. This suggests that longer reads can aid in the assembly of more complete transcripts. Finally, we compared the divergence of orthologs at synonymous sites (Ks) between Compositae crops and their wild relatives and found greater divergence when the progenitors were self-incompatible. We also found greater divergence between pairs of taxa that had some evidence of postzygotic isolation. For several more distantly related congeners, such as chicory and endive, we identified a signature of introgression in the distribution of Ks values. © 2013 John Wiley & Sons Ltd.

  15. RUMINANT NUTRITION SYMPOSIUM: Use of genomics and transcriptomics to identify strategies to lower ruminal methanogenesis.

    Science.gov (United States)

    McAllister, T A; Meale, S J; Valle, E; Guan, L L; Zhou, M; Kelly, W J; Henderson, G; Attwood, G T; Janssen, P H

    2015-04-01

    Globally, methane (CH4) emissions account for 40% to 45% of greenhouse gas emissions from ruminant livestock, with over 90% of these emissions arising from enteric fermentation. Reduction of carbon dioxide to CH4 is critical for efficient ruminal fermentation because it prevents the accumulation of reducing equivalents in the rumen. Methanogens exist in a symbiotic relationship with rumen protozoa and fungi and within biofilms associated with feed and the rumen wall. Genomics and transcriptomics are playing an increasingly important role in defining the ecology of ruminal methanogenesis and identifying avenues for its mitigation. Metagenomic approaches have provided information on changes in abundances as well as the species composition of the methanogen community among ruminants that vary naturally in their CH4 emissions, their feed efficiency, and their response to CH4 mitigators. Sequencing the genomes of rumen methanogens has provided insight into surface proteins that may prove useful in the development of vaccines and has allowed assembly of biochemical pathways for use in chemogenomic approaches to lowering ruminal CH4 emissions. Metagenomics and metatranscriptomic analysis of entire rumen microbial communities are providing new perspectives on how methanogens interact with other members of this ecosystem and how these relationships may be altered to reduce methanogenesis. Identification of community members that produce antimethanogen agents that either inhibit or kill methanogens could lead to the identification of new mitigation approaches. Discovery of a lytic archaeophage that specifically lyses methanogens is 1 such example. Efforts in using genomic data to alter methanogenesis have been hampered by a lack of sequence information that is specific to the microbial community of the rumen. Programs such as Hungate1000 and the Global Rumen Census are increasing the breadth and depth of our understanding of global ruminal microbial communities, steps that

  16. From genes to milk: genomic organization and epigenetic regulation of the mammary transcriptome.

    Science.gov (United States)

    Lemay, Danielle G; Pollard, Katherine S; Martin, William F; Freeman Zadrowski, Courtneay; Hernandez, Joseph; Korf, Ian; German, J Bruce; Rijnkels, Monique

    2013-01-01

    Even in genomes lacking operons, a gene's position in the genome influences its potential for expression. The mechanisms by which adjacent genes are co-expressed are still not completely understood. Using lactation and the mammary gland as a model system, we explore the hypothesis that chromatin state contributes to the co-regulation of gene neighborhoods. The mammary gland represents a unique evolutionary model, due to its recent appearance, in the context of vertebrate genomes. An understanding of how the mammary gland is regulated to produce milk is also of biomedical and agricultural importance for human lactation and dairying. Here, we integrate epigenomic and transcriptomic data to develop a comprehensive regulatory model. Neighborhoods of mammary-expressed genes were determined using expression data derived from pregnant and lactating mice and a neighborhood scoring tool, G-NEST. Regions of open and closed chromatin were identified by ChIP-Seq of histone modifications H3K36me3, H3K4me2, and H3K27me3 in the mouse mammary gland and liver tissue during lactation. We found that neighborhoods of genes in regions of uniquely active chromatin in the lactating mammary gland, compared with liver tissue, were extremely rare. Rather, genes in most neighborhoods were suppressed during lactation as reflected in their expression levels and their location in regions of silenced chromatin. Chromatin silencing was largely shared between the liver and mammary gland during lactation, and what distinguished the mammary gland was mainly a small tissue-specific repertoire of isolated, expressed genes. These findings suggest that an advantage of the neighborhood organization is in the collective repression of groups of genes via a shared mechanism of chromatin repression. Genes essential to the mammary gland's uniqueness are isolated from neighbors, and likely have less tolerance for variation in expression, properties they share with genes responsible for an organism's survival.

  17. Complete Genome Sequence of Sporisorium scitamineum and Biotrophic Interaction Transcriptome with Sugarcane.

    Directory of Open Access Journals (Sweden)

    Lucas M Taniguti

    Full Text Available Sporisorium scitamineum is a biotrophic fungus responsible for the sugarcane smut, a worldwide spread disease. This study provides the complete sequence of individual chromosomes of S. scitamineum from telomere to telomere achieved by a combination of PacBio long reads and Illumina short reads sequence data, as well as a draft sequence of a second fungal strain. Comparative analysis to previous available sequences of another strain detected few polymorphisms among the three genomes. The novel complete sequence described herein allowed us to identify and annotate extended subtelomeric regions, repetitive elements and the mitochondrial DNA sequence. The genome comprises 19,979,571 bases, 6,677 genes encoding proteins, 111 tRNAs and 3 assembled copies of rDNA, out of our estimated number of copies as 130. Chromosomal reorganizations were detected when comparing to sequences of S. reilianum, the closest smut relative, potentially influenced by repeats of transposable elements. Repetitive elements may have also directed the linkage of the two mating-type loci. The fungal transcriptome profiling from in vitro and from interaction with sugarcane at two time points (early infection and whip emergence revealed that 13.5% of the genes were differentially expressed in planta and particular to each developmental stage. Among them are plant cell wall degrading enzymes, proteases, lipases, chitin modification and lignin degradation enzymes, sugar transporters and transcriptional factors. The fungus also modulates transcription of genes related to surviving against reactive oxygen species and other toxic metabolites produced by the plant. Previously described effectors in smut/plant interactions were detected but some new candidates are proposed. Ten genomic islands harboring some of the candidate genes unique to S. scitamineum were expressed only in planta. RNAseq data was also used to reassure gene predictions.

  18. Detection of G-Quadruplex Structures Formed by G-Rich Sequences from Rice Genome and Transcriptome Using Combined Probes.

    Science.gov (United States)

    Chang, Tianjun; Li, Weiguo; Ding, Zhan; Cheng, Shaofei; Liang, Kun; Liu, Xiangjun; Bing, Tao; Shangguan, Dihua

    2017-08-01

    Putative G-quadruplex (G4) forming sequences (PQS) are highly prevalent in the genome and transcriptome of various organisms and are considered as potential regulation elements in many biological processes by forming G4 structures. The formation of G4 structures highly depends on the sequences and the environment. In most cases, it is difficult to predict G4 formation by PQS, especially PQS containing G2 tracts. Therefore, the experimental identification of G4 formation is essential in the study of G4-related biological functions. Herein, we report a rapid and simple method for the detection of G4 structures by using a pair of complementary reporters, hemin and BMSP. This method was applied to detect G4 structures formed by PQS (DNA and RNA) searched in the genome and transcriptome of Oryza sativa. Unlike most of the reported G4 probes that only recognize part of G4 structures, the proposed method based on combined probes positively responded to almost all G4 conformations, including parallel, antiparallel, and mixed/hybrid G4, but did not respond to non-G4 sequences. This method shows potential for high-throughput identification of G4 structures in genome and transcriptome. Furthermore, BMSP was observed to drive some PQS to form more stable G4 structures or induce the G4 formation of some PQS that cannot form G4 in normal physiological conditions, which may provide a powerful molecular tool for gene regulation.

  19. Assembled genomic and tissue-specific transcriptomic data resources for two genetically distinct lines of Cowpea ( Vigna unguiculata (L.) Walp).

    Science.gov (United States)

    Spriggs, Andrew; Henderson, Steven T; Hand, Melanie L; Johnson, Susan D; Taylor, Jennifer M; Koltunow, Anna

    2018-02-09

    Cowpea ( Vigna unguiculata (L.) Walp) is an important legume crop for food security in areas of low-input and smallholder farming throughout Africa and Asia. Genetic improvements are required to increase yield and resilience to biotic and abiotic stress and to enhance cowpea crop performance. An integrated cowpea genomic and gene expression data resource has the potential to greatly accelerate breeding and the delivery of novel genetic traits for cowpea. Extensive genomic resources for cowpea have been absent from the public domain; however, a recent early release reference genome for IT97K-499-35 ( Vigna unguiculata  v1.0, NSF, UCR, USAID, DOE-JGI, http://phytozome.jgi.doe.gov/) has now been established in a collaboration between the Joint Genome Institute (JGI) and University California (UC) Riverside. Here we release supporting genomic and transcriptomic data for IT97K-499-35 and a second transformable cowpea variety, IT86D-1010. The transcriptome resource includes six tissue-specific datasets for each variety, with particular emphasis on reproductive tissues that extend and support the V. unguiculata v1.0 reference. Annotations have been included in our resource to allow direct mapping to the v1.0 cowpea reference. Access to this resource provided here is supported by raw and assembled data downloads.

  20. Genomics, transcriptomics and proteomics: enabling insights into social evolution and disease challenges for managed and wild bees.

    Science.gov (United States)

    Trapp, Judith; McAfee, Alison; Foster, Leonard J

    2017-02-01

    Globally, there are over 20 000 bee species (Hymenoptera: Apoidea: Anthophila) with a host of biologically fascinating characteristics. Although they have long been studied as models for social evolution, recent challenges to bee health (mainly diseases and pesticides) have gathered the attention of both public and research communities. Genome sequences of twelve bee species are now complete or under progress, facilitating the application of additional 'omic technologies. Here, we review recent developments in honey bee and native bee research in the genomic era. We discuss the progress in genome sequencing and functional annotation, followed by the enabled comparative genomics, proteomics and transcriptomics applications regarding social evolution and health. Finally, we end with comments on future challenges in the postgenomic era. © 2016 John Wiley & Sons Ltd.

  1. Genomic and transcriptomic analysis of Laccaria bicolor CAZome reveals insights into polysaccharides remodelling during symbiosis establishment.

    Science.gov (United States)

    Veneault-Fourrey, Claire; Commun, Carine; Kohler, Annegret; Morin, Emmanuelle; Balestrini, Raffaella; Plett, Jonathan; Danchin, Etienne; Coutinho, Pedro; Wiebenga, Ad; de Vries, Ronald P; Henrissat, Bernard; Martin, Francis

    2014-11-01

    Ectomycorrhizal fungi, living in soil forests, are required microorganisms to sustain tree growth and productivity. The establishment of mutualistic interaction with roots to form ectomycorrhiza (ECM) is not well known at the molecular level. In particular, how fungal and plant cell walls are rearranged to establish a fully functional ectomycorrhiza is poorly understood. Nevertheless, it is likely that Carbohydrate Active enZymes (CAZyme) produced by the fungus participate in this process. Genome-wide transcriptome profiling during ECM development was used to examine how the CAZome of Laccaria bicolor is regulated during symbiosis establishment. CAZymes active on fungal cell wall were upregulated during ECM development in particular after 4weeks of contact when the hyphae are surrounding the root cells and start to colonize the apoplast. We demonstrated that one expansin-like protein, whose expression is specific to symbiotic tissues, localizes within fungal cell wall. Whereas L. bicolor genome contained a constricted repertoire of CAZymes active on cellulose and hemicellulose, these CAZymes were expressed during the first steps of root cells colonization. L. bicolor retained the ability to use homogalacturonan, a pectin-derived substrate, as carbon source. CAZymes likely involved in pectin hydrolysis were mainly expressed at the stage of a fully mature ECM. All together, our data suggest an active remodelling of fungal cell wall with a possible involvement of expansin during ECM development. By contrast, a soft remodelling of the plant cell wall likely occurs through the loosening of the cellulose microfibrils by AA9 or GH12 CAZymes and middle lamella smooth remodelling through pectin (homogalacturonan) hydrolysis likely by GH28, GH12 CAZymes. Copyright © 2014 Elsevier Inc. All rights reserved.

  2. A genome-wide longitudinal transcriptome analysis of the aging model Podospora anserina.

    Science.gov (United States)

    Philipp, Oliver; Hamann, Andrea; Servos, Jörg; Werner, Alexandra; Koch, Ina; Osiewacz, Heinz D

    2013-01-01

    Aging of biological systems is controlled by various processes which have a potential impact on gene expression. Here we report a genome-wide transcriptome analysis of the fungal aging model Podospora anserina. Total RNA of three individuals of defined age were pooled and analyzed by SuperSAGE (serial analysis of gene expression). A bioinformatics analysis identified different molecular pathways to be affected during aging. While the abundance of transcripts linked to ribosomes and to the proteasome quality control system were found to decrease during aging, those associated with autophagy increase, suggesting that autophagy may act as a compensatory quality control pathway. Transcript profiles associated with the energy metabolism including mitochondrial functions were identified to fluctuate during aging. Comparison of wild-type transcripts, which are continuously down-regulated during aging, with those down-regulated in the long-lived, copper-uptake mutant grisea, validated the relevance of age-related changes in cellular copper metabolism. Overall, we (i) present a unique age-related data set of a longitudinal study of the experimental aging model P. anserina which represents a reference resource for future investigations in a variety of organisms, (ii) suggest autophagy to be a key quality control pathway that becomes active once other pathways fail, and (iii) present testable predictions for subsequent experimental investigations.

  3. A genome-wide longitudinal transcriptome analysis of the aging model Podospora anserina.

    Directory of Open Access Journals (Sweden)

    Oliver Philipp

    Full Text Available Aging of biological systems is controlled by various processes which have a potential impact on gene expression. Here we report a genome-wide transcriptome analysis of the fungal aging model Podospora anserina. Total RNA of three individuals of defined age were pooled and analyzed by SuperSAGE (serial analysis of gene expression. A bioinformatics analysis identified different molecular pathways to be affected during aging. While the abundance of transcripts linked to ribosomes and to the proteasome quality control system were found to decrease during aging, those associated with autophagy increase, suggesting that autophagy may act as a compensatory quality control pathway. Transcript profiles associated with the energy metabolism including mitochondrial functions were identified to fluctuate during aging. Comparison of wild-type transcripts, which are continuously down-regulated during aging, with those down-regulated in the long-lived, copper-uptake mutant grisea, validated the relevance of age-related changes in cellular copper metabolism. Overall, we (i present a unique age-related data set of a longitudinal study of the experimental aging model P. anserina which represents a reference resource for future investigations in a variety of organisms, (ii suggest autophagy to be a key quality control pathway that becomes active once other pathways fail, and (iii present testable predictions for subsequent experimental investigations.

  4. C-RAF function at the genome-wide transcriptome level: A systematic view.

    Science.gov (United States)

    Huang, Ying; Zhang, Xin-Yu; An, Su; Yang, Yang; Liu, Ying; Hao, Qian; Guo, Xiao-Xi; Xu, Tian-Rui

    2018-05-20

    C-RAF was the first member of the RAF kinase family to be discovered. Since its discovery, C-RAF has been found to regulate many fundamental cell processes, such as cell proliferation, cell death, and metabolism. However, the majority of these functions are achieved through interactions with different proteins; the genes regulated by C-RAF in its active or inactive state remain unclear. In the work, we used RNA-seq analysis to study the global transcriptomes of C-RAF bearing or C-RAF knockout cells in quiescent or EGF activated states. We identified 3353 genes that are promoted or suppressed by C-RAF. Gene ontology and Kyoto Encyclopedia of Genes and Genomes analyses revealed that these genes are involved in drug addiction, cardiomyopathy, autoimmunity, and regulation of cell metabolism. Our results provide a panoramic view of C-RAF function, including known and novel functions, and have revealed potential targets for elucidating the role of C-RAF. Copyright © 2018 Elsevier B.V. All rights reserved.

  5. Tools for Genomic and Transcriptomic Analysis of Microbes at Single-Cell Level

    Directory of Open Access Journals (Sweden)

    Zixi Chen

    2017-09-01

    Full Text Available Microbiologists traditionally study population rather than individual cells, as it is generally assumed that the status of individual cells will be similar to that observed in the population. However, the recent studies have shown that the individual behavior of each single cell could be quite different from that of the whole population, suggesting the importance of extending traditional microbiology studies to single-cell level. With recent technological advances, such as flow cytometry, next-generation sequencing (NGS, and microspectroscopy, single-cell microbiology has greatly enhanced the understanding of individuality and heterogeneity of microbes in many biological systems. Notably, the application of multiple ‘omics’ in single-cell analysis has shed light on how individual cells perceive, respond, and adapt to the environment, how heterogeneity arises under external stress and finally determines the fate of the whole population, and how microbes survive under natural conditions. As single-cell analysis involves no axenic cultivation of target microorganism, it has also been demonstrated as a valuable tool for dissecting the microbial ‘dark matter.’ In this review, current state-of-the-art tools and methods for genomic and transcriptomic analysis of microbes at single-cell level were critically summarized, including single-cell isolation methods and experimental strategies of single-cell analysis with NGS. In addition, perspectives on the future trends of technology development in the field of single-cell analysis was also presented.

  6. Multidimensional scaling for large genomic data sets

    Directory of Open Access Journals (Sweden)

    Lu Henry

    2008-04-01

    Full Text Available Abstract Background Multi-dimensional scaling (MDS is aimed to represent high dimensional data in a low dimensional space with preservation of the similarities between data points. This reduction in dimensionality is crucial for analyzing and revealing the genuine structure hidden in the data. For noisy data, dimension reduction can effectively reduce the effect of noise on the embedded structure. For large data set, dimension reduction can effectively reduce information retrieval complexity. Thus, MDS techniques are used in many applications of data mining and gene network research. However, although there have been a number of studies that applied MDS techniques to genomics research, the number of analyzed data points was restricted by the high computational complexity of MDS. In general, a non-metric MDS method is faster than a metric MDS, but it does not preserve the true relationships. The computational complexity of most metric MDS methods is over O(N2, so that it is difficult to process a data set of a large number of genes N, such as in the case of whole genome microarray data. Results We developed a new rapid metric MDS method with a low computational complexity, making metric MDS applicable for large data sets. Computer simulation showed that the new method of split-and-combine MDS (SC-MDS is fast, accurate and efficient. Our empirical studies using microarray data on the yeast cell cycle showed that the performance of K-means in the reduced dimensional space is similar to or slightly better than that of K-means in the original space, but about three times faster to obtain the clustering results. Our clustering results using SC-MDS are more stable than those in the original space. Hence, the proposed SC-MDS is useful for analyzing whole genome data. Conclusion Our new method reduces the computational complexity from O(N3 to O(N when the dimension of the feature space is far less than the number of genes N, and it successfully

  7. Transcriptome complexity in cardiac development and diseases--an expanding universe between genome and phenome.

    Science.gov (United States)

    Gao, Chen; Wang, Yibin

    2014-01-01

    With the advancement of transcriptome profiling by micro-arrays and high-throughput RNA-sequencing, transcriptome complexity and its dynamics are revealed at different levels in cardiovascular development and diseases. In this review, we will highlight the recent progress in our knowledge of cardiovascular transcriptome complexity contributed by RNA splicing, RNA editing and noncoding RNAs. The emerging importance of many of these previously under-explored aspects of gene regulation in cardiovascular development and pathology will be discussed.

  8. Large-Scale Transcriptome Analysis in Faba Bean (Vicia faba L. under Ascochyta fabae Infection.

    Directory of Open Access Journals (Sweden)

    Sara Ocaña

    Full Text Available Faba bean is an important food crop worldwide. However, progress in faba bean genomics lags far behind that of model systems due to limited availability of genetic and genomic information. Using the Illumina platform the faba bean transcriptome from leaves of two lines (29H and Vf136 subjected to Ascochyta fabae infection have been characterized. De novo transcriptome assembly provided a total of 39,185 different transcripts that were functionally annotated, and among these, 13,266 were assigned to gene ontology against Arabidopsis. Quality of the assembly was validated by RT-qPCR amplification of selected transcripts differentially expressed. Comparison of faba bean transcripts with those of better-characterized plant genomes such as Arabidopsis thaliana, Medicago truncatula and Cicer arietinum revealed a sequence similarity of 68.3%, 72.8% and 81.27%, respectively. Moreover, 39,060 single nucleotide polymorphism (SNP and 3,669 InDels were identified for genotyping applications. Mapping of the sequence reads generated onto the assembled transcripts showed that 393 and 457 transcripts were overexpressed in the resistant (29H and susceptible genotype (Vf136, respectively. Transcripts involved in plant-pathogen interactions such as leucine rich proteins (LRR or plant growth regulators involved in plant adaptation to abiotic and biotic stresses were found to be differently expressed in the resistant line. The results reported here represent the most comprehensive transcript database developed so far in faba bean, providing valuable information that could be used to gain insight into the pathways involved in the resistance mechanism against A. fabae and to identify potential resistance genes to be further used in marker assisted selection.

  9. Genome-wide transcriptome analysis of soybean primary root under varying water-deficit conditions.

    Science.gov (United States)

    Song, Li; Prince, Silvas; Valliyodan, Babu; Joshi, Trupti; Maldonado dos Santos, Joao V; Wang, Jiaojiao; Lin, Li; Wan, Jinrong; Wang, Yongqin; Xu, Dong; Nguyen, Henry T

    2016-01-15

    Soybean is a major crop that provides an important source of protein and oil to humans and animals, but its production can be dramatically decreased by the occurrence of drought stress. Soybeans can survive drought stress if there is a robust and deep root system at the early vegetative growth stage. However, little is known about the genome-wide molecular mechanisms contributing to soybean root system architecture. This study was performed to gain knowledge on transcriptome changes and related molecular mechanisms contributing to soybean root development under water limited conditions. The soybean Williams 82 genotype was subjected to very mild stress (VMS), mild stress (MS) and severe stress (SS) conditions, as well as recovery from the severe stress after re-watering (SR). In total, 6,609 genes in the roots showed differential expression patterns in response to different water-deficit stress levels. Genes involved in hormone (Auxin/Ethylene), carbohydrate, and cell wall-related metabolism (XTH/lipid/flavonoids/lignin) pathways were differentially regulated in the soybean root system. Several transcription factors (TFs) regulating root growth and responses under varying water-deficit conditions were identified and the expression patterns of six TFs were found to be common across the stress levels. Further analysis on the whole plant level led to the finding of tissue-specific or water-deficit levels specific regulation of transcription factors. Analysis of the over-represented motif of different gene groups revealed several new cis-elements associated with different levels of water deficit. The expression patterns of 18 genes were confirmed byquantitative reverse transcription polymerase chain reaction method and demonstrated the accuracy and effectiveness of RNA-Seq. The primary root specific transcriptome in soybean can enable a better understanding of the root response to water deficit conditions. The genes detected in root tissues that were associated with

  10. Phylogenetic distribution of large-scale genome patchiness

    Directory of Open Access Journals (Sweden)

    Hackenberg Michael

    2008-04-01

    Full Text Available Abstract Background The phylogenetic distribution of large-scale genome structure (i.e. mosaic compositional patchiness has been explored mainly by analytical ultracentrifugation of bulk DNA. However, with the availability of large, good-quality chromosome sequences, and the recently developed computational methods to directly analyze patchiness on the genome sequence, an evolutionary comparative analysis can be carried out at the sequence level. Results The local variations in the scaling exponent of the Detrended Fluctuation Analysis are used here to analyze large-scale genome structure and directly uncover the characteristic scales present in genome sequences. Furthermore, through shuffling experiments of selected genome regions, computationally-identified, isochore-like regions were identified as the biological source for the uncovered large-scale genome structure. The phylogenetic distribution of short- and large-scale patchiness was determined in the best-sequenced genome assemblies from eleven eukaryotic genomes: mammals (Homo sapiens, Pan troglodytes, Mus musculus, Rattus norvegicus, and Canis familiaris, birds (Gallus gallus, fishes (Danio rerio, invertebrates (Drosophila melanogaster and Caenorhabditis elegans, plants (Arabidopsis thaliana and yeasts (Saccharomyces cerevisiae. We found large-scale patchiness of genome structure, associated with in silico determined, isochore-like regions, throughout this wide phylogenetic range. Conclusion Large-scale genome structure is detected by directly analyzing DNA sequences in a wide range of eukaryotic chromosome sequences, from human to yeast. In all these genomes, large-scale patchiness can be associated with the isochore-like regions, as directly detected in silico at the sequence level.

  11. Analysis of Genome-Scale Data

    NARCIS (Netherlands)

    Kemmeren, P.P.C.W.

    2005-01-01

    The genetic material of every cell in an organism is stored inside DNA in the form of genes, which together form the genome. The information stored in the DNA is translated to RNA and subsequently to proteins, which form complex biological systems. The availability of whole genome sequences has

  12. Genome-wide transcriptomics of aging in the rotifer Brachionus manjavacas, an emerging model system.

    Science.gov (United States)

    Gribble, Kristin E; Mark Welch, David B

    2017-03-01

    Understanding gene expression changes over lifespan in diverse animal species will lead to insights to conserved processes in the biology of aging and allow development of interventions to improve health. Rotifers are small aquatic invertebrates that have been used in aging studies for nearly 100 years and are now re-emerging as a modern model system. To provide a baseline to evaluate genetic responses to interventions that change health throughout lifespan and a framework for new hypotheses about the molecular genetic mechanisms of aging, we examined the transcriptome of an asexual female lineage of the rotifer Brachionus manjavacas at five life stages: eggs, neonates, and early-, late-, and post-reproductive adults. There are widespread shifts in gene expression over the lifespan of B. manjavacas; the largest change occurs between neonates and early reproductive adults and is characterized by down-regulation of developmental genes and up-regulation of genes involved in reproduction. The expression profile of post-reproductive adults was distinct from that of other life stages. While few genes were significantly differentially expressed in the late- to post-reproductive transition, gene set enrichment analysis revealed multiple down-regulated pathways in metabolism, maintenance and repair, and proteostasis, united by genes involved in mitochondrial function and oxidative phosphorylation. This study provides the first examination of changes in gene expression over lifespan in rotifers. We detected differential expression of many genes with human orthologs that are absent in Drosophila and C. elegans, highlighting the potential of the rotifer model in aging studies. Our findings suggest that small but coordinated changes in expression of many genes in pathways that integrate diverse functions drive the aging process. The observation of simultaneous declines in expression of genes in multiple pathways may have consequences for health and longevity not detected by

  13. Potential evolution of neurosurgical treatment paradigms for craniopharyngioma based on genomic and transcriptomic characteristics.

    Science.gov (United States)

    Robinson, Leslie C; Santagata, Sandro; Hankinson, Todd C

    2016-12-01

    The recent genomic and transcriptomic characterization of human craniopharyngiomas has provided important insights into the pathogenesis of these tumors and supports that these tumor types are distinct entities. Critically, the insights provided by these data offer the potential for the introduction of novel therapies and surgical treatment paradigms for these tumors, which are associated with high morbidity rates and morbid conditions. Mutations in the CTNNB1 gene are primary drivers of adamantinomatous craniopharyngioma (ACP) and lead to the accumulation of β-catenin protein in a subset of the nuclei within the neoplastic epithelium of these tumors. Dysregulation of epidermal growth factor receptor (EGFR) and of sonic hedgehog (SHH) signaling in ACP suggest that paracrine oncogenic mechanisms may underlie ACP growth and implicate these signaling pathways as potential targets for therapeutic intervention using directed therapies. Recent work shows that ACP cells have primary cilia, further supporting the potential importance of SHH signaling in the pathogenesis of these tumors. While further preclinical data are needed, directed therapies could defer, or replace, the need for radiation therapy and/or allow for less aggressive surgical interventions. Furthermore, the prospect for reliable control of cystic disease without the need for surgery now exists. Studies of papillary craniopharyngioma (PCP) are more clinically advanced than those for ACP. The vast majority of PCPs harbor the BRAF v600e mutation. There are now 2 reports of patients with PCP that had dramatic therapeutic responses to targeted agents. Ongoing clinical and research studies promise to not only advance our understanding of these challenging tumors but to offer new approaches for patient management.

  14. Integrated Genomics Reveals Convergent Transcriptomic Networks Underlying Chronic Obstructive Pulmonary Disease and Idiopathic Pulmonary Fibrosis.

    Science.gov (United States)

    Kusko, Rebecca L; Brothers, John F; Tedrow, John; Pandit, Kusum; Huleihel, Luai; Perdomo, Catalina; Liu, Gang; Juan-Guardela, Brenda; Kass, Daniel; Zhang, Sherry; Lenburg, Marc; Martinez, Fernando; Quackenbush, John; Sciurba, Frank; Limper, Andrew; Geraci, Mark; Yang, Ivana; Schwartz, David A; Beane, Jennifer; Spira, Avrum; Kaminski, Naftali

    2016-10-15

    Despite shared environmental exposures, idiopathic pulmonary fibrosis (IPF) and chronic obstructive pulmonary disease are usually studied in isolation, and the presence of shared molecular mechanisms is unknown. We applied an integrative genomic approach to identify convergent transcriptomic pathways in emphysema and IPF. We defined the transcriptional repertoire of chronic obstructive pulmonary disease, IPF, or normal histology lungs using RNA-seq (n = 87). Genes increased in both emphysema and IPF relative to control were enriched for the p53/hypoxia pathway, a finding confirmed in an independent cohort using both gene expression arrays and the nCounter Analysis System (n = 193). Immunohistochemistry confirmed overexpression of HIF1A, MDM2, and NFKBIB members of this pathway in tissues from patients with emphysema or IPF. Using reads aligned across splice junctions, we determined that alternative splicing of p53/hypoxia pathway-associated molecules NUMB and PDGFA occurred more frequently in IPF or emphysema compared with control and validated these findings by quantitative polymerase chain reaction and the nCounter Analysis System on an independent sample set (n = 193). Finally, by integrating parallel microRNA and mRNA-Seq data on the same samples, we identified MIR96 as a key novel regulatory hub in the p53/hypoxia gene-expression network and confirmed that modulation of MIR96 in vitro recapitulates the disease-associated gene-expression network. Our results suggest convergent transcriptional regulatory hubs in diseases as varied phenotypically as chronic obstructive pulmonary disease and IPF and suggest that these hubs may represent shared key responses of the lung to environmental stresses.

  15. Genome and Transcriptome sequence of Finger millet (Eleusine coracana (L.) Gaertn.) provides insights into drought tolerance and nutraceutical properties.

    Science.gov (United States)

    Hittalmani, Shailaja; Mahesh, H B; Shirke, Meghana Deepak; Biradar, Hanamareddy; Uday, Govindareddy; Aruna, Y R; Lohithaswa, H C; Mohanrao, A

    2017-06-15

    Finger millet (Eleusine coracana (L.) Gaertn.) is an important staple food crop widely grown in Africa and South Asia. Among the millets, finger millet has high amount of calcium, methionine, tryptophan, fiber, and sulphur containing amino acids. In addition, it has C4 photosynthetic carbon assimilation mechanism, which helps to utilize water and nitrogen efficiently under hot and arid conditions without severely affecting yield. Therefore, development and utilization of genomic resources for genetic improvement of this crop is immensely useful. Experimental results from whole genome sequencing and assembling process of ML-365 finger millet cultivar yielded 1196 Mb covering approximately 82% of total estimated genome size. Genome analysis showed the presence of 85,243 genes and one half of the genome is repetitive in nature. The finger millet genome was found to have higher colinearity with foxtail millet and rice as compared to other Poaceae species. Mining of simple sequence repeats (SSRs) yielded abundance of SSRs within the finger millet genome. Functional annotation and mining of transcription factors revealed finger millet genome harbors large number of drought tolerance related genes. Transcriptome analysis of low moisture stress and non-stress samples revealed the identification of several drought-induced candidate genes, which could be used in drought tolerance breeding. This genome sequencing effort will strengthen plant breeders for allele discovery, genetic mapping, and identification of candidate genes for agronomically important traits. Availability of genomic resources of finger millet will enhance the novel breeding possibilities to address potential challenges of finger millet improvement.

  16. Comparative genome and transcriptome analysis reveals distinctive surface characteristics and unique physiological potentials of Pseudomonas aeruginosa ATCC 27853

    KAUST Repository

    Cao, Huiluo

    2017-06-12

    Pseudomonas aeruginosa ATCC 27853 was isolated from a hospital blood specimen in 1971 and has been widely used as a model strain to survey antibiotics susceptibilities, biofilm development, and metabolic activities of Pseudomonas spp.. Although four draft genomes of P. aeruginosa ATCC 27853 have been sequenced, the complete genome of this strain is still lacking, hindering a comprehensive understanding of its physiology and functional genome.Here we sequenced and assembled the complete genome of P. aeruginosa ATCC 27853 using the Pacific Biosciences SMRT (PacBio) technology and Illumina sequencing platform. We found that accessory genes of ATCC 27853 including prophages and genomic islands (GIs) mainly contribute to the difference between P. aeruginosa ATCC 27853 and other P. aeruginosa strains. Seven prophages were identified within the genome of P. aeruginosa ATCC 27853. Of the predicted 25 GIs, three contain genes that encode monoxoygenases, dioxygenases and hydrolases that could be involved in the metabolism of aromatic compounds. Surveying virulence-related genes revealed that a series of genes that encode the B-band O-antigen of LPS are lacking in ATCC 27853. Distinctive SNPs in genes of cellular adhesion proteins such as type IV pili and flagella biosynthesis were also observed in this strain. Colony morphology analysis confirmed an enhanced biofilm formation capability of ATCC 27853 on solid agar surface compared to Pseudomonas aeruginosa PAO1. We then performed transcriptome analysis of ATCC 27853 and PAO1 using RNA-seq and compared the expression of orthologous genes to understand the functional genome and the genomic details underlying the distinctive colony morphogenesis. These analyses revealed an increased expression of genes involved in cellular adhesion and biofilm maturation such as type IV pili, exopolysaccharide and electron transport chain components in ATCC 27853 compared with PAO1. In addition, distinctive expression profiles of the

  17. A differential genome-wide transcriptome analysis: impact of cellular copper on complex biological processes like aging and development.

    Directory of Open Access Journals (Sweden)

    Jörg Servos

    Full Text Available The regulation of cellular copper homeostasis is crucial in biology. Impairments lead to severe dysfunctions and are known to affect aging and development. Previously, a loss-of-function mutation in the gene encoding the copper-sensing and copper-regulated transcription factor GRISEA of the filamentous fungus Podospora anserina was reported to lead to cellular copper depletion and a pleiotropic phenotype with hypopigmentation of the mycelium and the ascospores, affected fertility and increased lifespan by approximately 60% when compared to the wild type. This phenotype is linked to a switch from a copper-dependent standard to an alternative respiration leading to both a reduced generation of reactive oxygen species (ROS and of adenosine triphosphate (ATP. We performed a genome-wide comparative transcriptome analysis of a wild-type strain and the copper-depleted grisea mutant. We unambiguously assigned 9,700 sequences of the transcriptome in both strains to the more than 10,600 predicted and annotated open reading frames of the P. anserina genome indicating 90% coverage of the transcriptome. 4,752 of the transcripts differed significantly in abundance with 1,156 transcripts differing at least 3-fold. Selected genes were investigated by qRT-PCR analyses. Apart from this general characterization we analyzed the data with special emphasis on molecular pathways related to the grisea mutation taking advantage of the available complete genomic sequence of P. anserina. This analysis verified but also corrected conclusions from earlier data obtained by single gene analysis, identified new candidates of factors as part of the cellular copper homeostasis system including target genes of transcription factor GRISEA, and provides a rich reference source of quantitative data for further in detail investigations. Overall, the present study demonstrates the importance of systems biology approaches also in cases were mutations in single genes are analyzed to

  18. Genome Sequence and Transcriptome Analyses of Chrysochromulina tobin: Metabolic Tools for Enhanced Algal Fitness in the Prominent Order Prymnesiales (Haptophyceae.

    Directory of Open Access Journals (Sweden)

    Blake T Hovde

    Full Text Available Haptophytes are recognized as seminal players in aquatic ecosystem function. These algae are important in global carbon sequestration, form destructive harmful blooms, and given their rich fatty acid content, serve as a highly nutritive food source to a broad range of eco-cohorts. Haptophyte dominance in both fresh and marine waters is supported by the mixotrophic nature of many taxa. Despite their importance the nuclear genome sequence of only one haptophyte, Emiliania huxleyi (Isochrysidales, is available. Here we report the draft genome sequence of Chrysochromulina tobin (Prymnesiales, and transcriptome data collected at seven time points over a 24-hour light/dark cycle. The nuclear genome of C. tobin is small (59 Mb, compact (∼ 40% of the genome is protein coding and encodes approximately 16,777 genes. Genes important to fatty acid synthesis, modification, and catabolism show distinct patterns of expression when monitored over the circadian photoperiod. The C. tobin genome harbors the first hybrid polyketide synthase/non-ribosomal peptide synthase gene complex reported for an algal species, and encodes potential anti-microbial peptides and proteins involved in multidrug and toxic compound extrusion. A new haptophyte xanthorhodopsin was also identified, together with two "red" RuBisCO activases that are shared across many algal lineages. The Chrysochromulina tobin genome sequence provides new information on the evolutionary history, ecology and economic importance of haptophytes.

  19. Capturing the response of Clostridium acetobutylicum to chemical stressors using a regulated genome-scale metabolic model

    International Nuclear Information System (INIS)

    Dash, Satyakam; Mueller, Thomas J.; Venkataramanan, Keerthi P.; Papoutsakis, Eleftherios T.; Maranas, Costas D.

    2014-01-01

    Clostridia are anaerobic Gram-positive Firmicutes containing broad and flexible systems for substrate utilization, which have been used successfully to produce a range of industrial compounds. Clostridium acetobutylicum has been used to produce butanol on an industrial scale through acetone-butanol-ethanol (ABE) fermentation. A genome-scale metabolic (GSM) model is a powerful tool for understanding the metabolic capacities of an organism and developing metabolic engineering strategies for strain development. The integration of stress related specific transcriptomics information with the GSM model provides opportunities for elucidating the focal points of regulation

  20. Analysis of Genome-Scale Data

    OpenAIRE

    Kemmeren, P.P.C.W.

    2005-01-01

    The genetic material of every cell in an organism is stored inside DNA in the form of genes, which together form the genome. The information stored in the DNA is translated to RNA and subsequently to proteins, which form complex biological systems. The availability of whole genome sequences has given rise to the parallel development of other high-throughput approaches such as determining mRNA expression level changes, gene-deletion phenotypes, chromosomal location of DNA binding proteins, cel...

  1. Genome and Transcriptome Analysis of the Fungal Pathogen Fusarium oxysporum f. sp. cubense Causing Banana Vascular Wilt Disease

    Science.gov (United States)

    Zeng, Huicai; Fan, Dingding; Zhu, Yabin; Feng, Yue; Wang, Guofen; Peng, Chunfang; Jiang, Xuanting; Zhou, Dajie; Ni, Peixiang; Liang, Changcong; Liu, Lei; Wang, Jun; Mao, Chao

    2014-01-01

    Background The asexual fungus Fusarium oxysporum f. sp. cubense (Foc) causing vascular wilt disease is one of the most devastating pathogens of banana (Musa spp.). To understand the molecular underpinning of pathogenicity in Foc, the genomes and transcriptomes of two Foc isolates were sequenced. Methodology/Principal Findings Genome analysis revealed that the genome structures of race 1 and race 4 isolates were highly syntenic with those of F. oxysporum f. sp. lycopersici strain Fol4287. A large number of putative virulence associated genes were identified in both Foc genomes, including genes putatively involved in root attachment, cell degradation, detoxification of toxin, transport, secondary metabolites biosynthesis and signal transductions. Importantly, relative to the Foc race 1 isolate (Foc1), the Foc race 4 isolate (Foc4) has evolved with some expanded gene families of transporters and transcription factors for transport of toxins and nutrients that may facilitate its ability to adapt to host environments and contribute to pathogenicity to banana. Transcriptome analysis disclosed a significant difference in transcriptional responses between Foc1 and Foc4 at 48 h post inoculation to the banana ‘Brazil’ in comparison with the vegetative growth stage. Of particular note, more virulence-associated genes were up regulated in Foc4 than in Foc1. Several signaling pathways like the mitogen-activated protein kinase Fmk1 mediated invasion growth pathway, the FGA1-mediated G protein signaling pathway and a pathogenicity associated two-component system were activated in Foc4 rather than in Foc1. Together, these differences in gene content and transcription response between Foc1 and Foc4 might account for variation in their virulence during infection of the banana variety ‘Brazil’. Conclusions/Significance Foc genome sequences will facilitate us to identify pathogenicity mechanism involved in the banana vascular wilt disease development. These will thus advance

  2. Using Genome-scale Models to Predict Biological Capabilities

    DEFF Research Database (Denmark)

    O’Brien, Edward J.; Monk, Jonathan M.; Palsson, Bernhard O.

    2015-01-01

    Constraint-based reconstruction and analysis (COBRA) methods at the genome scale have been under development since the first whole-genome sequences appeared in the mid-1990s. A few years ago, this approach began to demonstrate the ability to predict a range of cellular functions, including cellul...

  3. Selecting Superior De Novo Transcriptome Assemblies: Lessons Learned by Leveraging the Best Plant Genome.

    Directory of Open Access Journals (Sweden)

    Loren A Honaas

    Full Text Available Whereas de novo assemblies of RNA-Seq data are being published for a growing number of species across the tree of life, there are currently no broadly accepted methods for evaluating such assemblies. Here we present a detailed comparison of 99 transcriptome assemblies, generated with 6 de novo assemblers including CLC, Trinity, SOAP, Oases, ABySS and NextGENe. Controlled analyses of de novo assemblies for Arabidopsis thaliana and Oryza sativa transcriptomes provide new insights into the strengths and limitations of transcriptome assembly strategies. We find that the leading assemblers generate reassuringly accurate assemblies for the majority of transcripts. At the same time, we find a propensity for assemblers to fail to fully assemble highly expressed genes. Surprisingly, the instance of true chimeric assemblies is very low for all assemblers. Normalized libraries are reduced in highly abundant transcripts, but they also lack 1000s of low abundance transcripts. We conclude that the quality of de novo transcriptome assemblies is best assessed through consideration of a combination of metrics: 1 proportion of reads mapping to an assembly 2 recovery of conserved, widely expressed genes, 3 N50 length statistics, and 4 the total number of unigenes. We provide benchmark Illumina transcriptome data and introduce SCERNA, a broadly applicable modular protocol for de novo assembly improvement. Finally, our de novo assembly of the Arabidopsis leaf transcriptome revealed ~20 putative Arabidopsis genes lacking in the current annotation.

  4. Large-Scale Transcriptome Analysis of Two Sugarcane Genotypes Contrasting for Lignin Content.

    Directory of Open Access Journals (Sweden)

    Renato Vicentini

    Full Text Available Sugarcane is an important crop worldwide for sugar and first generation ethanol production. Recently, the residue of sugarcane mills, named bagasse, has been considered a promising lignocellulosic biomass to produce the second-generation ethanol. Lignin is a major factor limiting the use of bagasse and other plant lignocellulosic materials to produce second-generation ethanol. Lignin biosynthesis pathway is a complex network and changes in the expression of genes of this pathway have in general led to diverse and undesirable impacts on plant structure and physiology. Despite its economic importance, sugarcane genome was still not sequenced. In this study a high-throughput transcriptome evaluation of two sugarcane genotypes contrasting for lignin content was carried out. We generated a set of 85,151 transcripts of sugarcane using RNA-seq and de novo assembling. More than 2,000 transcripts showed differential expression between the genotypes, including several genes involved in the lignin biosynthetic pathway. This information can give valuable knowledge on the lignin biosynthesis and its interactions with other metabolic pathways in the complex sugarcane genome.

  5. Genome-scale biological models for industrial microbial systems.

    Science.gov (United States)

    Xu, Nan; Ye, Chao; Liu, Liming

    2018-04-01

    The primary aims and challenges associated with microbial fermentation include achieving faster cell growth, higher productivity, and more robust production processes. Genome-scale biological models, predicting the formation of an interaction among genetic materials, enzymes, and metabolites, constitute a systematic and comprehensive platform to analyze and optimize the microbial growth and production of biological products. Genome-scale biological models can help optimize microbial growth-associated traits by simulating biomass formation, predicting growth rates, and identifying the requirements for cell growth. With regard to microbial product biosynthesis, genome-scale biological models can be used to design product biosynthetic pathways, accelerate production efficiency, and reduce metabolic side effects, leading to improved production performance. The present review discusses the development of microbial genome-scale biological models since their emergence and emphasizes their pertinent application in improving industrial microbial fermentation of biological products.

  6. Local adaptation at the transcriptome level in brown trout: evidence from early life history temperature genomic reaction norms.

    Directory of Open Access Journals (Sweden)

    Kristian Meier

    Full Text Available Local adaptation and its underlying molecular basis has long been a key focus in evolutionary biology. There has recently been increased interest in the evolutionary role of plasticity and the molecular mechanisms underlying local adaptation. Using transcriptome analysis, we assessed differences in gene expression profiles for three brown trout (Salmo trutta populations, one resident and two anadromous, experiencing different temperature regimes in the wild. The study was based on an F2 generation raised in a common garden setting. A previous study of the F1 generation revealed different reaction norms and significantly higher QST than FST among populations for two early life-history traits. In the present study we investigated if genomic reaction norm patterns were also present at the transcriptome level. Eggs from the three populations were incubated at two temperatures (5 and 8 degrees C representing conditions encountered in the local environments. Global gene expression for fry at the stage of first feeding was analysed using a 32k cDNA microarray. The results revealed differences in gene expression between populations and temperatures and population × temperature interactions, the latter indicating locally adapted reaction norms. Moreover, the reaction norms paralleled those observed previously at early life-history traits. We identified 90 cDNA clones among the genes with an interaction effect that were differently expressed between the ecologically divergent populations. These included genes involved in immune- and stress response. We observed less plasticity in the resident as compared to the anadromous populations, possibly reflecting that the degree of environmental heterogeneity encountered by individuals throughout their life cycle will select for variable level of phenotypic plasticity at the transcriptome level. Our study demonstrates the usefulness of transcriptome approaches to identify genes with different temperature reaction

  7. Merkel Cell Polyomavirus Exhibits Dominant Control of the Tumor Genome and Transcriptome in Virus-Associated Merkel Cell Carcinoma.

    Science.gov (United States)

    Starrett, Gabriel J; Marcelus, Christina; Cantalupo, Paul G; Katz, Joshua P; Cheng, Jingwei; Akagi, Keiko; Thakuria, Manisha; Rabinowits, Guilherme; Wang, Linda C; Symer, David E; Pipas, James M; Harris, Reuben S; DeCaprio, James A

    2017-01-03

    Merkel cell polyomavirus is the primary etiological agent of the aggressive skin cancer Merkel cell carcinoma (MCC). Recent studies have revealed that UV radiation is the primary mechanism for somatic mutagenesis in nonviral forms of MCC. Here, we analyze the whole transcriptomes and genomes of primary MCC tumors. Our study reveals that virus-associated tumors have minimally altered genomes compared to non-virus-associated tumors, which are dominated by UV-mediated mutations. Although virus-associated tumors contain relatively small mutation burdens, they exhibit a distinct mutation signature with observable transcriptionally biased kataegic events. In addition, viral integration sites overlap focal genome amplifications in virus-associated tumors, suggesting a potential mechanism for these events. Collectively, our studies indicate that Merkel cell polyomavirus is capable of hijacking cellular processes and driving tumorigenesis to the same severity as tens of thousands of somatic genome alterations. A variety of mutagenic processes that shape the evolution of tumors are critical determinants of disease outcome. Here, we sequenced the entire genome of virus-positive and virus-negative primary Merkel cell carcinomas (MCCs), revealing distinct mutation spectra and corresponding expression profiles. Our studies highlight the strong effect that Merkel cell polyomavirus has on the divergent development of viral MCC compared to the somatic alterations that typically drive nonviral tumorigenesis. A more comprehensive understanding of the distinct mutagenic processes operative in viral and nonviral MCCs has implications for the effective treatment of these tumors. Copyright © 2017 Starrett et al.

  8. Genomic and transcriptomic insights into how bacteria withstand high concentrations of benzalkonium chloride biocides.

    Science.gov (United States)

    Kim, Minjae; Hatt, Janet K; Weigand, Michael R; Krishnan, Raj; Pavlostathis, Spyros G; Konstantinidis, Konstantinos T

    2018-04-13

    exposure remain poorly elucidated. Elucidating these mechanisms may be important for monitoring and limiting the spreading of disinfectant-resistant pathogens. Using an integrated approach that combined genomics and transcriptomics with physiological characterization of BAC-adapted isolates, this study provided a comprehensive understanding of the BAC-resistance mechanisms in P. aeruginosa Our findings also revealed potential genetic markers to detect and monitor the abundance of BAC-resistant pathogens across clinical or environmental settings. Copyright © 2018 American Society for Microbiology.

  9. Genomic and transcriptome profiling identified both human and HBV genetic variations and their interactions in Chinese hepatocellular carcinoma

    Directory of Open Access Journals (Sweden)

    Hua Dong

    2015-12-01

    Full Text Available Interaction between HBV and host genome integrations in hepatocellular carcinoma (HCC development is a complex process and the mechanism is still unclear. Here we described in details the quality controls and data mining of aCGH and transcriptome sequencing data on 50 HCC samples from the Chinese patients, published by Dong et al. (2015 (GEO#: GSE65486. In additional to the HBV-MLL4 integration discovered, we also investigated the genetic aberrations of HBV and host genes as well as their genetic interactions. We reported human genome copy number changes and frequent transcriptome variations (e.g. TP53, CTNNB1 mutation, especially MLL family mutations in this cohort of the patients. For HBV genotype C, we identified a novel linkage disequilibrium region covering HBV replication regulatory elements, including basal core promoter, DR1, epsilon and poly-A regions, which is associated with HBV core antigen over-expression and almost exclusive to HBV-MLL4 integration.

  10. Exploring Networks at the genome scale

    NARCIS (Netherlands)

    Lam, M.C.; Puchalka, J.; Diez, M.S.; Martins Dos Santos, V.A.P.

    2010-01-01

    Systems biology is aimed at achieving a holistic understanding of living organisms, while synthetic biology seeks to design and construct new living organisms with targeted functionalities. Genome sequencing and the fields of ‘omics’ technology have proven a goldmine of information for scientists

  11. Use of genome-scale microbial models for metabolic engineering

    DEFF Research Database (Denmark)

    Patil, Kiran Raosaheb; Åkesson, M.; Nielsen, Jens

    2004-01-01

    Metabolic engineering serves as an integrated approach to design new cell factories by providing rational design procedures and valuable mathematical and experimental tools. Mathematical models have an important role for phenotypic analysis, but can also be used for the design of optimal metaboli...... network structures. The major challenge for metabolic engineering in the post-genomic era is to broaden its design methodologies to incorporate genome-scale biological data. Genome-scale stoichiometric models of microorganisms represent a first step in this direction....

  12. Genome-wide Annotation, Identification, and Global Transcriptomic Analysis of Regulatory or Small RNA Gene Expression in Staphylococcus aureus.

    Science.gov (United States)

    Carroll, Ronan K; Weiss, Andy; Broach, William H; Wiemels, Richard E; Mogen, Austin B; Rice, Kelly C; Shaw, Lindsey N

    2016-02-09

    In Staphylococcus aureus, hundreds of small regulatory or small RNAs (sRNAs) have been identified, yet this class of molecule remains poorly understood and severely understudied. sRNA genes are typically absent from genome annotation files, and as a consequence, their existence is often overlooked, particularly in global transcriptomic studies. To facilitate improved detection and analysis of sRNAs in S. aureus, we generated updated GenBank files for three commonly used S. aureus strains (MRSA252, NCTC 8325, and USA300), in which we added annotations for >260 previously identified sRNAs. These files, the first to include genome-wide annotation of sRNAs in S. aureus, were then used as a foundation to identify novel sRNAs in the community-associated methicillin-resistant strain USA300. This analysis led to the discovery of 39 previously unidentified sRNAs. Investigating the genomic loci of the newly identified sRNAs revealed a surprising degree of inconsistency in genome annotation in S. aureus, which may be hindering the analysis and functional exploration of these elements. Finally, using our newly created annotation files as a reference, we perform a global analysis of sRNA gene expression in S. aureus and demonstrate that the newly identified tsr25 is the most highly upregulated sRNA in human serum. This study provides an invaluable resource to the S. aureus research community in the form of our newly generated annotation files, while at the same time presenting the first examination of differential sRNA expression in pathophysiologically relevant conditions. Despite a large number of studies identifying regulatory or small RNA (sRNA) genes in Staphylococcus aureus, their annotation is notably lacking in available genome files. In addition to this, there has been a considerable lack of cross-referencing in the wealth of studies identifying these elements, often leading to the same sRNA being identified multiple times and bearing multiple names. In this work

  13. Diversity and Genome Analysis of Australian and Global Oilseed Brassica napus L. Germplasm Using Transcriptomics and Whole Genome Re-sequencing

    Directory of Open Access Journals (Sweden)

    M. Michelle Malmberg

    2018-04-01

    Full Text Available Intensive breeding of Brassica napus has resulted in relatively low diversity, such that B. napus would benefit from germplasm improvement schemes that sustain diversity. As such, samples representative of global germplasm pools need to be assessed for existing population structure, diversity and linkage disequilibrium (LD. Complexity reduction genotyping-by-sequencing (GBS methods, including GBS-transcriptomics (GBS-t, enable cost-effective screening of a large number of samples, while whole genome re-sequencing (WGR delivers the ability to generate large numbers of unbiased genomic single nucleotide polymorphisms (SNPs, and identify structural variants (SVs. Furthermore, the development of genomic tools based on whole genomes representative of global oilseed diversity and orientated by the reference genome has substantial industry relevance and will be highly beneficial for canola breeding. As recent studies have focused on European and Chinese varieties, a global diversity panel as well as a substantial number of Australian spring types were included in this study. Focusing on industry relevance, 633 varieties were initially genotyped using GBS-t to examine population structure using 61,037 SNPs. Subsequently, 149 samples representative of global diversity were selected for WGR and both data sets used for a side-by-side evaluation of diversity and LD. The WGR data was further used to develop genomic resources consisting of a list of 4,029,750 high-confidence SNPs annotated using SnpEff, and SVs in the form of 10,976 deletions and 2,556 insertions. These resources form the basis of a reliable and repeatable system allowing greater integration between canola genomics studies, with a strong focus on breeding germplasm and industry applicability.

  14. Tools to covisualize and coanalyze proteomic data with genomes and transcriptomes: validation of genes and alternative mRNA splicing.

    Science.gov (United States)

    Pang, Chi Nam Ignatius; Tay, Aidan P; Aya, Carlos; Twine, Natalie A; Harkness, Linda; Hart-Smith, Gene; Chia, Samantha Z; Chen, Zhiliang; Deshpande, Nandan P; Kaakoush, Nadeem O; Mitchell, Hazel M; Kassem, Moustapha; Wilkins, Marc R

    2014-01-03

    Direct links between proteomic and genomic/transcriptomic data are not frequently made, partly because of lack of appropriate bioinformatics tools. To help address this, we have developed the PG Nexus pipeline. The PG Nexus allows users to covisualize peptides in the context of genomes or genomic contigs, along with RNA-seq reads. This is done in the Integrated Genome Viewer (IGV). A Results Analyzer reports the precise base position where LC-MS/MS-derived peptides cover genes or gene isoforms, on the chromosomes or contigs where this occurs. In prokaryotes, the PG Nexus pipeline facilitates the validation of genes, where annotation or gene prediction is available, or the discovery of genes using a "virtual protein"-based unbiased approach. We illustrate this with a comprehensive proteogenomics analysis of two strains of Campylobacter concisus . For higher eukaryotes, the PG Nexus facilitates gene validation and supports the identification of mRNA splice junction boundaries and splice variants that are protein-coding. This is illustrated with an analysis of splice junctions covered by human phosphopeptides, and other examples of relevance to the Chromosome-Centric Human Proteome Project. The PG Nexus is open-source and available from https://github.com/IntersectAustralia/ap11_Samifier. It has been integrated into Galaxy and made available in the Galaxy tool shed.

  15. Traumatic Brain Injury Induces Genome-Wide Transcriptomic, Methylomic, and Network Perturbations in Brain and Blood Predicting Neurological Disorders

    Directory of Open Access Journals (Sweden)

    Qingying Meng

    2017-02-01

    Full Text Available The complexity of the traumatic brain injury (TBI pathology, particularly concussive injury, is a serious obstacle for diagnosis, treatment, and long-term prognosis. Here we utilize modern systems biology in a rodent model of concussive injury to gain a thorough view of the impact of TBI on fundamental aspects of gene regulation, which have the potential to drive or alter the course of the TBI pathology. TBI perturbed epigenomic programming, transcriptional activities (expression level and alternative splicing, and the organization of genes in networks centered around genes such as Anax2, Ogn, and Fmod. Transcriptomic signatures in the hippocampus are involved in neuronal signaling, metabolism, inflammation, and blood function, and they overlap with those in leukocytes from peripheral blood. The homology between genomic signatures from blood and brain elicited by TBI provides proof of concept information for development of biomarkers of TBI based on composite genomic patterns. By intersecting with human genome-wide association studies, many TBI signature genes and network regulators identified in our rodent model were causally associated with brain disorders with relevant link to TBI. The overall results show that concussive brain injury reprograms genes which could lead to predisposition to neurological and psychiatric disorders, and that genomic information from peripheral leukocytes has the potential to predict TBI pathogenesis in the brain.

  16. A resource of large-scale molecular markers for monitoring Agropyron cristatum chromatin introgression in wheat background based on transcriptome sequences.

    Science.gov (United States)

    Zhang, Jinpeng; Liu, Weihua; Lu, Yuqing; Liu, Qunxing; Yang, Xinming; Li, Xiuquan; Li, Lihui

    2017-09-20

    Agropyron cristatum is a wild grass of the tribe Triticeae and serves as a gene donor for wheat improvement. However, very few markers can be used to monitor A. cristatum chromatin introgressions in wheat. Here, we reported a resource of large-scale molecular markers for tracking alien introgressions in wheat based on transcriptome sequences. By aligning A. cristatum unigenes with the Chinese Spring reference genome sequences, we designed 9602 A. cristatum expressed sequence tag-sequence-tagged site (EST-STS) markers for PCR amplification and experimental screening. As a result, 6063 polymorphic EST-STS markers were specific for the A. cristatum P genome in the single-receipt wheat background. A total of 4956 randomly selected polymorphic EST-STS markers were further tested in eight wheat variety backgrounds, and 3070 markers displaying stable and polymorphic amplification were validated. These markers covered more than 98% of the A. cristatum genome, and the marker distribution density was approximately 1.28 cM. An application case of all EST-STS markers was validated on the A. cristatum 6 P chromosome. These markers were successfully applied in the tracking of alien A. cristatum chromatin. Altogether, this study provided a universal method of large-scale molecular marker development to monitor wild relative chromatin in wheat.

  17. The OME Framework for genome-scale systems biology

    Energy Technology Data Exchange (ETDEWEB)

    Palsson, Bernhard O. [Univ. of California, San Diego, CA (United States); Ebrahim, Ali [Univ. of California, San Diego, CA (United States); Federowicz, Steve [Univ. of California, San Diego, CA (United States)

    2014-12-19

    The life sciences are undergoing continuous and accelerating integration with computational and engineering sciences. The biology that many in the field have been trained on may be hardly recognizable in ten to twenty years. One of the major drivers for this transformation is the blistering pace of advancements in DNA sequencing and synthesis. These advances have resulted in unprecedented amounts of new data, information, and knowledge. Many software tools have been developed to deal with aspects of this transformation and each is sorely needed [1-3]. However, few of these tools have been forced to deal with the full complexity of genome-scale models along with high throughput genome- scale data. This particular situation represents a unique challenge, as it is simultaneously necessary to deal with the vast breadth of genome-scale models and the dizzying depth of high-throughput datasets. It has been observed time and again that as the pace of data generation continues to accelerate, the pace of analysis significantly lags behind [4]. It is also evident that, given the plethora of databases and software efforts [5-12], it is still a significant challenge to work with genome-scale metabolic models, let alone next-generation whole cell models [13-15]. We work at the forefront of model creation and systems scale data generation [16-18]. The OME Framework was borne out of a practical need to enable genome-scale modeling and data analysis under a unified framework to drive the next generation of genome-scale biological models. Here we present the OME Framework. It exists as a set of Python classes. However, we want to emphasize the importance of the underlying design as an addition to the discussions on specifications of a digital cell. A great deal of work and valuable progress has been made by a number of communities [13, 19-24] towards interchange formats and implementations designed to achieve similar goals. While many software tools exist for handling genome-scale

  18. Genome-wide comparative transcriptome analysis of CMS-D2 and its maintainer and restorer lines in upland cotton.

    Science.gov (United States)

    Wu, Jianyong; Zhang, Meng; Zhang, Bingbing; Zhang, Xuexian; Guo, Liping; Qi, Tingxiang; Wang, Hailin; Zhang, Jinfa; Xing, Chaozhu

    2017-06-08

    Cytoplasmic male sterility (CMS) conferred by the cytoplasm from Gossypium harknessii (D2) is an important system for hybrid seed production in Upland cotton (G. hirsutum). The male sterility of CMS-D2 (i.e., A line) can be restored to fertility by a restorer (i.e., R line) carrying the restorer gene Rf1 transferred from the D2 nuclear genome. However, the molecular mechanisms of CMS-D2 and its restoration are poorly understood. In this study, a genome-wide comparative transcriptome analysis was performed to identify differentially expressed genes (DEGs) in flower buds among the isogenic fertile R line and sterile A line derived from a backcross population (BC 8 F 1 ) and the recurrent parent, i.e., the maintainer (B line). A total of 1464 DEGs were identified among the three isogenic lines, and the Rf1-carrying Chr_D05 and its homeologous Chr_A05 had more DEGs than other chromosomes. The results of GO and KEGG enrichment analysis showed differences in circadian rhythm between the fertile and sterile lines. Eleven DEGs were selected for validation using qRT-PCR, confirming the accuracy of the RNA-seq results. Through genome-wide comparative transcriptome analysis, the differential expression profiles of CMS-D2 and its maintainer and restorer lines in Upland cotton were identified. Our results provide an important foundation for further studies into the molecular mechanisms of the interactions between the restorer gene Rf1 and the CMS-D2 cytoplasm.

  19. Genome-Wide Transcriptome Analysis Reveals Extensive Alternative Splicing Events in the Protoscoleces of Echinococcus granulosus and Echinococcus multilocularis

    Science.gov (United States)

    Liu, Shuai; Zhou, Xiaosu; Hao, Lili; Piao, Xianyu; Hou, Nan; Chen, Qijun

    2017-01-01

    Alternative splicing (AS), as one of the most important topics in the post-genomic era, has been extensively studied in numerous organisms. However, little is known about the prevalence and characteristics of AS in Echinococcus species, which can cause significant health problems to humans and domestic animals. Based on high-throughput RNA-sequencing data, we performed a genome-wide survey of AS in two major pathogens of echinococcosis-Echinococcus granulosus and Echinococcus multilocularis. Our study revealed that the prevalence and characteristics of AS in protoscoleces of the two parasites were generally consistent with each other. A total of 6,826 AS events from 3,774 E. granulosus genes and 6,644 AS events from 3,611 E. multilocularis genes were identified in protoscolex transcriptomes, indicating that 33–36% of genes were subject to AS in the two parasites. Strikingly, intron retention instead of exon skipping was the predominant type of AS in Echinococcus species. Moreover, analysis of the Kyoto Encyclopedia of Genes and Genomes pathway indicated that genes that underwent AS events were significantly enriched in multiple pathways mainly related to metabolism (e.g., purine, fatty acid, galactose, and glycerolipid metabolism), signal transduction (e.g., Jak-STAT, VEGF, Notch, and GnRH signaling pathways), and genetic information processing (e.g., RNA transport and mRNA surveillance pathways). The landscape of AS obtained in this study will not only facilitate future investigations on transcriptome complexity and AS regulation during the life cycle of Echinococcus species, but also provide an invaluable resource for future functional and evolutionary studies of AS in platyhelminth parasites. PMID:28588571

  20. The Carcinogenic Liver Fluke, Clonorchis sinensis: New Assembly, Reannotation and Analysis of the Genome and Characterization of Tissue Transcriptomes

    Science.gov (United States)

    Wang, Xiaoyun; Liu, Hailiang; Chen, Yangyi; Guo, Lei; Luo, Fang; Sun, Jiufeng; Mao, Qiang; Liang, Pei; Xie, Zhizhi; Zhou, Chenhui; Tian, Yanli; Lv, Xiaoli; Huang, Lisi; Zhou, Juanjuan; Hu, Yue; Li, Ran; Zhang, Fan; Lei, Huali; Li, Wenfang; Hu, Xuchu; Liang, Chi; Xu, Jin; Li, Xuerong; Yu, Xinbing

    2013-01-01

    Clonorchis sinensis (C. sinensis), an important food-borne parasite that inhabits the intrahepatic bile duct and causes clonorchiasis, is of interest to both the public health field and the scientific research community. To learn more about the migration, parasitism and pathogenesis of C. sinensis at the molecular level, the present study developed an upgraded genomic assembly and annotation by sequencing paired-end and mate-paired libraries. We also performed transcriptome sequence analyses on multiple C. sinensis tissues (sucker, muscle, ovary and testis). Genes encoding molecules involved in responses to stimuli and muscle-related development were abundantly expressed in the oral sucker. Compared with other species, genes encoding molecules that facilitate the recognition and transport of cholesterol were observed in high copy numbers in the genome and were highly expressed in the oral sucker. Genes encoding transporters for fatty acids, glucose, amino acids and oxygen were also highly expressed, along with other molecules involved in metabolizing these substrates. All genes involved in energy metabolism pathways, including the β-oxidation of fatty acids, the citrate cycle, oxidative phosphorylation, and fumarate reduction, were expressed in the adults. Finally, we also provide valuable insights into the mechanism underlying the process of pathogenesis by characterizing the secretome of C. sinensis. The characterization and elaborate analysis of the upgraded genome and the tissue transcriptomes not only form a detailed and fundamental C. sinensis resource but also provide novel insights into the physiology and pathogenesis of C. sinensis. We anticipate that this work will aid the development of innovative strategies for the prevention and control of clonorchiasis. PMID:23382950

  1. The carcinogenic liver fluke, Clonorchis sinensis: new assembly, reannotation and analysis of the genome and characterization of tissue transcriptomes.

    Directory of Open Access Journals (Sweden)

    Yan Huang

    Full Text Available Clonorchis sinensis (C. sinensis, an important food-borne parasite that inhabits the intrahepatic bile duct and causes clonorchiasis, is of interest to both the public health field and the scientific research community. To learn more about the migration, parasitism and pathogenesis of C. sinensis at the molecular level, the present study developed an upgraded genomic assembly and annotation by sequencing paired-end and mate-paired libraries. We also performed transcriptome sequence analyses on multiple C. sinensis tissues (sucker, muscle, ovary and testis. Genes encoding molecules involved in responses to stimuli and muscle-related development were abundantly expressed in the oral sucker. Compared with other species, genes encoding molecules that facilitate the recognition and transport of cholesterol were observed in high copy numbers in the genome and were highly expressed in the oral sucker. Genes encoding transporters for fatty acids, glucose, amino acids and oxygen were also highly expressed, along with other molecules involved in metabolizing these substrates. All genes involved in energy metabolism pathways, including the β-oxidation of fatty acids, the citrate cycle, oxidative phosphorylation, and fumarate reduction, were expressed in the adults. Finally, we also provide valuable insights into the mechanism underlying the process of pathogenesis by characterizing the secretome of C. sinensis. The characterization and elaborate analysis of the upgraded genome and the tissue transcriptomes not only form a detailed and fundamental C. sinensis resource but also provide novel insights into the physiology and pathogenesis of C. sinensis. We anticipate that this work will aid the development of innovative strategies for the prevention and control of clonorchiasis.

  2. Comparative genomic and transcriptomic analysis revealed genetic characteristics related to solvent formation and xylose utilization in Clostridium acetobutylicum EA 2018

    Directory of Open Access Journals (Sweden)

    Wang Shengyue

    2011-02-01

    Full Text Available Abstract Background Clostridium acetobutylicum, a gram-positive and spore-forming anaerobe, is a major strain for the fermentative production of acetone, butanol and ethanol. But a previously isolated hyper-butanol producing strain C. acetobutylicum EA 2018 does not produce spores and has greater capability of solvent production, especially for butanol, than the type strain C. acetobutylicum ATCC 824. Results Complete genome of C. acetobutylicum EA 2018 was sequenced using Roche 454 pyrosequencing. Genomic comparison with ATCC 824 identified many variations which may contribute to the hyper-butanol producing characteristics in the EA 2018 strain, including a total of 46 deletion sites and 26 insertion sites. In addition, transcriptomic profiling of gene expression in EA 2018 relative to that of ATCC824 revealed expression-level changes of several key genes related to solvent formation. For example, spo0A and adhEII have higher expression level, and most of the acid formation related genes have lower expression level in EA 2018. Interestingly, the results also showed that the variation in CEA_G2622 (CAC2613 in ATCC 824, a putative transcriptional regulator involved in xylose utilization, might accelerate utilization of substrate xylose. Conclusions Comparative analysis of C. acetobutylicum hyper-butanol producing strain EA 2018 and type strain ATCC 824 at both genomic and transcriptomic levels, for the first time, provides molecular-level understanding of non-sporulation, higher solvent production and enhanced xylose utilization in the mutant EA 2018. The information could be valuable for further genetic modification of C. acetobutylicum for more effective butanol production.

  3. A genomic and transcriptomic approach for a differential diagnosis between primary and secondary ovarian carcinomas in patients with a previous history of breast cancer

    International Nuclear Information System (INIS)

    Meyniel, Jean-Philippe; Alran, Séverine; Rapinat, Audrey; Gentien, David; Roman-Roman, Sergio; Mignot, Laurent; Sastre-Garau, Xavier; Cottu, Paul H; Decraene, Charles; Stern, Marc-Henri; Couturier, Jérôme; Lebigot, Ingrid; Nicolas, André; Weber, Nina; Fourchotte, Virginie

    2010-01-01

    The distinction between primary and secondary ovarian tumors may be challenging for pathologists. The purpose of the present work was to develop genomic and transcriptomic tools to further refine the pathological diagnosis of ovarian tumors after a previous history of breast cancer. Sixteen paired breast-ovary tumors from patients with a former diagnosis of breast cancer were collected. The genomic profiles of paired tumors were analyzed using the Affymetrix GeneChip ® Mapping 50 K Xba Array or Genome-Wide Human SNP Array 6.0 (for one pair), and the data were normalized with ITALICS (ITerative and Alternative normaLIzation and Copy number calling for affymetrix Snp arrays) algorithm or Partek Genomic Suite, respectively. The transcriptome of paired samples was analyzed using Affymetrix GeneChip ® Human Genome U133 Plus 2.0 Arrays, and the data were normalized with gc-Robust Multi-array Average (gcRMA) algorithm. A hierarchical clustering of these samples was performed, combined with a dataset of well-identified primary and secondary ovarian tumors. In 12 of the 16 paired tumors analyzed, the comparison of genomic profiles confirmed the pathological diagnosis of primary ovarian tumor (n = 5) or metastasis of breast cancer (n = 7). Among four cases with uncertain pathological diagnosis, genomic profiles were clearly distinct between the ovarian and breast tumors in two pairs, thus indicating primary ovarian carcinomas, and showed common patterns in the two others, indicating metastases from breast cancer. In all pairs, the result of the transcriptomic analysis was concordant with that of the genomic analysis. In patients with ovarian carcinoma and a previous history of breast cancer, SNP array analysis can be used to distinguish primary and secondary ovarian tumors. Transcriptomic analysis may be used when primary breast tissue specimen is not available

  4. Population genetics, phylogenomics and hybrid speciation of Juglans in China determined from whole chloroplast genomes, transcriptomes, and genotyping-by-sequencing (GBS)

    Science.gov (United States)

    Peng Zhao; Hui-Juan Zhou; Daniel Potter; Yi-Heng Hu; Xiao-Jia Feng; Meng Dang; Li Feng; Saman Zulfiqar; Wen-Zhe Liu; Gui-Fang Zhao; Keith Woeste

    2018-01-01

    Genomic data are a powerful tool for elucidating the processes involved in the evolution and divergence of species. The speciation and phylogenetic relationships among Chinese Juglans remain unclear. Here, we used results from phylogenomic and population genetic analyses, transcriptomics, Genotyping-By-Sequencing (GBS), and whole chloroplast...

  5. Genomic and Transcriptomic Associations Identify a New Insecticide Resistance Phenotype for the Selective Sweep at the Cyp6g1 Locus of Drosophila melanogaster.

    Science.gov (United States)

    Battlay, Paul; Schmidt, Joshua M; Fournier-Level, Alexandre; Robin, Charles

    2016-08-09

    Scans of the Drosophila melanogaster genome have identified organophosphate resistance loci among those with the most pronounced signature of positive selection. In this study, the molecular basis of resistance to the organophosphate insecticide azinphos-methyl was investigated using the Drosophila Genetic Reference Panel, and genome-wide association. Recently released full transcriptome data were used to extend the utility of the Drosophila Genetic Reference Panel resource beyond traditional genome-wide association studies to allow systems genetics analyses of phenotypes. We found that both genomic and transcriptomic associations independently identified Cyp6g1, a gene involved in resistance to DDT and neonicotinoid insecticides, as the top candidate for azinphos-methyl resistance. This was verified by transgenically overexpressing Cyp6g1 using natural regulatory elements from a resistant allele, resulting in a 6.5-fold increase in resistance. We also identified four novel candidate genes associated with azinphos-methyl resistance, all of which are involved in either regulation of fat storage, or nervous system development. In Cyp6g1, we find a demonstrable resistance locus, a verification that transcriptome data can be used to identify variants associated with insecticide resistance, and an overlap between peaks of a genome-wide association study, and a genome-wide selective sweep analysis. Copyright © 2016 Battlay et al.

  6. Discovery of genes related to insecticide resistance in Bactrocera dorsalis by functional genomic analysis of a de novo assembled transcriptome.

    Science.gov (United States)

    Hsu, Ju-Chun; Chien, Ting-Ying; Hu, Chia-Cheng; Chen, Mei-Ju May; Wu, Wen-Jer; Feng, Hai-Tung; Haymer, David S; Chen, Chien-Yu

    2012-01-01

    Insecticide resistance has recently become a critical concern for control of many insect pest species. Genome sequencing and global quantization of gene expression through analysis of the transcriptome can provide useful information relevant to this challenging problem. The oriental fruit fly, Bactrocera dorsalis, is one of the world's most destructive agricultural pests, and recently it has been used as a target for studies of genetic mechanisms related to insecticide resistance. However, prior to this study, the molecular data available for this species was largely limited to genes identified through homology. To provide a broader pool of gene sequences of potential interest with regard to insecticide resistance, this study uses whole transcriptome analysis developed through de novo assembly of short reads generated by next-generation sequencing (NGS). The transcriptome of B. dorsalis was initially constructed using Illumina's Solexa sequencing technology. Qualified reads were assembled into contigs and potential splicing variants (isotigs). A total of 29,067 isotigs have putative homologues in the non-redundant (nr) protein database from NCBI, and 11,073 of these correspond to distinct D. melanogaster proteins in the RefSeq database. Approximately 5,546 isotigs contain coding sequences that are at least 80% complete and appear to represent B. dorsalis genes. We observed a strong correlation between the completeness of the assembled sequences and the expression intensity of the transcripts. The assembled sequences were also used to identify large numbers of genes potentially belonging to families related to insecticide resistance. A total of 90 P450-, 42 GST-and 37 COE-related genes, representing three major enzyme families involved in insecticide metabolism and resistance, were identified. In addition, 36 isotigs were discovered to contain target site sequences related to four classes of resistance genes. Identified sequence motifs were also analyzed to

  7. Improving transcriptome de novo assembly by using a reference genome of a related species: Translational genomics from oil palm to coconut.

    Directory of Open Access Journals (Sweden)

    Alix Armero

    Full Text Available The palms are a family of tropical origin and one of the main constituents of the ecosystems of these regions around the world. The two main species of palm represent different challenges: coconut (Cocos nucifera L. is a source of multiple goods and services in tropical communities, while oil palm (Elaeis guineensis Jacq is the main protagonist of the oil market. In this study, we present a workflow that exploits the comparative genomics between a target species (coconut and a reference species (oil palm to improve the transcriptomic data, providing a proteome useful to answer functional or evolutionary questions. This workflow reduces redundancy and fragmentation, two inherent problems of transcriptomic data, while preserving the functional representation of the target species. Our approach was validated in Arabidopsis thaliana using Arabidopsis lyrata and Capsella rubella as references species. This analysis showed the high sensitivity and specificity of our strategy, relatively independent of the reference proteome. The workflow increased the length of proteins products in A. thaliana by 13%, allowing, often, to recover 100% of the protein sequence length. In addition redundancy was reduced by a factor greater than 3. In coconut, the approach generated 29,366 proteins, 1,246 of these proteins deriving from new contigs obtained with the BRANCH software. The coconut proteome presented a functional profile similar to that observed in rice and an important number of metabolic pathways related to secondary metabolism. The new sequences found with BRANCH software were enriched in functions related to biotic stress. Our strategy can be used as a complementary step to de novo transcriptome assembly to get a representative proteome of a target species. The results of the current analysis are available on the website PalmComparomics (http://palm-comparomics.southgreen.fr/.

  8. Improving transcriptome de novo assembly by using a reference genome of a related species: Translational genomics from oil palm to coconut.

    Science.gov (United States)

    Armero, Alix; Baudouin, Luc; Bocs, Stéphanie; This, Dominique

    2017-01-01

    The palms are a family of tropical origin and one of the main constituents of the ecosystems of these regions around the world. The two main species of palm represent different challenges: coconut (Cocos nucifera L.) is a source of multiple goods and services in tropical communities, while oil palm (Elaeis guineensis Jacq) is the main protagonist of the oil market. In this study, we present a workflow that exploits the comparative genomics between a target species (coconut) and a reference species (oil palm) to improve the transcriptomic data, providing a proteome useful to answer functional or evolutionary questions. This workflow reduces redundancy and fragmentation, two inherent problems of transcriptomic data, while preserving the functional representation of the target species. Our approach was validated in Arabidopsis thaliana using Arabidopsis lyrata and Capsella rubella as references species. This analysis showed the high sensitivity and specificity of our strategy, relatively independent of the reference proteome. The workflow increased the length of proteins products in A. thaliana by 13%, allowing, often, to recover 100% of the protein sequence length. In addition redundancy was reduced by a factor greater than 3. In coconut, the approach generated 29,366 proteins, 1,246 of these proteins deriving from new contigs obtained with the BRANCH software. The coconut proteome presented a functional profile similar to that observed in rice and an important number of metabolic pathways related to secondary metabolism. The new sequences found with BRANCH software were enriched in functions related to biotic stress. Our strategy can be used as a complementary step to de novo transcriptome assembly to get a representative proteome of a target species. The results of the current analysis are available on the website PalmComparomics (http://palm-comparomics.southgreen.fr/).

  9. Genome and transcriptome analyses of the mountain pine beetle-fungal symbiont Grosmannia clavigera, a lodgepole pine pathogen.

    Science.gov (United States)

    DiGuistini, Scott; Wang, Ye; Liao, Nancy Y; Taylor, Greg; Tanguay, Philippe; Feau, Nicolas; Henrissat, Bernard; Chan, Simon K; Hesse-Orce, Uljana; Alamouti, Sepideh Massoumi; Tsui, Clement K M; Docking, Roderick T; Levasseur, Anthony; Haridas, Sajeet; Robertson, Gordon; Birol, Inanc; Holt, Robert A; Marra, Marco A; Hamelin, Richard C; Hirst, Martin; Jones, Steven J M; Bohlmann, Jörg; Breuil, Colette

    2011-02-08

    In western North America, the current outbreak of the mountain pine beetle (MPB) and its microbial associates has destroyed wide areas of lodgepole pine forest, including more than 16 million hectares in British Columbia. Grosmannia clavigera (Gc), a critical component of the outbreak, is a symbiont of the MPB and a pathogen of pine trees. To better understand the interactions between Gc, MPB, and lodgepole pine hosts, we sequenced the ∼30-Mb Gc genome and assembled it into 18 supercontigs. We predict 8,314 protein-coding genes, and support the gene models with proteome, expressed sequence tag, and RNA-seq data. We establish that Gc is heterothallic, and report evidence for repeat-induced point mutation. We report insights, from genome and transcriptome analyses, into how Gc tolerates conifer-defense chemicals, including oleoresin terpenoids, as they colonize a host tree. RNA-seq data indicate that terpenoids induce a substantial antimicrobial stress in Gc, and suggest that the fungus may detoxify these chemicals by using them as a carbon source. Terpenoid treatment strongly activated a ∼100-kb region of the Gc genome that contains a set of genes that may be important for detoxification of these host-defense chemicals. This work is a major step toward understanding the biological interactions between the tripartite MPB/fungus/forest system.

  10. Genome-Wide Host-Pathogen Interaction Unveiled by Transcriptomic Response of Diamondback Moth to Fungal Infection.

    Directory of Open Access Journals (Sweden)

    Zhen-Jian Chu

    Full Text Available Genome-wide insight into insect pest response to the infection of Beauveria bassiana (fungal insect pathogen is critical for genetic improvement of fungal insecticides but has been poorly explored. We constructed three pairs of transcriptomes of Plutella xylostella larvae at 24, 36 and 48 hours post treatment of infection (hptI and of control (hptC for insight into the host-pathogen interaction at genomic level. There were 2143, 3200 and 2967 host genes differentially expressed at 24, 36 and 48 hptI/hptC respectively. These infection-responsive genes (~15% of the host genome were enriched in various immune processes, such as complement and coagulation cascades, protein digestion and absorption, and drug metabolism-cytochrome P450. Fungal penetration into cuticle and host defense reaction began at 24 hptI, followed by most intensive host immune response at 36 hptI and attenuated immunity at 48 hptI. Contrastingly, 44% of fungal genes were differentially expressed in the infection course and enriched in several biological processes, such as antioxidant activity, peroxidase activity and proteolysis. There were 1636 fungal genes co-expressed during 24-48 hptI, including 116 encoding putative secretion proteins. Our results provide novel insights into the insect-pathogen interaction and help to probe molecular mechanisms involved in the fungal infection to the global pest.

  11. Genome and transcriptome adaptation accompanying emergence of the definitive type 2 host-restricted Salmonella enterica serovar Typhimurium pathovar.

    Science.gov (United States)

    Kingsley, Robert A; Kay, Sally; Connor, Thomas; Barquist, Lars; Sait, Leanne; Holt, Kathryn E; Sivaraman, Karthi; Wileman, Thomas; Goulding, David; Clare, Simon; Hale, Christine; Seshasayee, Aswin; Harris, Simon; Thomson, Nicholas R; Gardner, Paul; Rabsch, Wolfgang; Wigley, Paul; Humphrey, Tom; Parkhill, Julian; Dougan, Gordon

    2013-08-27

    Salmonella enterica serovar Typhimurium definitive type 2 (DT2) is host restricted to Columba livia (rock or feral pigeon) but is also closely related to S. Typhimurium isolates that circulate in livestock and cause a zoonosis characterized by gastroenteritis in humans. DT2 isolates formed a distinct phylogenetic cluster within S. Typhimurium based on whole-genome-sequence polymorphisms. Comparative genome analysis of DT2 94-213 and S. Typhimurium SL1344, DT104, and D23580 identified few differences in gene content with the exception of variations within prophages. However, DT2 94-213 harbored 22 pseudogenes that were intact in other closely related S. Typhimurium strains. We report a novel in silico approach to identify single amino acid substitutions in proteins that have a high probability of a functional impact. One polymorphism identified using this method, a single-residue deletion in the Tar protein, abrogated chemotaxis to aspartate in vitro. DT2 94-213 also exhibited an altered transcriptional profile in response to culture at 42°C compared to that of SL1344. Such differentially regulated genes included a number involved in flagellum biosynthesis and motility. IMPORTANCE Whereas Salmonella enterica serovar Typhimurium can infect a wide range of animal species, some variants within this serovar exhibit a more limited host range and altered disease potential. Phylogenetic analysis based on whole-genome sequences can identify lineages associated with specific virulence traits, including host adaptation. This study represents one of the first to link pathogen-specific genetic signatures, including coding capacity, genome degradation, and transcriptional responses to host adaptation within a Salmonella serovar. We performed comparative genome analysis of reference and pigeon-adapted definitive type 2 (DT2) S. Typhimurium isolates alongside phenotypic and transcriptome analyses, to identify genetic signatures linked to host adaptation within the DT2 lineage.

  12. Ecological venomics: How genomics, transcriptomics and proteomics can shed new light on the ecology and evolution of venom.

    Science.gov (United States)

    Sunagar, Kartik; Morgenstern, David; Reitzel, Adam M; Moran, Yehu

    2016-03-01

    Animal venom is a complex cocktail of bioactive chemicals that traditionally drew interest mostly from biochemists and pharmacologists. However, in recent years the evolutionary and ecological importance of venom is realized as this trait has direct and strong influence on interactions between species. Moreover, venom content can be modulated by environmental factors. Like many other fields of biology, venom research has been revolutionized in recent years by the introduction of systems biology approaches, i.e., genomics, transcriptomics and proteomics. The employment of these methods in venom research is known as 'venomics'. In this review we describe the history and recent advancements of venomics and discuss how they are employed in studying venom in general and in particular in the context of evolutionary ecology. We also discuss the pitfalls and challenges of venomics and what the future may hold for this emerging scientific field. Copyright © 2015 Elsevier B.V. All rights reserved.

  13. Identification of candidate genes associated with porcine meat color traits by genome-wide transcriptome analysis

    OpenAIRE

    Bojiang Li; Chao Dong; Pinghua Li; Zhuqing Ren; Han Wang; Fengxiang Yu; Caibo Ning; Kaiqing Liu; Wei Wei; Ruihua Huang; Jie Chen; Wangjun Wu; Honglin Liu

    2016-01-01

    Meat color is considered to be the most important indicator of meat quality, however, the molecular mechanisms underlying traits related to meat color remain mostly unknown. In this study, to elucidate the molecular basis of meat color, we constructed six cDNA libraries from biceps femoris (Bf) and soleus (Sol), which exhibit obvious differences in meat color, and analyzed the whole-transcriptome differences between Bf (white muscle) and Sol (red muscle) using high-throughput sequencing techn...

  14. Systematic Identification and Assessment of Therapeutic Targets for Breast Cancer Based on Genome-Wide RNA Interference Transcriptomes

    Directory of Open Access Journals (Sweden)

    Yang Liu

    2017-02-01

    Full Text Available With accumulating public omics data, great efforts have been made to characterize the genetic heterogeneity of breast cancer. However, identifying novel targets and selecting the best from the sizeable lists of candidate targets is still a key challenge for targeted therapy, largely owing to the lack of economical, efficient and systematic discovery and assessment to prioritize potential therapeutic targets. Here, we describe an approach that combines the computational evaluation and objective, multifaceted assessment to systematically identify and prioritize targets for biological validation and therapeutic exploration. We first establish the reference gene expression profiles from breast cancer cell line MCF7 upon genome-wide RNA interference (RNAi of a total of 3689 genes, and the breast cancer query signatures using RNA-seq data generated from tissue samples of clinical breast cancer patients in the Cancer Genome Atlas (TCGA. Based on gene set enrichment analysis, we identified a set of 510 genes that when knocked down could significantly reverse the transcriptome of breast cancer state. We then perform multifaceted assessment to analyze the gene set to prioritize potential targets for gene therapy. We also propose drug repurposing opportunities and identify potentially druggable proteins that have been poorly explored with regard to the discovery of small-molecule modulators. Finally, we obtained a small list of candidate therapeutic targets for four major breast cancer subtypes, i.e., luminal A, luminal B, HER2+ and triple negative breast cancer. This RNAi transcriptome-based approach can be a helpful paradigm for relevant researches to identify and prioritize candidate targets for experimental validation.

  15. Genomic, transcriptomic, and proteomic approaches towards understanding the molecular mechanisms of salt tolerance in Frankia strains isolated from Casuarina trees.

    Science.gov (United States)

    Oshone, Rediet; Ngom, Mariama; Chu, Feixia; Mansour, Samira; Sy, Mame Ourèye; Champion, Antony; Tisa, Louis S

    2017-08-18

    Soil salinization is a worldwide problem that is intensifying because of the effects of climate change. An effective method for the reclamation of salt-affected soils involves initiating plant succession using fast growing, nitrogen fixing actinorhizal trees such as the Casuarina. The salt tolerance of Casuarina is enhanced by the nitrogen-fixing symbiosis that they form with the actinobacterium Frankia. Identification and molecular characterization of salt-tolerant Casuarina species and associated Frankia is imperative for the successful utilization of Casuarina trees in saline soil reclamation efforts. In this study, salt-tolerant and salt-sensitive Casuarina associated Frankia strains were identified and comparative genomics, transcriptome profiling, and proteomics were employed to elucidate the molecular mechanisms of salt and osmotic stress tolerance. Salt-tolerant Frankia strains (CcI6 and Allo2) that could withstand up to 1000 mM NaCl and a salt-sensitive Frankia strain (CcI3) which could withstand only up to 475 mM NaCl were identified. The remaining isolates had intermediate levels of salt tolerance with MIC values ranging from 650 mM to 750 mM. Comparative genomic analysis showed that all of the Frankia isolates from Casuarina belonged to the same species (Frankia casuarinae). Pangenome analysis revealed a high abundance of singletons among all Casuarina isolates. The two salt-tolerant strains contained 153 shared single copy genes (most of which code for hypothetical proteins) that were not found in the salt-sensitive(CcI3) and moderately salt-tolerant (CeD) strains. RNA-seq analysis of one of the two salt-tolerant strains (Frankia sp. strain CcI6) revealed hundreds of genes differentially expressed under salt and/or osmotic stress. Among the 153 genes, 7 and 7 were responsive to salt and osmotic stress, respectively. Proteomic profiling confirmed the transcriptome results and identified 19 and 8 salt and/or osmotic stress-responsive proteins in the

  16. Ensembl Genomes 2013: scaling up access to genome-wide data.

    Science.gov (United States)

    Kersey, Paul Julian; Allen, James E; Christensen, Mikkel; Davis, Paul; Falin, Lee J; Grabmueller, Christoph; Hughes, Daniel Seth Toney; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Langridge, Nicholas; McDowall, Mark D; Maheswari, Uma; Maslen, Gareth; Nuhn, Michael; Ong, Chuang Kee; Paulini, Michael; Pedro, Helder; Toneva, Iliana; Tuli, Mary Ann; Walts, Brandon; Williams, Gareth; Wilson, Derek; Youens-Clark, Ken; Monaco, Marcela K; Stein, Joshua; Wei, Xuehong; Ware, Doreen; Bolser, Daniel M; Howe, Kevin Lee; Kulesha, Eugene; Lawson, Daniel; Staines, Daniel Michael

    2014-01-01

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species. The project exploits and extends technologies for genome annotation, analysis and dissemination, developed in the context of the vertebrate-focused Ensembl project, and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. This article provides an update to the previous publications about the resource, with a focus on recent developments. These include the addition of important new genomes (and related data sets) including crop plants, vectors of human disease and eukaryotic pathogens. In addition, the resource has scaled up its representation of bacterial genomes, and now includes the genomes of over 9000 bacteria. Specific extensions to the web and programmatic interfaces have been developed to support users in navigating these large data sets. Looking forward, analytic tools to allow targeted selection of data for visualization and download are likely to become increasingly important in future as the number of available genomes increases within all domains of life, and some of the challenges faced in representing bacterial data are likely to become commonplace for eukaryotes in future.

  17. Multi-scale structural community organisation of the human genome.

    Science.gov (United States)

    Boulos, Rasha E; Tremblay, Nicolas; Arneodo, Alain; Borgnat, Pierre; Audit, Benjamin

    2017-04-11

    Structural interaction frequency matrices between all genome loci are now experimentally achievable thanks to high-throughput chromosome conformation capture technologies. This ensues a new methodological challenge for computational biology which consists in objectively extracting from these data the structural motifs characteristic of genome organisation. We deployed the fast multi-scale community mining algorithm based on spectral graph wavelets to characterise the networks of intra-chromosomal interactions in human cell lines. We observed that there exist structural domains of all sizes up to chromosome length and demonstrated that the set of structural communities forms a hierarchy of chromosome segments. Hence, at all scales, chromosome folding predominantly involves interactions between neighbouring sites rather than the formation of links between distant loci. Multi-scale structural decomposition of human chromosomes provides an original framework to question structural organisation and its relationship to functional regulation across the scales. By construction the proposed methodology is independent of the precise assembly of the reference genome and is thus directly applicable to genomes whose assembly is not fully determined.

  18. The Genome-Scale Integrated Networks in Microorganisms

    Directory of Open Access Journals (Sweden)

    Tong Hao

    2018-02-01

    Full Text Available The genome-scale cellular network has become a necessary tool in the systematic analysis of microbes. In a cell, there are several layers (i.e., types of the molecular networks, for example, genome-scale metabolic network (GMN, transcriptional regulatory network (TRN, and signal transduction network (STN. It has been realized that the limitation and inaccuracy of the prediction exist just using only a single-layer network. Therefore, the integrated network constructed based on the networks of the three types attracts more interests. The function of a biological process in living cells is usually performed by the interaction of biological components. Therefore, it is necessary to integrate and analyze all the related components at the systems level for the comprehensively and correctly realizing the physiological function in living organisms. In this review, we discussed three representative genome-scale cellular networks: GMN, TRN, and STN, representing different levels (i.e., metabolism, gene regulation, and cellular signaling of a cell’s activities. Furthermore, we discussed the integration of the networks of the three types. With more understanding on the complexity of microbial cells, the development of integrated network has become an inevitable trend in analyzing genome-scale cellular networks of microorganisms.

  19. Symbiodinium transcriptomes: genome insights into the dinoflagellate symbionts of reef-building corals.

    KAUST Repository

    Bayer, Till

    2012-04-18

    Dinoflagellates are unicellular algae that are ubiquitously abundant in aquatic environments. Species of the genus Symbiodinium form symbiotic relationships with reef-building corals and other marine invertebrates. Despite their ecologic importance, little is known about the genetics of dinoflagellates in general and Symbiodinium in particular. Here, we used 454 sequencing to generate transcriptome data from two Symbiodinium species from different clades (clade A and clade B). With more than 56,000 assembled sequences per species, these data represent the largest transcriptomic resource for dinoflagellates to date. Our results corroborate previous observations that dinoflagellates possess the complete nucleosome machinery. We found a complete set of core histones as well as several H3 variants and H2A.Z in one species. Furthermore, transcriptome analysis points toward a low number of transcription factors in Symbiodinium spp. that also differ in the distribution of DNA-binding domains relative to other eukaryotes. In particular the cold shock domain was predominant among transcription factors. Additionally, we found a high number of antioxidative genes in comparison to non-symbiotic but evolutionary related organisms. These findings might be of relevance in the context of the role that Symbiodinium spp. play as coral symbionts.Our data represent the most comprehensive dinoflagellate EST data set to date. This study provides a comprehensive resource to further analyze the genetic makeup, metabolic capacities, and gene repertoire of Symbiodinium and dinoflagellates. Overall, our findings indicate that Symbiodinium possesses some unique characteristics, in particular the transcriptional regulation in Symbiodinium may differ from the currently known mechanisms of eukaryotic gene regulation.

  20. A flexible whole-genome microarray for transcriptomics in three-spine stickleback (Gasterosteus aculeatus

    Directory of Open Access Journals (Sweden)

    Primmer Craig R

    2009-09-01

    Full Text Available Abstract Background The use of microarray technology for describing changes in mRNA expression to address ecological and evolutionary questions is becoming increasingly popular. Since three-spine stickleback are an important ecological and evolutionary model-species as well as an emerging model for eco-toxicology, the ability to have a functional and flexible microarray platform for transcriptome studies will greatly enhance the research potential in these areas. Results We designed 43,392 unique oligonucleotide probes representing 19,274 genes (93% of the estimated total gene number, and tested the hybridization performance of both DNA and RNA from different populations to determine the efficacy of probe design for transcriptome analysis using the Agilent array platform. The majority of probes were functional as evidenced by the DNA hybridization success, and 30,946 probes (14,615 genes had a signal that was significantly above background for RNA isolated from liver tissue. Genes identified as being expressed in liver tissue were grouped into functional categories for each of the three Gene Ontology groups: biological process, molecular function, and cellular component. As expected, the highest proportions of functional categories belonged to those associated with metabolic functions: metabolic process, binding, catabolism, and organelles. Conclusion The probe and microarray design presented here provides an important step facilitating transcriptomics research for this important research organism by providing a set of over 43,000 probes whose hybridization success and specificity to liver expression has been demonstrated. Probes can easily be added or removed from the current design to tailor the array to specific experiments and additional flexibility lies in the ability to perform either one-color or two-color hybridizations.

  1. Quantitative RNA-Seq analysis in non-model species: assessing transcriptome assemblies as a scaffold and the utility of evolutionary divergent genomic reference species

    Directory of Open Access Journals (Sweden)

    Hornett Emily A

    2012-08-01

    Full Text Available Abstract Background How well does RNA-Seq data perform for quantitative whole gene expression analysis in the absence of a genome? This is one unanswered question facing the rapidly growing number of researchers studying non-model species. Using Homo sapiens data and resources, we compared the direct mapping of sequencing reads to predicted genes from the genome with mapping to de novo transcriptomes assembled from RNA-Seq data. Gene coverage and expression analysis was further investigated in the non-model context by using increasingly divergent genomic reference species to group assembled contigs by unique genes. Results Eight transcriptome sets, composed of varying amounts of Illumina and 454 data, were assembled and assessed. Hybrid 454/Illumina assemblies had the highest transcriptome and individual gene coverage. Quantitative whole gene expression levels were highly similar between using a de novo hybrid assembly and the predicted genes as a scaffold, although mapping to the de novo transcriptome assembly provided data on fewer genes. Using non-target species as reference scaffolds does result in some loss of sequence and expression data, and bias and error increase with evolutionary distance. However, within a 100 million year window these effect sizes are relatively small. Conclusions Predicted gene sets from sequenced genomes of related species can provide a powerful method for grouping RNA-Seq reads and annotating contigs. Gene expression results can be produced that are similar to results obtained using gene models derived from a high quality genome, though biased towards conserved genes. Our results demonstrate the power and limitations of conducting RNA-Seq in non-model species.

  2. Comparative genomics and transcriptomics of Escherichia coli isolates carrying virulence factors of both enteropathogenic and enterotoxigenic E. coli.

    Science.gov (United States)

    Hazen, Tracy H; Michalski, Jane; Luo, Qingwei; Shetty, Amol C; Daugherty, Sean C; Fleckenstein, James M; Rasko, David A

    2017-06-14

    Escherichia coli that are capable of causing human disease are often classified into pathogenic variants (pathovars) based on their virulence gene content. However, disease-associated hybrid E. coli, containing unique combinations of multiple canonical virulence factors have also been described. Such was the case of the E. coli O104:H4 outbreak in 2011, which caused significant morbidity and mortality. Among the pathovars of diarrheagenic E. coli that cause significant human disease are the enteropathogenic E. coli (EPEC) and enterotoxigenic E. coli (ETEC). In the current study we use comparative genomics, transcriptomics, and functional studies to characterize isolates that contain virulence factors of both EPEC and ETEC. Based on phylogenomic analysis, these hybrid isolates are more genomically-related to EPEC, but appear to have acquired ETEC virulence genes. Global transcriptional analysis using RNA sequencing, demonstrated that the EPEC and ETEC virulence genes of these hybrid isolates were differentially-expressed under virulence-inducing laboratory conditions, similar to reference isolates. Immunoblot assays further verified that the virulence gene products were produced and that the T3SS effector EspB of EPEC, and heat-labile toxin of ETEC were secreted. These findings document the existence and virulence potential of an E. coli pathovar hybrid that blurs the distinction between E. coli pathovars.

  3. Genomic and transcriptomic insights into the cytochrome P450 monooxygenase gene repertoire in the rice pest brown planthopper, Nilaparvata lugens.

    Science.gov (United States)

    Lao, Shu-Hua; Huang, Xiao-Hui; Huang, Hai-Jian; Liu, Cheng-Wen; Zhang, Chuan-Xi; Bao, Yan-Yuan

    2015-11-01

    The cytochrome P450 monooxygenase (P450) gene family is one of the most abundant eukaryotic gene families that encode detoxification enzymes. In this study, we identified an abundance of P450 gene repertoire through genome- and transcriptome-wide analysis in the brown planthopper (Nilaparvata lugens), the most destructive rice pest in Asia. Detailed gene information including the exon-intron organization, size, transcription orientation and distribution in the genome revealed that many P450 loci were closely situated on the same scaffold, indicating frequent occurrence of gene duplications. Insecticide-response expression profiling revealed that imidacloprid significantly increased NlCYP6CS1v2, NLCYP4CE1v2, NlCYP4DE1, NlCYP417A1v2 and NlCYP439A1 expression; while triazophos and deltamethrin notably enhanced NlCYP303A1 expression. Expression analysis at the developmental stage showed the egg-, nymph-, male- and female-specific expression patterns of N. lugens P450 genes. These novel findings will be helpful for clarifying the P450 functions in physiological processes including development, reproduction and insecticide resistance in this insect species. Copyright © 2015 Elsevier Inc. All rights reserved.

  4. Analysis of the Phlebiopsis gigantea Genome, Transcriptome and Secretome Provides Insight into Its Pioneer Colonization Strategies of Wood

    Science.gov (United States)

    Hori, Chiaki; Ishida, Takuya; Igarashi, Kiyohiko; Samejima, Masahiro; Suzuki, Hitoshi; Master, Emma; Ferreira, Patricia; Ruiz-Dueñas, Francisco J.; Held, Benjamin; Canessa, Paulo; Larrondo, Luis F.; Schmoll, Monika; Druzhinina, Irina S.; Kubicek, Christian P.; Gaskell, Jill A.; Kersten, Phil; St. John, Franz; Glasner, Jeremy; Sabat, Grzegorz; Splinter BonDurant, Sandra; Syed, Khajamohiddin; Yadav, Jagjit; Mgbeahuruike, Anthony C.; Kovalchuk, Andriy; Asiegbu, Fred O.; Lackner, Gerald; Hoffmeister, Dirk; Rencoret, Jorge; Gutiérrez, Ana; Sun, Hui; Lindquist, Erika; Barry, Kerrie; Riley, Robert; Grigoriev, Igor V.; Henrissat, Bernard; Kües, Ursula; Berka, Randy M.; Martínez, Angel T.; Covert, Sarah F.; Blanchette, Robert A.; Cullen, Daniel

    2014-01-01

    Collectively classified as white-rot fungi, certain basidiomycetes efficiently degrade the major structural polymers of wood cell walls. A small subset of these Agaricomycetes, exemplified by Phlebiopsis gigantea, is capable of colonizing freshly exposed conifer sapwood despite its high content of extractives, which retards the establishment of other fungal species. The mechanism(s) by which P. gigantea tolerates and metabolizes resinous compounds have not been explored. Here, we report the annotated P. gigantea genome and compare profiles of its transcriptome and secretome when cultured on fresh-cut versus solvent-extracted loblolly pine wood. The P. gigantea genome contains a conventional repertoire of hydrolase genes involved in cellulose/hemicellulose degradation, whose patterns of expression were relatively unperturbed by the absence of extractives. The expression of genes typically ascribed to lignin degradation was also largely unaffected. In contrast, genes likely involved in the transformation and detoxification of wood extractives were highly induced in its presence. Their products included an ABC transporter, lipases, cytochrome P450s, glutathione S-transferase and aldehyde dehydrogenase. Other regulated genes of unknown function and several constitutively expressed genes are also likely involved in P. gigantea's extractives metabolism. These results contribute to our fundamental understanding of pioneer colonization of conifer wood and provide insight into the diverse chemistries employed by fungi in carbon cycling processes. PMID:25474575

  5. The Transcriptome of the Reference Potato Genome Solanum tuberosum Group Phureja Clone DM1-3 516R44

    Science.gov (United States)

    Massa, Alicia N.; Childs, Kevin L.; Lin, Haining; Bryan, Glenn J.; Giuliano, Giovanni; Buell, C. Robin

    2011-01-01

    Advances in molecular breeding in potato have been limited by its complex biological system, which includes vegetative propagation, autotetraploidy, and extreme heterozygosity. The availability of the potato genome and accompanying gene complement with corresponding gene structure, location, and functional annotation are powerful resources for understanding this complex plant and advancing molecular breeding efforts. Here, we report a reference for the potato transcriptome using 32 tissues and growth conditions from the doubled monoploid Solanum tuberosum Group Phureja clone DM1-3 516R44 for which a genome sequence is available. Analysis of greater than 550 million RNA-Seq reads permitted the detection and quantification of expression levels of over 22,000 genes. Hierarchical clustering and principal component analyses captured the biological variability that accounts for gene expression differences among tissues suggesting tissue-specific gene expression, and genes with tissue or condition restricted expression. Using gene co-expression network analysis, we identified 18 gene modules that represent tissue-specific transcriptional networks of major potato organs and developmental stages. This information provides a powerful resource for potato research as well as studies on other members of the Solanaceae family. PMID:22046362

  6. Genome-scale metabolic models applied to human health and disease.

    Science.gov (United States)

    Cook, Daniel J; Nielsen, Jens

    2017-11-01

    Advances in genome sequencing, high throughput measurement of gene and protein expression levels, data accessibility, and computational power have allowed genome-scale metabolic models (GEMs) to become a useful tool for understanding metabolic alterations associated with many different diseases. Despite the proven utility of GEMs, researchers confront multiple challenges in the use of GEMs, their application to human health and disease, and their construction and simulation in an organ-specific and disease-specific manner. Several approaches that researchers are taking to address these challenges include using proteomic and transcriptomic-informed methods to build GEMs for individual organs, diseases, and patients and using constraints on model behavior during simulation to match observed metabolic fluxes. We review the challenges facing researchers in the use of GEMs, review the approaches used to address these challenges, and describe advances that are on the horizon and could lead to a better understanding of human metabolism. WIREs Syst Biol Med 2017, 9:e1393. doi: 10.1002/wsbm.1393 For further resources related to this article, please visit the WIREs website. © 2017 Wiley Periodicals, Inc.

  7. Genome-scale metabolic representation of Amycolatopsis balhimycina

    DEFF Research Database (Denmark)

    Vongsangnak, Wanwipa; Figueiredo, L. F.; Förster, Jochen

    2012-01-01

    Infection caused by methicillin‐resistant Staphylococcus aureus (MRSA) is an increasing societal problem. Typically, glycopeptide antibiotics are used in the treatment of these infections. The most comprehensively studied glycopeptide antibiotic biosynthetic pathway is that of balhimycin...... to reconstruct a genome‐scale metabolic model for the organism. Here we generated an almost complete A. balhimycina genome sequence comprising 10,562,587 base pairs assembled into 2,153 contigs. The high GC‐genome (∼69%) includes 8,585 open reading frames (ORFs). We used our integrative toolbox called SEQTOR...

  8. Lifestyle transitions in plant pathogenic Colletotrichum fungi deciphered by genome and transcriptome analyses

    NARCIS (Netherlands)

    O'Connell, R.J.; Thon, M.R.; Hacquard, S.; Amyotte, S.G.; Kleemann, J.; Torres, M.F.; Damm, U.; Buiate, E.A.; Epstein, L.; Alkan, N.; Altmuller, J.; Alvarado-Balderrama, L.; Bauser, C.A.; Becker, C.; Birren, B.W.; Chen, Z.; Choi, J.; Crouch, J.A.; Duvick, J.P.; Farman, M.A.; Gan, P.; Heiman, D.; Henrissat, B.; Howard, R.J.; Kabbage, M.; Koch, C.; Kracher, B.; Kubo, Y.; Law, A.D.; Lebrun, M.-H.; Lee, Y.-H.; Miyara, I.; Moore, N.; Neumann, U.; Nordstrom, K.; Panaccione, D.G.; Panstruga, R.; Place, M.; Proctor, R.H.; Prusky, D.; Rech, G.; Reinhardt, R.; Rollins, J.A.; Rounsley, S.; Schardl, C.L.; Schwartz, D.C.; Shenoy, N.; Shirasu, K.; Sikhakolli, U.R.; Stuber, K.; Sukno, S.A.; Sweigard, J.A.; Takano, Y.; Takahara, H.; Trail, F.; Does, H.C.; Voll, L.M.; Will, I.; Young, S.; Zeng, Q.; Zhang, Jingze; Zhou, S.; Dickman, M.B.; Schulze-Lefert, P.; Verloren van Themaat, E.; Ma, L.-J.; Vaillancourt, L.J.

    2012-01-01

    Colletotrichum species are fungal pathogens that devastate crop plants worldwide. Host infection involves the differentiation of specialized cell types that are associated with penetration, growth inside living host cells (biotrophy) and tissue destruction (necrotrophy). We report here genome and

  9. Genomic and Transcriptomic Evidence for Carbohydrate Consumption among Microorganisms in a Cold Seep Brine Pool

    KAUST Repository

    Zhang, Weipeng; Ding, Wei; Yang, Bo; Tian, Renmao; Gu, Shuo; Luo, Haiwei; Qian, Pei-Yuan

    2016-01-01

    the Thuwal cold seep brine pool of the Red Sea. The recovered metagenome-assembled genomes (MAGs) belong to six different phyla: Actinobacteria, Proteobacteria, Candidatus Cloacimonetes, Candidatus Marinimicrobia, Bathyarchaeota, and Thaumarchaeota

  10. An in-depth comparison of the porcine, murine and human inflammasomes; lessons from the porcine genome and transcriptome.

    Science.gov (United States)

    Dawson, Harry D; Smith, Allen D; Chen, Celine; Urban, Joseph F

    2017-04-01

    Emerging evidence suggests that swine are a scientifically acceptable intermediate species between rodents and humans to model immune function relevant to humans. The swine genome has recently been sequenced and several preliminary structural and functional analysis of the porcine immunome have been published. Herein we provide an expanded in silico analysis using an improved assembly of the porcine transcriptome that provides an in depth analysis of genes that are related to inflammasomes, responses to Toll-like receptor ligands, and M1 macrophage polarization and Escherichia coli as a model organism. Comparisons of the expansion or contraction of orthologous gene families indicated more similar rates and classes of genes in humans and pigs than in mice; however several novel porcine or artiodactyl-specific paralogs or pseudogenes were identified. Conservation of homology and structural motifs of orthologs revealed that the overall similarity to human proteins was significantly higher for pigs compared to mouse. Despite these similarities, two out of four canonical inflammasome pathways, Absent in melanoma 2 (AIM2) and NLR family and CARD domain containing 4 (NLRC4), were found to be missing in pigs. Pig M1 Mφ polarization in response to interferon-γ (IFN-γ) and lipopolysaccharide (LPS) was assessed, via the transcriptome, using next generation sequencing. Our analysis revealed predominantly human-like responses however some, mouse-like responses were observed, as well as induction of numerous pig or artiodactyl-specific genes. This work supports using swine to model both human immunological and inflammatory responses to infection. However, caution must be exercised as pigs differ from humans in several fundamental pathways. Published by Elsevier B.V.

  11. The Dynamic Genome and Transcriptome of the Human Fungal Pathogen Blastomyces and Close Relative Emmonsia

    OpenAIRE

    Muñoz, José F.; Gauthier, Gregory M.; Desjardins, Christopher A.; Gallo, Juan E.; Holder, Jason; Sullivan, Thomas D.; Marty, Amber J.; Carmen, John C.; Chen, Zehua; Ding, Li; Gujja, Sharvari; Magrini, Vincent; Misas, Elizabeth; Mitreva, Makedonka; Priest, Margaret

    2015-01-01

    Three closely related thermally dimorphic pathogens are causal agents of major fungal diseases affecting humans in the Americas: blastomycosis, histoplasmosis and paracoccidioidomycosis. Here we report the genome sequence and analysis of four strains of the etiological agent of blastomycosis, Blastomyces, and two species of the related genus Emmonsia, typically pathogens of small mammals. Compared to related species, Blastomyces genomes are highly expanded, with long, often sharply demarcated...

  12. Large-scale transcriptome analysis reveals arabidopsis metabolic pathways are frequently influenced by different pathogens.

    Science.gov (United States)

    Jiang, Zhenhong; He, Fei; Zhang, Ziding

    2017-07-01

    Through large-scale transcriptional data analyses, we highlighted the importance of plant metabolism in plant immunity and identified 26 metabolic pathways that were frequently influenced by the infection of 14 different pathogens. Reprogramming of plant metabolism is a common phenomenon in plant defense responses. Currently, a large number of transcriptional profiles of infected tissues in Arabidopsis (Arabidopsis thaliana) have been deposited in public databases, which provides a great opportunity to understand the expression patterns of metabolic pathways during plant defense responses at the systems level. Here, we performed a large-scale transcriptome analysis based on 135 previously published expression samples, including 14 different pathogens, to explore the expression pattern of Arabidopsis metabolic pathways. Overall, metabolic genes are significantly changed in expression during plant defense responses. Upregulated metabolic genes are enriched on defense responses, and downregulated genes are enriched on photosynthesis, fatty acid and lipid metabolic processes. Gene set enrichment analysis (GSEA) identifies 26 frequently differentially expressed metabolic pathways (FreDE_Paths) that are differentially expressed in more than 60% of infected samples. These pathways are involved in the generation of energy, fatty acid and lipid metabolism as well as secondary metabolite biosynthesis. Clustering analysis based on the expression levels of these 26 metabolic pathways clearly distinguishes infected and control samples, further suggesting the importance of these metabolic pathways in plant defense responses. By comparing with FreDE_Paths from abiotic stresses, we find that the expression patterns of 26 FreDE_Paths from biotic stresses are more consistent across different infected samples. By investigating the expression correlation between transcriptional factors (TFs) and FreDE_Paths, we identify several notable relationships. Collectively, the current study

  13. Genome-Guided Analysis and Whole Transcriptome Profiling of the Mesophilic Syntrophic Acetate Oxidising Bacterium Syntrophaceticus schinkii.

    Directory of Open Access Journals (Sweden)

    Shahid Manzoor

    Full Text Available Syntrophaceticus schinkii is a mesophilic, anaerobic bacterium capable of oxidising acetate to CO2 and H2 in intimate association with a methanogenic partner, a syntrophic relationship which operates close to the energetic limits of microbial life. Syntrophaceticus schinkii has been identified as a key organism in engineered methane-producing processes relying on syntrophic acetate oxidation as the main methane-producing pathway. However, due to strict cultivation requirements and difficulties in reconstituting the thermodynamically unfavourable acetate oxidation, the physiology of this functional group is poorly understood. Genome-guided and whole transcriptome analyses performed in the present study provide new insights into habitat adaptation, syntrophic acetate oxidation and energy conservation. The working draft genome of Syntrophaceticus schinkii indicates limited metabolic capacities, with lack of organic nutrient uptake systems, chemotactic machineries, carbon catabolite repression and incomplete biosynthesis pathways. Ech hydrogenase, [FeFe] hydrogenases, [NiFe] hydrogenases, F1F0-ATP synthase and membrane-bound and cytoplasmic formate dehydrogenases were found clearly expressed, whereas Rnf and a predicted oxidoreductase/heterodisulphide reductase complex, both found encoded in the genome, were not expressed under syntrophic growth condition. A transporter sharing similarities to the high-affinity acetate transporters of aceticlastic methanogens was also found expressed, suggesting that Syntrophaceticus schinkii can potentially compete with methanogens for acetate. Acetate oxidation seems to proceed via the Wood-Ljungdahl pathway as all genes involved in this pathway were highly expressed. This study shows that Syntrophaceticus schinkii is a highly specialised, habitat-adapted organism relying on syntrophic acetate oxidation rather than metabolic versatility. By expanding its complement of respiratory complexes, it might overcome

  14. Genome, transcriptome, and secretome analysis of wood decay fungus postia placenta supports unique mechanisms of lignocellulose conversion

    Energy Technology Data Exchange (ETDEWEB)

    Martinez, Diego [Los Alamos National Laboratory; Challacombe, Jean F [Los Alamos National Laboratory; Misra, Monica [Los Alamos National Laboratory; Xie, Gary [Los Alamos National Laboratory; Brettin, Thomas [Los Alamos National Laboratory; Morgenstern, Ingo [CLARK UNIV; Hibbett, David [CLARK UNIV.; Schmoll, Monika [UNIV WIEN; Kubicek, Christian P [UNIV WIEN; Ferreira, Patricia [CIB, CSIC, MADRID; Ruiz - Duenase, Francisco J [CIB, CSIC, MADRID; Martinez, Angel T [CIB, CSIC, MADRID; Kersten, Phil [FOREST PRODUCTS LAB; Hammel, Kenneth E [FOREST PRODUCTS LAB; Vanden Wymelenberg, Amber [U. WISCONSIN; Gaskell, Jill [FOREST PRODUCTS LAB; Lindquist, Erika [DOE JGI; Sabati, Grzegorz [U. WISCONSIN; Bondurant, Sandra S [U. WISCONSIN; Larrondo, Luis F [U. CATHOLICA DE CHILE; Canessa, Paulo [U. CATHOLICA DE CHILE; Vicunna, Rafael [U. CATHOLICA DE CHILE; Yadavk, Jagiit [U. CINCINATTI; Doddapaneni, Harshavardhan [U. CINCINATTI; Subramaniank, Venkataramanan [U. CINCINATTI; Pisabarro, Antonio G [PUBLIC U. NAVARRE; Lavin, Jose L [PUBLIC U. NAVARRE; Oguiza, Jose A [PUBLIC U. NAVARRE; Master, Emma [U. TORONTO; Henrissat, Bernard [CNRS, MARSEILLE; Coutinho, Pedro M [CNRS, MARSEILLE; Harris, Paul [NOVOZYMES, INC.; Magnuson, Jon K [PNNL; Baker, Scott [PNNL; Bruno, Kenneth [PNNL; Kenealy, William [MASCOMA, INC.; Hoegger, Patrik J [GEORG-AUGUST-U.; Kues, Ursula [GEORG-AUGUST-U; Ramaiva, Preethi [NOVOZYMES, INC.; Lucas, Susan [DOE JGI; Salamov, Asaf [DOE JGI; Shapiro, Harris [DOE JGI; Tuh, Hank [DOE JGI; Chee, Christine L [UNM; Teter, Sarah [NOVOZYMES, INC.; Yaver, Debbie [NOVOZYMES, INC.; James, Tim [MCMASTER U.; Mokrejs, Martin [CHARLES U.; Pospisek, Martin [CHARLES U.; Grigoriev, Igor [DOE JGI; Rokhsar, Dan [DOE JGI; Berka, Randy [NOVOZYMES; Cullen, Dan [FOREST PRODUCTS LAB

    2008-01-01

    Brown-rot fungi such as Postia placenta are common inhabitants of forest ecosystems and are also largely responsible for the destructive decay of wooden structures. Rapid depolymerization of cellulose is a distinguishing feature of brown-rot, but the biochemical mechanisms and underlying genetics are poorly understood. Systematic examination of the P. placenta genome, transcriptome and secretome revealed unique extracellular enzyme systems, including an unusual repertoire of extracellular glycoside hydrolases. Genes encoding exocellobiohydrolases and cellulose-binding domains, typical of cellulolytic microbes, are absent in this efficient cellulose-degrading fungus. When P. placenta was grown in medium containing cellulose as sole carbon source, transcripts corresponding to many hemicellulases and to a single putative {beta}-1-4 endoglucanase were expressed at high levels relative to glucose grown cultures. These transcript profiles were confirmed by direct identification of peptides by liquid chromatography-tandem mass spectrometry (LC{center_dot}MSIMS). Also upregulated during growth on cellulose medium were putative iron reductases, quinone reductase, and structurally divergent oxidases potentially involved in extracellular generation of Fe(II) and H202. These observations are consistent with a biodegradative role for Fenton chemistry in which Fe(II) and H202 react to form hydroxyl radicals, highly reactive oxidants capable of depolymerizing cellulose. The P. placenta genome resources provide unparalleled opportunities for investigating such unusual mechanisms of cellulose conversion. More broadly, the genome offers insight into the diversification of lignocellulose degrading mechanisms in fungi. Comparisons to the closely related white-rot fungus Phanerochaete chrysosporium support an evolutionary shift from white-rot to brown-rot during which the capacity for efficient depolymerization of lignin was lost.

  15. Autism spectrum disorders: Integration of the genome, transcriptome and the environment.

    Science.gov (United States)

    Vijayakumar, N Thushara; Judy, M V

    2016-05-15

    Autism spectrum disorders denote a series of lifelong neurodevelopmental conditions characterized by an impaired social communication profile and often repetitive, stereotyped behavior. Recent years have seen the complex genetic architecture of the disease being progressively unraveled with advancements in gene finding technology and next generation sequencing methods. However, a complete elucidation of the molecular mechanisms behind autism is necessary for potential diagnostic and therapeutic applications. A multidisciplinary approach should be adopted where the focus is not only on the 'genetics' of autism but also on the combinational roles of epigenetics, transcriptomics, immune system disruption and environmental factors that could all influence the etiopathogenesis of the disease. ASD is a clinically heterogeneous disorder with great genetic complexity; only through an integrated multidimensional effort can modern autism research progress further. Copyright © 2016 Elsevier B.V. All rights reserved.

  16. Next-generation transcriptome assembly

    Energy Technology Data Exchange (ETDEWEB)

    Martin, Jeffrey A.; Wang, Zhong

    2011-09-01

    Transcriptomics studies often rely on partial reference transcriptomes that fail to capture the full catalog of transcripts and their variations. Recent advances in sequencing technologies and assembly algorithms have facilitated the reconstruction of the entire transcriptome by deep RNA sequencing (RNA-seq), even without a reference genome. However, transcriptome assembly from billions of RNA-seq reads, which are often very short, poses a significant informatics challenge. This Review summarizes the recent developments in transcriptome assembly approaches - reference-based, de novo and combined strategies-along with some perspectives on transcriptome assembly in the near future.

  17. Ensembl Genomes: an integrative resource for genome-scale data from non-vertebrate species.

    Science.gov (United States)

    Kersey, Paul J; Staines, Daniel M; Lawson, Daniel; Kulesha, Eugene; Derwent, Paul; Humphrey, Jay C; Hughes, Daniel S T; Keenan, Stephan; Kerhornou, Arnaud; Koscielny, Gautier; Langridge, Nicholas; McDowall, Mark D; Megy, Karine; Maheswari, Uma; Nuhn, Michael; Paulini, Michael; Pedro, Helder; Toneva, Iliana; Wilson, Derek; Yates, Andrew; Birney, Ewan

    2012-01-01

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrative resource for genome-scale data from non-vertebrate species. The project exploits and extends technology (for genome annotation, analysis and dissemination) developed in the context of the (vertebrate-focused) Ensembl project and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. Since its launch in 2009, Ensembl Genomes has undergone rapid expansion, with the goal of providing coverage of all major experimental organisms, and additionally including taxonomic reference points to provide the evolutionary context in which genes can be understood. Against the backdrop of a continuing increase in genome sequencing activities in all parts of the tree of life, we seek to work, wherever possible, with the communities actively generating and using data, and are participants in a growing range of collaborations involved in the annotation and analysis of genomes.

  18. Power Laws, Scale-Free Networks and Genome Biology

    CERN Document Server

    Koonin, Eugene V; Karev, Georgy P

    2006-01-01

    Power Laws, Scale-free Networks and Genome Biology deals with crucial aspects of the theoretical foundations of systems biology, namely power law distributions and scale-free networks which have emerged as the hallmarks of biological organization in the post-genomic era. The chapters in the book not only describe the interesting mathematical properties of biological networks but moves beyond phenomenology, toward models of evolution capable of explaining the emergence of these features. The collection of chapters, contributed by both physicists and biologists, strives to address the problems in this field in a rigorous but not excessively mathematical manner and to represent different viewpoints, which is crucial in this emerging discipline. Each chapter includes, in addition to technical descriptions of properties of biological networks and evolutionary models, a more general and accessible introduction to the respective problems. Most chapters emphasize the potential of theoretical systems biology for disco...

  19. Next-generation genome-scale models for metabolic engineering

    DEFF Research Database (Denmark)

    King, Zachary A.; Lloyd, Colton J.; Feist, Adam M.

    2015-01-01

    Constraint-based reconstruction and analysis (COBRA) methods have become widely used tools for metabolic engineering in both academic and industrial laboratories. By employing a genome-scale in silico representation of the metabolic network of a host organism, COBRA methods can be used to predict...... examples of applying COBRA methods to strain optimization are presented and discussed. Then, an outlook is provided on the next generation of COBRA models and the new types of predictions they will enable for systems metabolic engineering....

  20. Genomic divergences among cattle, dog and human estimated from large-scale alignments of genomic sequences

    Directory of Open Access Journals (Sweden)

    Shade Larry L

    2006-06-01

    Full Text Available Abstract Background Approximately 11 Mb of finished high quality genomic sequences were sampled from cattle, dog and human to estimate genomic divergences and their regional variation among these lineages. Results Optimal three-way multi-species global sequence alignments for 84 cattle clones or loci (each >50 kb of genomic sequence were constructed using the human and dog genome assemblies as references. Genomic divergences and substitution rates were examined for each clone and for various sequence classes under different functional constraints. Analysis of these alignments revealed that the overall genomic divergences are relatively constant (0.32–0.37 change/site for pairwise comparisons among cattle, dog and human; however substitution rates vary across genomic regions and among different sequence classes. A neutral mutation rate (2.0–2.2 × 10(-9 change/site/year was derived from ancestral repetitive sequences, whereas the substitution rate in coding sequences (1.1 × 10(-9 change/site/year was approximately half of the overall rate (1.9–2.0 × 10(-9 change/site/year. Relative rate tests also indicated that cattle have a significantly faster rate of substitution as compared to dog and that this difference is about 6%. Conclusion This analysis provides a large-scale and unbiased assessment of genomic divergences and regional variation of substitution rates among cattle, dog and human. It is expected that these data will serve as a baseline for future mammalian molecular evolution studies.

  1. Endophytic life strategies decoded by genome and transcriptome analyses of the mutualistic root symbiont Piriformospora indica.

    Directory of Open Access Journals (Sweden)

    Alga Zuccaro

    2011-10-01

    Full Text Available Recent sequencing projects have provided deep insight into fungal lifestyle-associated genomic adaptations. Here we report on the 25 Mb genome of the mutualistic root symbiont Piriformospora indica (Sebacinales, Basidiomycota and provide a global characterization of fungal transcriptional responses associated with the colonization of living and dead barley roots. Extensive comparative analysis of the P. indica genome with other Basidiomycota and Ascomycota fungi that have diverse lifestyle strategies identified features typically associated with both, biotrophism and saprotrophism. The tightly controlled expression of the lifestyle-associated gene sets during the onset of the symbiosis, revealed by microarray analysis, argues for a biphasic root colonization strategy of P. indica. This is supported by a cytological study that shows an early biotrophic growth followed by a cell death-associated phase. About 10% of the fungal genes induced during the biotrophic colonization encoded putative small secreted proteins (SSP, including several lectin-like proteins and members of a P. indica-specific gene family (DELD with a conserved novel seven-amino acids motif at the C-terminus. Similar to effectors found in other filamentous organisms, the occurrence of the DELDs correlated with the presence of transposable elements in gene-poor repeat-rich regions of the genome. This is the first in depth genomic study describing a mutualistic symbiont with a biphasic lifestyle. Our findings provide a significant advance in understanding development of biotrophic plant symbionts and suggest a series of incremental shifts along the continuum from saprotrophy towards biotrophy in the evolution of mycorrhizal association from decomposer fungi.

  2. Genomic and transcriptomic differences in community acquired methicillin resistant Staphylococcus aureus USA300 and USA400 strains.

    Science.gov (United States)

    Jones, Marcus B; Montgomery, Christopher P; Boyle-Vavra, Susan; Shatzkes, Kenneth; Maybank, Rosslyn; Frank, Bryan C; Peterson, Scott N; Daum, Robert S

    2014-12-19

    Staphylococcus aureus is a human pathogen responsible for substantial morbidity and mortality through its ability to cause a number of human infections including bacteremia, pneumonia and soft tissue infections. Of great concern is the emergence and dissemination of methicillin-resistant Staphylococcus aureus strains (MRSA) that are resistant to nearly all β-lactams. The emergence of the USA300 MRSA genetic background among community associated S. aureus infections (CA-MRSA) in the USA was followed by the disappearance of USA400 CA-MRSA isolates. To gain a greater understanding of the potential fitness advantages and virulence capacity of S. aureus USA300 clones, we performed whole genome sequencing of 15 USA300 and 4 USA400 clinical isolates. A comparison of representative genomes of the USA300 and USA400 pulsotypes indicates a number of differences in mobile genome elements. We examined the in vitro gene expression profiles by microarray hybridization and the in vivo transcriptomes during lung infection in mice of a USA300 and a USA400 MRSA strain by performing complete genome qRT-PCR analysis. The unique presence and increased expression of 6 exotoxins in USA300 (12- to 600-fold) compared to USA400 may contribute to the increased virulence of USA300 clones. Importantly, we also observed the up-regulation of prophage genes in USA300 (compared with USA400) during mouse lung infection (including genes encoded by both prophages ΦSa2usa and ΦSa3usa), suggesting that these prophages may play an important role in vivo by contributing to the elevated virulence characteristic of the USA300 clone. We observed differences in the genetic content of USA300 and USA400 strains, as well as significant differences of in vitro and in vivo gene expression of mobile elements in a lung pneumonia model. This is the first study to document the global transcription differences between USA300 and USA400 strains during both in vitro and in vivo growth.

  3. Comparative genomic and transcriptomic analysis of selected fatty acid biosynthesis genes and CNL disease resistance genes in oil palm

    Science.gov (United States)

    Rosli, Rozana; Amiruddin, Nadzirah; Ab Halim, Mohd Amin; Chan, Pek-Lan; Chan, Kuang-Lim; Azizi, Norazah; Morris, Priscilla E.; Leslie Low, Eng-Ti; Ong-Abdullah, Meilina; Sambanthamurthi, Ravigadevi; Singh, Rajinder

    2018-01-01

    Comparative genomics and transcriptomic analyses were performed on two agronomically important groups of genes from oil palm versus other major crop species and the model organism, Arabidopsis thaliana. The first analysis was of two gene families with key roles in regulation of oil quality and in particular the accumulation of oleic acid, namely stearoyl ACP desaturases (SAD) and acyl-acyl carrier protein (ACP) thioesterases (FAT). In both cases, these were found to be large gene families with complex expression profiles across a wide range of tissue types and developmental stages. The detailed classification of the oil palm SAD and FAT genes has enabled the updating of the latest version of the oil palm gene model. The second analysis focused on disease resistance (R) genes in order to elucidate possible candidates for breeding of pathogen tolerance/resistance. Ortholog analysis showed that 141 out of the 210 putative oil palm R genes had homologs in banana and rice. These genes formed 37 clusters with 634 orthologous genes. Classification of the 141 oil palm R genes showed that the genes belong to the Kinase (7), CNL (95), MLO-like (8), RLK (3) and Others (28) categories. The CNL R genes formed eight clusters. Expression data for selected R genes also identified potential candidates for breeding of disease resistance traits. Furthermore, these findings can provide information about the species evolution as well as the identification of agronomically important genes in oil palm and other major crops. PMID:29672525

  4. Comparative genomic and transcriptomic analysis of selected fatty acid biosynthesis genes and CNL disease resistance genes in oil palm.

    Science.gov (United States)

    Rosli, Rozana; Amiruddin, Nadzirah; Ab Halim, Mohd Amin; Chan, Pek-Lan; Chan, Kuang-Lim; Azizi, Norazah; Morris, Priscilla E; Leslie Low, Eng-Ti; Ong-Abdullah, Meilina; Sambanthamurthi, Ravigadevi; Singh, Rajinder; Murphy, Denis J

    2018-01-01

    Comparative genomics and transcriptomic analyses were performed on two agronomically important groups of genes from oil palm versus other major crop species and the model organism, Arabidopsis thaliana. The first analysis was of two gene families with key roles in regulation of oil quality and in particular the accumulation of oleic acid, namely stearoyl ACP desaturases (SAD) and acyl-acyl carrier protein (ACP) thioesterases (FAT). In both cases, these were found to be large gene families with complex expression profiles across a wide range of tissue types and developmental stages. The detailed classification of the oil palm SAD and FAT genes has enabled the updating of the latest version of the oil palm gene model. The second analysis focused on disease resistance (R) genes in order to elucidate possible candidates for breeding of pathogen tolerance/resistance. Ortholog analysis showed that 141 out of the 210 putative oil palm R genes had homologs in banana and rice. These genes formed 37 clusters with 634 orthologous genes. Classification of the 141 oil palm R genes showed that the genes belong to the Kinase (7), CNL (95), MLO-like (8), RLK (3) and Others (28) categories. The CNL R genes formed eight clusters. Expression data for selected R genes also identified potential candidates for breeding of disease resistance traits. Furthermore, these findings can provide information about the species evolution as well as the identification of agronomically important genes in oil palm and other major crops.

  5. Carotenoid metabolic profiling and transcriptome-genome mining reveal functional equivalence among blue-pigmented copepods and appendicularia

    KAUST Repository

    Mojib, Nazia; Amad, Maan H.; Thimma, Manjula; Aldanondo, Naroa; Kumaran, Mande; Irigoien, Xabier

    2014-01-01

    The tropical oligotrophic oceanic areas are characterized by high water transparency and annual solar radiation. Under these conditions, a large number of phylogenetically diverse mesozooplankton species living in the surface waters (neuston) are found to be blue pigmented. In the present study, we focused on understanding the metabolic and genetic basis of the observed blue phenotype functional equivalence between the blue-pigmented organisms from the phylum Arthropoda, subclass Copepoda (Acartia fossae) and the phylum Chordata, class Appendicularia (Oikopleura dioica) in the Red Sea. Previous studies have shown that carotenoid–protein complexes are responsible for blue coloration in crustaceans. Therefore, we performed carotenoid metabolic profiling using both targeted and nontargeted (high-resolution mass spectrometry) approaches in four different blue-pigmented genera of copepods and one blue-pigmented species of appendicularia. Astaxanthin was found to be the principal carotenoid in all the species. The pathway analysis showed that all the species can synthesize astaxanthin from β-carotene, ingested from dietary sources, via 3-hydroxyechinenone, canthaxanthin, zeaxanthin, adonirubin or adonixanthin. Further, using de novo assembled transcriptome of blue A. fossae (subclass Copepoda), we identified highly expressed homologous β-carotene hydroxylase enzymes and putative carotenoid-binding proteins responsible for astaxanthin formation and the blue phenotype. In blue O. dioica (class Appendicularia), corresponding putative genes were identified from the reference genome. Collectively, our data provide molecular evidences for the bioconversion and accumulation of blue astaxanthin–protein complexes underpinning the observed ecological functional equivalence and adaptive convergence among neustonic mesozooplankton.

  6. Carotenoid metabolic profiling and transcriptome-genome mining reveal functional equivalence among blue-pigmented copepods and appendicularia

    KAUST Repository

    Mojib, Nazia

    2014-06-01

    The tropical oligotrophic oceanic areas are characterized by high water transparency and annual solar radiation. Under these conditions, a large number of phylogenetically diverse mesozooplankton species living in the surface waters (neuston) are found to be blue pigmented. In the present study, we focused on understanding the metabolic and genetic basis of the observed blue phenotype functional equivalence between the blue-pigmented organisms from the phylum Arthropoda, subclass Copepoda (Acartia fossae) and the phylum Chordata, class Appendicularia (Oikopleura dioica) in the Red Sea. Previous studies have shown that carotenoid–protein complexes are responsible for blue coloration in crustaceans. Therefore, we performed carotenoid metabolic profiling using both targeted and nontargeted (high-resolution mass spectrometry) approaches in four different blue-pigmented genera of copepods and one blue-pigmented species of appendicularia. Astaxanthin was found to be the principal carotenoid in all the species. The pathway analysis showed that all the species can synthesize astaxanthin from β-carotene, ingested from dietary sources, via 3-hydroxyechinenone, canthaxanthin, zeaxanthin, adonirubin or adonixanthin. Further, using de novo assembled transcriptome of blue A. fossae (subclass Copepoda), we identified highly expressed homologous β-carotene hydroxylase enzymes and putative carotenoid-binding proteins responsible for astaxanthin formation and the blue phenotype. In blue O. dioica (class Appendicularia), corresponding putative genes were identified from the reference genome. Collectively, our data provide molecular evidences for the bioconversion and accumulation of blue astaxanthin–protein complexes underpinning the observed ecological functional equivalence and adaptive convergence among neustonic mesozooplankton.

  7. Integration of genomic, transcriptomic and proteomic data identifies two biologically distinct subtypes of invasive lobular breast cancer.

    Science.gov (United States)

    Michaut, Magali; Chin, Suet-Feung; Majewski, Ian; Severson, Tesa M; Bismeijer, Tycho; de Koning, Leanne; Peeters, Justine K; Schouten, Philip C; Rueda, Oscar M; Bosma, Astrid J; Tarrant, Finbarr; Fan, Yue; He, Beilei; Xue, Zheng; Mittempergher, Lorenza; Kluin, Roelof J C; Heijmans, Jeroen; Snel, Mireille; Pereira, Bernard; Schlicker, Andreas; Provenzano, Elena; Ali, Hamid Raza; Gaber, Alexander; O'Hurley, Gillian; Lehn, Sophie; Muris, Jettie J F; Wesseling, Jelle; Kay, Elaine; Sammut, Stephen John; Bardwell, Helen A; Barbet, Aurélie S; Bard, Floriane; Lecerf, Caroline; O'Connor, Darran P; Vis, Daniël J; Benes, Cyril H; McDermott, Ultan; Garnett, Mathew J; Simon, Iris M; Jirström, Karin; Dubois, Thierry; Linn, Sabine C; Gallagher, William M; Wessels, Lodewyk F A; Caldas, Carlos; Bernards, Rene

    2016-01-05

    Invasive lobular carcinoma (ILC) is the second most frequently occurring histological breast cancer subtype after invasive ductal carcinoma (IDC), accounting for around 10% of all breast cancers. The molecular processes that drive the development of ILC are still largely unknown. We have performed a comprehensive genomic, transcriptomic and proteomic analysis of a large ILC patient cohort and present here an integrated molecular portrait of ILC. Mutations in CDH1 and in the PI3K pathway are the most frequent molecular alterations in ILC. We identified two main subtypes of ILCs: (i) an immune related subtype with mRNA up-regulation of PD-L1, PD-1 and CTLA-4 and greater sensitivity to DNA-damaging agents in representative cell line models; (ii) a hormone related subtype, associated with Epithelial to Mesenchymal Transition (EMT), and gain of chromosomes 1q and 8q and loss of chromosome 11q. Using the somatic mutation rate and eIF4B protein level, we identified three groups with different clinical outcomes, including a group with extremely good prognosis. We provide a comprehensive overview of the molecular alterations driving ILC and have explored links with therapy response. This molecular characterization may help to tailor treatment of ILC through the application of specific targeted, chemo- and/or immune-therapies.

  8. Genomic, Transcriptomic, and Proteomic Analysis Provide Insights Into the Cold Adaptation Mechanism of the Obligate Psychrophilic Fungus Mrakia psychrophila

    Directory of Open Access Journals (Sweden)

    Yao Su

    2016-11-01

    Full Text Available Mrakia psychrophila is an obligate psychrophilic fungus. The cold adaptation mechanism of psychrophilic fungi remains unknown. Comparative genomics analysis indicated that M. psychrophila had a specific codon usage preference, especially for codons of Gly and Arg and its major facilitator superfamily (MFS transporter gene family was expanded. Transcriptomic analysis revealed that genes involved in ribosome and energy metabolism were upregulated at 4°, while genes involved in unfolded protein binding, protein processing in the endoplasmic reticulum, proteasome, spliceosome, and mRNA surveillance were upregulated at 20°. In addition, genes related to unfolded protein binding were alternatively spliced. Consistent with other psychrophiles, desaturase and glycerol 3-phosphate dehydrogenase, which are involved in biosynthesis of unsaturated fatty acid and glycerol respectively, were upregulated at 4°. Cold adaptation of M. psychrophila is mediated by synthesizing unsaturated fatty acids to maintain membrane fluidity and accumulating glycerol as a cryoprotectant. The proteomic analysis indicated that the correlations between the dynamic patterns between transcript level changes and protein level changes for some pathways were positive at 4°, but negative at 20°. The death of M. psychrophila above 20° might be caused by an unfolded protein response.

  9. Identification of candidate genes associated with porcine meat color traits by genome-wide transcriptome analysis.

    Science.gov (United States)

    Li, Bojiang; Dong, Chao; Li, Pinghua; Ren, Zhuqing; Wang, Han; Yu, Fengxiang; Ning, Caibo; Liu, Kaiqing; Wei, Wei; Huang, Ruihua; Chen, Jie; Wu, Wangjun; Liu, Honglin

    2016-10-17

    Meat color is considered to be the most important indicator of meat quality, however, the molecular mechanisms underlying traits related to meat color remain mostly unknown. In this study, to elucidate the molecular basis of meat color, we constructed six cDNA libraries from biceps femoris (Bf) and soleus (Sol), which exhibit obvious differences in meat color, and analyzed the whole-transcriptome differences between Bf (white muscle) and Sol (red muscle) using high-throughput sequencing technology. Using DEseq2 method, we identified 138 differentially expressed genes (DEGs) between Bf and Sol. Using DEGseq method, we identified 770, 810, and 476 DEGs in comparisons between Bf and Sol in three separate animals. Of these DEGs, 52 were overlapping DEGs. Using these data, we determined the enriched GO terms, metabolic pathways and candidate genes associated with meat color traits. Additionally, we mapped 114 non-redundant DEGs to the meat color QTLs via a comparative analysis with the porcine quantitative trait loci (QTL) database. Overall, our data serve as a valuable resource for identifying genes whose functions are critical for meat color traits and can accelerate studies of the molecular mechanisms of meat color formation.

  10. The transcriptomes of novel marmoset monkey embryonic stem cell lines reflect distinct genomic features.

    Science.gov (United States)

    Debowski, Katharina; Drummer, Charis; Lentes, Jana; Cors, Maren; Dressel, Ralf; Lingner, Thomas; Salinas-Riester, Gabriela; Fuchs, Sigrid; Sasaki, Erika; Behr, Rüdiger

    2016-07-07

    Embryonic stem cells (ESCs) are useful for the study of embryonic development. However, since research on naturally conceived human embryos is limited, non-human primate (NHP) embryos and NHP ESCs represent an excellent alternative to the corresponding human entities. Though, ESC lines derived from naturally conceived NHP embryos are still very rare. Here, we report the generation and characterization of four novel ESC lines derived from natural preimplantation embryos of the common marmoset monkey (Callithrix jacchus). For the first time we document derivation of NHP ESCs derived from morula stages. We show that quantitative chromosome-wise transcriptome analyses precisely reflect trisomies present in both morula-derived ESC lines. We also demonstrate that the female ESC lines exhibit different states of X-inactivation which is impressively reflected by the abundance of the lncRNA X inactive-specific transcript (XIST). The novel marmoset ESC lines will promote basic primate embryo and ESC studies as well as preclinical testing of ESC-based regenerative approaches in NHP.

  11. Analysis of codon usage patterns in Morus notabilis based on genome and transcriptome data.

    Science.gov (United States)

    Wen, Yan; Zou, Ziliang; Li, Hongshun; Xiang, Zhonghuai; He, Ningjia

    2017-06-01

    Codons play important roles in regulating gene expression levels and mRNA half-lives. However, codon usage and related studies in multicellular organisms still lag far behind those in unicellular organisms. In this study, we describe for the first time genome-wide patterns of codon bias in Morus notabilis (mulberry tree), and analyze genome-wide codon usage in 12 other species within the order Rosales. The codon usage of M. notabilis was affected by nucleotide composition, mutation pressure, nature selection, and gene expression level. Translational selection optimal codons were identified and highly expressed genes of M. notabilis tended to use the optimal codons. Genes with higher expression levels have shorter coding region and lower amino acid complexity. Housekeeping genes showed stronger translational selection, which, notably, was not caused by the large differences between the expression level of housekeeping genes and other genes.

  12. A comprehensive survey of the Plasmodium life cycle by genomic, transcriptomic, and proteomic analyses.

    Science.gov (United States)

    Hall, Neil; Karras, Marianna; Raine, J Dale; Carlton, Jane M; Kooij, Taco W A; Berriman, Matthew; Florens, Laurence; Janssen, Christoph S; Pain, Arnab; Christophides, Georges K; James, Keith; Rutherford, Kim; Harris, Barbara; Harris, David; Churcher, Carol; Quail, Michael A; Ormond, Doug; Doggett, Jon; Trueman, Holly E; Mendoza, Jacqui; Bidwell, Shelby L; Rajandream, Marie-Adele; Carucci, Daniel J; Yates, John R; Kafatos, Fotis C; Janse, Chris J; Barrell, Bart; Turner, C Michael R; Waters, Andrew P; Sinden, Robert E

    2005-01-07

    Plasmodium berghei and Plasmodium chabaudi are widely used model malaria species. Comparison of their genomes, integrated with proteomic and microarray data, with the genomes of Plasmodium falciparum and Plasmodium yoelii revealed a conserved core of 4500 Plasmodium genes in the central regions of the 14 chromosomes and highlighted genes evolving rapidly because of stage-specific selective pressures. Four strategies for gene expression are apparent during the parasites' life cycle: (i) housekeeping; (ii) host-related; (iii) strategy-specific related to invasion, asexual replication, and sexual development; and (iv) stage-specific. We observed posttranscriptional gene silencing through translational repression of messenger RNA during sexual development, and a 47-base 3' untranslated region motif is implicated in this process.

  13. Genomic and Transcriptomic Evidence for Carbohydrate Consumption Among Microorganisms in a Cold Seep Brine Pool

    Directory of Open Access Journals (Sweden)

    Weipeng Zhang

    2016-11-01

    Full Text Available The detailed lifestyle of microorganisms in deep-sea brine environments remains largely unexplored. Using a carefully calibrated genome binning approach, we reconstructed partial to nearly-complete genomes of 51 microorganisms in biofilms from the Thuwal cold seep brine pool of the Red Sea. The recovered metagenome-assembled genomes (MAGs belong to six different phyla: Actinobacteria, Proteobacteria, Candidatus Cloacimonetes, Candidatus Marinimicrobia, Bathyarchaeota and Thaumarchaeota. By comparison with close relatives of these microorganisms, we identified a number of unique genes associated with organic carbon metabolism and energy generation. These genes included various glycoside hydrolases, nitrate and sulfate reductases, putative bacterial microcompartment biosynthetic clusters (BMC, and F420H2 dehydrogenases. Phylogenetic analysis suggested that the acquisition of these genes probably occurred through horizontal gene transfer (HGT. Metatranscriptomics illustrated that glycoside hydrolases are among the most highly expressed genes. Our results suggest that the microbial inhabitants are well adapted to this brine environment, and anaerobic carbohydrate consumption mediated by glycoside hydrolases and electron transport systems (ETSs is a dominant process performed by microorganisms from various phyla within this ecosystem.

  14. Current approaches on non-invasive prenatal diagnosis: Prenatal genomics, transcriptomics, personalized fetal diagnosis

    Directory of Open Access Journals (Sweden)

    Tuba Günel

    2014-12-01

    Full Text Available Recent developments in molecular genetics improved our knowledge on fetal genome and physiology. Novel scientific innovations in prenatal diagnosis have accelerated in the last decade changing our vision immensely. Data obtained from fetal genomic studies brought new insights to fetal medicine and by the advances in fetal DNA and RNA sequencing technology novel treatment strategies has evolved. Non-invasive prenatal diagnosis found ground in genetics and the results are widely studied in scientific arena. When Lo and colleges proved fetal genetic material can be extracted from maternal plasma and fetal DNA can be isolated from maternal serum, the gate to many exciting discoveries was open. Microarray technology and advances in sequencing helped fetal diagnosis as well as other areas of medicine. Today it is a very crucial prerequisite for physicians practicing prenatal diagnosis to have a profound knowledge in genetics. Prevailing practical use and application of fetal genomic tests in maternal and fetal medicine mandates obstetricians to update their knowledge in genetics. The purpose of this review is to assist physicians to understand and update their knowledge in fetal genetic testing from maternal blood, individualized prenatal counseling and advancements on the subject by sharing our experiences as İstanbul University Fetal Nucleic Acid Research Group.

  15. Genomic and Transcriptomic Evidence for Carbohydrate Consumption among Microorganisms in a Cold Seep Brine Pool

    KAUST Repository

    Zhang, Weipeng

    2016-11-15

    The detailed lifestyle of microorganisms in deep-sea brine environments remains largely unexplored. Using a carefully calibrated genome binning approach, we reconstructed partial to nearly-complete genomes of 51 microorganisms in biofilms from the Thuwal cold seep brine pool of the Red Sea. The recovered metagenome-assembled genomes (MAGs) belong to six different phyla: Actinobacteria, Proteobacteria, Candidatus Cloacimonetes, Candidatus Marinimicrobia, Bathyarchaeota, and Thaumarchaeota. By comparison with close relatives of these microorganisms, we identified a number of unique genes associated with organic carbon metabolism and energy generation. These genes included various glycoside hydrolases, nitrate and sulfate reductases, putative bacterial microcompartment biosynthetic clusters (BMC), and F420H2 dehydrogenases. Phylogenetic analysis suggested that the acquisition of these genes probably occurred through horizontal gene transfer (HGT). Metatranscriptomics illustrated that glycoside hydrolases are among the most highly expressed genes. Our results suggest that the microbial inhabitants are well adapted to this brine environment, and anaerobic carbohydrate consumption mediated by glycoside hydrolases and electron transport systems (ETSs) is a dominant process performed by microorganisms from various phyla within this ecosystem.

  16. The genome and transcriptome of Phalaenopsis yield insights into floral organ development and flowering regulation

    Directory of Open Access Journals (Sweden)

    Jian-Zhi Huang

    2016-05-01

    Full Text Available The Phalaenopsis orchid is an important potted flower of high economic value around the world. We report the 3.1 Gb draft genome assembly of an important winter flowering Phalaenopsis ‘KHM190’ cultivar. We generated 89.5 Gb RNA-seq and 113 million sRNA-seq reads to use these data to identify 41,153 protein-coding genes and 188 miRNA families. We also generated a draft genome for Phalaenopsis pulcherrima ‘B8802,’ a summer flowering species, via resequencing. Comparison of genome data between the two Phalaenopsis cultivars allowed the identification of 691,532 single-nucleotide polymorphisms. In this study, we reveal that the key role of PhAGL6b in the regulation of labellum organ development involves alternative splicing in the big lip mutant. Petal or sepal overexpressing PhAGL6b leads to the conversion into a lip-like structure. We also discovered that the gibberellin pathway that regulates the expression of flowering time genes during the reproductive phase change is induced by cool temperature. Our work thus depicted a valuable resource for the flowering control, flower architecture development, and breeding of the Phalaenopsis orchids.

  17. Genomic and Transcriptomic Analysis of Growth-Supporting Dehalogenation of Chlorinated Methanes in Methylobacterium

    Directory of Open Access Journals (Sweden)

    Pauline Chaignaud

    2017-09-01

    Full Text Available Bacterial adaptation to growth with toxic halogenated chemicals was explored in the context of methylotrophic metabolism of Methylobacterium extorquens, by comparing strains CM4 and DM4, which show robust growth with chloromethane and dichloromethane, respectively. Dehalogenation of chlorinated methanes initiates growth-supporting degradation, with intracellular release of protons and chloride ions in both cases. The core, variable and strain-specific genomes of strains CM4 and DM4 were defined by comparison with genomes of non-dechlorinating strains. In terms of gene content, adaptation toward dehalogenation appears limited, strains CM4 and DM4 sharing between 75 and 85% of their genome with other strains of M. extorquens. Transcript abundance in cultures of strain CM4 grown with chloromethane and of strain DM4 grown with dichloromethane was compared to growth with methanol as a reference C1 growth substrate. Previously identified strain-specific dehalogenase-encoding genes were the most transcribed with chlorinated methanes, alongside other genes encoded by genomic islands (GEIs and plasmids involved in growth with chlorinated compounds as carbon and energy source. None of the 163 genes shared by strains CM4 and DM4 but not by other strains of M. extorquens showed higher transcript abundance in cells grown with chlorinated methanes. Among the several thousand genes of the M. extorquens core genome, 12 genes were only differentially abundant in either strain CM4 or strain DM4. Of these, 2 genes of known function were detected, for the membrane-bound proton translocating pyrophosphatase HppA and the housekeeping molecular chaperone protein DegP. This indicates that the adaptive response common to chloromethane and dichloromethane is limited at the transcriptional level, and involves aspects of the general stress response as well as of a dehalogenation-specific response to intracellular hydrochloric acid production. Core genes only differentially

  18. Biological effects of the olive polyphenol, hydroxytyrosol: An extra view from genome-wide transcriptome analysis.

    Science.gov (United States)

    Nan, Jia Nancy; Ververis, Katherine; Bollu, Sameera; Rodd, Annabelle L; Swarup, Oshi; Karagiannis, Tom C

    2014-01-01

    Epidemiological and clinical studies have established the health benefits of the Mediterranean diet, an important component of which are olives and olive oil derived from the olive tree (Olea Europea). It is now well-established that not only the major fatty acid constituents, but also the minor phenolic components, in olives and olive oil have important health benefits. Emerging research over the past decade has highlighted the beneficial effects of a range of phenolic compounds from olives and olive oil, particularly for cardiovascular diseases, metabolic syndrome and inflammatory conditions. Mechanisms of action include potent antioxidant and anti-inflammatory effects. Further, accumulating evidence indicates the potential of the polyphenols and potent antioxidants, hydroxytyrosol and oleuropein in oncology. Numerous studies, both in vitro and in vivo, have demonstrated the anticancer effects of hydroxytyrosol which include chemopreventive and cell-specific cytotoxic and apoptotic effects. Indeed, the precise molecular mechanisms accounting for the antioxidant, anti-inflammatory and anticancer properties are now becoming clear and this is, at least in part, due to high through-put gene transcription profiling. Initially, we constructed phylogenetic trees to visualize the evolutionary relationship of members of the Oleaceae family and secondly, between plants producing hydroxytyrosol to make inferences of potential similarities or differences in their medicinal properties and to identify novel plant candidates for the treatment and prevention of disease. Furthermore, given the recent interest in hydroxytyrosol as a potential anticancer agent and chemopreventative we utilized transcriptome analysis in the erythroleukemic cell line K562, to investigate the effects of hydroxytyrosol on three gene pathways: the complement system, The Warburg effect and chromatin remodeling to ascertain relevant gene candidates in the prevention of cancer.

  19. Large-scale transcriptome data reveals transcriptional activity of fission yeast LTR retrotransposons

    DEFF Research Database (Denmark)

    Mourier, Tobias; Willerslev, Eske

    2010-01-01

    of transcriptional activity are observed from both strands of solitary LTR sequences. Transcriptome data collected during meiosis suggests that transcription of solitary LTRs is correlated with the transcription of nearby protein-coding genes. CONCLUSIONS: Presumably, the host organism negatively regulates...

  20. Transcriptome analysis of tube foot and large scale marker discovery in sea cucumber, Apostichopus japonicus.

    Science.gov (United States)

    Zhou, Xiaoxu; Wang, Hongdi; Cui, Jun; Qiu, Xuemei; Chang, Yaqing; Wang, Xiuli

    2016-12-01

    Tube foot as one of the ambulacral appendages types in Aspidochirote holothurioids, is known for their functions in locomotion, feeding, chemoreception, light sensitivity and respiration. In this study, we explored the characteristic of transcriptome in the tube foot of sea cucumber (Apostichopus japonicus). Our results showed that among 390 unigenes which specifically expressed in the tube foot, 190 of them were annotated. Based on the assembly transcriptome, we found 219,860 SNPs from 34,749 unigenes, 97,683, 53,624, 27,767 and 40,786 were located in CDSs, 5'-UTRs, 3'-UTRs and non-CDS separately. Furthermore, 12,114 SSRs were detected from 7394 unigenes. Target genes of four specifically expressed miRNAs (miR-29a, miR-29b, miR-278-3p and miR-2005) in tube foot were also predicted based on the transcriptome, which contain immune-related factors (MBL, VLRA, AjC3, MyD88, CFB), skin pigmentation (MITF), candidate regeneration factor (TRP) and holothurians autolysis-related factor (CL). These results develop a relatively large number of molecular markers and transcriptome resources, and will provide a foundation for further analyses on the function and molecular mechanisms underlying A. japonicas tube foot. Copyright © 2016 Elsevier Inc. All rights reserved.

  1. Next-Generation Sequencing of the Chrysanthemum nankingense (Asteraceae) Transcriptome Permits Large-Scale Unigene Assembly and SSR Marker Discovery

    Science.gov (United States)

    Wang, Haibin; Jiang, Jiafu; Chen, Sumei; Qi, Xiangyu; Peng, Hui; Li, Pirui; Song, Aiping; Guan, Zhiyong; Fang, Weimin; Liao, Yuan; Chen, Fadi

    2013-01-01

    Background Simple sequence repeats (SSRs) are ubiquitous in eukaryotic genomes. Chrysanthemum is one of the largest genera in the Asteraceae family. Only few Chrysanthemum expressed sequence tag (EST) sequences have been acquired to date, so the number of available EST-SSR markers is very low. Methodology/Principal Findings Illumina paired-end sequencing technology produced over 53 million sequencing reads from C. nankingense mRNA. The subsequent de novo assembly yielded 70,895 unigenes, of which 45,789 (64.59%) unigenes showed similarity to the sequences in NCBI database. Out of 45,789 sequences, 107 have hits to the Chrysanthemum Nr protein database; 679 and 277 sequences have hits to the database of Helianthus and Lactuca species, respectively. MISA software identified a large number of putative EST-SSRs, allowing 1,788 primer pairs to be designed from the de novo transcriptome sequence and a further 363 from archival EST sequence. Among 100 primer pairs randomly chosen, 81 markers have amplicons and 20 are polymorphic for genotypes analysis in Chrysanthemum. The results showed that most (but not all) of the assays were transferable across species and that they exposed a significant amount of allelic diversity. Conclusions/Significance SSR markers acquired by transcriptome sequencing are potentially useful for marker-assisted breeding and genetic analysis in the genus Chrysanthemum and its related genera. PMID:23626799

  2. Genome Sequencing and Comparative Transcriptomics of the Model Entomopathogenic Fungi Metarhizium anisopliae and M. acridum

    Science.gov (United States)

    Shang, Yanfang; Duan, Zhibing; Hu, Xiao; Xie, Xue-Qin; Zhou, Gang; Peng, Guoxiong; Luo, Zhibing; Huang, Wei; Wang, Bing; Fang, Weiguo; Wang, Sibao; Zhong, Yi; Ma, Li-Jun; St. Leger, Raymond J.; Zhao, Guo-Ping; Pei, Yan; Feng, Ming-Guang; Xia, Yuxian; Wang, Chengshu

    2011-01-01

    Metarhizium spp. are being used as environmentally friendly alternatives to chemical insecticides, as model systems for studying insect-fungus interactions, and as a resource of genes for biotechnology. We present a comparative analysis of the genome sequences of the broad-spectrum insect pathogen Metarhizium anisopliae and the acridid-specific M. acridum. Whole-genome analyses indicate that the genome structures of these two species are highly syntenic and suggest that the genus Metarhizium evolved from plant endophytes or pathogens. Both M. anisopliae and M. acridum have a strikingly larger proportion of genes encoding secreted proteins than other fungi, while ∼30% of these have no functionally characterized homologs, suggesting hitherto unsuspected interactions between fungal pathogens and insects. The analysis of transposase genes provided evidence of repeat-induced point mutations occurring in M. acridum but not in M. anisopliae. With the help of pathogen-host interaction gene database, ∼16% of Metarhizium genes were identified that are similar to experimentally verified genes involved in pathogenicity in other fungi, particularly plant pathogens. However, relative to M. acridum, M. anisopliae has evolved with many expanded gene families of proteases, chitinases, cytochrome P450s, polyketide synthases, and nonribosomal peptide synthetases for cuticle-degradation, detoxification, and toxin biosynthesis that may facilitate its ability to adapt to heterogenous environments. Transcriptional analysis of both fungi during early infection processes provided further insights into the genes and pathways involved in infectivity and specificity. Of particular note, M. acridum transcribed distinct G-protein coupled receptors on cuticles from locusts (the natural hosts) and cockroaches, whereas M. anisopliae transcribed the same receptor on both hosts. This study will facilitate the identification of virulence genes and the development of improved biocontrol strains

  3. The Dynamic Genome and Transcriptome of the Human Fungal Pathogen Blastomyces and Close Relative Emmonsia.

    Directory of Open Access Journals (Sweden)

    José F Muñoz

    2015-10-01

    Full Text Available Three closely related thermally dimorphic pathogens are causal agents of major fungal diseases affecting humans in the Americas: blastomycosis, histoplasmosis and paracoccidioidomycosis. Here we report the genome sequence and analysis of four strains of the etiological agent of blastomycosis, Blastomyces, and two species of the related genus Emmonsia, typically pathogens of small mammals. Compared to related species, Blastomyces genomes are highly expanded, with long, often sharply demarcated tracts of low GC-content sequence. These GC-poor isochore-like regions are enriched for gypsy elements, are variable in total size between isolates, and are least expanded in the avirulent B. dermatitidis strain ER-3 as compared with the virulent B. gilchristii strain SLH14081. The lack of similar regions in related species suggests these isochore-like regions originated recently in the ancestor of the Blastomyces lineage. While gene content is highly conserved between Blastomyces and related fungi, we identified changes in copy number of genes potentially involved in host interaction, including proteases and characterized antigens. In addition, we studied gene expression changes of B. dermatitidis during the interaction of the infectious yeast form with macrophages and in a mouse model. Both experiments highlight a strong antioxidant defense response in Blastomyces, and upregulation of dioxygenases in vivo suggests that dioxide produced by antioxidants may be further utilized for amino acid metabolism. We identify a number of functional categories upregulated exclusively in vivo, such as secreted proteins, zinc acquisition proteins, and cysteine and tryptophan metabolism, which may include critical virulence factors missed before in in vitro studies. Across the dimorphic fungi, loss of certain zinc acquisition genes and differences in amino acid metabolism suggest unique adaptations of Blastomyces to its host environment. These results reveal the dynamics

  4. The Dynamic Genome and Transcriptome of the Human Fungal Pathogen Blastomyces and Close Relative Emmonsia.

    Science.gov (United States)

    Muñoz, José F; Gauthier, Gregory M; Desjardins, Christopher A; Gallo, Juan E; Holder, Jason; Sullivan, Thomas D; Marty, Amber J; Carmen, John C; Chen, Zehua; Ding, Li; Gujja, Sharvari; Magrini, Vincent; Misas, Elizabeth; Mitreva, Makedonka; Priest, Margaret; Saif, Sakina; Whiston, Emily A; Young, Sarah; Zeng, Qiandong; Goldman, William E; Mardis, Elaine R; Taylor, John W; McEwen, Juan G; Clay, Oliver K; Klein, Bruce S; Cuomo, Christina A

    2015-10-01

    Three closely related thermally dimorphic pathogens are causal agents of major fungal diseases affecting humans in the Americas: blastomycosis, histoplasmosis and paracoccidioidomycosis. Here we report the genome sequence and analysis of four strains of the etiological agent of blastomycosis, Blastomyces, and two species of the related genus Emmonsia, typically pathogens of small mammals. Compared to related species, Blastomyces genomes are highly expanded, with long, often sharply demarcated tracts of low GC-content sequence. These GC-poor isochore-like regions are enriched for gypsy elements, are variable in total size between isolates, and are least expanded in the avirulent B. dermatitidis strain ER-3 as compared with the virulent B. gilchristii strain SLH14081. The lack of similar regions in related species suggests these isochore-like regions originated recently in the ancestor of the Blastomyces lineage. While gene content is highly conserved between Blastomyces and related fungi, we identified changes in copy number of genes potentially involved in host interaction, including proteases and characterized antigens. In addition, we studied gene expression changes of B. dermatitidis during the interaction of the infectious yeast form with macrophages and in a mouse model. Both experiments highlight a strong antioxidant defense response in Blastomyces, and upregulation of dioxygenases in vivo suggests that dioxide produced by antioxidants may be further utilized for amino acid metabolism. We identify a number of functional categories upregulated exclusively in vivo, such as secreted proteins, zinc acquisition proteins, and cysteine and tryptophan metabolism, which may include critical virulence factors missed before in in vitro studies. Across the dimorphic fungi, loss of certain zinc acquisition genes and differences in amino acid metabolism suggest unique adaptations of Blastomyces to its host environment. These results reveal the dynamics of genome evolution

  5. Genomic and transcriptomic alterations following intergeneric hybridization and polyploidization in the Chrysanthemum nankingense×Tanacetum vulgare hybrid and allopolyploid (Asteraceae).

    Science.gov (United States)

    Qi, Xiangyu; Wang, Haibin; Song, Aiping; Jiang, Jiafu; Chen, Sumei; Chen, Fadi

    2018-01-01

    Allopolyploid formation involves two major events: interspecific hybridization and polyploidization. A number of species in the Asteraceae family are polyploids because of frequent hybridization. The effects of hybridization on genomics and transcriptomics in Chrysanthemum nankingense×Tanacetum vulgare hybrids have been reported. In this study, we obtained allopolyploids by applying a colchicine treatment to a synthesized C. nankingense × T. vulgare hybrid. Sequence-related amplified polymorphism (SRAP), methylation-sensitive amplification polymorphism (MSAP), and high-throughput RNA sequencing (RNA-Seq) technologies were used to investigate the genomic, epigenetic, and transcriptomic alterations in both the hybrid and allopolyploids. The genomic alterations in the hybrid and allopolyploids mainly involved the loss of parental fragments and the gain of novel fragments. The DNA methylation level of the hybrid was reduced by hybridization but was restored somewhat after polyploidization. There were more significant differences in gene expression between the hybrid/allopolyploid and the paternal parent than between the hybrid/allopolyploid and the maternal parent. Most differentially expressed genes (DEGs) showed down-regulation in the hybrid/allopolyploid relative to the parents. Among the non-additive genes, transgressive patterns appeared to be dominant, especially repression patterns. Maternal expression dominance was observed specifically for down-regulated genes. Many methylase and methyltransferase genes showed differential expression between the hybrid and parents and between the allopolyploid and parents. Our data indicate that hybridization may be a major factor affecting genomic and transcriptomic changes in newly formed allopolyploids. The formation of allopolyploids may not simply be the sum of hybridization and polyploidization changes but also may be influenced by the interaction between these processes.

  6. Characterizing the developmental transcriptome of the oriental fruit fly, Bactrocera dorsalis (Diptera: Tephritidae) through comparative genomic analysis with Drosophila melanogaster utilizing modENCODE datasets.

    Science.gov (United States)

    Geib, Scott M; Calla, Bernarda; Hall, Brian; Hou, Shaobin; Manoukis, Nicholas C

    2014-10-28

    The oriental fruit fly, Bactrocera dorsalis, is an important pest of fruit and vegetable crops throughout Asia, and is considered a high risk pest for establishment in the mainland United States. It is a member of the family Tephritidae, which are the most agriculturally important family of flies, and can be considered an out-group to well-studied members of the family Drosophilidae. Despite their importance as pests and their relatedness to Drosophila, little information is present on B. dorsalis transcripts and proteins. The objective of this paper is to comprehensively characterize the transcripts present throughout the life history of B. dorsalis and functionally annotate and analyse these transcripts relative to the presence, expression, and function of orthologous sequences present in Drosophila melanogaster. We present a detailed transcriptome assembly of B. dorsalis from egg through adult stages containing 20,666 transcripts across 10,799 unigene components. Utilizing data available through Flybase and the modENCODE project, we compared expression patterns of these transcripts to putative orthologs in D. melanogaster in terms of timing, abundance, and function. In addition, temporal expression patterns in B. dorsalis were characterized between stages, to establish the constitutive or stage-specific expression patterns of particular transcripts. A fully annotated transcriptome assembly is made available through NCBI, in addition to corresponding expression data. Through characterizing the transcriptome of B. dorsalis through its life history and comparing the transcriptome of B. dorsalis to the model organism D. melanogaster, a database has been developed that can be used as the foundation to functional genomic research in Bactrocera flies and help identify orthologous genes between B. dorsalis and D. melanogaster. This data provides the foundation for future functional genomic research that will focus on improving our understanding of the physiology and

  7. Fish-T1K (Transcriptomes of 1,000 Fishes) Project: large-scale transcriptome data for fish evolution studies.

    Science.gov (United States)

    Sun, Ying; Huang, Yu; Li, Xiaofeng; Baldwin, Carole C; Zhou, Zhuocheng; Yan, Zhixiang; Crandall, Keith A; Zhang, Yong; Zhao, Xiaomeng; Wang, Min; Wong, Alex; Fang, Chao; Zhang, Xinhui; Huang, Hai; Lopez, Jose V; Kilfoyle, Kirk; Zhang, Yong; Ortí, Guillermo; Venkatesh, Byrappa; Shi, Qiong

    2016-01-01

    Ray-finned fishes (Actinopterygii) represent more than 50 % of extant vertebrates and are of great evolutionary, ecologic and economic significance, but they are relatively underrepresented in 'omics studies. Increased availability of transcriptome data for these species will allow researchers to better understand changes in gene expression, and to carry out functional analyses. An international project known as the "Transcriptomes of 1,000 Fishes" (Fish-T1K) project has been established to generate RNA-seq transcriptome sequences for 1,000 diverse species of ray-finned fishes. The first phase of this project has produced transcriptomes from more than 180 ray-finned fishes, representing 142 species and covering 51 orders and 109 families. Here we provide an overview of the goals of this project and the work done so far.

  8. BFAST: an alignment tool for large scale genome resequencing.

    Directory of Open Access Journals (Sweden)

    Nils Homer

    2009-11-01

    Full Text Available The new generation of massively parallel DNA sequencers, combined with the challenge of whole human genome resequencing, result in the need for rapid and accurate alignment of billions of short DNA sequence reads to a large reference genome. Speed is obviously of great importance, but equally important is maintaining alignment accuracy of short reads, in the 25-100 base range, in the presence of errors and true biological variation.We introduce a new algorithm specifically optimized for this task, as well as a freely available implementation, BFAST, which can align data produced by any of current sequencing platforms, allows for user-customizable levels of speed and accuracy, supports paired end data, and provides for efficient parallel and multi-threaded computation on a computer cluster. The new method is based on creating flexible, efficient whole genome indexes to rapidly map reads to candidate alignment locations, with arbitrary multiple independent indexes allowed to achieve robustness against read errors and sequence variants. The final local alignment uses a Smith-Waterman method, with gaps to support the detection of small indels.We compare BFAST to a selection of large-scale alignment tools -- BLAT, MAQ, SHRiMP, and SOAP -- in terms of both speed and accuracy, using simulated and real-world datasets. We show BFAST can achieve substantially greater sensitivity of alignment in the context of errors and true variants, especially insertions and deletions, and minimize false mappings, while maintaining adequate speed compared to other current methods. We show BFAST can align the amount of data needed to fully resequence a human genome, one billion reads, with high sensitivity and accuracy, on a modest computer cluster in less than 24 hours. BFAST is available at (http://bfast.sourceforge.net.

  9. Farewell to GBM-O: Genomic and transcriptomic profiling of glioblastoma with oligodendroglioma component reveals distinct molecular subgroups.

    Science.gov (United States)

    Hinrichs, Benjamin H; Newman, Scott; Appin, Christina L; Dunn, William; Cooper, Lee; Pauly, Rini; Kowalski, Jeanne; Rossi, Michael R; Brat, Daniel J

    2016-01-13

    Glioblastoma with oligodendroglioma component (GBM-O) was recognized as a histologic pattern of glioblastoma (GBM) by the World Health Organization (WHO) in 2007 and is distinguished by the presence of oligodendroglioma-like differentiation. To better understand the genetic underpinnings of this morphologic entity, we performed a genome-wide, integrated copy number, mutational and transcriptomic analysis of eight (seven primary, primary secondary) cases. Three GBM-O samples had IDH1 (p.R132H) mutations; two of these also demonstrated 1p/19q co-deletion and had a proneural transcriptional profile, a molecular signature characteristic of oligodendroglioma. The additional IDH1 mutant tumor lacked 1p/19q co-deletion, harbored a TP53 mutation, and overall, demonstrated features most consistent with IDH mutant (secondary) GBM. Finally, five tumors were IDH wild-type (IDHwt) and had chromosome seven gains, chromosome 10 losses, and homozygous 9p deletions (CDKN2A), alterations typical of IDHwt (primary) GBM. IDHwt GBM-Os also demonstrated EGFR and PDGFRA amplifications, which correlated with classical and proneural expression subtypes, respectively. Our findings demonstrate that GBM-O is composed of three discrete molecular subgroups with characteristic mutations, copy number alterations and gene expression patterns. Despite displaying areas that morphologically resemble oligodendroglioma, the current results indicate that morphologically defined GBM-O does not correspond to a particular genetic signature, but rather represents a collection of genetically dissimilar entities. Ancillary testing, especially for IDH and 1p/19q, should be used for determining these molecular subtypes.

  10. Whole transcriptomic and proteomic analyses of an isogenic M. tuberculosis clinical strain with a naturally occurring 15 Kb genomic deletion.

    Directory of Open Access Journals (Sweden)

    Carla Duncan

    Full Text Available Tuberculosis remains one of the most difficult to control infectious diseases in the world. Many different factors contribute to the complexity of this disease. These include the ability of the host to control the infection which may directly relate to nutritional status, presence of co-morbidities and genetic predisposition. Pathogen factors, in particular the ability of different Mycobacterium tuberculosis strains to respond to the harsh environment of the host granuloma, which includes low oxygen and nutrient availability and the presence of damaging radical oxygen and nitrogen species, also play an important role in the success of different strains to cause disease. In this study we evaluated the impact of a naturally occurring 12 gene 15 Kb genomic deletion on the physiology and virulence of M. tuberculosis. The strains denominated ON-A WT (wild type and ON-A NM (natural mutant were isolated from a previously reported TB outbreak in an inner city under-housed population in Toronto, Canada. Here we subjected these isogenic strains to transcriptomic (via RNA-seq and proteomic analyses and identified several gene clusters with differential expression in the natural mutant, including the DosR regulon and the molybdenum cofactor biosynthesis genes, both of which were found in lower abundance in the natural mutant. We also demonstrated lesser virulence of the natural mutant in the guinea pig animal model. Overall, our findings suggest that the ON-A natural mutant is less fit to cause disease, but nevertheless has the potential to cause extended transmission in at-risk populations.

  11. Modeling Lactococcus lactis using a genome-scale flux model

    Directory of Open Access Journals (Sweden)

    Nielsen Jens

    2005-06-01

    Full Text Available Abstract Background Genome-scale flux models are useful tools to represent and analyze microbial metabolism. In this work we reconstructed the metabolic network of the lactic acid bacteria Lactococcus lactis and developed a genome-scale flux model able to simulate and analyze network capabilities and whole-cell function under aerobic and anaerobic continuous cultures. Flux balance analysis (FBA and minimization of metabolic adjustment (MOMA were used as modeling frameworks. Results The metabolic network was reconstructed using the annotated genome sequence from L. lactis ssp. lactis IL1403 together with physiological and biochemical information. The established network comprised a total of 621 reactions and 509 metabolites, representing the overall metabolism of L. lactis. Experimental data reported in the literature was used to fit the model to phenotypic observations. Regulatory constraints had to be included to simulate certain metabolic features, such as the shift from homo to heterolactic fermentation. A minimal medium for in silico growth was identified, indicating the requirement of four amino acids in addition to a sugar. Remarkably, de novo biosynthesis of four other amino acids was observed even when all amino acids were supplied, which is in good agreement with experimental observations. Additionally, enhanced metabolic engineering strategies for improved diacetyl producing strains were designed. Conclusion The L. lactis metabolic network can now be used for a better understanding of lactococcal metabolic capabilities and potential, for the design of enhanced metabolic engineering strategies and for integration with other types of 'omic' data, to assist in finding new information on cellular organization and function.

  12. XTHs from Fragaria vesca: genomic structure and transcriptomic analysis in ripening fruit and other tissues.

    Science.gov (United States)

    Opazo, María Cecilia; Lizana, Rodrigo; Stappung, Yazmina; Davis, Thomas M; Herrera, Raúl; Moya-León, María Alejandra

    2017-11-07

    Fragaria vesca or 'woodland strawberry' has emerged as an attractive model for the study of ripening of non-climacteric fruit. It has several advantages, such as its small genome and its diploidy. The recent availability of the complete sequence of its genome opens the possibility for further analysis and its use as a reference species. Fruit softening is a physiological event and involves many biochemical changes that take place at the final stages of fruit development; among them, the remodeling of cell walls by the action of a set of enzymes. Xyloglucan endotransglycosylase/hydrolase (XTH) is a cell wall-associated enzyme, which is encoded by a multigene family. Its action modifies the structure of xyloglucans, a diverse group of polysaccharides that crosslink with cellulose microfibrills, affecting therefore the functional structure of the cell wall. The aim of this work is to identify the XTH-encoding genes present in F. vesca and to determine its transcription level in ripening fruit. The search resulted in identification of 26 XTH-encoding genes named as FvXTHs. Genetic structure and phylogenetic analyses were performed allowing the classification of FvXTH genes into three phylogenetic groups: 17 in group I/II, 2 in group IIIA and 4 in group IIIB. Two sequences were included into the ancestral group. Through a comparative analysis, characteristic structural protein domains were found in FvXTH protein sequences. In complement, expression analyses of FvXTHs by qPCR were performed in fruit at different developmental and ripening stages, as well as, in other tissues. The results showed a diverse expression pattern of FvXTHs in several tissues, although most of them are highly expressed in roots. Their expression patterns are not related to their respective phylogenetic groups. In addition, most FvXTHs are expressed in ripe fruit, and interestingly, some of them (FvXTH 18 and 20, belonging to phylogenic group I/II, and FvXTH 25 and 26 to group IIIB) display an

  13. Comparative genomics and transcriptome analysis of Lactobacillus rhamnosus ATCC 11443 and the mutant strain SCT-10-10-60 with enhanced L-lactic acid production capacity.

    Science.gov (United States)

    Sun, Liang; Lu, Zhilong; Li, Jianxiu; Sun, Feifei; Huang, Ribo

    2018-02-01

    Mechanisms for high L-lactic acid production remain unclear in many bacteria. Lactobacillus rhamnosus SCT-10-10-60 was previously obtained from L. rhamnosus ATCC 11443 via mutagenesis and showed improved L-lactic acid production. In this study, the genomes of strains SCT-10-10-60 and ATCC 11443 were sequenced. Both genomes are a circular chromosome, 2.99 Mb in length with a GC content of approximately 46.8%. Eight split genes were identified in strain SCT-10-10-60, including two LytR family transcriptional regulators, two Rex redox-sensing transcriptional repressors, and four ABC transporters. In total, 60 significantly up-regulated genes (log 2 fold-change ≥ 2) and 39 significantly down-regulated genes (log 2 fold-change ≤ - 2) were identified by a transcriptome comparison between strains SCT-10-10-60 and ATCC 11443. KEGG pathway enrichment analysis revealed that "pyruvate metabolism" was significantly different (P < 0.05) between the two strains. The split genes and the differentially expressed genes involved in the "pyruvate metabolism" pathway are probably responsible for the increased L-lactic acid production by SCT-10-10-60. The genome and transcriptome sequencing information and comparison of SCT-10-10-60 with ATCC 11443 provide insights into the anabolism of L-lactic acid and a reference for improving L-lactic acid production using genetic engineering.

  14. BISQUE: locus- and variant-specific conversion of genomic, transcriptomic and proteomic database identifiers.

    Science.gov (United States)

    Meyer, Michael J; Geske, Philip; Yu, Haiyuan

    2016-05-15

    Biological sequence databases are integral to efforts to characterize and understand biological molecules and share biological data. However, when analyzing these data, scientists are often left holding disparate biological currency-molecular identifiers from different databases. For downstream applications that require converting the identifiers themselves, there are many resources available, but analyzing associated loci and variants can be cumbersome if data is not given in a form amenable to particular analyses. Here we present BISQUE, a web server and customizable command-line tool for converting molecular identifiers and their contained loci and variants between different database conventions. BISQUE uses a graph traversal algorithm to generalize the conversion process for residues in the human genome, genes, transcripts and proteins, allowing for conversion across classes of molecules and in all directions through an intuitive web interface and a URL-based web service. BISQUE is freely available via the web using any major web browser (http://bisque.yulab.org/). Source code is available in a public GitHub repository (https://github.com/hyulab/BISQUE). haiyuan.yu@cornell.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  15. Low level genome mistranslations deregulate the transcriptome and translatome and generate proteotoxic stress in yeast

    Directory of Open Access Journals (Sweden)

    Paredes João A

    2012-06-01

    Full Text Available Abstract Background Organisms use highly accurate molecular processes to transcribe their genes and a variety of mRNA quality control and ribosome proofreading mechanisms to maintain intact the fidelity of genetic information flow. Despite this, low level gene translational errors induced by mutations and environmental factors cause neurodegeneration and premature death in mice and mitochondrial disorders in humans. Paradoxically, such errors can generate advantageous phenotypic diversity in fungi and bacteria through poorly understood molecular processes. Results In order to clarify the biological relevance of gene translational errors we have engineered codon misreading in yeast and used profiling of total and polysome-associated mRNAs, molecular and biochemical tools to characterize the recombinant cells. We demonstrate here that gene translational errors, which have negligible impact on yeast growth rate down-regulate protein synthesis, activate the unfolded protein response and environmental stress response pathways, and down-regulate chaperones linked to ribosomes. Conclusions We provide the first global view of transcriptional and post-transcriptional responses to global gene translational errors and we postulate that they cause gradual cell degeneration through synergistic effects of overloading protein quality control systems and deregulation of protein synthesis, but generate adaptive phenotypes in unicellular organisms through activation of stress cross-protection. We conclude that these genome wide gene translational infidelities can be degenerative or adaptive depending on cellular context and physiological condition.

  16. Comparative genomics and transcriptomics depict ericoid mycorrhizal fungi as versatile saprotrophs and plant mutualists.

    Science.gov (United States)

    Martino, Elena; Morin, Emmanuelle; Grelet, Gwen-Aëlle; Kuo, Alan; Kohler, Annegret; Daghino, Stefania; Barry, Kerrie W; Cichocki, Nicolas; Clum, Alicia; Dockter, Rhyan B; Hainaut, Matthieu; Kuo, Rita C; LaButti, Kurt; Lindahl, Björn D; Lindquist, Erika A; Lipzen, Anna; Khouja, Hassine-Radhouane; Magnuson, Jon; Murat, Claude; Ohm, Robin A; Singer, Steven W; Spatafora, Joseph W; Wang, Mei; Veneault-Fourrey, Claire; Henrissat, Bernard; Grigoriev, Igor V; Martin, Francis M; Perotto, Silvia

    2018-02-01

    Some soil fungi in the Leotiomycetes form ericoid mycorrhizal (ERM) symbioses with Ericaceae. In the harsh habitats in which they occur, ERM plant survival relies on nutrient mobilization from soil organic matter (SOM) by their fungal partners. The characterization of the fungal genetic machinery underpinning both the symbiotic lifestyle and SOM degradation is needed to understand ERM symbiosis functioning and evolution, and its impact on soil carbon (C) turnover. We sequenced the genomes of the ERM fungi Meliniomyces bicolor, M. variabilis, Oidiodendron maius and Rhizoscyphus ericae, and compared their gene repertoires with those of fungi with different lifestyles (ecto- and orchid mycorrhiza, endophytes, saprotrophs, pathogens). We also identified fungal transcripts induced in symbiosis. The ERM fungal gene contents for polysaccharide-degrading enzymes, lipases, proteases and enzymes involved in secondary metabolism are closer to those of saprotrophs and pathogens than to those of ectomycorrhizal symbionts. The fungal genes most highly upregulated in symbiosis are those coding for fungal and plant cell wall-degrading enzymes (CWDEs), lipases, proteases, transporters and mycorrhiza-induced small secreted proteins (MiSSPs). The ERM fungal gene repertoire reveals a capacity for a dual saprotrophic and biotrophic lifestyle. This may reflect an incomplete transition from saprotrophy to the mycorrhizal habit, or a versatile life strategy similar to fungal endophytes. © 2018 The Authors. New Phytologist © 2018 New Phytologist Trust.

  17. Characterization of Fusobacterium varium Fv113-g1 isolated from a patient with ulcerative colitis based on complete genome sequence and transcriptome analysis.

    Directory of Open Access Journals (Sweden)

    Tsuyoshi Sekizuka

    Full Text Available Fusobacterium spp. present in the oral and gut flora is carcinogenic and is associated with the risk of pancreatic and colorectal cancers. Fusobacterium spp. is also implicated in a broad spectrum of human pathologies, including Crohn's disease and ulcerative colitis (UC. Here we report the complete genome sequence of Fusobacterium varium Fv113-g1 (genome size, 3.96 Mb isolated from a patient with UC. Comparative genome analyses totally suggested that Fv113-g1 is basically assigned as F. varium, in particular, it could be reclassified as notable F. varium subsp. similar to F. ulcerans because of partial shared orthologs. Compared with the genome sequences of F. varium ATCC 27725 (genome size, 3.30 Mb and other strains of Fusobacterium spp., Fv113-g1 possesses many accessary pan-genome sequences with noteworthy multiple virulence factors, including 44 autotransporters (type V secretion system, T5SS and 13 Fusobacterium adhesion (FadA paralogs involved in potential mucosal inflammation. Indeed, transcriptome analysis demonstrated that Fv113-g1-specific accessary genes, such as multiple T5SS and fadA paralogs, showed notably increased expression with D-MEM cultivation than with brain heart infusion broth. This implied that growth condition may enhance the expression of such potential virulence factors, leading to remarkable survival against other gut microorganisms and to the pathogenicity to human intestinal epithelium.

  18. Genome-wide transcriptome analysis of gametophyte development in Physcomitrella patens

    Directory of Open Access Journals (Sweden)

    Xiao Lihong

    2011-12-01

    Full Text Available Abstract Background Regulation of gene expression plays a pivotal role in controlling the development of multicellular plants. To explore the molecular mechanism of plant developmental-stage transition and cell-fate determination, a genome-wide analysis was undertaken of sequential developmental time-points and individual tissue types in the model moss Physcomitrella patens because of the short life cycle and relative structural simplicity of this plant. Results Gene expression was analyzed by digital gene expression tag profiling of samples taken from P. patens protonema at 3, 14 and 24 days, and from leafy shoot tissues at 30 days, after protoplast isolation, and from 14-day-old caulonemal and chloronemal tissues. In total, 4333 genes were identified as differentially displayed. Among these genes, 4129 were developmental-stage specific and 423 were preferentially expressed in either chloronemal or caulonemal tissues. Most of the differentially displayed genes were assigned to functions in organic substance and energy metabolism or macromolecule biosynthetic and catabolic processes based on gene ontology descriptions. In addition, some regulatory genes identified as candidates might be involved in controlling the developmental-stage transition and cell differentiation, namely MYB-like, HB-8, AL3, zinc finger family proteins, bHLH superfamily, GATA superfamily, GATA and bZIP transcription factors, protein kinases, genes related to protein/amino acid methylation, and auxin, ethylene, and cytokinin signaling pathways. Conclusions These genes that show highly dynamic changes in expression during development in P. patens are potential targets for further functional characterization and evolutionary developmental biology studies.

  19. Dissection of the inflammatory bowel disease transcriptome using genome-wide cDNA microarrays.

    Directory of Open Access Journals (Sweden)

    Christine M Costello

    2005-08-01

    Full Text Available BACKGROUND: The differential pathophysiologic mechanisms that trigger and maintain the two forms of inflammatory bowel disease (IBD, Crohn disease (CD, and ulcerative colitis (UC are only partially understood. cDNA microarrays can be used to decipher gene regulation events at a genome-wide level and to identify novel unknown genes that might be involved in perpetuating inflammatory disease progression. METHODS AND FINDINGS: High-density cDNA microarrays representing 33,792 UniGene clusters were prepared. Biopsies were taken from the sigmoid colon of normal controls (n = 11, CD patients (n = 10 and UC patients (n = 10. 33P-radiolabeled cDNA from purified poly(A+ RNA extracted from biopsies (unpooled was hybridized to the arrays. We identified 500 and 272 transcripts differentially regulated in CD and UC, respectively. Interesting hits were independently verified by real-time PCR in a second sample of 100 individuals, and immunohistochemistry was used for exemplary localization. The main findings point to novel molecules important in abnormal immune regulation and the highly disturbed cell biology of colonic epithelial cells in IBD pathogenesis, e.g., CYLD (cylindromatosis, turban tumor syndrome and CDH11 (cadherin 11, type 2. By the nature of the array setup, many of the genes identified were to our knowledge previously uncharacterized, and prediction of the putative function of a subsection of these genes indicate that some could be involved in early events in disease pathophysiology. CONCLUSION: A comprehensive set of candidate genes not previously associated with IBD was revealed, which underlines the polygenic and complex nature of the disease. It points out substantial differences in pathophysiology between CD and UC. The multiple unknown genes identified may stimulate new research in the fields of barrier mechanisms and cell signalling in the context of IBD, and ultimately new therapeutic approaches.

  20. Incorporating genomic, transcriptomic and clinical data: a prognostic and stem cell-like MYC and PRC imbalance in high-risk neuroblastoma.

    Science.gov (United States)

    Yang, Xinan Holly; Tang, Fangming; Shin, Jisu; Cunningham, John M

    2017-10-03

    Previous studies suggested that cancer cells possess traits reminiscent of the biological mechanisms ascribed to normal embryonic stem cells (ESCs) regulated by MYC and Polycomb repressive complex 2 (PRC2). Several poorly differentiated adult tumors showed preferentially high expression levels in targets of MYC, coincident with low expression levels in targets of PRC2. This paper will reveal this ESC-like cancer signature in high-risk neuroblastoma (HR-NB), the most common extracranial solid tumor in children. We systematically assembled genomic variants, gene expression changes, priori knowledge of gene functions, and clinical outcomes to identify prognostic multigene signatures. First, we assigned a new, individualized prognostic index using the relative expressions between the poor- and good-outcome signature genes. We then characterized HR-NB aggressiveness beyond these prognostic multigene signatures through the imbalanced effects of MYC and PRC2 signaling. We further analyzed Retinoic acid (RA)-induced HR-NB cells to model tumor cell differentiation. Finally, we performed in vitro validation on ZFHX3, a cell differentiation marker silenced by PRC2, and compared cell morphology changes before and after blocking PRC2 in HR-NB cells. A significant concurrence existed between exons with verified variants and genes showing MYCN-dependent expression in HR-NB. From these biomarker candidates, we identified two novel prognostic gene-set pairs with multi-scale oncogenic defects. Intriguingly, MYC targets over-represented an unfavorable component of the identified prognostic signatures while PRC2 targets over-represented a favorable component. The cell cycle arrest and neuronal differentiation marker ZFHX3 was identified as one of PRC2-silenced tumor suppressor candidates. Blocking PRC2 reduced tumor cell growth and increased the mRNA expression levels of ZFHX3 in an early treatment stage. This hypothesis-driven systems bioinformatics work offered novel insights into

  1. Transcriptional Slippage and RNA Editing Increase the Diversity of Transcripts in Chloroplasts: Insight from Deep Sequencing of Vigna radiata Genome and Transcriptome.

    Directory of Open Access Journals (Sweden)

    Ching-Ping Lin

    Full Text Available We performed deep sequencing of the nuclear and organellar genomes of three mungbean genotypes: Vigna radiata ssp. sublobata TC1966, V. radiata var. radiata NM92 and the recombinant inbred line RIL59 derived from a cross between TC1966 and NM92. Moreover, we performed deep sequencing of the RIL59 transcriptome to investigate transcript variability. The mungbean chloroplast genome has a quadripartite structure including a pair of inverted repeats separated by two single copy regions. A total of 213 simple sequence repeats were identified in the chloroplast genomes of NM92 and RIL59; 78 single nucleotide variants and nine indels were discovered in comparing the chloroplast genomes of TC1966 and NM92. Analysis of the mungbean chloroplast transcriptome revealed mRNAs that were affected by transcriptional slippage and RNA editing. Transcriptional slippage frequency was positively correlated with the length of simple sequence repeats of the mungbean chloroplast genome (R2=0.9911. In total, 41 C-to-U editing sites were found in 23 chloroplast genes and in one intergenic spacer. No editing site that swapped U to C was found. A combination of bioinformatics and experimental methods revealed that the plastid-encoded RNA polymerase-transcribed genes psbF and ndhA are affected by transcriptional slippage in mungbean and in main lineages of land plants, including three dicots (Glycine max, Brassica rapa, and Nicotiana tabacum, two monocots (Oryza sativa and Zea mays, two gymnosperms (Pinus taeda and Ginkgo biloba and one moss (Physcomitrella patens. Transcript analysis of the rps2 gene showed that transcriptional slippage could affect transcripts at single sequence repeat regions with poly-A runs. It showed that transcriptional slippage together with incomplete RNA editing may cause sequence diversity of transcripts in chloroplasts of land plants.

  2. Population genetics, phylogenomics and hybrid speciation of Juglans in China determined from whole chloroplast genomes, transcriptomes, and genotyping-by-sequencing (GBS).

    Science.gov (United States)

    Zhao, Peng; Zhou, Hui-Juan; Potter, Daniel; Hu, Yi-Heng; Feng, Xiao-Jia; Dang, Meng; Feng, Li; Zulfiqar, Saman; Liu, Wen-Zhe; Zhao, Gui-Fang; Woeste, Keith

    2018-04-18

    Genomic data are a powerful tool for elucidating the processes involved in the evolution and divergence of species. The speciation and phylogenetic relationships among Chinese Juglans remain unclear. Here, we used results from phylogenomic and population genetic analyses, transcriptomics, Genotyping-By-Sequencing (GBS), and whole chloroplast genomes (Cp genome) data to infer processes of lineage formation among the five native Chinese species of the walnut genus (Juglans, Juglandaceae), a widespread, economically important group. We found that the processes of isolation generated diversity during glaciations, but that the recent range expansion of J. regia, probably from multiple refugia, led to hybrid formation both within and between sections of the genus. In southern China, human dispersal of J. regia brought it into contact with J. sigillata, which we determined to be an ecotype of J. regia that is now maintained as a landrace. In northern China, walnut hybridized with a distinct lineage of J. mandshurica to form J. hopeiensis, a controversial taxon (considered threatened) that our data indicate is a horticultural variety. Comparisons among whole chloroplast genomes and nuclear transcriptome analyses provided conflicting evidence for the timing of the divergence of Chinese Juglans taxa. J. cathayensis and J. mandshurica are poorly differentiated based our genomic data. Reconstruction of Juglans evolutionary history indicate that episodes of climatic variation over the past 4.5 to 33.80 million years, associated with glacial advances and retreats and population isolation, have shaped Chinese walnut demography and evolution, even in the presence of gene flow and introgression. Copyright © 2018 Elsevier Inc. All rights reserved.

  3. The genomic and transcriptomic basis of the potential of Lactobacillus plantarum A6 to improve the nutritional quality of a cereal based fermented food.

    Science.gov (United States)

    Turpin, Williams; Weiman, Marion; Guyot, Jean-Pierre; Lajus, Aurélie; Cruveiller, Stéphane; Humblot, Christèle

    2018-02-02

    The objective of this work was to investigate the nutritional potential of Lactobacillus plantarum A6 in a food matrix using next generation sequencing. To this end, we characterized the genome of the A6 strain for a complete overview of its potential. We then compared its transcriptome when grown in a food matrix made from pearl millet to and its transcriptome when cultivated in a laboratory medium. Genomic comparison of the strain L. plantarum A6 with the strains WCFS1, ST-III, JDM1 and ATCC14917 led to the identification of five regions of genomic plasticity. More specifically, 362 coding sequences, mostly annotated as coding for proteins of unknown functions, were specific to L. plantarum A6. A total of 1201 genes were significantly differentially expressed in laboratory medium and food matrix. Among them, 821 genes were up-regulated in the food matrix compared to the laboratory medium, representing 23% of whole genomic objects. In the laboratory medium, the expression of 380 genes, representing 11% of the all genomic objects was at least double than in the food matrix. Genes encoding important functions for the nutritional quality of the food were identified. Considering its efficiency as an amylolytic strain, we investigated all genes involved in carbohydrate metabolism, paying particular attention to starch metabolism. An extracellular alpha amylase, a neopullulanase and maltodextrin transporters were identified, all of which were highly expressed in the food matrix. In addition, genes involved in alpha-galactoside metabolism were identified but only two of them were induced in food matrix than in laboratory medium. This may be because alpha galactosides were already eliminated during soaking. Different biosynthetic pathways involved in the synthesis of vitamin B (folate, riboflavin, and cobalamin) were identified. They allowed the identification of a potential of vitamin synthesis, which should be confirmed through biochemical analysis in further work

  4. Genomic and transcriptomic alterations in Leishmania donovani lines experimentally resistant to antileishmanial drugs.

    Science.gov (United States)

    Rastrojo, Alberto; García-Hernández, Raquel; Vargas, Paola; Camacho, Esther; Corvo, Laura; Imamura, Hideo; Dujardin, Jean-Claude; Castanys, Santiago; Aguado, Begoña; Gamarro, Francisco; Requena, Jose M

    2018-04-13

    Leishmaniasis is a serious medical issue in many countries around the World, but it remains largely neglected in terms of research investment for developing new control and treatment measures. No vaccines exist for human use, and the chemotherapeutic agents currently used are scanty. Furthermore, for some drugs, resistance and treatment failure are increasing to alarming levels. The aim of this work was to identify genomic and trancriptomic alterations associated with experimental resistance against the common drugs used against VL: trivalent antimony (Sb III , S line), amphotericin B (AmB, A line), miltefosine (MIL, M line) and paromomycin (PMM, P line). A total of 1006 differentially expressed transcripts were identified in the S line, 379 in the A line, 146 in the M line, and 129 in the P line. Also, changes in ploidy of chromosomes and amplification/deletion of particular regions were observed in the resistant lines regarding the parental one. A series of genes were identified as possible drivers of the resistance phenotype and were validated in both promastigotes and amastigotes from Leishmania donovani, Leishmania infantum and Leishmania major species. Remarkably, a deletion of the gene LinJ.36.2510 (coding for 24-sterol methyltransferase, SMT) was found to be associated with AmB-resistance in the A line. In the P line, a dramatic overexpression of the transcripts LinJ.27.T1940 and LinJ.27.T1950 that results from a massive amplification of the collinear genes was suggested as one of the mechanisms of PMM resistance. This conclusion was reinforced after transfection experiments in which significant PMM-resistance was generated in WT parasites over-expressing either gene LinJ.27.1940 (coding for a D-lactate dehydrogenase-like protein, D-LDH) or gene LinJ.27.1950 (coding for an aminotransferase of branched-chain amino acids, BCAT). This work allowed to identify new drivers, like SMT, the deletion of which being associated with resistance to AmB, and the tandem D

  5. Revealing less derived nature of cartilaginous fish genomes with their evolutionary time scale inferred with nuclear genes.

    Directory of Open Access Journals (Sweden)

    Adina J Renz

    Full Text Available Cartilaginous fishes, divided into Holocephali (chimaeras and Elasmoblanchii (sharks, rays and skates, occupy a key phylogenetic position among extant vertebrates in reconstructing their evolutionary processes. Their accurate evolutionary time scale is indispensable for better understanding of the relationship between phenotypic and molecular evolution of cartilaginous fishes. However, our current knowledge on the time scale of cartilaginous fish evolution largely relies on estimates using mitochondrial DNA sequences. In this study, making the best use of the still partial, but large-scale sequencing data of cartilaginous fish species, we estimate the divergence times between the major cartilaginous fish lineages employing nuclear genes. By rigorous orthology assessment based on available genomic and transcriptomic sequence resources for cartilaginous fishes, we selected 20 protein-coding genes in the nuclear genome, spanning 2973 amino acid residues. Our analysis based on the Bayesian inference resulted in the mean divergence time of 421 Ma, the late Silurian, for the Holocephali-Elasmobranchii split, and 306 Ma, the late Carboniferous, for the split between sharks and rays/skates. By applying these results and other documented divergence times, we measured the relative evolutionary rate of the Hox A cluster sequences in the cartilaginous fish lineages, which resulted in a lower substitution rate with a factor of at least 2.4 in comparison to tetrapod lineages. The obtained time scale enables mapping phenotypic and molecular changes in a quantitative framework. It is of great interest to corroborate the less derived nature of cartilaginous fish at the molecular level as a genome-wide phenomenon.

  6. Multi-scale coding of genomic information: From DNA sequence to genome structure and function

    International Nuclear Information System (INIS)

    Arneodo, Alain; Vaillant, Cedric; Audit, Benjamin; Argoul, Francoise; D'Aubenton-Carafa, Yves; Thermes, Claude

    2011-01-01

    Understanding how chromatin is spatially and dynamically organized in the nucleus of eukaryotic cells and how this affects genome functions is one of the main challenges of cell biology. Since the different orders of packaging in the hierarchical organization of DNA condition the accessibility of DNA sequence elements to trans-acting factors that control the transcription and replication processes, there is actually a wealth of structural and dynamical information to learn in the primary DNA sequence. In this review, we show that when using concepts, methodologies, numerical and experimental techniques coming from statistical mechanics and nonlinear physics combined with wavelet-based multi-scale signal processing, we are able to decipher the multi-scale sequence encoding of chromatin condensation-decondensation mechanisms that play a fundamental role in regulating many molecular processes involved in nuclear functions.

  7. Analysis of Aspergillus nidulans metabolism at the genome-scale

    DEFF Research Database (Denmark)

    David, Helga; Ozcelik, İlknur Ş; Hofmann, Gerald

    2008-01-01

    of relevant secondary metabolites, was reconstructed based on detailed metabolic reconstructions available for A. niger and Saccharomyces cerevisiae, and information on the genetics, biochemistry and physiology of A. nidulans. Thereby, it was possible to identify metabolic functions without a gene associated...... a function. Results: In this work, we have manually assigned functions to 472 orphan genes in the metabolism of A. nidulans, by using a pathway-driven approach and by employing comparative genomics tools based on sequence similarity. The central metabolism of A. nidulans, as well as biosynthetic pathways......, in an objective and systematic manner. The functional assignments served as a basis to develop a mathematical model, linking 666 genes (both previously and newly annotated) to metabolic roles. The model was used to simulate metabolic behavior and additionally to integrate, analyze and interpret large-scale gene...

  8. Incorporating Protein Biosynthesis into the Saccharomyces cerevisiae Genome-scale Metabolic Model

    DEFF Research Database (Denmark)

    Olivares Hernandez, Roberto

    Based on stoichiometric biochemical equations that occur into the cell, the genome-scale metabolic models can quantify the metabolic fluxes, which are regarded as the final representation of the physiological state of the cell. For Saccharomyces Cerevisiae the genome scale model has been construc......Based on stoichiometric biochemical equations that occur into the cell, the genome-scale metabolic models can quantify the metabolic fluxes, which are regarded as the final representation of the physiological state of the cell. For Saccharomyces Cerevisiae the genome scale model has been...

  9. Genomic, Transcriptomic and Metabolomic Studies of Two Well-Characterized, Laboratory-Derived Vancomycin-Intermediate Staphylococcus aureus Strains Derived from the Same Parent Strain

    Directory of Open Access Journals (Sweden)

    Dipti S. Hattangady

    2015-02-01

    Full Text Available Complete genome comparisons, transcriptomic and metabolomic studies were performed on two laboratory-selected, well-characterized vancomycin-intermediate Staphylococcus aureus (VISA derived from the same parent MRSA that have changes in cell wall composition and decreased autolysis. A variety of mutations were found in the VISA, with more in strain 13136p−m+V20 (vancomycin MIC = 16 µg/mL than strain 13136p−m+V5 (MIC = 8 µg/mL. Most of the mutations have not previously been associated with the VISA phenotype; some were associated with cell wall metabolism and many with stress responses, notably relating to DNA damage. The genomes and transcriptomes of the two VISA support the importance of gene expression regulation to the VISA phenotype. Similarities in overall transcriptomic and metabolomic data indicated that the VISA physiologic state includes elements of the stringent response, such as downregulation of protein and nucleotide synthesis, the pentose phosphate pathway and nutrient transport systems. Gene expression for secreted virulence determinants was generally downregulated, but was more variable for surface-associated virulence determinants, although capsule formation was clearly inhibited. The importance of activated stress response elements could be seen across all three analyses, as in the accumulation of osmoprotectant metabolites such as proline and glutamate. Concentrations of potential cell wall precursor amino acids and glucosamine were increased in the VISA strains. Polyamines were decreased in the VISA, which may facilitate the accrual of mutations. Overall, the studies confirm the wide variability in mutations and gene expression patterns that can lead to the VISA phenotype.

  10. Optimizing Hybrid de Novo Transcriptome Assembly and Extending Genomic Resources for Giant Freshwater Prawns (Macrobrachium rosenbergii: The Identification of Genes and Markers Associated with Reproduction

    Directory of Open Access Journals (Sweden)

    Hyungtaek Jung

    2016-05-01

    Full Text Available The giant freshwater prawn, Macrobrachium rosenbergii, a sexually dimorphic decapod crustacean is currently the world’s most economically important cultured freshwater crustacean species. Despite its economic importance, there is currently a lack of genomic resources available for this species, and this has limited exploration of the molecular mechanisms that control the M. rosenbergii sex-differentiation system more widely in freshwater prawns. Here, we present the first hybrid transcriptome from M. rosenbergii applying RNA-Seq technologies directed at identifying genes that have potential functional roles in reproductive-related traits. A total of 13,733,210 combined raw reads (1720 Mbp were obtained from Ion-Torrent PGM and 454 FLX. Bioinformatic analyses based on three state-of-the-art assemblers, the CLC Genomic Workbench, Trans-ABySS, and Trinity, that use single and multiple k-mer methods respectively, were used to analyse the data. The influence of multiple k-mers on assembly performance was assessed to gain insight into transcriptome assembly from short reads. After optimisation, de novo assembly resulted in 44,407 contigs with a mean length of 437 bp, and the assembled transcripts were further functionally annotated to detect single nucleotide polymorphisms and simple sequence repeat motifs. Gene expression analysis was also used to compare expression patterns from ovary and testis tissue libraries to identify genes with potential roles in reproduction and sex differentiation. The large transcript set assembled here represents the most comprehensive set of transcriptomic resources ever developed for reproduction traits in M. rosenbergii, and the large number of genetic markers predicted should constitute an invaluable resource for future genetic research studies on M. rosenbergii and can be applied more widely on other freshwater prawn species in the genus Macrobrachium.

  11. Large-scale parallel genome assembler over cloud computing environment.

    Science.gov (United States)

    Das, Arghya Kusum; Koppa, Praveen Kumar; Goswami, Sayan; Platania, Richard; Park, Seung-Jong

    2017-06-01

    The size of high throughput DNA sequencing data has already reached the terabyte scale. To manage this huge volume of data, many downstream sequencing applications started using locality-based computing over different cloud infrastructures to take advantage of elastic (pay as you go) resources at a lower cost. However, the locality-based programming model (e.g. MapReduce) is relatively new. Consequently, developing scalable data-intensive bioinformatics applications using this model and understanding the hardware environment that these applications require for good performance, both require further research. In this paper, we present a de Bruijn graph oriented Parallel Giraph-based Genome Assembler (GiGA), as well as the hardware platform required for its optimal performance. GiGA uses the power of Hadoop (MapReduce) and Giraph (large-scale graph analysis) to achieve high scalability over hundreds of compute nodes by collocating the computation and data. GiGA achieves significantly higher scalability with competitive assembly quality compared to contemporary parallel assemblers (e.g. ABySS and Contrail) over traditional HPC cluster. Moreover, we show that the performance of GiGA is significantly improved by using an SSD-based private cloud infrastructure over traditional HPC cluster. We observe that the performance of GiGA on 256 cores of this SSD-based cloud infrastructure closely matches that of 512 cores of traditional HPC cluster.

  12. Local adaptation at the transcriptome level in brown trout: Evidence from early life history temperature genomic reaction norms

    DEFF Research Database (Denmark)

    Meier, Kristian; Hansen, Michael Møller; Normandeau, Eric

    2014-01-01

    Local adaptation and its underlying molecular basis has long been a key focus in evolutionary biology. There has recently been increased interest in the evolutionary role of plasticity and the molecular mechanisms underlying local adaptation. Using transcriptome analysis, we assessed differences....... These included genes involved in immune- and stress response. We observed less plasticity in the resident as compared to the anadromous populations, possibly reflecting that the degree of environmental heterogeneity encountered by individuals throughout their life cycle will select for variable level...... of phenotypic plasticity at the transcriptome level. Our study demonstrates the usefulness of transcriptome approaches to identify genes with different temperature reaction norms. The responses observed suggest that populations may vary in their susceptibility to climate change....

  13. Genome scale metabolic network reconstruction of Spirochaeta cellobiosiphila

    Directory of Open Access Journals (Sweden)

    Bharat Manna

    2017-10-01

    Full Text Available Substantial rise in the global energy demand is one of the biggest challenges in this century. Environmental pollution due to rapid depletion of the fossil fuel resources and its alarming impact on the climate change and Global Warming have motivated researchers to look for non-petroleum-based sustainable, eco-friendly, renewable, low-cost energy alternatives, such as biofuel. Lignocellulosic biomass is one of the most promising bio-resources with huge potential to contribute to this worldwide energy demand. However, the complex organization of the Cellulose, Hemicellulose and Lignin in the Lignocellulosic biomass requires extensive pre-treatment and enzymatic hydrolysis followed by fermentation, raising overall production cost of biofuel. This encourages researchers to design cost-effective approaches for the production of second generation biofuels. The products from enzymatic hydrolysis of cellulose are mostly glucose monomer or cellobiose unit that are subjected to fermentation. Spirochaeta genus is a well-known group of obligate or facultative anaerobes, living primarily on carbohydrate metabolism. Spirochaeta cellobiosiphila sp. is a facultative anaerobe under this genus, which uses a variety of monosaccharides and disaccharides as energy sources. However, most rapid growth occurs on cellobiose and fermentation yields significant amount of ethanol, acetate, CO2, H2 and small amounts of formate. It is predicted to be promising microbial machinery for industrial fermentation processes for biofuel production. The metabolic pathways that govern cellobiose metabolism in Spirochaeta cellobiosiphila are yet to be explored. The function annotation of the genome sequence of Spirochaeta cellobiosiphila is in progress. In this work we aim to map all the metabolic activities for reconstruction of genome-scale metabolic model of Spirochaeta cellobiosiphila.

  14. Babelomics: an integrative platform for the analysis of transcriptomics, proteomics and genomic data with advanced functional profiling

    Science.gov (United States)

    Medina, Ignacio; Carbonell, José; Pulido, Luis; Madeira, Sara C.; Goetz, Stefan; Conesa, Ana; Tárraga, Joaquín; Pascual-Montano, Alberto; Nogales-Cadenas, Ruben; Santoyo, Javier; García, Francisco; Marbà, Martina; Montaner, David; Dopazo, Joaquín

    2010-01-01

    Babelomics is a response to the growing necessity of integrating and analyzing different types of genomic data in an environment that allows an easy functional interpretation of the results. Babelomics includes a complete suite of methods for the analysis of gene expression data that include normalization (covering most commercial platforms), pre-processing, differential gene expression (case-controls, multiclass, survival or continuous values), predictors, clustering; large-scale genotyping assays (case controls and TDTs, and allows population stratification analysis and correction). All these genomic data analysis facilities are integrated and connected to multiple options for the functional interpretation of the experiments. Different methods of functional enrichment or gene set enrichment can be used to understand the functional basis of the experiment analyzed. Many sources of biological information, which include functional (GO, KEGG, Biocarta, Reactome, etc.), regulatory (Transfac, Jaspar, ORegAnno, miRNAs, etc.), text-mining or protein–protein interaction modules can be used for this purpose. Finally a tool for the de novo functional annotation of sequences has been included in the system. This provides support for the functional analysis of non-model species. Mirrors of Babelomics or command line execution of their individual components are now possible. Babelomics is available at http://www.babelomics.org. PMID:20478823

  15. Metabolite coupling in genome-scale metabolic networks

    Directory of Open Access Journals (Sweden)

    Palsson Bernhard Ø

    2006-03-01

    Full Text Available Abstract Background Biochemically detailed stoichiometric matrices have now been reconstructed for various bacteria, yeast, and for the human cardiac mitochondrion based on genomic and proteomic data. These networks have been manually curated based on legacy data and elementally and charge balanced. Comparative analysis of these well curated networks is now possible. Pairs of metabolites often appear together in several network reactions, linking them topologically. This co-occurrence of pairs of metabolites in metabolic reactions is termed herein "metabolite coupling." These metabolite pairs can be directly computed from the stoichiometric matrix, S. Metabolite coupling is derived from the matrix ŜŜT, whose off-diagonal elements indicate the number of reactions in which any two metabolites participate together, where Ŝ is the binary form of S. Results Metabolite coupling in the studied networks was found to be dominated by a relatively small group of highly interacting pairs of metabolites. As would be expected, metabolites with high individual metabolite connectivity also tended to be those with the highest metabolite coupling, as the most connected metabolites couple more often. For metabolite pairs that are not highly coupled, we show that the number of reactions a pair of metabolites shares across a metabolic network closely approximates a line on a log-log scale. We also show that the preferential coupling of two metabolites with each other is spread across the spectrum of metabolites and is not unique to the most connected metabolites. We provide a measure for determining which metabolite pairs couple more often than would be expected based on their individual connectivity in the network and show that these metabolites often derive their principal biological functions from existing in pairs. Thus, analysis of metabolite coupling provides information beyond that which is found from studying the individual connectivity of individual

  16. A genome-wide transcriptome map of pistachio (Pistacia vera L.) provides novel insights into salinity-related genes and marker discovery.

    Science.gov (United States)

    Moazzzam Jazi, Maryam; Seyedi, Seyed Mahdi; Ebrahimie, Esmaeil; Ebrahimi, Mansour; De Moro, Gianluca; Botanga, Christopher

    2017-08-17

    Pistachio (Pistacia vera L.) is one of the most important commercial nut crops worldwide. It is a salt-tolerant and long-lived tree, with the largest cultivation area in Iran. Climate change and subsequent increased soil salt content have adversely affected the pistachio yield in recent years. However, the lack of genomic/global transcriptomic sequences on P. vera impedes comprehensive researches at the molecular level. Hence, whole transcriptome sequencing is required to gain insight into functional genes and pathways in response to salt stress. RNA sequencing of a pooled sample representing 24 different tissues of two pistachio cultivars with contrasting salinity tolerance under control and salt treatment by Illumina Hiseq 2000 platform resulted in 368,953,262 clean 100 bp paired-ends reads (90 Gb). Following creating several assemblies and assessing their quality from multiple perspectives, we found that using the annotation-based metrics together with the length-based parameters allows an improved assessment of the transcriptome assembly quality, compared to the solely use of the length-based parameters. The generated assembly by Trinity was adopted for functional annotation and subsequent analyses. In total, 29,119 contigs annotated against all of five public databases, including NR, UniProt, TAIR10, KOG and InterProScan. Among 279 KEGG pathways supported by our assembly, we further examined the pathways involved in the plant hormone biosynthesis and signaling as well as those to be contributed to secondary metabolite biosynthesis due to their importance under salinity stress. In total, 11,337 SSRs were also identified, which the most abundant being dinucleotide repeats. Besides, 13,097 transcripts as candidate stress-responsive genes were identified. Expression of some of these genes experimentally validated through quantitative real-time PCR (qRT-PCR) that further confirmed the accuracy of the assembly. From this analysis, the contrasting expression pattern

  17. Estimated allele substitution effects underlying genomic evaluation models depend on the scaling of allele counts

    NARCIS (Netherlands)

    Bouwman, Aniek C.; Hayes, Ben J.; Calus, Mario P.L.

    2017-01-01

    Background: Genomic evaluation is used to predict direct genomic values (DGV) for selection candidates in breeding programs, but also to estimate allele substitution effects (ASE) of single nucleotide polymorphisms (SNPs). Scaling of allele counts influences the estimated ASE, because scaling of

  18. Comprehensive reconstruction and in silico analysis of Aspergillus niger genome-scale metabolic network model that accounts for 1210 ORFs.

    Science.gov (United States)

    Lu, Hongzhong; Cao, Weiqiang; Ouyang, Liming; Xia, Jianye; Huang, Mingzhi; Chu, Ju; Zhuang, Yingping; Zhang, Siliang; Noorman, Henk

    2017-03-01

    Aspergillus niger is one of the most important cell factories for industrial enzymes and organic acids production. A comprehensive genome-scale metabolic network model (GSMM) with high quality is crucial for efficient strain improvement and process optimization. The lack of accurate reaction equations and gene-protein-reaction associations (GPRs) in the current best model of A. niger named GSMM iMA871, however, limits its application scope. To overcome these limitations, we updated the A. niger GSMM by combining the latest genome annotation and literature mining technology. Compared with iMA871, the number of reactions in iHL1210 was increased from 1,380 to 1,764, and the number of unique ORFs from 871 to 1,210. With the aid of our transcriptomics analysis, the existence of 63% ORFs and 68% reactions in iHL1210 can be verified when glucose was used as the only carbon source. Physiological data from chemostat cultivations, 13 C-labeled and molecular experiments from the published literature were further used to check the performance of iHL1210. The average correlation coefficients between the predicted fluxes and estimated fluxes from 13 C-labeling data were sufficiently high (above 0.89) and the prediction of cell growth on most of the reported carbon and nitrogen sources was consistent. Using the updated genome-scale model, we evaluated gene essentiality on synthetic and yeast extract medium, as well as the effects of NADPH supply on glucoamylase production in A. niger. In summary, the new A. niger GSMM iHL1210 contains significant improvements with respect to the metabolic coverage and prediction performance, which paves the way for systematic metabolic engineering of A. niger. Biotechnol. Bioeng. 2017;114: 685-695. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  19. Large-scale transcriptome analyses reveal new genetic marker candidates of head, neck, and thyroid cancer

    DEFF Research Database (Denmark)

    Reis, Eduardo M; Ojopi, Elida P B; Alberto, Fernando L

    2005-01-01

    A detailed genome mapping analysis of 213,636 expressed sequence tags (EST) derived from nontumor and tumor tissues of the oral cavity, larynx, pharynx, and thyroid was done. Transcripts matching known human genes were identified; potential new splice variants were flagged and subjected to manual...... that can be used for future studies on the molecular basis of these tumors. Similar analysis is warranted for a number of other tumors for which large EST data sets are available....

  20. Transcriptomics resources of human tissues and organs

    DEFF Research Database (Denmark)

    Uhlén, Mathias; Hallström, Björn M.; Lindskog, Cecilia

    2016-01-01

    a framework for defining the molecular constituents of the human body as well as for generating comprehensive lists of proteins expressed across tissues or in a tissue-restricted manner. Here, we review publicly available human transcriptome resources and discuss body-wide data from independent genome......Quantifying the differential expression of genes in various human organs, tissues, and cell types is vital to understand human physiology and disease. Recently, several large-scale transcriptomics studies have analyzed the expression of protein-coding genes across tissues. These datasets provide...

  1. Genome-wide transcriptomic analysis of BR-deficient Micro-Tom reveals correlations between drought stress tolerance and brassinosteroid signaling in tomato.

    Science.gov (United States)

    Lee, Jinsu; Shim, Donghwan; Moon, Suyun; Kim, Hyemin; Bae, Wonsil; Kim, Kyunghwan; Kim, Yang-Hoon; Rhee, Sung-Keun; Hong, Chang Pyo; Hong, Suk-Young; Lee, Ye-Jin; Sung, Jwakyung; Ryu, Hojin

    2018-06-01

    Brassinosteroids (BRs) are plant steroid hormones that play crucial roles in a range of growth and developmental processes. Although BR signal transduction and biosynthetic pathways have been well characterized in model plants, their biological roles in an important crop, tomato (Solanum lycopersicum), remain unknown. Here, cultivated tomato (WT) and a BR synthesis mutant, Micro-Tom (MT), were compared using physiological and transcriptomic approaches. The cultivated tomato showed higher tolerance to drought and osmotic stresses than the MT tomato. However, BR-defective phenotypes of MT, including plant growth and stomatal closure defects, were completely recovered by application of exogenous BR or complementation with a SlDWARF gene. Using genome-wide transcriptome analysis, 619 significantly differentially expressed genes (DEGs) were identified between WT and MT plants. Several DEGs were linked to known signaling networks, including those related to biotic/abiotic stress responses, lignification, cell wall development, and hormone responses. Consistent with the higher susceptibility of MT to drought stress, several gene sets involved in responses to drought and osmotic stress were differentially regulated between the WT and MT tomato plants. Our data suggest that BR signaling pathways are involved in mediating the response to abiotic stress via fine-tuning of abiotic stress-related gene networks in tomato plants. Copyright © 2018. Published by Elsevier Masson SAS.

  2. The genome and life-stage specific transcriptomes of Globodera pallida elucidate key aspects of plant parasitism by a cyst nematode.

    Science.gov (United States)

    Cotton, James A; Lilley, Catherine J; Jones, Laura M; Kikuchi, Taisei; Reid, Adam J; Thorpe, Peter; Tsai, Isheng J; Beasley, Helen; Blok, Vivian; Cock, Peter J A; Eves-van den Akker, Sebastian; Holroyd, Nancy; Hunt, Martin; Mantelin, Sophie; Naghra, Hardeep; Pain, Arnab; Palomares-Rius, Juan E; Zarowiecki, Magdalena; Berriman, Matthew; Jones, John T; Urwin, Peter E

    2014-03-03

    Globodera pallida is a devastating pathogen of potato crops, making it one of the most economically important plant parasitic nematodes. It is also an important model for the biology of cyst nematodes. Cyst nematodes and root-knot nematodes are the two most important plant parasitic nematode groups and together represent a global threat to food security. We present the complete genome sequence of G. pallida, together with transcriptomic data from most of the nematode life cycle, particularly focusing on the life cycle stages involved in root invasion and establishment of the biotrophic feeding site. Despite the relatively close phylogenetic relationship with root-knot nematodes, we describe a very different gene family content between the two groups and in particular extensive differences in the repertoire of effectors, including an enormous expansion of the SPRY domain protein family in G. pallida, which includes the SPRYSEC family of effectors. This highlights the distinct biology of cyst nematodes compared to the root-knot nematodes that were, until now, the only sedentary plant parasitic nematodes for which genome information was available. We also present in-depth descriptions of the repertoires of other genes likely to be important in understanding the unique biology of cyst nematodes and of potential drug targets and other targets for their control. The data and analyses we present will be central in exploiting post-genomic approaches in the development of much-needed novel strategies for the control of G. pallida and related pathogens.

  3. Characterization of the genome of a phylogenetically distinct tospovirus and its interactions with the local lesion-induced host Chenopodium quinoa by whole-transcriptome analyses.

    Science.gov (United States)

    Chou, Wan-Chen; Lin, Shih-Shun; Yeh, Shyi-Dong; Li, Siang-Ling; Peng, Ying-Che; Fan, Ya-Hsu; Chen, Tsung-Chi

    2017-01-01

    Chenopodium quinoa is a natural local lesion host of numerous plant viruses, including tospoviruses (family Bunyaviridae). Groundnut chlorotic fan-spot tospovirus (GCFSV) has been shown to consistently induce local lesions on the leaves of C. quinoa 4 days post-inoculation (dpi). To reveal the whole genome of GCFSV and its interactions with C. quinoa, RNA-seq was performed to determine the transcriptome profiles of C. quinoa leaves. The high-throughput reads from infected C. quinoa leaves were used to identify the whole genome sequence of GCFSV and its single nucleotide polymorphisms. Our results indicated that GCFSV is a phylogenetically distinct tospovirus. Moreover, 27,170 coding and 29,563 non-coding sequences of C. quinoa were identified through de novo assembly, mixing reads from mock and infected samples. Several key genes involved in the modulation of hypersensitive response (HR) were identified. The expression levels of 4,893 deduced complete genes annotated using the Arabidopsis genome indicated that several HR-related orthologues of pathogenesis-related proteins, transcription factors, mitogen-activated protein kinases, and defense proteins were significantly expressed in leaves that formed local lesions. Here, we also provide new insights into the replication progression of a tospovirus and the molecular regulation of the C. quinoa response to virus infection.

  4. Genome, transcriptome and methylome sequencing of a primitively eusocial wasp reveal a greatly reduced DNA methylation system in a social insect.

    Science.gov (United States)

    Standage, Daniel S; Berens, Ali J; Glastad, Karl M; Severin, Andrew J; Brendel, Volker P; Toth, Amy L

    2016-04-01

    Comparative genomics of social insects has been intensely pursued in recent years with the goal of providing insights into the evolution of social behaviour and its underlying genomic and epigenomic basis. However, the comparative approach has been hampered by a paucity of data on some of the most informative social forms (e.g. incipiently and primitively social) and taxa (especially members of the wasp family Vespidae) for studying social evolution. Here, we provide a draft genome of the primitively eusocial model insect Polistes dominula, accompanied by analysis of caste-related transcriptome and methylome sequence data for adult queens and workers. Polistes dominula possesses a fairly typical hymenopteran genome, but shows very low genomewide GC content and some evidence of reduced genome size. We found numerous caste-related differences in gene expression, with evidence that both conserved and novel genes are related to caste differences. Most strikingly, these -omics data reveal a major reduction in one of the major epigenetic mechanisms that has been previously suggested to be important for caste differences in social insects: DNA methylation. Along with a conspicuous loss of a key gene associated with environmentally responsive DNA methylation (the de novo DNA methyltransferase Dnmt3), these wasps have greatly reduced genomewide methylation to almost zero. In addition to providing a valuable resource for comparative analysis of social insect evolution, our integrative -omics data for this important behavioural and evolutionary model system call into question the general importance of DNA methylation in caste differences and evolution in social insects. © 2016 The Authors. Molecular Ecology Published by John Wiley & Sons Ltd.

  5. Large-scale genomic 2D visualization reveals extensive CG-AT skew correlation in bird genomes

    Directory of Open Access Journals (Sweden)

    Deng Xuemei

    2007-11-01

    Full Text Available Abstract Background Bird genomes have very different compositional structure compared with other warm-blooded animals. The variation in the base skew rules in the vertebrate genomes remains puzzling, but it must relate somehow to large-scale genome evolution. Current research is inclined to relate base skew with mutations and their fixation. Here we wish to explore base skew correlations in bird genomes, to develop methods for displaying and quantifying such correlations at different scales, and to discuss possible explanations for the peculiarities of the bird genomes in skew correlation. Results We have developed a method called Base Skew Double Triangle (BSDT for exhibiting the genome-scale change of AT/CG skew as a two-dimensional square picture, showing base skews at many scales simultaneously in a single image. By this method we found that most chicken chromosomes have high AT/CG skew correlation (symmetry in 2D picture, except for some microchromosomes. No other organisms studied (18 species show such high skew correlations. This visualized high correlation was validated by three kinds of quantitative calculations with overlapping and non-overlapping windows, all indicating that chicken and birds in general have a special genome structure. Similar features were also found in some of the mammal genomes, but clearly much weaker than in chickens. We presume that the skew correlation feature evolved near the time that birds separated from other vertebrate lineages. When we eliminated the repeat sequences from the genomes, the AT and CG skews correlation increased for some mammal genomes, but were still clearly lower than in chickens. Conclusion Our results suggest that BSDT is an expressive visualization method for AT and CG skew and enabled the discovery of the very high skew correlation in bird genomes; this peculiarity is worth further study. Computational analysis indicated that this correlation might be a compositional characteristic

  6. Using a genome-scale metabolic network model to elucidate the mechanism of chloroquine action in Plasmodium falciparum

    Directory of Open Access Journals (Sweden)

    Shivendra G. Tewari

    2017-08-01

    Full Text Available Chloroquine, long the default first-line treatment against malaria, is now abandoned in large parts of the world because of widespread drug-resistance in Plasmodium falciparum. In spite of its importance as a cost-effective and efficient drug, a coherent understanding of the cellular mechanisms affected by chloroquine and how they influence the fitness and survival of the parasite remains elusive. Here, we used a systems biology approach to integrate genome-scale transcriptomics to map out the effects of chloroquine, identify targeted metabolic pathways, and translate these findings into mechanistic insights. Specifically, we first developed a method that integrates transcriptomic and metabolomic data, which we independently validated against a recently published set of such data for Krebs-cycle mutants of P. falciparum. We then used the method to calculate the effect of chloroquine treatment on the metabolic flux profiles of P. falciparum during the intraerythrocytic developmental cycle. The model predicted dose-dependent inhibition of DNA replication, in agreement with earlier experimental results for both drug-sensitive and drug-resistant P. falciparum strains. Our simulations also corroborated experimental findings that suggest differences in chloroquine sensitivity between ring- and schizont-stage P. falciparum. Our analysis also suggests that metabolic fluxes that govern reduced thioredoxin and phosphoenolpyruvate synthesis are significantly decreased and are pivotal to chloroquine-based inhibition of P. falciparum DNA replication. The consequences of impaired phosphoenolpyruvate synthesis and redox metabolism are reduced carbon fixation and increased oxidative stress, respectively, both of which eventually facilitate killing of the parasite. Our analysis suggests that a combination of chloroquine (or an analogue and another drug, which inhibits carbon fixation and/or increases oxidative stress, should increase the clearance of P

  7. Transcriptome Analysis of Two Vicia sativa Subspecies: Mining Molecular Markers to Enhance Genomic Resources for Vetch Improvement

    Directory of Open Access Journals (Sweden)

    Tae-Sung Kim

    2015-11-01

    Full Text Available The vetch (Vicia sativa is one of the most important annual forage legumes globally due to its multiple uses and high nutritional content. Despite these agronomical benefits, many drawbacks, including cyano-alanine toxin, has reduced the agronomic value of vetch varieties. Here, we used 454 technology to sequence the two V. sativa subspecies (ssp. sativa and ssp. nigra to enrich functional information and genetic marker resources for the vetch research community. A total of 86,532 and 47,103 reads produced 35,202 and 18,808 unigenes with average lengths of 735 and 601 bp for V. sativa sativa and V. sativa nigra, respectively. Gene Ontology annotations and the cluster of orthologous gene classes were used to annotate the function of the Vicia transcriptomes. The Vicia transcriptome sequences were then mined for simple sequence repeat (SSR and single nucleotide polymorphism (SNP markers. About 13% and 3% of the Vicia unigenes contained the putative SSR and SNP sequences, respectively. Among those SSRs, 100 were chosen for the validation and the polymorphism test using the Vicia germplasm set. Thus, our approach takes advantage of the utility of transcriptomic data to expedite a vetch breeding program.

  8. The first whole genome and transcriptome of the cinereous vulture reveals adaptation in the gastric and immune defense systems and possible convergent evolution between the Old and New World vultures.

    Science.gov (United States)

    Chung, Oksung; Jin, Seondeok; Cho, Yun Sung; Lim, Jeongheui; Kim, Hyunho; Jho, Sungwoong; Kim, Hak-Min; Jun, JeHoon; Lee, HyeJin; Chon, Alvin; Ko, Junsu; Edwards, Jeremy; Weber, Jessica A; Han, Kyudong; O'Brien, Stephen J; Manica, Andrea; Bhak, Jong; Paek, Woon Kee

    2015-10-21

    The cinereous vulture, Aegypius monachus, is the largest bird of prey and plays a key role in the ecosystem by removing carcasses, thus preventing the spread of diseases. Its feeding habits force it to cope with constant exposure to pathogens, making this species an interesting target for discovering functionally selected genetic variants. Furthermore, the presence of two independently evolved vulture groups, Old World and New World vultures, provides a natural experiment in which to investigate convergent evolution due to obligate scavenging. We sequenced the genome of a cinereous vulture, and mapped it to the bald eagle reference genome, a close relative with a divergence time of 18 million years. By comparing the cinereous vulture to other avian genomes, we find positively selected genetic variations in this species associated with respiration, likely linked to their ability of immune defense responses and gastric acid secretion, consistent with their ability to digest carcasses. Comparisons between the Old World and New World vulture groups suggest convergent gene evolution. We assemble the cinereous vulture blood transcriptome from a second individual, and annotate genes. Finally, we infer the demographic history of the cinereous vulture which shows marked fluctuations in effective population size during the late Pleistocene. We present the first genome and transcriptome analyses of the cinereous vulture compared to other avian genomes and transcriptomes, revealing genetic signatures of dietary and environmental adaptations accompanied by possible convergent evolution between the Old World and New World vultures.

  9. Mapping copy number variation by population-scale genome sequencing

    DEFF Research Database (Denmark)

    Mills, Ryan E.; Walter, Klaudia; Stewart, Chip

    2011-01-01

    Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is......, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications...

  10. Genome-scale reconstruction of the Saccharomyces cerevisiae metabolic network

    DEFF Research Database (Denmark)

    Förster, Jochen; Famili, I.; Fu, P.

    2003-01-01

    The metabolic network in the yeast Saccharomyces cerevisiae was reconstructed using currently available genomic, biochemical, and physiological information. The metabolic reactions were compartmentalized between the cytosol and the mitochondria, and transport steps between the compartments...

  11. GIGGLE: a search engine for large-scale integrated genome analysis

    Science.gov (United States)

    Layer, Ryan M; Pedersen, Brent S; DiSera, Tonya; Marth, Gabor T; Gertz, Jason; Quinlan, Aaron R

    2018-01-01

    GIGGLE is a genomics search engine that identifies and ranks the significance of genomic loci shared between query features and thousands of genome interval files. GIGGLE (https://github.com/ryanlayer/giggle) scales to billions of intervals and is over three orders of magnitude faster than existing methods. Its speed extends the accessibility and utility of resources such as ENCODE, Roadmap Epigenomics, and GTEx by facilitating data integration and hypothesis generation. PMID:29309061

  12. GIGGLE: a search engine for large-scale integrated genome analysis.

    Science.gov (United States)

    Layer, Ryan M; Pedersen, Brent S; DiSera, Tonya; Marth, Gabor T; Gertz, Jason; Quinlan, Aaron R

    2018-02-01

    GIGGLE is a genomics search engine that identifies and ranks the significance of genomic loci shared between query features and thousands of genome interval files. GIGGLE (https://github.com/ryanlayer/giggle) scales to billions of intervals and is over three orders of magnitude faster than existing methods. Its speed extends the accessibility and utility of resources such as ENCODE, Roadmap Epigenomics, and GTEx by facilitating data integration and hypothesis generation.

  13. In Silico Genome-Scale Reconstruction and Validation of the Staphylococcus aureus Metabolic Network

    NARCIS (Netherlands)

    Heinemann, Matthias; Kümmel, Anne; Ruinatscha, Reto; Panke, Sven

    2005-01-01

    A genome-scale metabolic model of the Gram-positive, facultative anaerobic opportunistic pathogen Staphylococcus aureus N315 was constructed based on current genomic data, literature, and physiological information. The model comprises 774 metabolic processes representing approximately 23% of all

  14. Genome-based microbial ecology of anammox granules in a full-scale wastewater treatment system

    NARCIS (Netherlands)

    Speth, D.R.; Zandt, M.H. in 't; Guerrero Cruz, S.; Dutilh, B.E.; Jetten, M.S.M.

    2016-01-01

    Partial-nitritation anammox (PNA) is a novel wastewater treatment procedure for energy-efficient ammonium removal. Here we use genome-resolved metagenomics to build a genome-based ecological model of the microbial community in a full-scale PNA reactor. Sludge from the bioreactor examined here is

  15. Environmental versatility promotes modularity in genome-scale metabolic networks.

    Science.gov (United States)

    Samal, Areejit; Wagner, Andreas; Martin, Olivier C

    2011-08-24

    The ubiquity of modules in biological networks may result from an evolutionary benefit of a modular organization. For instance, modularity may increase the rate of adaptive evolution, because modules can be easily combined into new arrangements that may benefit their carrier. Conversely, modularity may emerge as a by-product of some trait. We here ask whether this last scenario may play a role in genome-scale metabolic networks that need to sustain life in one or more chemical environments. For such networks, we define a network module as a maximal set of reactions that are fully coupled, i.e., whose fluxes can only vary in fixed proportions. This definition overcomes limitations of purely graph based analyses of metabolism by exploiting the functional links between reactions. We call a metabolic network viable in a given chemical environment if it can synthesize all of an organism's biomass compounds from nutrients in this environment. An organism's metabolism is highly versatile if it can sustain life in many different chemical environments. We here ask whether versatility affects the modularity of metabolic networks. Using recently developed techniques to randomly sample large numbers of viable metabolic networks from a vast space of metabolic networks, we use flux balance analysis to study in silico metabolic networks that differ in their versatility. We find that highly versatile networks are also highly modular. They contain more modules and more reactions that are organized into modules. Most or all reactions in a module are associated with the same biochemical pathways. Modules that arise in highly versatile networks generally involve reactions that process nutrients or closely related chemicals. We also observe that the metabolism of E. coli is significantly more modular than even our most versatile networks. Our work shows that modularity in metabolic networks can be a by-product of functional constraints, e.g., the need to sustain life in multiple

  16. Environmental versatility promotes modularity in genome-scale metabolic networks

    Directory of Open Access Journals (Sweden)

    Wagner Andreas

    2011-08-01

    Full Text Available Abstract Background The ubiquity of modules in biological networks may result from an evolutionary benefit of a modular organization. For instance, modularity may increase the rate of adaptive evolution, because modules can be easily combined into new arrangements that may benefit their carrier. Conversely, modularity may emerge as a by-product of some trait. We here ask whether this last scenario may play a role in genome-scale metabolic networks that need to sustain life in one or more chemical environments. For such networks, we define a network module as a maximal set of reactions that are fully coupled, i.e., whose fluxes can only vary in fixed proportions. This definition overcomes limitations of purely graph based analyses of metabolism by exploiting the functional links between reactions. We call a metabolic network viable in a given chemical environment if it can synthesize all of an organism's biomass compounds from nutrients in this environment. An organism's metabolism is highly versatile if it can sustain life in many different chemical environments. We here ask whether versatility affects the modularity of metabolic networks. Results Using recently developed techniques to randomly sample large numbers of viable metabolic networks from a vast space of metabolic networks, we use flux balance analysis to study in silico metabolic networks that differ in their versatility. We find that highly versatile networks are also highly modular. They contain more modules and more reactions that are organized into modules. Most or all reactions in a module are associated with the same biochemical pathways. Modules that arise in highly versatile networks generally involve reactions that process nutrients or closely related chemicals. We also observe that the metabolism of E. coli is significantly more modular than even our most versatile networks. Conclusions Our work shows that modularity in metabolic networks can be a by-product of functional

  17. Transcriptome of interstitial cells of Cajal reveals unique and selective gene signatures.

    Directory of Open Access Journals (Sweden)

    Moon Young Lee

    Full Text Available Transcriptome-scale data can reveal essential clues into understanding the underlying molecular mechanisms behind specific cellular functions and biological processes. Transcriptomics is a continually growing field of research utilized in biomarker discovery. The transcriptomic profile of interstitial cells of Cajal (ICC, which serve as slow-wave electrical pacemakers for gastrointestinal (GI smooth muscle, has yet to be uncovered. Using copGFP-labeled ICC mice and flow cytometry, we isolated ICC populations from the murine small intestine and colon and obtained their transcriptomes. In analyzing the transcriptome, we identified a unique set of ICC-restricted markers including transcription factors, epigenetic enzymes/regulators, growth factors, receptors, protein kinases/phosphatases, and ion channels/transporters. This analysis provides new and unique insights into the cellular and biological functions of ICC in GI physiology. Additionally, we constructed an interactive ICC genome browser (http://med.unr.edu/physio/transcriptome based on the UCSC genome database. To our knowledge, this is the first online resource that provides a comprehensive library of all known genetic transcripts expressed in primary ICC. Our genome browser offers a new perspective into the alternative expression of genes in ICC and provides a valuable reference for future functional studies.

  18. Decoding Synteny Blocks and Large-Scale Duplications in Mammalian and Plant Genomes

    Science.gov (United States)

    Peng, Qian; Alekseyev, Max A.; Tesler, Glenn; Pevzner, Pavel A.

    The existing synteny block reconstruction algorithms use anchors (e.g., orthologous genes) shared over all genomes to construct the synteny blocks for multiple genomes. This approach, while efficient for a few genomes, cannot be scaled to address the need to construct synteny blocks in many mammalian genomes that are currently being sequenced. The problem is that the number of anchors shared among all genomes quickly decreases with the increase in the number of genomes. Another problem is that many genomes (plant genomes in particular) had extensive duplications, which makes decoding of genomic architecture and rearrangement analysis in plants difficult. The existing synteny block generation algorithms in plants do not address the issue of generating non-overlapping synteny blocks suitable for analyzing rearrangements and evolution history of duplications. We present a new algorithm based on the A-Bruijn graph framework that overcomes these difficulties and provides a unified approach to synteny block reconstruction for multiple genomes, and for genomes with large duplications.

  19. Insertion Sequence-Caused Large Scale-Rearrangements in the Genome of Escherichia coli

    Science.gov (United States)

    2016-07-18

    affordable ap- proach to genome-wide characterization of genetic varia - tion in bacterial and eukaryotic genomes (1–3). In addition to small-scale...Paired-End Reads), that uses a graph-based al- gorithm (27) capable of detecting most large-scale varia - tion involving repetitive regions, including novel...Avila,P., Grinsted,J. and De La Cruz,F. (1988) Analysis of the variable endpoints generated by one-ended transposition of Tn21.. J. Bacteriol., 170

  20. Mining whole genomes and transcriptomes of Jatropha (Jatropha curcas) and Castor bean (Ricinus communis) for NBS-LRR genes and defense response associated transcription factors.

    Science.gov (United States)

    Sood, Archit; Jaiswal, Varun; Chanumolu, Sree Krishna; Malhotra, Nikhil; Pal, Tarun; Chauhan, Rajinder Singh

    2014-11-01

    Jatropha (Jatropha curcas L.) and Castor bean (Ricinus communis) are oilseed crops of family Euphorbiaceae with the potential of producing high quality biodiesel and having industrial value. Both the bioenergy plants are becoming susceptible to various biotic stresses directly affecting the oil quality and content. No report exists as of today on analysis of Nucleotide Binding Site-Leucine Rich Repeat (NBS-LRR) gene repertoire and defense response transcription factors in both the plant species. In silico analysis of whole genomes and transcriptomes identified 47 new NBS-LRR genes in both the species and 122 and 318 defense response related transcription factors in Jatropha and Castor bean, respectively. The identified NBS-LRR genes and defense response transcription factors were mapped onto the respective genomes. Common and unique NBS-LRR genes and defense related transcription factors were identified in both the plant species. All NBS-LRR genes in both the species were characterized into Toll/interleukin-1 receptor NBS-LRRs (TNLs) and coiled-coil NBS-LRRs (CNLs), position on contigs, gene clusters and motifs and domains distribution. Transcript abundance or expression values were measured for all NBS-LRR genes and defense response transcription factors, suggesting their functional role. The current study provides a repertoire of NBS-LRR genes and transcription factors which can be used in not only dissecting the molecular basis of disease resistance phenotype but also in developing disease resistant genotypes in Jatropha and Castor bean through transgenic or molecular breeding approaches.

  1. Genome-Wide Identification and Transcriptome-Based Expression Profiling of the Sox Gene Family in the Nile Tilapia (Oreochromis niloticus).

    Science.gov (United States)

    Wei, Ling; Yang, Chao; Tao, Wenjing; Wang, Deshou

    2016-02-23

    The Sox transcription factor family is characterized with the presence of a Sry-related high-mobility group (HMG) box and plays important roles in various biological processes in animals, including sex determination and differentiation, and the development of multiple organs. In this study, 27 Sox genes were identified in the genome of the Nile tilapia (Oreochromis niloticus), and were classified into seven groups. The members of each group of the tilapia Sox genes exhibited a relatively conserved exon-intron structure. Comparative analysis showed that the Sox gene family has undergone an expansion in tilapia and other teleost fishes following their whole genome duplication, and group K only exists in teleosts. Transcriptome-based analysis demonstrated that most of the tilapia Sox genes presented stage-specific and/or sex-dimorphic expressions during gonadal development, and six of the group B Sox genes were specifically expressed in the adult brain. Our results provide a better understanding of gene structure and spatio-temporal expression of the Sox gene family in tilapia, and will be useful for further deciphering the roles of the Sox genes during sex determination and gonadal development in teleosts.

  2. Genome-Wide Identification and Transcriptome-Based Expression Profiling of the Sox Gene Family in the Nile Tilapia (Oreochromis niloticus

    Directory of Open Access Journals (Sweden)

    Ling Wei

    2016-02-01

    Full Text Available The Sox transcription factor family is characterized with the presence of a Sry-related high-mobility group (HMG box and plays important roles in various biological processes in animals, including sex determination and differentiation, and the development of multiple organs. In this study, 27 Sox genes were identified in the genome of the Nile tilapia (Oreochromis niloticus, and were classified into seven groups. The members of each group of the tilapia Sox genes exhibited a relatively conserved exon-intron structure. Comparative analysis showed that the Sox gene family has undergone an expansion in tilapia and other teleost fishes following their whole genome duplication, and group K only exists in teleosts. Transcriptome-based analysis demonstrated that most of the tilapia Sox genes presented stage-specific and/or sex-dimorphic expressions during gonadal development, and six of the group B Sox genes were specifically expressed in the adult brain. Our results provide a better understanding of gene structure and spatio-temporal expression of the Sox gene family in tilapia, and will be useful for further deciphering the roles of the Sox genes during sex determination and gonadal development in teleosts.

  3. Unraveling the rat blood genome-wide transcriptome after oral administration of lavender oil by a two-color dye-swap DNA microarray approach

    Directory of Open Access Journals (Sweden)

    Motohide Hori

    2016-06-01

    Full Text Available Lavender oil (LO is a commonly used essential oil in aromatherapy as non-traditional medicine. With an aim to demonstrate LO effects on the body, we have recently established an animal model investigating the influence of orally administered LO in rat tissues, genome-wide. In this brief, we investigate the effect of LO ingestion in the blood of rat. Rats were administered LO at usual therapeutic dose (5 mg/kg in humans, and following collection of the venous blood from the heart and extraction of total RNA, the differentially expressed genes were screened using a 4 × 44-K whole-genome rat chip (Agilent microarray platform; Agilent Technologies, Palo Alto, CA, USA in conjunction with a two-color dye-swap approach. A total of 834 differentially expressed genes in the blood were identified: 362 up-regulated and 472 down-regulated. These genes were functionally categorized using bioinformatics tools. The gene expression inventory of rat blood transcriptome under LO, a first report, has been deposited into the Gene Expression Omnibus (GEO: GSE67499. The data will be a valuable resource in examining the effects of natural products, and which could also serve as a human model for further functional analysis and investigation.

  4. Kernel methods for large-scale genomic data analysis

    Science.gov (United States)

    Xing, Eric P.; Schaid, Daniel J.

    2015-01-01

    Machine learning, particularly kernel methods, has been demonstrated as a promising new tool to tackle the challenges imposed by today’s explosive data growth in genomics. They provide a practical and principled approach to learning how a large number of genetic variants are associated with complex phenotypes, to help reveal the complexity in the relationship between the genetic markers and the outcome of interest. In this review, we highlight the potential key role it will have in modern genomic data processing, especially with regard to integration with classical methods for gene prioritizing, prediction and data fusion. PMID:25053743

  5. A reference gene set for sex pheromone biosynthesis and degradation genes from the diamondback moth, Plutella xylostella, based on genome and transcriptome digital gene expression analyses.

    Science.gov (United States)

    He, Peng; Zhang, Yun-Fei; Hong, Duan-Yang; Wang, Jun; Wang, Xing-Liang; Zuo, Ling-Hua; Tang, Xian-Fu; Xu, Wei-Ming; He, Ming

    2017-03-01

    Female moths synthesize species-specific sex pheromone components and release them to attract male moths, which depend on precise sex pheromone chemosensory system to locate females. Two types of genes involved in the sex pheromone biosynthesis and degradation pathways play essential roles in this important moth behavior. To understand the function of genes in the sex pheromone pathway, this study investigated the genome-wide and digital gene expression of sex pheromone biosynthesis and degradation genes in various adult tissues in the diamondback moth (DBM), Plutella xylostella, which is a notorious vegetable pest worldwide. A massive transcriptome data (at least 39.04 Gb) was generated by sequencing 6 adult tissues including male antennae, female antennae, heads, legs, abdomen and female pheromone glands from DBM by using Illumina 4000 next-generation sequencing and mapping to a published DBM genome. Bioinformatics analysis yielded a total of 89,332 unigenes among which 87 transcripts were putatively related to seven gene families in the sex pheromone biosynthesis pathway. Among these, seven [two desaturases (DES), three fatty acyl-CoA reductases (FAR) one acetyltransferase (ACT) and one alcohol dehydrogenase (AD)] were mainly expressed in the pheromone glands with likely function in the three essential sex pheromone biosynthesis steps: desaturation, reduction, and esterification. We also identified 210 odorant-degradation related genes (including sex pheromone-degradation related genes) from seven major enzyme groups. Among these genes, 100 genes are new identified and two aldehyde oxidases (AOXs), one aldehyde dehydrogenase (ALDH), five carboxyl/cholinesterases (CCEs), five UDP-glycosyltransferases (UGTs), eight cytochrome P450 (CYP) and three glutathione S-transferases (GSTs) displayed more robust expression in the antennae, and thus are proposed to participate in the degradation of sex pheromone components and plant volatiles. To date, this is the most

  6. Resolution of deep eudicot phylogeny and their temporal diversification using nuclear genes from transcriptomic and genomic datasets.

    Science.gov (United States)

    Zeng, Liping; Zhang, Ning; Zhang, Qiang; Endress, Peter K; Huang, Jie; Ma, Hong

    2017-05-01

    Explosive diversification is widespread in eukaryotes, making it difficult to resolve phylogenetic relationships. Eudicots contain c. 75% of extant flowering plants, are important for human livelihood and terrestrial ecosystems, and have probably experienced explosive diversifications. The eudicot phylogenetic relationships, especially among those of the Pentapetalae, remain unresolved. Here, we present a highly supported eudicot phylogeny and diversification rate shifts using 31 newly generated transcriptomes and 88 other datasets covering 70% of eudicot orders. A highly supported eudicot phylogeny divided Pentapetalae into two groups: one with rosids, Saxifragales, Vitales and Santalales; the other containing asterids, Caryophyllales and Dilleniaceae, with uncertainty for Berberidopsidales. Molecular clock analysis estimated that crown eudicots originated c. 146 Ma, considerably earlier than earliest tricolpate pollen fossils and most other molecular clock estimates, and Pentapetalae sequentially diverged into eight major lineages within c. 15 Myr. Two identified increases of diversification rate are located in the stems leading to Pentapetalae and asterids, and lagged behind the gamma hexaploidization. The nuclear genes from newly generated transcriptomes revealed a well-resolved eudicot phylogeny, sequential separation of major core eudicot lineages and temporal mode of diversifications, providing new insights into the evolutionary trend of morphologies and contributions to the diversification of eudicots. © 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.

  7. Developmental transcriptome of Aplysia californica'

    KAUST Repository

    Heyland, Andreas

    2010-12-06

    Genome-wide transcriptional changes in development provide important insight into mechanisms underlying growth, differentiation, and patterning. However, such large-scale developmental studies have been limited to a few representatives of Ecdysozoans and Chordates. Here, we characterize transcriptomes of embryonic, larval, and metamorphic development in the marine mollusc Aplysia californica and reveal novel molecular components associated with life history transitions. Specifically, we identify more than 20 signal peptides, putative hormones, and transcription factors in association with early development and metamorphic stages-many of which seem to be evolutionarily conserved elements of signal transduction pathways. We also characterize genes related to biomineralization-a critical process of molluscan development. In summary, our experiment provides the first large-scale survey of gene expression in mollusc development, and complements previous studies on the regulatory mechanisms underlying body plan patterning and the formation of larval and juvenile structures. This study serves as a resource for further functional annotation of transcripts and genes in Aplysia, specifically and molluscs in general. A comparison of the Aplysia developmental transcriptome with similar studies in the zebra fish Danio rerio, the fruit fly Drosophila melanogaster, the nematode Caenorhabditis elegans, and other studies on molluscs suggests an overall highly divergent pattern of gene regulatory mechanisms that are likely a consequence of the different developmental modes of these organisms. © 2010 Wiley-Liss, Inc., A Wiley Company.

  8. Large-scale chromosome folding versus genomic DNA sequences: A discrete double Fourier transform technique.

    Science.gov (United States)

    Chechetkin, V R; Lobzin, V V

    2017-08-07

    Using state-of-the-art techniques combining imaging methods and high-throughput genomic mapping tools leaded to the significant progress in detailing chromosome architecture of various organisms. However, a gap still remains between the rapidly growing structural data on the chromosome folding and the large-scale genome organization. Could a part of information on the chromosome folding be obtained directly from underlying genomic DNA sequences abundantly stored in the databanks? To answer this question, we developed an original discrete double Fourier transform (DDFT). DDFT serves for the detection of large-scale genome regularities associated with domains/units at the different levels of hierarchical chromosome folding. The method is versatile and can be applied to both genomic DNA sequences and corresponding physico-chemical parameters such as base-pairing free energy. The latter characteristic is closely related to the replication and transcription and can also be used for the assessment of temperature or supercoiling effects on the chromosome folding. We tested the method on the genome of E. coli K-12 and found good correspondence with the annotated domains/units established experimentally. As a brief illustration of further abilities of DDFT, the study of large-scale genome organization for bacteriophage PHIX174 and bacterium Caulobacter crescentus was also added. The combined experimental, modeling, and bioinformatic DDFT analysis should yield more complete knowledge on the chromosome architecture and genome organization. Copyright © 2017 Elsevier Ltd. All rights reserved.

  9. Genome-based microbial ecology of anammox granules in a full-scale wastewater treatment system.

    Science.gov (United States)

    Speth, Daan R; In 't Zandt, Michiel H; Guerrero-Cruz, Simon; Dutilh, Bas E; Jetten, Mike S M

    2016-03-31

    Partial-nitritation anammox (PNA) is a novel wastewater treatment procedure for energy-efficient ammonium removal. Here we use genome-resolved metagenomics to build a genome-based ecological model of the microbial community in a full-scale PNA reactor. Sludge from the bioreactor examined here is used to seed reactors in wastewater treatment plants around the world; however, the role of most of its microbial community in ammonium removal remains unknown. Our analysis yielded 23 near-complete draft genomes that together represent the majority of the microbial community. We assign these genomes to distinct anaerobic and aerobic microbial communities. In the aerobic community, nitrifying organisms and heterotrophs predominate. In the anaerobic community, widespread potential for partial denitrification suggests a nitrite loop increases treatment efficiency. Of our genomes, 19 have no previously cultivated or sequenced close relatives and six belong to bacterial phyla without any cultivated members, including the most complete Omnitrophica (formerly OP3) genome to date.

  10. Adaptation and evolution of deep-sea scale worms (Annelida: Polynoidae): insights from transcriptome comparison with a shallow-water species

    Science.gov (United States)

    Zhang, Yanjie; Sun, Jin; Chen, Chong; Watanabe, Hiromi K.; Feng, Dong; Zhang, Yu; Chiu, Jill M.Y.; Qian, Pei-Yuan; Qiu, Jian-Wen

    2017-01-01

    Polynoid scale worms (Polynoidae, Annelida) invaded deep-sea chemosynthesis-based ecosystems approximately 60 million years ago, but little is known about their genetic adaptation to the extreme deep-sea environment. In this study, we reported the first two transcriptomes of deep-sea polynoids (Branchipolynoe pettiboneae, Lepidonotopodium sp.) and compared them with the transcriptome of a shallow-water polynoid (Harmothoe imbricata). We determined codon and amino acid usage, positive selected genes, highly expressed genes and putative duplicated genes. Transcriptome assembly produced 98,806 to 225,709 contigs in the three species. There were more positively charged amino acids (i.e., histidine and arginine) and less negatively charged amino acids (i.e., aspartic acid and glutamic acid) in the deep-sea species. There were 120 genes showing clear evidence of positive selection. Among the 10% most highly expressed genes, there were more hemoglobin genes with high expression levels in both deep-sea species. The duplicated genes related to DNA recombination and metabolism, and gene expression were only enriched in deep-sea species. Deep-sea scale worms adopted two strategies of adaptation to hypoxia in the chemosynthesis-based habitats (i.e., rapid evolution of tetra-domain hemoglobin in Branchipolynoe or high expression of single-domain hemoglobin in Lepidonotopodium sp.). PMID:28397791

  11. Genome-wide transcriptomic responses of the seagrasses Zostera marina and Nanozostera noltii under a simulated heatwave confirm functional types

    NARCIS (Netherlands)

    Franssen, Susanne U.; Gu, Jenny; Winters, Gidon; Huylmans, Ann-Kathrin; Wienpahl, Isabell; Sparwel, Maximiliane; Coyer, James; Olsen, Jeanine; Reusch, Thorsten; Bornberg-Bauer, Erich

    Genome-wide transcription analysis between related species occurring in overlapping ranges can provide insights into the molecular basis underlying different ecological niches. The co-occurring seagrass species, Zostera marina and Nanozostera noltii, are found in marine coastal environments

  12. The genome sequence and transcriptome of Potentilla micrantha and their comparison to Fragaria vesca (the woodland strawberry).

    Science.gov (United States)

    Buti, Matteo; Moretto, Marco; Barghini, Elena; Mascagni, Flavia; Natali, Lucia; Brilli, Matteo; Lomsadze, Alexandre; Sonego, Paolo; Giongo, Lara; Alonge, Michael; Velasco, Riccardo; Varotto, Claudio; Šurbanovski, Nada; Borodovsky, Mark; Ward, Judson A; Engelen, Kristof; Cavallini, Andrea; Cestaro, Alessandro; Sargent, Daniel James

    2018-04-01

    The genus Potentilla is closely related to that of Fragaria, the economically important strawberry genus. Potentilla micrantha is a species that does not develop berries but shares numerous morphological and ecological characteristics with Fragaria vesca. These similarities make P. micrantha an attractive choice for comparative genomics studies with F. vesca. In this study, the P. micrantha genome was sequenced and annotated, and RNA-Seq data from the different developmental stages of flowering and fruiting were used to develop a set of gene predictions. A 327 Mbp sequence and annotation of the genome of P. micrantha, spanning 2674 sequence contigs, with an N50 size of 335,712, estimated to cover 80% of the total genome size of the species was developed. The genus Potentilla has a characteristically larger genome size than Fragaria, but the recovered sequence scaffolds were remarkably collinear at the micro-syntenic level with the genome of F. vesca, its closest sequenced relative. A total of 33,602 genes were predicted, and 95.1% of bench-marking universal single-copy orthologous genes were complete within the presented sequence. Thus, we argue that the majority of the gene-rich regions of the genome have been sequenced. Comparisons of RNA-Seq data from the stages of floral and fruit development revealed genes differentially expressed between P. micrantha and F. vesca.The data presented are a valuable resource for future studies of berry development in Fragaria and the Rosaceae and they also shed light on the evolution of genome size and organization in this family.

  13. Genome and Transcriptome Adaptation Accompanying Emergence of the Definitive Type 2 Host-Restricted Salmonella enterica Serovar Typhimurium Pathovar

    OpenAIRE

    Kingsley, Robert A.; Kay, Sally; Connor, Thomas; Barquist, Lars; Sait, Leanne; Holt, Kathryn E.; Sivaraman, Karthi; Wileman, Thomas; Goulding, David; Clare, Simon; Hale, Christine; Seshasayee, Aswin; Harris, Simon; Thomson, Nicholas R.; Gardner, Paul

    2013-01-01

    Salmonella enterica serovar Typhimurium definitive type 2 (DT2) is host restricted to Columba livia (rock or feral pigeon) but is also closely related to S. Typhimurium isolates that circulate in livestock and cause a zoonosis characterized by gastroenteritis in humans. DT2 isolates formed a distinct phylogenetic cluster within S. Typhimurium based on whole-genome-sequence polymorphisms. Comparative genome analysis of DT2 94-213 and S. Typhimurium SL1344, DT104, and D23580 identified few diff...

  14. The genome and life-stage specific transcriptomes of Globodera pallida elucidate key aspects of plant parasitism by a cyst nematode

    KAUST Repository

    Cotton, James A

    2014-03-03

    Background: Globodera pallida is a devastating pathogen of potato crops, making it one of the most economically important plant parasitic nematodes. It is also an important model for the biology of cyst nematodes. Cyst nematodes and root-knot nematodes are the two most important plant parasitic nematode groups and together represent a global threat to food security. Results: We present the complete genome sequence of G. pallida, together with transcriptomic data from most of the nematode life cycle, particularly focusing on the life cycle stages involved in root invasion and establishment of the biotrophic feeding site. Despite the relatively close phylogenetic relationship with root-knot nematodes, we describe a very different gene family content between the two groups and in particular extensive differences in the repertoire of effectors, including an enormous expansion of the SPRY domain protein family in G. pallida, which includes the SPRYSEC family of effectors. This highlights the distinct biology of cyst nematodes compared to the root-knot nematodes that were, until now, the only sedentary plant parasitic nematodes for which genome information was available. We also present in-depth descriptions of the repertoires of other genes likely to be important in understanding the unique biology of cyst nematodes and of potential drug targets and other targets for their control. Conclusions: The data and analyses we present will be central in exploiting post-genomic approaches in the development of much-needed novel strategies for the control of G. pallida and related pathogens. 2014 Cotton et al.; licensee BioMed Central Ltd.

  15. Genome Assembly of the Fungus Cochliobolus miyabeanus, and Transcriptome Analysis during Early Stages of Infection on American Wildrice (Zizania palustris L..

    Directory of Open Access Journals (Sweden)

    Claudia V Castell-Miller

    Full Text Available The fungus Cochliobolus miyabeanus causes severe leaf spot disease on rice (Oryza sativa and two North American specialty crops, American wildrice (Zizania palustris and switchgrass (Panicum virgatum. Despite the importance of C. miyabeanus as a disease-causing agent in wildrice, little is known about either the mechanisms of pathogenicity or host defense responses. To start bridging these gaps, the genome of C. miyabeanus strain TG12bL2 was shotgun sequenced using Illumina technology. The genome assembly consists of 31.79 Mbp in 2,378 scaffolds with an N50 = 74,921. It contains 11,000 predicted genes of which 94.5% were annotated. Approximately 10% of total gene number is expected to be secreted. The C. miyabeanus genome is rich in carbohydrate active enzymes, and harbors 187 small secreted peptides (SSPs and some fungal effector homologs. Detoxification systems were represented by a variety of enzymes that could offer protection against plant defense compounds. The non-ribosomal peptide synthetases and polyketide synthases (PKS present were common to other Cochliobolus species. Additionally, the fungal transcriptome was analyzed at 48 hours after inoculation in planta. A total of 10,674 genes were found to be expressed, some of which are known to be involved in pathogenicity or response to host defenses including hydrophobins, cutinase, cell wall degrading enzymes, enzymes related to reactive oxygen species scavenging, PKS, detoxification systems, SSPs, and a known fungal effector. This work will facilitate future research on C. miyabeanus pathogen-associated molecular patterns and effectors, and in the identification of their corresponding wildrice defense mechanisms.

  16. Exploring the genetic architecture and improving genomic prediction accuracy for mastitis and milk production traits in dairy cattle by mapping variants to hepatic transcriptomic regions responsive to intra-mammary infection.

    Science.gov (United States)

    Fang, Lingzhao; Sahana, Goutam; Ma, Peipei; Su, Guosheng; Yu, Ying; Zhang, Shengli; Lund, Mogens Sandø; Sørensen, Peter

    2017-05-12

    A better understanding of the genetic architecture of complex traits can contribute to improve genomic prediction. We hypothesized that genomic variants associated with mastitis and milk production traits in dairy cattle are enriched in hepatic transcriptomic regions that are responsive to intra-mammary infection (IMI). Genomic markers [e.g. single nucleotide polymorphisms (SNPs)] from those regions, if included, may improve the predictive ability of a genomic model. We applied a genomic feature best linear unbiased prediction model (GFBLUP) to implement the above strategy by considering the hepatic transcriptomic regions responsive to IMI as genomic features. GFBLUP, an extension of GBLUP, includes a separate genomic effect of SNPs within a genomic feature, and allows differential weighting of the individual marker relationships in the prediction equation. Since GFBLUP is computationally intensive, we investigated whether a SNP set test could be a computationally fast way to preselect predictive genomic features. The SNP set test assesses the association between a genomic feature and a trait based on single-SNP genome-wide association studies. We applied these two approaches to mastitis and milk production traits (milk, fat and protein yield) in Holstein (HOL, n = 5056) and Jersey (JER, n = 1231) cattle. We observed that a majority of genomic features were enriched in genomic variants that were associated with mastitis and milk production traits. Compared to GBLUP, the accuracy of genomic prediction with GFBLUP was marginally improved (3.2 to 3.9%) in within-breed prediction. The highest increase (164.4%) in prediction accuracy was observed in across-breed prediction. The significance of genomic features based on the SNP set test were correlated with changes in prediction accuracy of GFBLUP (P layers of biological knowledge to provide novel insights into the biological basis of complex traits, and to improve the accuracy of genomic prediction. The SNP set

  17. Direct-to-consumer genomics on the scales of autonomy

    Science.gov (United States)

    Vayena, Effy

    2015-01-01

    Direct-to-consumer (DTC) genetic services have generated enormous controversy from their first emergence. A dramatic recent manifestation of this is the Food and Drug Administration's (FDA) cease and desist order against 23andMe, the leading provider in the market. Critics have argued for the restrictive regulation of such services, and even their prohibition, on the grounds of the harm they pose to consumers. Their advocates, by contrast, defend them as a means of enhancing the autonomy of those same consumers. Autonomy emerges as a key battle-field in this debate, because many of the ‘harm’ arguments can be interpreted as identifying threats to autonomy. This paper assesses whether DTC genomic services are a threat to, or instead, an enhancement of, personal autonomy. It deploys Joseph Raz's account of personal autonomy, with its emphasis on choice from a range of valuable options. It then seeks to counter claims that DTC genomics threatens autonomy because it involves manipulation in contravention of consumers’ independence or because it does not generate valuable options which can be meaningfully engaged with by consumers. It is stressed that the value of the options generated by DTC genomics should not be judged exclusively from the perspective of medical actionability, but should take into consideration plural utilities. Finally, the paper ends by broaching policy recommendations, suggesting that there is a strong autonomy-based argument for permitting DTC genomic services, and that the key question is the nature of the regulatory conditions under which they should be permitted. The discussion of autonomy in this paper helps illuminate some of these conditions. PMID:24797610

  18. Genome Wide Transcriptome Analysis reveals ABA mediated response in Arabidopsis during Gold (AuCl4- treatment

    Directory of Open Access Journals (Sweden)

    Devesh eShukla

    2014-11-01

    Full Text Available The unique physico-chemical properties of gold nanoparticles (AuNPs find manifold applications in diagnostics, medicine and catalysis. Chemical synthesis produces reactive AuNPs and generates hazardous by-products. Alternatively, plants can be utilized to produce AuNPs in an eco-friendly manner. To better control the biosynthesis of AuNPs, we need to first understand the detailed molecular response induced by AuCl4- In this study, we carried out global transcriptome analysis in root tissue of Arabidopsis grown for 12- hours in presence of gold solution (HAuCl4 using the novel unbiased Affymetrix exon array. Transcriptomics analysis revealed differential regulation of a total of 704 genes and 4900 exons. Of these, 492 and 212 genes were up- and downregulated, respectively. The validation of the expressed key genes, such as glutathione-S-transferases, auxin responsive genes, cytochrome P450 82C2, methyl transferases, transducin (G protein beta subunit, ERF transcription factor, ABC, and MATE transporters, was carried out through quantitative RT-PCR. These key genes demonstrated specific induction under AuCl4- treatment relative to other heavy metals, suggesting a unique plant-gold interaction. GO enrichment analysis reveals the upregulation of processes like oxidative stress, glutathione binding, metal binding, transport, and plant hormonal responses. Changes predicted in biochemical pathways indicated major modulation in glutathione mediated detoxification, flavones and derivatives, and plant hormone biosynthesis. Motif search analysis identified a highly significant enriched motif, ACGT, which is an abscisic acid responsive core element (ABRE, suggesting the possibility of ABA- mediated signaling. Identification of abscisic acid response element (ABRE points to the operation of a predominant signaling mechanism in response to AuCl4- exposure. Overall, this study presents a useful picture of plant-gold interaction with an identification of

  19. Genome-wide immunity studies in the rabbit: transcriptome variations in peripheral blood mononuclear cells after in vitro stimulation by LPS or PMA-Ionomycin.

    Science.gov (United States)

    Jacquier, Vincent; Estellé, Jordi; Schmaltz-Panneau, Barbara; Lecardonnel, Jérôme; Moroldo, Marco; Lemonnier, Gaëtan; Turner-Maier, Jason; Duranthon, Véronique; Oswald, Isabelle P; Gidenne, Thierry; Rogel-Gaillard, Claire

    2015-01-23

    Our purpose was to obtain genome-wide expression data for the rabbit species on the responses of peripheral blood mononuclear cells (PBMCs) after in vitro stimulation by lipopolysaccharide (LPS) or phorbol myristate acetate (PMA) and ionomycin. This transcriptome profiling was carried out using microarrays enriched with immunity-related genes, and annotated with the most recent data available for the rabbit genome. The LPS affected 15 to 20 times fewer genes than PMA-Ionomycin after both 4 hours (T4) and 24 hours (T24), of in vitro stimulation, in comparison with mock-stimulated PBMCs. LPS induced an inflammatory response as shown by a significant up-regulation of IL12A and CXCL11 at T4, followed by an increased transcription of IL6, IL1B, IL1A, IL36, IL37, TNF, and CCL4 at T24. Surprisingly, we could not find an up-regulation of IL8 either at T4 or at T24, and detected a down-regulation of DEFB1 and BPI at T24. A concerted up-regulation of SAA1, S100A12 and F3 was found upon stimulation by LPS. PMA-Ionomycin induced a very early expression of Th1, Th2, Treg, and Th17 responses by PBMCs at T4. The Th1 response increased at T24 as shown by the increase of the transcription of IFNG and by contrast to other cytokines which significantly decreased from T4 to T24 (IL2, IL4, IL10, IL13, IL17A, CD69) by comparison to mock-stimulation. The granulocyte-macrophage colony-stimulating factor (CSF2) was by far the most over-expressed gene at both T4 and T24 by comparison to mock-stimulated cells, confirming a major impact of PMA-Ionomycin on cell growth and proliferation. A significant down-regulation of IL16 was observed at T4 and T24, in agreement with a role of IL16 in PBMC apoptosis. We report new data on the responses of PBMCs to LPS and PMA-Ionomycin in the rabbit species, thus enlarging the set of mammalian species for which such reports exist. The availability of the rabbit genome assembly together with high throughput genomic tools should pave the way for more

  20. In silico method for modelling metabolism and gene product expression at genome scale

    Energy Technology Data Exchange (ETDEWEB)

    Lerman, Joshua A.; Hyduke, Daniel R.; Latif, Haythem; Portnoy, Vasiliy A.; Lewis, Nathan E.; Orth, Jeffrey D.; Rutledge, Alexandra C.; Smith, Richard D.; Adkins, Joshua N.; Zengler, Karsten; Palsson, Bernard O.

    2012-07-03

    Transcription and translation use raw materials and energy generated metabolically to create the macromolecular machinery responsible for all cellular functions, including metabolism. A biochemically accurate model of molecular biology and metabolism will facilitate comprehensive and quantitative computations of an organism's molecular constitution as a function of genetic and environmental parameters. Here we formulate a model of metabolism and macromolecular expression. Prototyping it using the simple microorganism Thermotoga maritima, we show our model accurately simulates variations in cellular composition and gene expression. Moreover, through in silico comparative transcriptomics, the model allows the discovery of new regulons and improving the genome and transcription unit annotations. Our method presents a framework for investigating molecular biology and cellular physiology in silico and may allow quantitative interpretation of multi-omics data sets in the context of an integrated biochemical description of an organism.

  1. TIGER: Toolbox for integrating genome-scale metabolic models, expression data, and transcriptional regulatory networks

    Directory of Open Access Journals (Sweden)

    Jensen Paul A

    2011-09-01

    Full Text Available Abstract Background Several methods have been developed for analyzing genome-scale models of metabolism and transcriptional regulation. Many of these methods, such as Flux Balance Analysis, use constrained optimization to predict relationships between metabolic flux and the genes that encode and regulate enzyme activity. Recently, mixed integer programming has been used to encode these gene-protein-reaction (GPR relationships into a single optimization problem, but these techniques are often of limited generality and lack a tool for automating the conversion of rules to a coupled regulatory/metabolic model. Results We present TIGER, a Toolbox for Integrating Genome-scale Metabolism, Expression, and Regulation. TIGER converts a series of generalized, Boolean or multilevel rules into a set of mixed integer inequalities. The package also includes implementations of existing algorithms to integrate high-throughput expression data with genome-scale models of metabolism and transcriptional regulation. We demonstrate how TIGER automates the coupling of a genome-scale metabolic model with GPR logic and models of transcriptional regulation, thereby serving as a platform for algorithm development and large-scale metabolic analysis. Additionally, we demonstrate how TIGER's algorithms can be used to identify inconsistencies and improve existing models of transcriptional regulation with examples from the reconstructed transcriptional regulatory network of Saccharomyces cerevisiae. Conclusion The TIGER package provides a consistent platform for algorithm development and extending existing genome-scale metabolic models with regulatory networks and high-throughput data.

  2. GMATA: An Integrated Software Package for Genome-Scale SSR Mining, Marker Development and Viewing.

    Science.gov (United States)

    Wang, Xuewen; Wang, Le

    2016-01-01

    Simple sequence repeats (SSRs), also referred to as microsatellites, are highly variable tandem DNAs that are widely used as genetic markers. The increasing availability of whole-genome and transcript sequences provides information resources for SSR marker development. However, efficient software is required to efficiently identify and display SSR information along with other gene features at a genome scale. We developed novel software package Genome-wide Microsatellite Analyzing Tool Package (GMATA) integrating SSR mining, statistical analysis and plotting, marker design, polymorphism screening and marker transferability, and enabled simultaneously display SSR markers with other genome features. GMATA applies novel strategies for SSR analysis and primer design in large genomes, which allows GMATA to perform faster calculation and provides more accurate results than existing tools. Our package is also capable of processing DNA sequences of any size on a standard computer. GMATA is user friendly, only requires mouse clicks or types inputs on the command line, and is executable in multiple computing platforms. We demonstrated the application of GMATA in plants genomes and reveal a novel distribution pattern of SSRs in 15 grass genomes. The most abundant motifs are dimer GA/TC, the A/T monomer and the GCG/CGC trimer, rather than the rich G/C content in DNA sequence. We also revealed that SSR count is a linear to the chromosome length in fully assembled grass genomes. GMATA represents a powerful application tool that facilitates genomic sequence analyses. GAMTA is freely available at http://sourceforge.net/projects/gmata/?source=navbar.

  3. An Integrative Bioinformatics Framework for Genome-scale Multiple Level Network Reconstruction of Rice

    Directory of Open Access Journals (Sweden)

    Liu Lili

    2013-06-01

    Full Text Available Understanding how metabolic reactions translate the genome of an organism into its phenotype is a grand challenge in biology. Genome-wide association studies (GWAS statistically connect genotypes to phenotypes, without any recourse to known molecular interactions, whereas a molecular mechanistic description ties gene function to phenotype through gene regulatory networks (GRNs, protein-protein interactions (PPIs and molecular pathways. Integration of different regulatory information levels of an organism is expected to provide a good way for mapping genotypes to phenotypes. However, the lack of curated metabolic model of rice is blocking the exploration of genome-scale multi-level network reconstruction. Here, we have merged GRNs, PPIs and genome-scale metabolic networks (GSMNs approaches into a single framework for rice via omics’ regulatory information reconstruction and integration. Firstly, we reconstructed a genome-scale metabolic model, containing 4,462 function genes, 2,986 metabolites involved in 3,316 reactions, and compartmentalized into ten subcellular locations. Furthermore, 90,358 pairs of protein-protein interactions, 662,936 pairs of gene regulations and 1,763 microRNA-target interactions were integrated into the metabolic model. Eventually, a database was developped for systematically storing and retrieving the genome-scale multi-level network of rice. This provides a reference for understanding genotype-phenotype relationship of rice, and for analysis of its molecular regulatory network.

  4. Savant Genome Browser 2: visualization and analysis for population-scale genomics.

    Science.gov (United States)

    Fiume, Marc; Smith, Eric J M; Brook, Andrew; Strbenac, Dario; Turner, Brian; Mezlini, Aziz M; Robinson, Mark D; Wodak, Shoshana J; Brudno, Michael

    2012-07-01

    High-throughput sequencing (HTS) technologies are providing an unprecedented capacity for data generation, and there is a corresponding need for efficient data exploration and analysis capabilities. Although most existing tools for HTS data analysis are developed for either automated (e.g. genotyping) or visualization (e.g. genome browsing) purposes, such tools are most powerful when combined. For example, integration of visualization and computation allows users to iteratively refine their analyses by updating computational parameters within the visual framework in real-time. Here we introduce the second version of the Savant Genome Browser, a standalone program for visual and computational analysis of HTS data. Savant substantially improves upon its predecessor and existing tools by introducing innovative visualization modes and navigation interfaces for several genomic datatypes, and synergizing visual and automated analyses in a way that is powerful yet easy even for non-expert users. We also present a number of plugins that were developed by the Savant Community, which demonstrate the power of integrating visual and automated analyses using Savant. The Savant Genome Browser is freely available (open source) at www.savantbrowser.com.

  5. Nuclear RNA sequencing of the mouse erythroid cell transcriptome.

    Directory of Open Access Journals (Sweden)

    Jennifer A Mitchell

    Full Text Available In addition to protein coding genes a substantial proportion of mammalian genomes are transcribed. However, most transcriptome studies investigate steady-state mRNA levels, ignoring a considerable fraction of the transcribed genome. In addition, steady-state mRNA levels are influenced by both transcriptional and posttranscriptional mechanisms, and thus do not provide a clear picture of transcriptional output. Here, using deep sequencing of nuclear RNAs (nucRNA-Seq in parallel with chromatin immunoprecipitation sequencing (ChIP-Seq of active RNA polymerase II, we compared the nuclear transcriptome of mouse anemic spleen erythroid cells with polymerase occupancy on a genome-wide scale. We demonstrate that unspliced transcripts quantified by nucRNA-seq correlate with primary transcript frequencies measured by RNA FISH, but differ from steady-state mRNA levels measured by poly(A-enriched RNA-seq. Highly expressed protein coding genes showed good correlation between RNAPII occupancy and transcriptional output; however, genome-wide we observed a poor correlation between transcriptional output and RNAPII association. This poor correlation is due to intergenic regions associated with RNAPII which correspond with transcription factor bound regulatory regions and a group of stable, nuclear-retained long non-coding transcripts. In conclusion, sequencing the nuclear transcriptome provides an opportunity to investigate the transcriptional landscape in a given cell type through quantification of unspliced primary transcripts and the identification of nuclear-retained long non-coding RNAs.

  6. Genomic and transcriptomic analyses reveal differential regulation of diverse terpenoid and polyketides secondary metabolites in Hericium erinaceus.

    Science.gov (United States)

    Chen, Juan; Zeng, Xu; Yang, Yan Long; Xing, Yong Mei; Zhang, Qi; Li, Jia Mei; Ma, Ke; Liu, Hong Wei; Guo, Shun Xing

    2017-08-31

    The lion's mane mushroom Hericium erinaceus is a famous traditional medicinal fungus credited with anti-dementia activity and a producer of cyathane diterpenoid natural products (erinacines) useful against nervous system diseases. To date, few studies have explored the biosynthesis of these compounds, although their chemical synthesis is known. Here, we report the first genome and tanscriptome sequence of the medicinal fungus H. erinaceus. The size of the genome is 39.35 Mb, containing 9895 gene models. The genome of H. erinaceus reveals diverse enzymes and a large family of cytochrome P450 (CYP) proteins involved in the biosynthesis of terpenoid backbones, diterpenoids, sesquiterpenes and polyketides. Three gene clusters related to terpene biosynthesis and one gene cluster for polyketides biosynthesis (PKS) were predicted. Genes involved in terpenoid biosynthesis were generally upregulated in mycelia, while the PKS gene was upregulated in the fruiting body. Comparative genome analysis of 42 fungal species of Basidiomycota revealed that most edible and medicinal mushroom show many more gene clusters involved in terpenoid and polyketide biosynthesis compared to the pathogenic fungi. None of the gene clusters for terpenoid or polyketide biosynthesis were predicted in the poisonous mushroom Amanita muscaria. Our findings may facilitate future discovery and biosynthesis of bioactive secondary metabolites from H. erinaceus and provide fundamental information for exploring the secondary metabolites in other Basidiomycetes.

  7. Estimating phylogenetic trees from genome-scale data.

    Science.gov (United States)

    Liu, Liang; Xi, Zhenxiang; Wu, Shaoyuan; Davis, Charles C; Edwards, Scott V

    2015-12-01

    The heterogeneity of signals in the genomes of diverse organisms poses challenges for traditional phylogenetic analysis. Phylogenetic methods known as "species tree" methods have been proposed to directly address one important source of gene tree heterogeneity, namely the incomplete lineage sorting that occurs when evolving lineages radiate rapidly, resulting in a diversity of gene trees from a single underlying species tree. Here we review theory and empirical examples that help clarify conflicts between species tree and concatenation methods, and misconceptions in the literature about the performance of species tree methods. Considering concatenation as a special case of the multispecies coalescent model helps explain differences in the behavior of the two methods on phylogenomic data sets. Recent work suggests that species tree methods are more robust than concatenation approaches to some of the classic challenges of phylogenetic analysis, including rapidly evolving sites in DNA sequences and long-branch attraction. We show that approaches, such as binning, designed to augment the signal in species tree analyses can distort the distribution of gene trees and are inconsistent. Computationally efficient species tree methods incorporating biological realism are a key to phylogenetic analysis of whole-genome data. © 2015 New York Academy of Sciences.

  8. Comparative transcriptomics in the Triticeae

    Directory of Open Access Journals (Sweden)

    Waugh Robbie

    2009-06-01

    Full Text Available Abstract Background Barley and particularly wheat are two grass species of immense agricultural importance. In spite of polyploidization events within the latter, studies have shown that genotypically and phenotypically these species are very closely related and, indeed, fertile hybrids can be created by interbreeding. The advent of two genome-scale Affymetrix GeneChips now allows studies of the comparison of their transcriptomes. Results We have used the Wheat GeneChip to create a "gene expression atlas" for the wheat transcriptome (cv. Chinese Spring. For this, we chose mRNA from a range of tissues and developmental stages closely mirroring a comparable study carried out for barley (cv. Morex using the Barley1 GeneChip. This, together with large-scale clustering of the probesets from the two GeneChips into "homologous groups", has allowed us to perform a genomic-scale comparative study of expression patterns in these two species. We explore the influence of the polyploidy of wheat on the results obtained with the Wheat GeneChip and quantify the correlation between conservation in gene sequence and gene expression in wheat and barley. In addition, we show how the conservation of expression patterns can be used to elucidate, probeset by probeset, the reliability of the Wheat GeneChip. Conclusion While there are many differences in expression on the level of individual genes and tissues, we demonstrate that the wheat and barley transcriptomes appear highly correlated. This finding is significant not only because given small evolutionary distance between the two species it is widely expected, but also because it demonstrates that it is possible to use the two GeneChips for comparative studies. This is the case even though their probeset composition reflects rather different design principles as well as, of course, the present incomplete knowledge of the gene content of the two species. We also show that, in general, the Wheat GeneChip is not able

  9. Genome-based microbial ecology of anammox granules in a full-scale wastewater treatment system

    OpenAIRE

    Speth, D.R.; Zandt, M.H. in 't; Guerrero Cruz, S.; Dutilh, B.E.; Jetten, M.S.M.

    2016-01-01

    Partial-nitritation anammox (PNA) is a novel wastewater treatment procedure for energy-efficient ammonium removal. Here we use genome-resolved metagenomics to build a genome-based ecological model of the microbial community in a full-scale PNA reactor. Sludge from the bioreactor examined here is used to seed reactors in wastewater treatment plants around the world; however, the role of most of its microbial community in ammonium removal remains unknown. Our analysis yielded 23 near-complete d...

  10. Profiling of secondary metabolite gene clusters regulated by LaeA in Aspergillus niger FGSC A1279 based on genome sequencing and transcriptome analysis.

    Science.gov (United States)

    Wang, Bin; Lv, Yangyong; Li, Xuejie; Lin, Yiying; Deng, Hai; Pan, Li

    The global regulator LaeA controls the production of many fungal secondary metabolites, possibly via chromatin remodeling. Here we aimed to survey the secondary metabolite profile regulated by LaeA in Aspergillus niger FGSC A1279 by genome sequencing and comparative transcriptomics between the laeA deletion (ΔlaeA) and overexpressing (OE-laeA) mutants. Genome sequencing revealed four putative polyketide synthase genes specific to FGSC A1279, suggesting that the corresponding polyketide compounds might be unique to FGSC A1279. RNA-seq data revealed 281 putative secondary metabolite genes upregulated in the OE-laeA mutants, including 22 secondary metabolite backbone genes. LC-MS chemical profiling illustrated that many secondary metabolites were produced in OE-laeA mutants compared to wild type and ΔlaeA mutants, providing potential resources for drug discovery. KEGG analysis annotated 16 secondary metabolite clusters putatively linked to metabolic pathways. Furthermore, 34 of 61 Zn 2 Cys 6 transcription factors located in secondary metabolite clusters were differentially expressed between ΔlaeA and OE-laeA mutants. Three secondary metabolite clusters (cluster 18, 30 and 33) containing Zn 2 Cys 6 transcription factors that were upregulated in OE-laeA mutants were putatively linked to KEGG pathways, suggesting that Zn 2 Cys 6 transcription factors might play an important role in synthesizing secondary metabolites regulated by LaeA. Taken together, LaeA dramatically influences the secondary metabolite profile in FGSC A1279. Copyright © 2017 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.

  11. Genome-wide expression of transcriptomes and their co-expression pattern in subtropical maize (Zea mays L. under waterlogging stress.

    Directory of Open Access Journals (Sweden)

    Nepolean Thirunavukkarasu

    Full Text Available Waterlogging causes extensive damage to maize crops in tropical and subtropical regions. The identification of tolerance genes and their interactions at the molecular level will be helpful to engineer tolerant genotypes. A whole-genome transcriptome assay revealed the specific role of genes in response to waterlogging stress in susceptible and tolerant genotypes. Genes involved in the synthesis of ethylene and auxin, cell wall metabolism, activation of G-proteins and formation of aerenchyma and adventitious roots, were upregulated in the tolerant genotype. Many transcription factors, particularly ERFs, MYB, HSPs, MAPK, and LOB-domain protein were involved in regulation of these traits. Genes responsible for scavenging of ROS generated under stress were expressed along with those involved in carbohydrate metabolism. The physical locations of 21 genes expressed in the tolerant genotype were found to correspond with the marker intervals of known QTLs responsible for development of adaptive traits. Among the candidate genes, most showed synteny with genes of sorghum and foxtail millet. Co-expression analysis of 528 microarray samples including 16 samples from the present study generated seven functional modules each in the two genotypes, with differing characteristics. In the tolerant genotype, stress genes were co-expressed along with peroxidase and fermentation pathway genes.

  12. Genome and Transcriptome sequence of Finger millet (Eleusine coracana (L.) Gaertn.) provides insights into drought tolerance and nutraceutical properties

    OpenAIRE

    Hittalmani, Shailaja; Mahesh, H. B.; Shirke, Meghana Deepak; Biradar, Hanamareddy; Uday, Govindareddy; Aruna, Y. R.; Lohithaswa, H. C.; Mohanrao, A.

    2017-01-01

    Background Finger millet (Eleusine coracana (L.) Gaertn.) is an important staple food crop widely grown in Africa and South Asia. Among the millets, finger millet has high amount of calcium, methionine, tryptophan, fiber, and sulphur containing amino acids. In addition, it has C4 photosynthetic carbon assimilation mechanism, which helps to utilize water and nitrogen efficiently under hot and arid conditions without severely affecting yield. Therefore, development and utilization of genomic re...

  13. Genome-wide mapping of transcription start sites yields novel insights into the primary transcriptome of Pseudomonas putida

    DEFF Research Database (Denmark)

    D'Arrigo, Isotta; Bojanovic, Klara; Yang, Xiaochen

    2016-01-01

    was examined using an in vivo assay with GFP-fusion vectors and shown to function via a translational repression mechanism. Furthermore, 56 novel intergenic small RNAs and 8 putative actuaton transcripts were detected, as well as 8 novel open reading frames (ORFs). This study illustrates how global mapping...... of TSSs can yield novel insights into the transcriptional features and RNA output of bacterial genomes....

  14. Genome-scale metabolic models as platforms for strain design and biological discovery.

    Science.gov (United States)

    Mienda, Bashir Sajo

    2017-07-01

    Genome-scale metabolic models (GEMs) have been developed and used in guiding systems' metabolic engineering strategies for strain design and development. This strategy has been used in fermentative production of bio-based industrial chemicals and fuels from alternative carbon sources. However, computer-aided hypotheses building using established algorithms and software platforms for biological discovery can be integrated into the pipeline for strain design strategy to create superior strains of microorganisms for targeted biosynthetic goals. Here, I described an integrated workflow strategy using GEMs for strain design and biological discovery. Specific case studies of strain design and biological discovery using Escherichia coli genome-scale model are presented and discussed. The integrated workflow presented herein, when applied carefully would help guide future design strategies for high-performance microbial strains that have existing and forthcoming genome-scale metabolic models.

  15. Using relational databases for improved sequence similarity searching and large-scale genomic analyses.

    Science.gov (United States)

    Mackey, Aaron J; Pearson, William R

    2004-10-01

    Relational databases are designed to integrate diverse types of information and manage large sets of search results, greatly simplifying genome-scale analyses. Relational databases are essential for management and analysis of large-scale sequence analyses, and can also be used to improve the statistical significance of similarity searches by focusing on subsets of sequence libraries most likely to contain homologs. This unit describes using relational databases to improve the efficiency of sequence similarity searching and to demonstrate various large-scale genomic analyses of homology-related data. This unit describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. These include basic use of the database to generate a novel sequence library subset, how to extend and use seqdb_demo for the storage of sequence similarity search results and making use of various kinds of stored search results to address aspects of comparative genomic analysis.

  16. Genomes and transcriptomes of partners in plant-fungal-interactions between canola (Brassica napus and two Leptosphaeria species.

    Directory of Open Access Journals (Sweden)

    Rohan G T Lowe

    Full Text Available Leptosphaeria maculans 'brassicae' is a damaging fungal pathogen of canola (Brassica napus, causing lesions on cotyledons and leaves, and cankers on the lower stem. A related species, L. biglobosa 'canadensis', colonises cotyledons but causes few stem cankers. We describe the complement of genes encoding carbohydrate-active enzymes (CAZys and peptidases of these fungi, as well as of four related plant pathogens. We also report dual-organism RNA-seq transcriptomes of these two Leptosphaeria species and B. napus during disease. During the first seven days of infection L. biglobosa 'canadensis', a necrotroph, expressed more cell wall degrading genes than L. maculans 'brassicae', a hemi-biotroph. L. maculans 'brassicae' expressed many genes in the Carbohydrate Binding Module class of CAZy, particularly CBM50 genes, with potential roles in the evasion of basal innate immunity in the host plant. At this time, three avirulence genes were amongst the top 20 most highly upregulated L. maculans 'brassicae' genes in planta. The two fungi had a similar number of peptidase genes, and trypsin was transcribed at high levels by both fungi early in infection. L. biglobosa 'canadensis' infection activated the jasmonic acid and salicylic acid defence pathways in B. napus, consistent with defence against necrotrophs. L. maculans 'brassicae' triggered a high level of expression of isochorismate synthase 1, a reporter for salicylic acid signalling. L. biglobosa 'canadensis' infection triggered coordinated shutdown of photosynthesis genes, and a concomitant increase in transcription of cell wall remodelling genes of the host plant. Expression of particular classes of CAZy genes and the triggering of host defence and particular metabolic pathways are consistent with the necrotrophic lifestyle of L. biglobosa 'canadensis', and the hemibiotrophic life style of L. maculans 'brassicae'.

  17. Rapid prototyping of microbial cell factories via genome-scale engineering.

    Science.gov (United States)

    Si, Tong; Xiao, Han; Zhao, Huimin

    2015-11-15

    Advances in reading, writing and editing genetic materials have greatly expanded our ability to reprogram biological systems at the resolution of a single nucleotide and on the scale of a whole genome. Such capacity has greatly accelerated the cycles of design, build and test to engineer microbes for efficient synthesis of fuels, chemicals and drugs. In this review, we summarize the emerging technologies that have been applied, or are potentially useful for genome-scale engineering in microbial systems. We will focus on the development of high-throughput methodologies, which may accelerate the prototyping of microbial cell factories. Copyright © 2014 Elsevier Inc. All rights reserved.

  18. Rapid Prototyping of Microbial Cell Factories via Genome-scale Engineering

    Science.gov (United States)

    Si, Tong; Xiao, Han; Zhao, Huimin

    2014-01-01

    Advances in reading, writing and editing genetic materials have greatly expanded our ability to reprogram biological systems at the resolution of a single nucleotide and on the scale of a whole genome. Such capacity has greatly accelerated the cycles of design, build and test to engineer microbes for efficient synthesis of fuels, chemicals and drugs. In this review, we summarize the emerging technologies that have been applied, or are potentially useful for genome-scale engineering in microbial systems. We will focus on the development of high-throughput methodologies, which may accelerate the prototyping of microbial cell factories. PMID:25450192

  19. Exploring massive, genome scale datasets with the genometricorr package

    KAUST Repository

    Favorov, Alexander; Mularoni, Loris; Cope, Leslie M.; Medvedeva, Yulia; Mironov, Andrey A.; Makeev, Vsevolod J.; Wheelan, Sarah J.

    2012-01-01

    We have created a statistically grounded tool for determining the correlation of genomewide data with other datasets or known biological features, intended to guide biological exploration of high-dimensional datasets, rather than providing immediate answers. The software enables several biologically motivated approaches to these data and here we describe the rationale and implementation for each approach. Our models and statistics are implemented in an R package that efficiently calculates the spatial correlation between two sets of genomic intervals (data and/or annotated features), for use as a metric of functional interaction. The software handles any type of pointwise or interval data and instead of running analyses with predefined metrics, it computes the significance and direction of several types of spatial association; this is intended to suggest potentially relevant relationships between the datasets. Availability and implementation: The package, GenometriCorr, can be freely downloaded at http://genometricorr.sourceforge.net/. Installation guidelines and examples are available from the sourceforge repository. The package is pending submission to Bioconductor. © 2012 Favorov et al.

  20. Exploring massive, genome scale datasets with the genometricorr package

    KAUST Repository

    Favorov, Alexander

    2012-05-31

    We have created a statistically grounded tool for determining the correlation of genomewide data with other datasets or known biological features, intended to guide biological exploration of high-dimensional datasets, rather than providing immediate answers. The software enables several biologically motivated approaches to these data and here we describe the rationale and implementation for each approach. Our models and statistics are implemented in an R package that efficiently calculates the spatial correlation between two sets of genomic intervals (data and/or annotated features), for use as a metric of functional interaction. The software handles any type of pointwise or interval data and instead of running analyses with predefined metrics, it computes the significance and direction of several types of spatial association; this is intended to suggest potentially relevant relationships between the datasets. Availability and implementation: The package, GenometriCorr, can be freely downloaded at http://genometricorr.sourceforge.net/. Installation guidelines and examples are available from the sourceforge repository. The package is pending submission to Bioconductor. © 2012 Favorov et al.

  1. Tools to covisualize and coanalyze proteomic data with genomes and transcriptomes: validation of genes and alternative mRNA splicing

    DEFF Research Database (Denmark)

    Pang, Chi; Tay, Aidan; Aya, Carlos

    2014-01-01

    contigs, along with RNA-seq reads. This is done in the Integrated Genome Viewer (IGV). A Results Analyzer reports the precise base position where LC-MS/MS-derived peptides cover genes or gene isoforms, on the chromosomes or contigs where this occurs. In prokaryotes, the PG Nexus pipeline facilitates...... the validation of genes, where annotation or gene prediction is available, or the discovery of genes using a "virtual protein"-based unbiased approach. We illustrate this with a comprehensive proteogenomics analysis of two strains of Campylobacter concisus . For higher eukaryotes, the PG Nexus facilitates gene...

  2. Improved annotation through genome-scale metabolic modeling of Aspergillus oryzae

    DEFF Research Database (Denmark)

    Vongsangnak, Wanwipa; Olsen, Peter; Hansen, Kim

    2008-01-01

    Background: Since ancient times the filamentous fungus Aspergillus oryzae has been used in the fermentation industry for the production of fermented sauces and the production of industrial enzymes. Recently, the genome sequence of A. oryzae with 12,074 annotated genes was released but the number...... to a genome scale metabolic model of A. oryzae. Results: Our assembled EST sequences we identified 1,046 newly predicted genes in the A. oryzae genome. Furthermore, it was possible to assign putative protein functions to 398 of the newly predicted genes. Noteworthy, our annotation strategy resulted...... model was validated and shown to correctly describe the phenotypic behavior of A. oryzae grown on different carbon sources. Conclusion: A much enhanced annotation of the A. oryzae genome was performed and a genomescale metabolic model of A. oryzae was reconstructed. The model accurately predicted...

  3. Characterization of the Pathogenicity of Streptococcus intermedius TYG1620 Isolated from a Human Brain Abscess Based on the Complete Genome Sequence with Transcriptome Analysis and Transposon Mutagenesis in a Murine Subcutaneous Abscess Model.

    Science.gov (United States)

    Hasegawa, Noriko; Sekizuka, Tsuyoshi; Sugi, Yutaka; Kawakami, Nobuhiro; Ogasawara, Yumiko; Kato, Kengo; Yamashita, Akifumi; Takeuchi, Fumihiko; Kuroda, Makoto

    2017-02-01

    Streptococcus intermedius is known to cause periodontitis and pyogenic infections in the brain and liver. Here we report the complete genome sequence of strain TYG1620 (genome size, 2,006,877 bp; GC content, 37.6%; 2,020 predicted open reading frames [ORFs]) isolated from a brain abscess in an infant. Comparative analysis of S. intermedius genome sequences suggested that TYG1620 carries a notable type VII secretion system (T7SS), two long repeat regions, and 19 ORFs for cell wall-anchored proteins (CWAPs). To elucidate the genes responsible for the pathogenicity of TYG1620, transcriptome analysis was performed in a murine subcutaneous abscess model. The results suggest that the levels of expression of small hypothetical proteins similar to phenol-soluble modulin β1 (PSMβ1), a staphylococcal virulence factor, significantly increased in the abscess model. In addition, an experiment in a murine subcutaneous abscess model with random transposon (Tn) mutant attenuation suggested that Tn mutants with mutations in 212 ORFs in the Tn mutant library were attenuated in the murine abscess model (629 ORFs were disrupted in total); the 212 ORFs are putatively essential for abscess formation. Transcriptome analysis identified 37 ORFs, including paralogs of the T7SS and a putative glucan-binding CWAP in long repeat regions, to be upregulated and attenuated in vivo This study provides a comprehensive characterization of S. intermedius pathogenicity based on the complete genome sequence and a murine subcutaneous abscess model with transcriptome and Tn mutagenesis, leading to the identification of pivotal targets for vaccines or antimicrobial agents for the control of S. intermedius infections. Copyright © 2017 American Society for Microbiology.

  4. The floral transcriptome of Eucalyptus grandis

    CSIR Research Space (South Africa)

    Vining, KJ

    2015-10-01

    Full Text Available As a step toward functional annotation of genes required for floral initiation and development within the Eucalyptus genome, we used short read sequencing to analyze transcriptomes of floral buds from early and late developmental stages...

  5. Genome-Wide Fine-Scale Recombination Rate Variation in Drosophila melanogaster

    Science.gov (United States)

    Song, Yun S.

    2012-01-01

    Estimating fine-scale recombination maps of Drosophila from population genomic data is a challenging problem, in particular because of the high background recombination rate. In this paper, a new computational method is developed to address this challenge. Through an extensive simulation study, it is demonstrated that the method allows more accurate inference, and exhibits greater robustness to the effects of natural selection and noise, compared to a well-used previous method developed for studying fine-scale recombination rate variation in the human genome. As an application, a genome-wide analysis of genetic variation data is performed for two Drosophila melanogaster populations, one from North America (Raleigh, USA) and the other from Africa (Gikongoro, Rwanda). It is shown that fine-scale recombination rate variation is widespread throughout the D. melanogaster genome, across all chromosomes and in both populations. At the fine-scale, a conservative, systematic search for evidence of recombination hotspots suggests the existence of a handful of putative hotspots each with at least a tenfold increase in intensity over the background rate. A wavelet analysis is carried out to compare the estimated recombination maps in the two populations and to quantify the extent to which recombination rates are conserved. In general, similarity is observed at very broad scales, but substantial differences are seen at fine scales. The average recombination rate of the X chromosome appears to be higher than that of the autosomes in both populations, and this pattern is much more pronounced in the African population than the North American population. The correlation between various genomic features—including recombination rates, diversity, divergence, GC content, gene content, and sequence quality—is examined using the wavelet analysis, and it is shown that the most notable difference between D. melanogaster and humans is in the correlation between recombination and

  6. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models

    DEFF Research Database (Denmark)

    King, Zachary A.; Lu, Justin; Dräger, Andreas

    2016-01-01

    Genome-scale metabolic models are mathematically-structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized repo...

  7. Experience from large scale use of the EuroGenomics custom SNP chip in cattle

    DEFF Research Database (Denmark)

    Boichard, Didier A; Boussaha, Mekki; Capitan, Aurélien

    2018-01-01

    This article presents the strategy to evaluate candidate mutations underlying QTL or responsible for genetic defects, based upon the design and large-scale use of the Eurogenomics custom SNP chip set up for bovine genomic selection. Some variants under study originated from mapping genetic defect...

  8. Genome-Wide Transcriptome Analysis of Cotton (Gossypium hirsutum L. Identifies Candidate Gene Signatures in Response to Aflatoxin Producing Fungus Aspergillus flavus.

    Directory of Open Access Journals (Sweden)

    Renesh Bedre

    Full Text Available Aflatoxins are toxic and potent carcinogenic metabolites produced from the fungi Aspergillus flavus and A. parasiticus. Aflatoxins can contaminate cottonseed under conducive preharvest and postharvest conditions. United States federal regulations restrict the use of aflatoxin contaminated cottonseed at >20 ppb for animal feed. Several strategies have been proposed for controlling aflatoxin contamination, and much success has been achieved by the application of an atoxigenic strain of A. flavus in cotton, peanut and maize fields. Development of cultivars resistant to aflatoxin through overexpression of resistance associated genes and/or knocking down aflatoxin biosynthesis of A. flavus will be an effective strategy for controlling aflatoxin contamination in cotton. In this study, genome-wide transcriptome profiling was performed to identify differentially expressed genes in response to infection with both toxigenic and atoxigenic strains of A. flavus on cotton (Gossypium hirsutum L. pericarp and seed. The genes involved in antifungal response, oxidative burst, transcription factors, defense signaling pathways and stress response were highly differentially expressed in pericarp and seed tissues in response to A. flavus infection. The cell-wall modifying genes and genes involved in the production of antimicrobial substances were more active in pericarp as compared to seed. The genes involved in auxin and cytokinin signaling were also induced. Most of the genes involved in defense response in cotton were highly induced in pericarp than in seed. The global gene expression analysis in response to fungal invasion in cotton will serve as a source for identifying biomarkers for breeding, potential candidate genes for transgenic manipulation, and will help in understanding complex plant-fungal interaction for future downstream research.

  9. The neuropeptides and protein hormones of the agricultural pest fruit fly Bactrocera dorsalis: What do we learn from the genome sequencing and tissue-specific transcriptomes?

    Science.gov (United States)

    Gui, Shun-Hua; Jiang, Hong-Bo; Smagghe, Guy; Wang, Jin-Jun

    2017-12-01

    Neuropeptides and protein hormones are very important signaling molecules, and are involved in the regulation and coordination of various physiological processes in invertebrates and vertebrates. Using a bioinformatics approach, we screened the recently sequenced genome and six tissue-specific transcriptome databases (central nervous system, fat body, ovary, testes, male accessory glands, antennae) of the oriental fruit fly (Bactrocera dorsalis) that is economically one of the most important pest insects of tropical and subtropical fruit. Thirty-nine candidate genes were found to encode neuropeptides or protein hormones. These include most of the known insect neuropeptides and protein hormones, with the exception of adipokinetic hormone-corazonin-related peptide, allatropin, diuretic hormone 34, diuretic hormone 45, IMFamide, inotocin, and sex peptide. Our results showed the neuropeptides and protein hormones of Diptera insects appear to have a reduced repertoire compared to some other insects. Moreover, there are also differences between B. dorsalis and the super-model of Drosophila melanogaster. Interesting features of the oriental fruit fly are the absence of genes coding for sex peptide and the presence of neuroparsin and two genes coding neuropeptide F. The majority of the identified neuropeptides and protein hormones is present in the central nervous system, with only a limited number of these in the other tissues. Moreover, we predicted their physiological functions via comparing with data of FlyBase and FlyAtlas. Taken together, owing to the large number of identified peptides, this study can be used as a reference about structure, tissue distribution and physiological functions for comparative studies in other model and important pest insects. Copyright © 2017 Elsevier Inc. All rights reserved.

  10. Assembly of 500,000 inter-specific catfish expressed sequence tags and large scale gene-associated marker development for whole genome association studies

    Energy Technology Data Exchange (ETDEWEB)

    Catfish Genome Consortium; Wang, Shaolin; Peatman, Eric; Abernathy, Jason; Waldbieser, Geoff; Lindquist, Erika; Richardson, Paul; Lucas, Susan; Wang, Mei; Li, Ping; Thimmapuram, Jyothi; Liu, Lei; Vullaganti, Deepika; Kucuktas, Huseyin; Murdock, Christopher; Small, Brian C; Wilson, Melanie; Liu, Hong; Jiang, Yanliang; Lee, Yoona; Chen, Fei; Lu, Jianguo; Wang, Wenqi; Xu, Peng; Somridhivej, Benjaporn; Baoprasertkul, Puttharat; Quilang, Jonas; Sha, Zhenxia; Bao, Baolong; Wang, Yaping; Wang, Qun; Takano, Tomokazu; Nandi, Samiran; Liu, Shikai; Wong, Lilian; Kaltenboeck, Ludmilla; Quiniou, Sylvie; Bengten, Eva; Miller, Norman; Trant, John; Rokhsar, Daniel; Liu, Zhanjiang

    2010-03-23

    Background-Through the Community Sequencing Program, a catfish EST sequencing project was carried out through a collaboration between the catfish research community and the Department of Energy's Joint Genome Institute. Prior to this project, only a limited EST resource from catfish was available for the purpose of SNP identification. Results-A total of 438,321 quality ESTs were generated from 8 channel catfish (Ictalurus punctatus) and 4 blue catfish (Ictalurus furcatus) libraries, bringing the number of catfish ESTs to nearly 500,000. Assembly of all catfish ESTs resulted in 45,306 contigs and 66,272 singletons. Over 35percent of the unique sequences had significant similarities to known genes, allowing the identification of 14,776 unique genes in catfish. Over 300,000 putative SNPs have been identified, of which approximately 48,000 are high-quality SNPs identified from contigs with at least four sequences and the minor allele presence of at least two sequences in the contig. The EST resource should be valuable for identification of microsatellites, genome annotation, large-scale expression analysis, and comparative genome analysis. Conclusions-This project generated a large EST resource for catfish that captured the majority of the catfish transcriptome. The parallel analysis of ESTs from two closely related Ictalurid catfishes should also provide powerful means for the evaluation of ancient and recent gene duplications, and for the development of high-density microarrays in catfish. The inter- and intra-specific SNPs identified from all catfish EST dataset assembly will greatly benefit the catfish introgression breeding program and whole genome association studies.

  11. Genome-wide transcriptome study in wheat identified candidate genes related to processing quality, majority of them showing interaction (quality x development) and having temporal and spatial distributions.

    Science.gov (United States)

    Singh, Anuradha; Mantri, Shrikant; Sharma, Monica; Chaudhury, Ashok; Tuli, Rakesh; Roy, Joy

    2014-01-16

    The cultivated bread wheat (Triticum aestivum L.) possesses unique flour quality, which can be processed into many end-use food products such as bread, pasta, chapatti (unleavened flat bread), biscuit, etc. The present wheat varieties require improvement in processing quality to meet the increasing demand of better quality food products. However, processing quality is very complex and controlled by many genes, which have not been completely explored. To identify the candidate genes whose expressions changed due to variation in processing quality and interaction (quality x development), genome-wide transcriptome studies were performed in two sets of diverse Indian wheat varieties differing for chapatti quality. It is also important to understand the temporal and spatial distributions of their expressions for designing tissue and growth specific functional genomics experiments. Gene-specific two-way ANOVA analysis of expression of about 55 K transcripts in two diverse sets of Indian wheat varieties for chapatti quality at three seed developmental stages identified 236 differentially expressed probe sets (10-fold). Out of 236, 110 probe sets were identified for chapatti quality. Many processing quality related key genes such as glutenin and gliadins, puroindolines, grain softness protein, alpha and beta amylases, proteases, were identified, and many other candidate genes related to cellular and molecular functions were also identified. The ANOVA analysis revealed that the expression of 56 of 110 probe sets was involved in interaction (quality x development). Majority of the probe sets showed differential expression at early stage of seed development i.e. temporal expression. Meta-analysis revealed that the majority of the genes expressed in one or a few growth stages indicating spatial distribution of their expressions. The differential expressions of a few candidate genes such as pre-alpha/beta-gliadin and gamma gliadin were validated by RT-PCR. Therefore, this study

  12. Genome-wide transcriptome study in wheat identified candidate genes related to processing quality, majority of them showing interaction (quality x development) and having temporal and spatial distributions

    Science.gov (United States)

    2014-01-01

    Background The cultivated bread wheat (Triticum aestivum L.) possesses unique flour quality, which can be processed into many end-use food products such as bread, pasta, chapatti (unleavened flat bread), biscuit, etc. The present wheat varieties require improvement in processing quality to meet the increasing demand of better quality food products. However, processing quality is very complex and controlled by many genes, which have not been completely explored. To identify the candidate genes whose expressions changed due to variation in processing quality and interaction (quality x development), genome-wide transcriptome studies were performed in two sets of diverse Indian wheat varieties differing for chapatti quality. It is also important to understand the temporal and spatial distributions of their expressions for designing tissue and growth specific functional genomics experiments. Results Gene-specific two-way ANOVA analysis of expression of about 55 K transcripts in two diverse sets of Indian wheat varieties for chapatti quality at three seed developmental stages identified 236 differentially expressed probe sets (10-fold). Out of 236, 110 probe sets were identified for chapatti quality. Many processing quality related key genes such as glutenin and gliadins, puroindolines, grain softness protein, alpha and beta amylases, proteases, were identified, and many other candidate genes related to cellular and molecular functions were also identified. The ANOVA analysis revealed that the expression of 56 of 110 probe sets was involved in interaction (quality x development). Majority of the probe sets showed differential expression at early stage of seed development i.e. temporal expression. Meta-analysis revealed that the majority of the genes expressed in one or a few growth stages indicating spatial distribution of their expressions. The differential expressions of a few candidate genes such as pre-alpha/beta-gliadin and gamma gliadin were validated by RT

  13. Rare and common regulatory variation in population-scale sequenced human genomes.

    Directory of Open Access Journals (Sweden)

    Stephen B Montgomery

    2011-07-01

    Full Text Available Population-scale genome sequencing allows the characterization of functional effects of a broad spectrum of genetic variants underlying human phenotypic variation. Here, we investigate the influence of rare and common genetic variants on gene expression patterns, using variants identified from sequencing data from the 1000 genomes project in an African and European population sample and gene expression data from lymphoblastoid cell lines. We detect comparable numbers of expression quantitative trait loci (eQTLs when compared to genotypes obtained from HapMap 3, but as many as 80% of the top expression quantitative trait variants (eQTVs discovered from 1000 genomes data are novel. The properties of the newly discovered variants suggest that mapping common causal regulatory variants is challenging even with full resequencing data; however, we observe significant enrichment of regulatory effects in splice-site and nonsense variants. Using RNA sequencing data, we show that 46.2% of nonsynonymous variants are differentially expressed in at least one individual in our sample, creating widespread potential for interactions between functional protein-coding and regulatory variants. We also use allele-specific expression to identify putative rare causal regulatory variants. Furthermore, we demonstrate that outlier expression values can be due to rare variant effects, and we approximate the number of such effects harboured in an individual by effect size. Our results demonstrate that integration of genomic and RNA sequencing analyses allows for the joint assessment of genome sequence and genome function.

  14. Survey of protein–DNA interactions in Aspergillus oryzae on a genomic scale

    Science.gov (United States)

    Wang, Chao; Lv, Yangyong; Wang, Bin; Yin, Chao; Lin, Ying; Pan, Li

    2015-01-01

    The genome-scale delineation of in vivo protein–DNA interactions is key to understanding genome function. Only ∼5% of transcription factors (TFs) in the Aspergillus genus have been identified using traditional methods. Although the Aspergillus oryzae genome contains >600 TFs, knowledge of the in vivo genome-wide TF-binding sites (TFBSs) in aspergilli remains limited because of the lack of high-quality antibodies. We investigated the landscape of in vivo protein–DNA interactions across the A. oryzae genome through coupling the DNase I digestion of intact nuclei with massively parallel sequencing and the analysis of cleavage patterns in protein–DNA interactions at single-nucleotide resolution. The resulting map identified overrepresented de novo TF-binding motifs from genomic footprints, and provided the detailed chromatin remodeling patterns and the distribution of digital footprints near transcription start sites. The TFBSs of 19 known Aspergillus TFs were also identified based on DNase I digestion data surrounding potential binding sites in conjunction with TF binding specificity information. We observed that the cleavage patterns of TFBSs were dependent on the orientation of TF motifs and independent of strand orientation, consistent with the DNA shape features of binding motifs with flanking sequences. PMID:25883143

  15. Testing the neutral theory of molecular evolution using genomic data: a comparison of the human and bovine transcriptome

    Directory of Open Access Journals (Sweden)

    McCulloch Alan

    2006-04-01

    Full Text Available Abstract Despite growing evidence of rapid evolution in protein coding genes, the contribution of positive selection to intra- and interspecific differences in protein coding regions of the genome is unclear. We attempted to see if genes coding for secreted proteins and genes with narrow expression, specifically those preferentially expressed in the mammary gland, have diverged at a faster rate between domestic cattle (Bos taurus and humans (Homo sapiens than other genes and whether positive selection is responsible. Using a large data set, we identified groups of genes based on secretion and expression patterns and compared them for the rate of nonsynonymous (dN and synonymous (dS substitutions per site and the number of radical (Dr and conservative (Dc amino acid substitutions. We found evidence of rapid evolution in genes with narrow expression, especially for those expressed in the liver and mammary gland and for genes coding for secreted proteins. We compared common human polymorphism data with human-cattle divergence and found that genes with high evolutionary rates in human-cattle divergence also had a large number of common human polymorphisms. This argues against positive selection causing rapid divergence in these groups of genes. In most cases dN/dS ratios were lower in human-cattle divergence than in common human polymorphism presumably due to differences in the effectiveness of purifying selection between long-term divergence and short-term polymorphism.

  16. Disturbance of gene expression in primary human hepatocytes by hepatotoxic pyrrolizidine alkaloids: A whole genome transcriptome analysis.

    Science.gov (United States)

    Luckert, Claudia; Hessel, Stefanie; Lenze, Dido; Lampen, Alfonso

    2015-10-01

    1,2-unsaturated pyrrolizidine alkaloids (PA) are plant metabolites predominantly occurring in the plant families Asteraceae and Boraginaceae. Acute and chronic PA poisoning causes severe hepatotoxicity. So far, the molecular mechanisms of PA toxicity are not well understood. To analyze its mode of action, primary human hepatocytes were exposed to a non-cytotoxic dose of 100 μM of four structurally different PA: echimidine, heliotrine, senecionine, senkirkine. Changes in mRNA expression were analyzed by a whole genome microarray. Employing cut-off values with a |fold change| of 2 and a q-value of 0.01, data analysis revealed numerous changes in gene expression. In total, 4556, 1806, 3406 and 8623 genes were regulated by echimidine, heliotrine, senecione and senkirkine, respectively. 1304 genes were identified as commonly regulated. PA affected pathways related to cell cycle regulation, cell death and cancer development. The transcription factors TP53, MYC, NFκB and NUPR1 were predicted to be activated upon PA treatment. Furthermore, gene expression data showed a considerable interference with lipid metabolism and bile acid flow. The associated transcription factors FXR, LXR, SREBF1/2, and PPARα/γ/δ were predicted to be inhibited. In conclusion, though structurally different, all four PA significantly regulated a great number of genes in common. This proposes similar molecular mechanisms, although the extent seems to differ between the analyzed PA as reflected by the potential hepatotoxicity and individual PA structure. Copyright © 2015 Elsevier Ltd. All rights reserved.

  17. Toward the automated generation of genome-scale metabolic networks in the SEED.

    Science.gov (United States)

    DeJongh, Matthew; Formsma, Kevin; Boillot, Paul; Gould, John; Rycenga, Matthew; Best, Aaron

    2007-04-26

    Current methods for the automated generation of genome-scale metabolic networks focus on genome annotation and preliminary biochemical reaction network assembly, but do not adequately address the process of identifying and filling gaps in the reaction network, and verifying that the network is suitable for systems level analysis. Thus, current methods are only sufficient for generating draft-quality networks, and refinement of the reaction network is still largely a manual, labor-intensive process. We have developed a method for generating genome-scale metabolic networks that produces substantially complete reaction networks, suitable for systems level analysis. Our method partitions the reaction space of central and intermediary metabolism into discrete, interconnected components that can be assembled and verified in isolation from each other, and then integrated and verified at the level of their interconnectivity. We have developed a database of components that are common across organisms, and have created tools for automatically assembling appropriate components for a particular organism based on the metabolic pathways encoded in the organism's genome. This focuses manual efforts on that portion of an organism's metabolism that is not yet represented in the database. We have demonstrated the efficacy of our method by reverse-engineering and automatically regenerating the reaction network from a published genome-scale metabolic model for Staphylococcus aureus. Additionally, we have verified that our method capitalizes on the database of common reaction network components created for S. aureus, by using these components to generate substantially complete reconstructions of the reaction networks from three other published metabolic models (Escherichia coli, Helicobacter pylori, and Lactococcus lactis). We have implemented our tools and database within the SEED, an open-source software environment for comparative genome annotation and analysis. Our method sets the

  18. Toward the automated generation of genome-scale metabolic networks in the SEED

    Directory of Open Access Journals (Sweden)

    Gould John

    2007-04-01

    Full Text Available Abstract Background Current methods for the automated generation of genome-scale metabolic networks focus on genome annotation and preliminary biochemical reaction network assembly, but do not adequately address the process of identifying and filling gaps in the reaction network, and verifying that the network is suitable for systems level analysis. Thus, current methods are only sufficient for generating draft-quality networks, and refinement of the reaction network is still largely a manual, labor-intensive process. Results We have developed a method for generating genome-scale metabolic networks that produces substantially complete reaction networks, suitable for systems level analysis. Our method partitions the reaction space of central and intermediary metabolism into discrete, interconnected components that can be assembled and verified in isolation from each other, and then integrated and verified at the level of their interconnectivity. We have developed a database of components that are common across organisms, and have created tools for automatically assembling appropriate components for a particular organism based on the metabolic pathways encoded in the organism's genome. This focuses manual efforts on that portion of an organism's metabolism that is not yet represented in the database. We have demonstrated the efficacy of our method by reverse-engineering and automatically regenerating the reaction network from a published genome-scale metabolic model for Staphylococcus aureus. Additionally, we have verified that our method capitalizes on the database of common reaction network components created for S. aureus, by using these components to generate substantially complete reconstructions of the reaction networks from three other published metabolic models (Escherichia coli, Helicobacter pylori, and Lactococcus lactis. We have implemented our tools and database within the SEED, an open-source software environment for comparative

  19. Reciprocal genomic evolution in the ant-fungus agricultural symbiosis

    DEFF Research Database (Denmark)

    Nygaard, Sanne; Hu, Haofu; Li, Cai

    2016-01-01

    The attine ant-fungus agricultural symbiosis evolved over tens of millions of years, producing complex societies with industrial-scale farming analogous to that of humans. Here we document reciprocal shifts in the genomes and transcriptomes of seven fungus-farming ant species and their fungal...

  20. Identifying anti-growth factors for human cancer cell lines through genome-scale metabolic modeling

    DEFF Research Database (Denmark)

    Ghaffari, Pouyan; Mardinoglu, Adil; Asplund, Anna

    2015-01-01

    Human cancer cell lines are used as important model systems to study molecular mechanisms associated with tumor growth, hereunder how genomic and biological heterogeneity found in primary tumors affect cellular phenotypes. We reconstructed Genome scale metabolic models (GEMs) for eleven cell lines...... based on RNA-Seq data and validated the functionality of these models with data from metabolite profiling. We used cell line-specific GEMs to analyze the differences in the metabolism of cancer cell lines, and to explore the heterogeneous expression of the metabolic subsystems. Furthermore, we predicted...... for inhibition of cell growth may provide leads for the development of efficient cancer treatment strategies....

  1. Genome-wide transcriptomic responses of the seagrasses Zostera marina and Nanozostera noltii under a simulated heatwave confirm functional types.

    Science.gov (United States)

    Franssen, Susanne U; Gu, Jenny; Winters, Gidon; Huylmans, Ann-Kathrin; Wienpahl, Isabell; Sparwel, Maximiliane; Coyer, James A; Olsen, Jeanine L; Reusch, Thorsten B H; Bornberg-Bauer, Erich

    2014-06-01

    Genome-wide transcription analysis between related species occurring in overlapping ranges can provide insights into the molecular basis underlying different ecological niches. The co-occurring seagrass species, Zostera marina and Nanozostera noltii, are found in marine coastal environments throughout the northern hemisphere. Z. marina is often dominant in subtidal environments and subjected to fewer temperature extremes compared to the predominately intertidal and more stress-tolerant N. noltii. We exposed plants of both species to a realistic heat wave scenario in a common-stress-garden experiment. Using RNA-seq (~7million reads/library), four Z. marina and four N. noltii libraries were compared representing northern (Denmark) and southern (Italy) locations within the co-occurring range of the species' European distribution. A total of 8977 expressed genes were identified, of which 78 were directly related to heat stress. As predicted, both species were negatively affected by the heat wave, but showed markedly different molecular responses. In Z. marina the heat response was similar across locations in response to the heatwave at 26°C, with a complex response in functions related to protein folding, synthesis of ribosomal chloroplast proteins, proteins involved in cell wall modification and heat shock proteins (HSPs). In N. noltii the heat response markedly differed between locations, while HSP genes were not induced in either population. Our results suggest that as coastal seawater temperatures increase, Z. marina will disappear along its southern most ranges, whereas N. noltii will continue to move north. As a consequence, sub- and intertidal habitat partitioning may weaken in more northern regions because the higher thermal tolerance of N. noltii provides a competitive advantage in both habitats. Although previous studies have focused on HSPs, the present study clearly demonstrates that a broader examination of stress related genes is necessary. Copyright

  2. Transcriptome sequencing and characterization for the sea cucumber Apostichopus japonicus (Selenka, 1867.

    Directory of Open Access Journals (Sweden)

    Huixia Du

    Full Text Available BACKGROUND: Sea cucumbers are a special group of marine invertebrates. They occupy a taxonomic position that is believed to be important for understanding the origin and evolution of deuterostomes. Some of them such as Apostichopus japonicus represent commercially important aquaculture species in Asian countries. Many efforts have been devoted to increasing the number of expressed sequence tags (ESTs for A. japonicus, but a comprehensive characterization of its transcriptome remains lacking. Here, we performed the large-scale transcriptome profiling and characterization by pyrosequencing diverse cDNA libraries from A. japonicus. RESULTS: In total, 1,061,078 reads were obtained by 454 sequencing of eight cDNA libraries representing different developmental stages and adult tissues in A. japonicus. These reads were assembled into 29,666 isotigs, which were further clustered into 21,071 isogroups. Nearly 40% of the isogroups showed significant matches to known proteins based on sequence similarity. Gene ontology (GO and KEGG pathway analyses recovered diverse biological functions and processes. Candidate genes that were potentially involved in aestivation were identified. Transcriptome comparison with the sea urchin Strongylocentrotus purpuratus revealed similar patterns of GO term representation. In addition, 4,882 putative orthologous genes were identified, of which 202 were not present in the non-echinoderm organisms. More than 700 simple sequence repeats (SSRs and 54,000 single nucleotide polymorphisms (SNPs were detected in the A. japonicus transcriptome. CONCLUSION: Pyrosequencing was proven to be efficient in rapidly identifying a large set of genes for the sea cucumber A. japonicus. Through the large-scale transcriptome sequencing as well as public EST data integration, we performed a comprehensive characterization of the A. japonicus transcriptome and identified candidate aestivation-related genes. A large number of potential genetic

  3. Genome-wide evolutionary dynamics of influenza B viruses on a global scale.

    Directory of Open Access Journals (Sweden)

    Pinky Langat

    2017-12-01

    Full Text Available The global-scale epidemiology and genome-wide evolutionary dynamics of influenza B remain poorly understood compared with influenza A viruses. We compiled a spatio-temporally comprehensive dataset of influenza B viruses, comprising over 2,500 genomes sampled worldwide between 1987 and 2015, including 382 newly-sequenced genomes that fill substantial gaps in previous molecular surveillance studies. Our contributed data increase the number of available influenza B virus genomes in Europe, Africa and Central Asia, improving the global context to study influenza B viruses. We reveal Yamagata-lineage diversity results from co-circulation of two antigenically-distinct groups that also segregate genetically across the entire genome, without evidence of intra-lineage reassortment. In contrast, Victoria-lineage diversity stems from geographic segregation of different genetic clades, with variability in the degree of geographic spread among clades. Differences between the lineages are reflected in their antigenic dynamics, as Yamagata-lineage viruses show alternating dominance between antigenic groups, while Victoria-lineage viruses show antigenic drift of a single lineage. Structural mapping of amino acid substitutions on trunk branches of influenza B gene phylogenies further supports these antigenic differences and highlights two potential mechanisms of adaptation for polymerase activity. Our study provides new insights into the epidemiological and molecular processes shaping influenza B virus evolution globally.

  4. Genome-wide evolutionary dynamics of influenza B viruses on a global scale

    Science.gov (United States)

    Langat, Pinky; Bowden, Thomas A.; Edwards, Stephanie; Gall, Astrid; Rambaut, Andrew; Daniels, Rodney S.; Russell, Colin A.; Pybus, Oliver G.; McCauley, John

    2017-01-01

    The global-scale epidemiology and genome-wide evolutionary dynamics of influenza B remain poorly understood compared with influenza A viruses. We compiled a spatio-temporally comprehensive dataset of influenza B viruses, comprising over 2,500 genomes sampled worldwide between 1987 and 2015, including 382 newly-sequenced genomes that fill substantial gaps in previous molecular surveillance studies. Our contributed data increase the number of available influenza B virus genomes in Europe, Africa and Central Asia, improving the global context to study influenza B viruses. We reveal Yamagata-lineage diversity results from co-circulation of two antigenically-distinct groups that also segregate genetically across the entire genome, without evidence of intra-lineage reassortment. In contrast, Victoria-lineage diversity stems from geographic segregation of different genetic clades, with variability in the degree of geographic spread among clades. Differences between the lineages are reflected in their antigenic dynamics, as Yamagata-lineage viruses show alternating dominance between antigenic groups, while Victoria-lineage viruses show antigenic drift of a single lineage. Structural mapping of amino acid substitutions on trunk branches of influenza B gene phylogenies further supports these antigenic differences and highlights two potential mechanisms of adaptation for polymerase activity. Our study provides new insights into the epidemiological and molecular processes shaping influenza B virus evolution globally. PMID:29284042

  5. Investigating host-pathogen behavior and their interaction using genome-scale metabolic network models.

    Science.gov (United States)

    Sadhukhan, Priyanka P; Raghunathan, Anu

    2014-01-01

    Genome Scale Metabolic Modeling methods represent one way to compute whole cell function starting from the genome sequence of an organism and contribute towards understanding and predicting the genotype-phenotype relationship. About 80 models spanning all the kingdoms of life from archaea to eukaryotes have been built till date and used to interrogate cell phenotype under varying conditions. These models have been used to not only understand the flux distribution in evolutionary conserved pathways like glycolysis and the Krebs cycle but also in applications ranging from value added product formation in Escherichia coli to predicting inborn errors of Homo sapiens metabolism. This chapter describes a protocol that delineates the process of genome scale metabolic modeling for analysing host-pathogen behavior and interaction using flux balance analysis (FBA). The steps discussed in the process include (1) reconstruction of a metabolic network from the genome sequence, (2) its representation in a precise mathematical framework, (3) its translation to a model, and (4) the analysis using linear algebra and optimization. The methods for biological interpretations of computed cell phenotypes in the context of individual host and pathogen models and their integration are also discussed.

  6. A Normalization-Free and Nonparametric Method Sharpens Large-Scale Transcriptome Analysis and Reveals Common Gene Alteration Patterns in Cancers.

    Science.gov (United States)

    Li, Qi-Gang; He, Yong-Han; Wu, Huan; Yang, Cui-Ping; Pu, Shao-Yan; Fan, Song-Qing; Jiang, Li-Ping; Shen, Qiu-Shuo; Wang, Xiao-Xiong; Chen, Xiao-Qiong; Yu, Qin; Li, Ying; Sun, Chang; Wang, Xiangting; Zhou, Jumin; Li, Hai-Peng; Chen, Yong-Bin; Kong, Qing-Peng

    2017-01-01

    Heterogeneity in transcriptional data hampers the identification of differentially expressed genes (DEGs) and understanding of cancer, essentially because current methods rely on cross-sample normalization and/or distribution assumption-both sensitive to heterogeneous values. Here, we developed a new method, Cross-Value Association Analysis (CVAA), which overcomes the limitation and is more robust to heterogeneous data than the other methods. Applying CVAA to a more complex pan-cancer dataset containing 5,540 transcriptomes discovered numerous new DEGs and many previously rarely explored pathways/processes; some of them were validated, both in vitro and in vivo , to be crucial in tumorigenesis, e.g., alcohol metabolism ( ADH1B ), chromosome remodeling ( NCAPH ) and complement system ( Adipsin ). Together, we present a sharper tool to navigate large-scale expression data and gain new mechanistic insights into tumorigenesis.

  7. A protocol for generating a high-quality genome-scale metabolic reconstruction.

    Science.gov (United States)

    Thiele, Ines; Palsson, Bernhard Ø

    2010-01-01

    Network reconstructions are a common denominator in systems biology. Bottom-up metabolic network reconstructions have been developed over the last 10 years. These reconstructions represent structured knowledge bases that abstract pertinent information on the biochemical transformations taking place within specific target organisms. The conversion of a reconstruction into a mathematical format facilitates a myriad of computational biological studies, including evaluation of network content, hypothesis testing and generation, analysis of phenotypic characteristics and metabolic engineering. To date, genome-scale metabolic reconstructions for more than 30 organisms have been published and this number is expected to increase rapidly. However, these reconstructions differ in quality and coverage that may minimize their predictive potential and use as knowledge bases. Here we present a comprehensive protocol describing each step necessary to build a high-quality genome-scale metabolic reconstruction, as well as the common trials and tribulations. Therefore, this protocol provides a helpful manual for all stages of the reconstruction process.

  8. Genome scale models of yeast: towards standardized evaluation and consistent omic integration

    DEFF Research Database (Denmark)

    Sanchez, Benjamin J.; Nielsen, Jens

    2015-01-01

    Genome scale models (GEMs) have enabled remarkable advances in systems biology, acting as functional databases of metabolism, and as scaffolds for the contextualization of high-throughput data. In the case of Saccharomyces cerevisiae (budding yeast), several GEMs have been published and are curre......Genome scale models (GEMs) have enabled remarkable advances in systems biology, acting as functional databases of metabolism, and as scaffolds for the contextualization of high-throughput data. In the case of Saccharomyces cerevisiae (budding yeast), several GEMs have been published...... in which all levels of omics data (from gene expression to flux) have been integrated in yeast GEMs. Relevant conclusions and current challenges for both GEM evaluation and omic integration are highlighted....

  9. A systems approach to predict oncometabolites via context-specific genome-scale metabolic networks.

    Directory of Open Access Journals (Sweden)

    Hojung Nam

    2014-09-01

    Full Text Available Altered metabolism in cancer cells has been viewed as a passive response required for a malignant transformation. However, this view has changed through the recently described metabolic oncogenic factors: mutated isocitrate dehydrogenases (IDH, succinate dehydrogenase (SDH, and fumarate hydratase (FH that produce oncometabolites that competitively inhibit epigenetic regulation. In this study, we demonstrate in silico predictions of oncometabolites that have the potential to dysregulate epigenetic controls in nine types of cancer by incorporating massive scale genetic mutation information (collected from more than 1,700 cancer genomes, expression profiling data, and deploying Recon 2 to reconstruct context-specific genome-scale metabolic models. Our analysis predicted 15 compounds and 24 substructures of potential oncometabolites that could result from the loss-of-function and gain-of-function mutations of metabolic enzymes, respectively. These results suggest a substantial potential for discovering unidentified oncometabolites in various forms of cancers.

  10. Genome-scale modeling of yeast: chronology, applications and critical perspectives.

    Science.gov (United States)

    Lopes, Helder; Rocha, Isabel

    2017-08-01

    Over the last 15 years, several genome-scale metabolic models (GSMMs) were developed for different yeast species, aiding both the elucidation of new biological processes and the shift toward a bio-based economy, through the design of in silico inspired cell factories. Here, an historical perspective of the GSMMs built over time for several yeast species is presented and the main inheritance patterns among the metabolic reconstructions are highlighted. We additionally provide a critical perspective on the overall genome-scale modeling procedure, underlining incomplete model validation and evaluation approaches and the quest for the integration of regulatory and kinetic information into yeast GSMMs. A summary of experimentally validated model-based metabolic engineering applications of yeast species is further emphasized, while the main challenges and future perspectives for the field are finally addressed. © FEMS 2017.

  11. Metingear: a development environment for annotating genome-scale metabolic models.

    Science.gov (United States)

    May, John W; James, A Gordon; Steinbeck, Christoph

    2013-09-01

    Genome-scale metabolic models often lack annotations that would allow them to be used for further analysis. Previous efforts have focused on associating metabolites in the model with a cross reference, but this can be problematic if the reference is not freely available, multiple resources are used or the metabolite is added from a literature review. Associating each metabolite with chemical structure provides unambiguous identification of the components and a more detailed view of the metabolism. We have developed an open-source desktop application that simplifies the process of adding database cross references and chemical structures to genome-scale metabolic models. Annotated models can be exported to the Systems Biology Markup Language open interchange format. Source code, binaries, documentation and tutorials are freely available at http://johnmay.github.com/metingear. The application is implemented in Java with bundles available for MS Windows and Macintosh OS X.

  12. Biofilm Formation Mechanisms of Pseudomonas aeruginosa Predicted via Genome-Scale Kinetic Models of Bacterial Metabolism

    Science.gov (United States)

    2016-03-15

    RESEARCH ARTICLE Biofilm Formation Mechanisms of Pseudomonas aeruginosa Predicted via Genome-Scale Kinetic Models of Bacterial Metabolism Francisco G...jaques.reifman.civ@mail.mil Abstract A hallmark of Pseudomonas aeruginosa is its ability to establish biofilm -based infections that are difficult to...eradicate. Biofilms are less susceptible to host inflammatory and immune responses and have higher antibiotic tolerance than free-living planktonic

  13. Enumeration of smallest intervention strategies in genome-scale metabolic networks.

    Directory of Open Access Journals (Sweden)

    Axel von Kamp

    2014-01-01

    Full Text Available One ultimate goal of metabolic network modeling is the rational redesign of biochemical networks to optimize the production of certain compounds by cellular systems. Although several constraint-based optimization techniques have been developed for this purpose, methods for systematic enumeration of intervention strategies in genome-scale metabolic networks are still lacking. In principle, Minimal Cut Sets (MCSs; inclusion-minimal combinations of reaction or gene deletions that lead to the fulfilment of a given intervention goal provide an exhaustive enumeration approach. However, their disadvantage is the combinatorial explosion in larger networks and the requirement to compute first the elementary modes (EMs which itself is impractical in genome-scale networks. We present MCSEnumerator, a new method for effective enumeration of the smallest MCSs (with fewest interventions in genome-scale metabolic network models. For this we combine two approaches, namely (i the mapping of MCSs to EMs in a dual network, and (ii a modified algorithm by which shortest EMs can be effectively determined in large networks. In this way, we can identify the smallest MCSs by calculating the shortest EMs in the dual network. Realistic application examples demonstrate that our algorithm is able to list thousands of the most efficient intervention strategies in genome-scale networks for various intervention problems. For instance, for the first time we could enumerate all synthetic lethals in E.coli with combinations of up to 5 reactions. We also applied the new algorithm exemplarily to compute strain designs for growth-coupled synthesis of different products (ethanol, fumarate, serine by E.coli. We found numerous new engineering strategies partially requiring less knockouts and guaranteeing higher product yields (even without the assumption of optimal growth than reported previously. The strength of the presented approach is that smallest intervention strategies can be

  14. In silico analysis of human metabolism: Reconstruction, contextualization and application of genome-scale models

    DEFF Research Database (Denmark)

    Geng, Jun; Nielsen, Jens

    2017-01-01

    The arising prevalence of metabolic diseases calls for a holistic approach for analysis of the underlying nature of abnormalities in cellular functions. Through mathematic representation and topological analysis of cellular metabolism, GEnome scale metabolic Models (GEMs) provide a promising fram...... that correctly describe interactions between cells or tissues, and we therefore discuss how GEMs can be integrated with blood circulation models. Finally, we end the review with proposing some possible future research directions....

  15. Optimal knockout strategies in genome-scale metabolic networks using particle swarm optimization.

    Science.gov (United States)

    Nair, Govind; Jungreuthmayer, Christian; Zanghellini, Jürgen

    2017-02-01

    Knockout strategies, particularly the concept of constrained minimal cut sets (cMCSs), are an important part of the arsenal of tools used in manipulating metabolic networks. Given a specific design, cMCSs can be calculated even in genome-scale networks. We would however like to find not only the optimal intervention strategy for a given design but the best possible design too. Our solution (PSOMCS) is to use particle swarm optimization (PSO) along with the direct calculation of cMCSs from the stoichiometric matrix to obtain optimal designs satisfying multiple objectives. To illustrate the working of PSOMCS, we apply it to a toy network. Next we show its superiority by comparing its performance against other comparable methods on a medium sized E. coli core metabolic network. PSOMCS not only finds solutions comparable to previously published results but also it is orders of magnitude faster. Finally, we use PSOMCS to predict knockouts satisfying multiple objectives in a genome-scale metabolic model of E. coli and compare it with OptKnock and RobustKnock. PSOMCS finds competitive knockout strategies and designs compared to other current methods and is in some cases significantly faster. It can be used in identifying knockouts which will force optimal desired behaviors in large and genome scale metabolic networks. It will be even more useful as larger metabolic models of industrially relevant organisms become available.

  16. Microarray Data Processing Techniques for Genome-Scale Network Inference from Large Public Repositories.

    Science.gov (United States)

    Chockalingam, Sriram; Aluru, Maneesha; Aluru, Srinivas

    2016-09-19

    Pre-processing of microarray data is a well-studied problem. Furthermore, all popular platforms come with their own recommended best practices for differential analysis of genes. However, for genome-scale network inference using microarray data collected from large public repositories, these methods filter out a considerable number of genes. This is primarily due to the effects of aggregating a diverse array of experiments with different technical and biological scenarios. Here we introduce a pre-processing pipeline suitable for inferring genome-scale gene networks from large microarray datasets. We show that partitioning of the available microarray datasets according to biological relevance into tissue- and process-specific categories significantly extends the limits of downstream network construction. We demonstrate the effectiveness of our pre-processing pipeline by inferring genome-scale networks for the model plant Arabidopsis thaliana using two different construction methods and a collection of 11,760 Affymetrix ATH1 microarray chips. Our pre-processing pipeline and the datasets used in this paper are made available at http://alurulab.cc.gatech.edu/microarray-pp.

  17. Genome-wide transcriptome analysis of hippocampus in rats indicated that TLR/NLR signaling pathway was involved in the pathogenisis of depressive disorder induced by chronic restraint stress.

    Science.gov (United States)

    Wang, Yu; Jiang, Huili; Meng, Hong; Lu, Jun; Li, Jing; Zhang, Xuhui; Yang, Xinjing; Zhao, Bingcong; Sun, Yang; Bao, Tuya

    2017-09-01

    Data from clinical investigations and laboratory fundings have provided preliminary evidence for the effectiveness and safety of acupuncture therapy in depression. However, the mechanisms underlying the antidepressant response of acupuncture are not fully elucidated. To elucidate the potential effects of acupuncture for depression on the hippocampal genome-wide transcriptome at the molecular level, we evaluated the transcriptomic profile of depression rats under treatment of acupuncture, and fluoxetine. We identified a very significant effect of acupucture intervention, with 107 genes differentially expressed in acupuncture vs. model group; while 41 genes between fluoxetine vs. model group. Notably, the 54 differentially expressed genes between acupuncture and fluoxetine showed the significantly different effect between acupuncture and fluoxetine. Through GO (gene ontology) functional term and KEGG (kyoto encyclopedia of genes and genomes) pathway analysis, we identified that the upregulation of gene sets were related to inflammatory response, innate immunity and immune response. We found that toll-like receptor signalling pathway and NOD like receptor signalling pathway were associated with the function of inflammatory response, innate immunity and immune response. Importantly, acupuncture reversed the upregulation of gene sets that were related to inflammatory response, innate immunity and immune response (including toll-like receptor signalling pathway and NOD like receptor signalling pathway), which might be critical for the pathogenesis of depression and provide evidence for the antidepressive effects of acupuncture by regulating inflammatory response, innate immunity and immune response via toll-like receptor signalling pathway and NOD like receptor signalling pathway. Copyright © 2017 Elsevier Inc. All rights reserved.

  18. Genome-Wide Analysis of Gene and microRNA Expression in Diploid and Autotetraploid Paulownia fortunei (Seem Hemsl. under Drought Stress by Transcriptome, microRNA, and Degradome Sequencing

    Directory of Open Access Journals (Sweden)

    Zhenli Zhao

    2018-02-01

    Full Text Available Drought is a common and recurring climatic condition in many parts of the world, and it can have disastrous impacts on plant growth and development. Many genes involved in the drought response of plants have been identified. Transcriptome, microRNA (miRNA, and degradome analyses are rapid ways of identifying drought-responsive genes. The reference genome sequence of Paulownia fortunei (Seem Hemsl. is now available, which makes it easier to explore gene expression, transcriptional regulation, and post-transcriptional in this species. In this study, four transcriptome, small RNA, and degradome libraries were sequenced by Illumina sequencing, respectively. A total of 258 genes and 11 miRNAs were identified for drought-responsive genes and miRNAs in P. fortunei. Degradome sequencing detected 28 miRNA target genes that were cleaved by members of nine conserved miRNA families and 12 novel miRNAs. The results here will contribute toward enriching our understanding of the response of Paulownia fortunei trees to drought stress and may provide new direction for further experimental studies related the development of molecular markers, the genetic map construction, and other genomic research projects in Paulownia.

  19. Rainbow: a tool for large-scale whole-genome sequencing data analysis using cloud computing.

    Science.gov (United States)

    Zhao, Shanrong; Prenger, Kurt; Smith, Lance; Messina, Thomas; Fan, Hongtao; Jaeger, Edward; Stephens, Susan

    2013-06-27

    Technical improvements have decreased sequencing costs and, as a result, the size and number of genomic datasets have increased rapidly. Because of the lower cost, large amounts of sequence data are now being produced by small to midsize research groups. Crossbow is a software tool that can detect single nucleotide polymorphisms (SNPs) in whole-genome sequencing (WGS) data from a single subject; however, Crossbow has a number of limitations when applied to multiple subjects from large-scale WGS projects. The data storage and CPU resources that are required for large-scale whole genome sequencing data analyses are too large for many core facilities and individual laboratories to provide. To help meet these challenges, we have developed Rainbow, a cloud-based software package that can assist in the automation of large-scale WGS data analyses. Here, we evaluated the performance of Rainbow by analyzing 44 different whole-genome-sequenced subjects. Rainbow has the capacity to process genomic data from more than 500 subjects in two weeks using cloud computing provided by the Amazon Web Service. The time includes the import and export of the data using Amazon Import/Export service. The average cost of processing a single sample in the cloud was less than 120 US dollars. Compared with Crossbow, the main improvements incorporated into Rainbow include the ability: (1) to handle BAM as well as FASTQ input files; (2) to split large sequence files for better load balance downstream; (3) to log the running metrics in data processing and monitoring multiple Amazon Elastic Compute Cloud (EC2) instances; and (4) to merge SOAPsnp outputs for multiple individuals into a single file to facilitate downstream genome-wide association studies. Rainbow is a scalable, cost-effective, and open-source tool for large-scale WGS data analysis. For human WGS data sequenced by either the Illumina HiSeq 2000 or HiSeq 2500 platforms, Rainbow can be used straight out of the box. Rainbow is available

  20. The Human Blood Metabolome-Transcriptome Interface

    Science.gov (United States)

    Schramm, Katharina; Adamski, Jerzy; Gieger, Christian; Herder, Christian; Carstensen, Maren; Peters, Annette; Rathmann, Wolfgang; Roden, Michael; Strauch, Konstantin; Suhre, Karsten; Kastenmüller, Gabi; Prokisch, Holger; Theis, Fabian J.

    2015-01-01

    Biological systems consist of multiple organizational levels all densely interacting with each other to ensure function and flexibility of the system. Simultaneous analysis of cross-sectional multi-omics data from large population studies is a powerful tool to comprehensively characterize the underlying molecular mechanisms on a physiological scale. In this study, we systematically analyzed the relationship between fasting serum metabolomics and whole blood transcriptomics data from 712 individuals of the German KORA F4 cohort. Correlation-based analysis identified 1,109 significant associations between 522 transcripts and 114 metabolites summarized in an integrated network, the ‘human blood metabolome-transcriptome interface’ (BMTI). Bidirectional causality analysis using Mendelian randomization did not yield any statistically significant causal associations between transcripts and metabolites. A knowledge-based interpretation and integration with a genome-scale human metabolic reconstruction revealed systematic signatures of signaling, transport and metabolic processes, i.e. metabolic reactions mainly belonging to lipid, energy and amino acid metabolism. Moreover, the construction of a network based on functional categories illustrated the cross-talk between the biological layers at a pathway level. Using a transcription factor binding site enrichment analysis, this pathway cross-talk was further confirmed at a regulatory level. Finally, we demonstrated how the constructed networks can be used to gain novel insights into molecular mechanisms associated to intermediate clinical traits. Overall, our results demonstrate the utility of a multi-omics integrative approach to understand the molecular mechanisms underlying both normal physiology and disease. PMID:26086077

  1. The Human Blood Metabolome-Transcriptome Interface.

    Directory of Open Access Journals (Sweden)

    Jörg Bartel

    2015-06-01

    Full Text Available Biological systems consist of multiple organizational levels all densely interacting with each other to ensure function and flexibility of the system. Simultaneous analysis of cross-sectional multi-omics data from large population studies is a powerful tool to comprehensively characterize the underlying molecular mechanisms on a physiological scale. In this study, we systematically analyzed the relationship between fasting serum metabolomics and whole blood transcriptomics data from 712 individuals of the German KORA F4 cohort. Correlation-based analysis identified 1,109 significant associations between 522 transcripts and 114 metabolites summarized in an integrated network, the 'human blood metabolome-transcriptome interface' (BMTI. Bidirectional causality analysis using Mendelian randomization did not yield any statistically significant causal associations between transcripts and metabolites. A knowledge-based interpretation and integration with a genome-scale human metabolic reconstruction revealed systematic signatures of signaling, transport and metabolic processes, i.e. metabolic reactions mainly belonging to lipid, energy and amino acid metabolism. Moreover, the construction of a network based on functional categories illustrated the cross-talk between the biological layers at a pathway level. Using a transcription factor binding site enrichment analysis, this pathway cross-talk was further confirmed at a regulatory level. Finally, we demonstrated how the constructed networks can be used to gain novel insights into molecular mechanisms associated to intermediate clinical traits. Overall, our results demonstrate the utility of a multi-omics integrative approach to understand the molecular mechanisms underlying both normal physiology and disease.

  2. Genome Modeling System: A Knowledge Management Platform for Genomics.

    Directory of Open Access Journals (Sweden)

    Malachi Griffith

    2015-07-01

    Full Text Available In this work, we present the Genome Modeling System (GMS, an analysis information management system capable of executing automated genome analysis pipelines at a massive scale. The GMS framework provides detailed tracking of samples and data coupled with reliable and repeatable analysis pipelines. The GMS also serves as a platform for bioinformatics development, allowing a large team to collaborate on data analysis, or an individual researcher to leverage the work of others effectively within its data management system. Rather than separating ad-hoc analysis from rigorous, reproducible pipelines, the GMS promotes systematic integration between the two. As a demonstration of the GMS, we performed an integrated analysis of whole genome, exome and transcriptome sequencing data from a breast cancer cell line (HCC1395 and matched lymphoblastoid line (HCC1395BL. These data are available for users to test the software, complete tutorials and develop novel GMS pipeline configurations. The GMS is available at https://github.com/genome/gms.

  3. A genome-wide, fine-scale map of natural pigmentation variation in Drosophila melanogaster.

    Directory of Open Access Journals (Sweden)

    Héloïse Bastide

    2013-06-01

    Full Text Available Various approaches can be applied to uncover the genetic basis of natural phenotypic variation, each with their specific strengths and limitations. Here, we use a replicated genome-wide association approach (Pool-GWAS to fine-scale map genomic regions contributing to natural variation in female abdominal pigmentation in Drosophila melanogaster, a trait that is highly variable in natural populations and highly heritable in the laboratory. We examined abdominal pigmentation phenotypes in approximately 8000 female European D. melanogaster, isolating 1000 individuals with extreme phenotypes. We then used whole-genome Illumina sequencing to identify single nucleotide polymorphisms (SNPs segregating in our sample, and tested these for associations with pigmentation by contrasting allele frequencies between replicate pools of light and dark individuals. We identify two small regions near the pigmentation genes tan and bric-à-brac 1, both corresponding to known cis-regulatory regions, which contain SNPs showing significant associations with pigmentation variation. While the Pool-GWAS approach suffers some limitations, its cost advantage facilitates replication and it can be applied to any non-model system with an available reference genome.

  4. A genome-wide, fine-scale map of natural pigmentation variation in Drosophila melanogaster.

    Science.gov (United States)

    Bastide, Héloïse; Betancourt, Andrea; Nolte, Viola; Tobler, Raymond; Stöbe, Petra; Futschik, Andreas; Schlötterer, Christian

    2013-06-01

    Various approaches can be applied to uncover the genetic basis of natural phenotypic variation, each with their specific strengths and limitations. Here, we use a replicated genome-wide association approach (Pool-GWAS) to fine-scale map genomic regions contributing to natural variation in female abdominal pigmentation in Drosophila melanogaster, a trait that is highly variable in natural populations and highly heritable in the laboratory. We examined abdominal pigmentation phenotypes in approximately 8000 female European D. melanogaster, isolating 1000 individuals with extreme phenotypes. We then used whole-genome Illumina sequencing to identify single nucleotide polymorphisms (SNPs) segregating in our sample, and tested these for associations with pigmentation by contrasting allele frequencies between replicate pools of light and dark individuals. We identify two small regions near the pigmentation genes tan and bric-à-brac 1, both corresponding to known cis-regulatory regions, which contain SNPs showing significant associations with pigmentation variation. While the Pool-GWAS approach suffers some limitations, its cost advantage facilitates replication and it can be applied to any non-model system with an available reference genome.

  5. Analysis of growth of Lactobacillus plantarum WCFS1 on a complex medium using a genome-scale metabolic model

    NARCIS (Netherlands)

    Teusink, B.; Wiersma, A.; Molenaar, D.; Francke, C.; Vos, de W.M.; Siezen, R.J.; Smid, E.J.

    2006-01-01

    A genome-scale metabolic model of the lactic acid bacterium Lactobacillus plantarum WCFS1 was constructed based on genomic content and experimental data. The complete model includes 721 genes, 643 reactions, and 531 metabolites. Different stoichiometric modeling techniques were used for

  6. Genome-scale reconstruction of metabolic networks of Lactobacillus casei ATCC 334 and 12A.

    Directory of Open Access Journals (Sweden)

    Elena Vinay-Lara

    Full Text Available Lactobacillus casei strains are widely used in industry and the utility of this organism in these industrial applications is strain dependent. Hence, tools capable of predicting strain specific phenotypes would have utility in the selection of strains for specific industrial processes. Genome-scale metabolic models can be utilized to better understand genotype-phenotype relationships and to compare different organisms. To assist in the selection and development of strains with enhanced industrial utility, genome-scale models for L. casei ATCC 334, a well characterized strain, and strain 12A, a corn silage isolate, were constructed. Draft models were generated from RAST genome annotations using the Model SEED database and refined by evaluating ATP generating cycles, mass-and-charge-balances of reactions, and growth phenotypes. After the validation process was finished, we compared the metabolic networks of these two strains to identify metabolic, genetic and ortholog differences that may lead to different phenotypic behaviors. We conclude that the metabolic capabilities of the two networks are highly similar. The L. casei ATCC 334 model accounts for 1,040 reactions, 959 metabolites and 548 genes, while the L. casei 12A model accounts for 1,076 reactions, 979 metabolites and 640 genes. The developed L. casei ATCC 334 and 12A metabolic models will enable better understanding of the physiology of these organisms and be valuable tools in the development and selection of strains with enhanced utility in a variety of industrial applications.

  7. Reconstruction and analysis of a genome-scale metabolic model for Scheffersomyces stipitis

    Directory of Open Access Journals (Sweden)

    Balagurunathan Balaji

    2012-02-01

    Full Text Available Abstract Background Fermentation of xylose, the major component in hemicellulose, is essential for economic conversion of lignocellulosic biomass to fuels and chemicals. The yeast Scheffersomyces stipitis (formerly known as Pichia stipitis has the highest known native capacity for xylose fermentation and possesses several genes for lignocellulose bioconversion in its genome. Understanding the metabolism of this yeast at a global scale, by reconstructing the genome scale metabolic model, is essential for manipulating its metabolic capabilities and for successful transfer of its capabilities to other industrial microbes. Results We present a genome-scale metabolic model for Scheffersomyces stipitis, a native xylose utilizing yeast. The model was reconstructed based on genome sequence annotation, detailed experimental investigation and known yeast physiology. Macromolecular composition of Scheffersomyces stipitis biomass was estimated experimentally and its ability to grow on different carbon, nitrogen, sulphur and phosphorus sources was determined by phenotype microarrays. The compartmentalized model, developed based on an iterative procedure, accounted for 814 genes, 1371 reactions, and 971 metabolites. In silico computed growth rates were compared with high-throughput phenotyping data and the model could predict the qualitative outcomes in 74% of substrates investigated. Model simulations were used to identify the biosynthetic requirements for anaerobic growth of Scheffersomyces stipitis on glucose and the results were validated with published literature. The bottlenecks in Scheffersomyces stipitis metabolic network for xylose uptake and nucleotide cofactor recycling were identified by in silico flux variability analysis. The scope of the model in enhancing the mechanistic understanding of microbial metabolism is demonstrated by identifying a mechanism for mitochondrial respiration and oxidative phosphorylation. Conclusion The genome-scale

  8. TCW: transcriptome computational workbench.

    Science.gov (United States)

    Soderlund, Carol; Nelson, William; Willer, Mark; Gang, David R

    2013-01-01

    The analysis of transcriptome data involves many steps and various programs, along with organization of large amounts of data and results. Without a methodical approach for storage, analysis and query, the resulting ad hoc analysis can lead to human error, loss of data and results, inefficient use of time, and lack of verifiability, repeatability, and extensibility. The Transcriptome Computational Workbench (TCW) provides Java graphical interfaces for methodical analysis for both single and comparative transcriptome data without the use of a reference genome (e.g. for non-model organisms). The singleTCW interface steps the user through importing transcript sequences (e.g. Illumina) or assembling long sequences (e.g. Sanger, 454, transcripts), annotating the sequences, and performing differential expression analysis using published statistical programs in R. The data, metadata, and results are stored in a MySQL database. The multiTCW interface builds a comparison database by importing sequence and annotation from one or more single TCW databases, executes the ESTscan program to translate the sequences into proteins, and then incorporates one or more clusterings, where the clustering options are to execute the orthoMCL program, compute transitive closure, or import clusters. Both singleTCW and multiTCW allow extensive query and display of the results, where singleTCW displays the alignment of annotation hits to transcript sequences, and multiTCW displays multiple transcript alignments with MUSCLE or pairwise alignments. The query programs can be executed on the desktop for fastest analysis, or from the web for sharing the results. It is now affordable to buy a multi-processor machine, and easy to install Java and MySQL. By simply downloading the TCW, the user can interactively analyze, query and view their data. The TCW allows in-depth data mining of the results, which can lead to a better understanding of the transcriptome. TCW is freely available from www.agcol.arizona.edu/software/tcw.

  9. Genome-scale reconstruction of the Streptococcus pyogenes M49 metabolic network reveals growth requirements and indicates potential drug targets

    NARCIS (Netherlands)

    Levering, J.; Fiedler, T.; Sieg, A.; van Grinsven, K.W.A.; Hering, S.; Veith, N.; Olivier, B.G.; Klett, L.; Hugenholtz, J.; Teusink, B.; Kreikemeyer, B.; Kummer, U.

    2016-01-01

    Genome-scale metabolic models comprise stoichiometric relations between metabolites, as well as associations between genes and metabolic reactions and facilitate the analysis of metabolism. We computationally reconstructed the metabolic network of the lactic acid bacterium Streptococcus pyogenes

  10. Moniliophthora roreri Genome and Transcriptome

    Science.gov (United States)

    Frosty pod rot disease of cacao is one of the most destructive diseases of cacao and at this time is limited to regions in South America and Central America. Frosty pod rot is caused by a fungal pathogen Moniliophthora roreri, a basidiomycete that is closely related to another cacao pathogen that ca...

  11. TRAM (Transcriptome Mapper: database-driven creation and analysis of transcriptome maps from multiple sources

    Directory of Open Access Journals (Sweden)

    Danieli Gian

    2011-02-01

    Full Text Available Abstract Background Several tools have been developed to perform global gene expression profile data analysis, to search for specific chromosomal regions whose features meet defined criteria as well as to study neighbouring gene expression. However, most of these tools are tailored for a specific use in a particular context (e.g. they are species-specific, or limited to a particular data format and they typically accept only gene lists as input. Results TRAM (Transcriptome Mapper is a new general tool that allows the simple generation and analysis of quantitative transcriptome maps, starting from any source listing gene expression values for a given gene set (e.g. expression microarrays, implemented as a relational database. It includes a parser able to assign univocal and updated gene symbols to gene identifiers from different data sources. Moreover, TRAM is able to perform intra-sample and inter-sample data normalization, including an original variant of quantile normalization (scaled quantile, useful to normalize data from platforms with highly different numbers of investigated genes. When in 'Map' mode, the software generates a quantitative representation of the transcriptome of a sample (or of a pool of samples and identifies if segments of defined lengths are over/under-expressed compared to the desired threshold. When in 'Cluster' mode, the software searches for a set of over/under-expressed consecutive genes. Statistical significance for all results is calculated with respect to genes localized on the same chromosome or to all genome genes. Transcriptome maps, showing differential expression between two sample groups, relative to two different biological conditions, may be easily generated. We present the results of a biological model test, based on a meta-analysis comparison between a sample pool of human CD34+ hematopoietic progenitor cells and a sample pool of megakaryocytic cells. Biologically relevant chromosomal segments and gene

  12. In Silico Genome-Scale Reconstruction and Validation of the Corynebacterium glutamicum Metabolic Network

    DEFF Research Database (Denmark)

    Kjeldsen, Kjeld Raunkjær; Nielsen, J.

    2009-01-01

    A genome-scale metabolic model of the Gram-positive bacteria Corynebacterium glutamicum ATCC 13032 was constructed comprising 446 reactions and 411 metabolite, based on the annotated genome and available biochemical information. The network was analyzed using constraint based methods. The model...... was extensively validated against published flux data, and flux distribution values were found to correlate well between simulations and experiments. The split pathway of the lysine synthesis pathway of C. glutamicum was investigated, and it was found that the direct dehydrogenase variant gave a higher lysine...... yield than the alternative succinyl pathway at high lysine production rates. The NADPH demand of the network was not found to be critical for lysine production until lysine yields exceeded 55% (mmol lysine (mmol glucose)(-1)). The model was validated during growth on the organic acids acetate...

  13. Genome-scale analysis of positional clustering of mouse testis-specific genes

    Directory of Open Access Journals (Sweden)

    Lee Bernett TK

    2005-01-01

    Full Text Available Abstract Background Genes are not randomly distributed on a chromosome as they were thought even after removal of tandem repeats. The positional clustering of co-expressed genes is known in prokaryotes and recently reported in several eukaryotic organisms such as Caenorhabditis elegans, Drosophila melanogaster, and Homo sapiens. In order to further investigate the mode of tissue-specific gene clustering in higher eukaryotes, we have performed a genome-scale analysis of positional clustering of the mouse testis-specific genes. Results Our computational analysis shows that a large proportion of testis-specific genes are clustered in groups of 2 to 5 genes in the mouse genome. The number of clusters is much higher than expected by chance even after removal of tandem repeats. Conclusion Our result suggests that testis-specific genes tend to cluster on the mouse chromosomes. This provides another piece of evidence for the hypothesis that clusters of tissue-specific genes do exist.

  14. Targeted and genome-scale methylomics reveals gene body signatures in human cell lines

    Science.gov (United States)

    Ball, Madeleine Price; Li, Jin Billy; Gao, Yuan; Lee, Je-Hyuk; LeProust, Emily; Park, In-Hyun; Xie, Bin; Daley, George Q.; Church, George M.

    2012-01-01

    Cytosine methylation, an epigenetic modification of DNA, is a target of growing interest for developing high throughput profiling technologies. Here we introduce two new, complementary techniques for cytosine methylation profiling utilizing next generation sequencing technology: bisulfite padlock probes (BSPPs) and methyl sensitive cut counting (MSCC). In the first method, we designed a set of ~10,000 BSPPs distributed over the ENCODE pilot project regions to take advantage of existing expression and chromatin immunoprecipitation data. We observed a pattern of low promoter methylation coupled with high gene body methylation in highly expressed genes. Using the second method, MSCC, we gathered genome-scale data for 1.4 million HpaII sites and confirmed that gene body methylation in highly expressed genes is a consistent phenomenon over the entire genome. Our observations highlight the usefulness of techniques which are not inherently or intentionally biased in favor of only profiling particular subsets like CpG islands or promoter regions. PMID:19329998

  15. Transcriptome sequencing of lentil based on second-generation technology permits large-scale unigene assembly and SSR marker discovery

    Directory of Open Access Journals (Sweden)

    Materne Michael

    2011-05-01

    Full Text Available Abstract Background Lentil (Lens culinaris Medik. is a cool-season grain legume which provides a rich source of protein for human consumption. In terms of genomic resources, lentil is relatively underdeveloped, in comparison to other Fabaceae species, with limited available data. There is hence a significant need to enhance such resources in order to identify novel genes and alleles for molecular breeding to increase crop productivity and quality. Results Tissue-specific cDNA samples from six distinct lentil genotypes were sequenced using Roche 454 GS-FLX Titanium technology, generating c. 1.38 × 106 expressed sequence tags (ESTs. De novo assembly generated a total of 15,354 contigs and 68,715 singletons. The complete unigene set was sequence-analysed against genome drafts of the model legume species Medicago truncatula and Arabidopsis thaliana to identify 12,639, and 7,476 unique matches, respectively. When compared to the genome of Glycine max, a total of 20,419 unique hits were observed corresponding to c. 31% of the known gene space. A total of 25,592 lentil unigenes were subsequently annoated from GenBank. Simple sequence repeat (SSR-containing ESTs were identified from consensus sequences and a total of 2,393 primer pairs were designed. A subset of 192 EST-SSR markers was screened for validation across a panel 12 cultivated lentil genotypes and one wild relative species. A total of 166 primer pairs obtained successful amplification, of which 47.5% detected genetic polymorphism. Conclusions A substantial collection of ESTs has been developed from sequence analysis of lentil genotypes using second-generation technology, permitting unigene definition across a broad range of functional categories. As well as providing resources for functional genomics studies, the unigene set has permitted significant enhancement of the number of publicly-available molecular genetic markers as tools for improvement of this species.

  16. The transcriptome of Utricularia vulgaris, a rootless plant with minimalist genome, reveals extreme alternative splicing and only moderate sequence similarity with Utricularia gibba

    Czech Academy of Sciences Publication Activity Database

    Bárta, J.; Stone, James D.; Pech, J.; Sirová, D.; Adamec, Lubomír; Campbell, M. A.; Štorchová, H.

    2015-01-01

    Roč. 15, MAR 7 (2015), s. 1-14, no. 78 ISSN 1471-2229 R&D Projects: GA ČR(CZ) GAP504/11/0783 Institutional support: RVO:67985939 Keywords : transcriptome * root-associated genes * alternative splicing Subject RIV: EF - Botanics Impact factor: 3.631, year: 2015

  17. The transcriptome of Utricularia vulgaris, a rootless plant with minimalist genome, reveals extreme alternative splicing and only moderate sequence similarity with Utricularia gibba

    Czech Academy of Sciences Publication Activity Database

    Bárta, J.; Stone, James D.; Pech, J.; Sirová, D.; Adamec, L.; Campbell, M. A.; Štorchová, Helena

    2015-01-01

    Roč. 15, MAR 7 2015 (2015) ISSN 1471-2229 R&D Projects: GA ČR(CZ) GAP504/11/0783 Institutional support: RVO:61389030 Keywords : Transcriptome * Root-associated genes * Alternative splicing Subject RIV: EF - Botanics Impact factor: 3.631, year: 2015

  18. Genome Partitioner: A web tool for multi-level partitioning of large-scale DNA constructs for synthetic biology applications.

    Science.gov (United States)

    Christen, Matthias; Del Medico, Luca; Christen, Heinz; Christen, Beat

    2017-01-01

    Recent advances in lower-cost DNA synthesis techniques have enabled new innovations in the field of synthetic biology. Still, efficient design and higher-order assembly of genome-scale DNA constructs remains a labor-intensive process. Given the complexity, computer assisted design tools that fragment large DNA sequences into fabricable DNA blocks are needed to pave the way towards streamlined assembly of biological systems. Here, we present the Genome Partitioner software implemented as a web-based interface that permits multi-level partitioning of genome-scale DNA designs. Without the need for specialized computing skills, biologists can submit their DNA designs to a fully automated pipeline that generates the optimal retrosynthetic route for higher-order DNA assembly. To test the algorithm, we partitioned a 783 kb Caulobacter crescentus genome design. We validated the partitioning strategy by assembling a 20 kb test segment encompassing a difficult to synthesize DNA sequence. Successful assembly from 1 kb subblocks into the 20 kb segment highlights the effectiveness of the Genome Partitioner for reducing synthesis costs and timelines for higher-order DNA assembly. The Genome Partitioner is broadly applicable to translate DNA designs into ready to order sequences that can be assembled with standardized protocols, thus offering new opportunities to harness the diversity of microbial genomes for synthetic biology applications. The Genome Partitioner web tool can be accessed at https://christenlab.ethz.ch/GenomePartitioner.

  19. Genome Partitioner: A web tool for multi-level partitioning of large-scale DNA constructs for synthetic biology applications.

    Directory of Open Access Journals (Sweden)

    Matthias Christen

    Full Text Available Recent advances in lower-cost DNA synthesis techniques have enabled new innovations in the field of synthetic biology. Still, efficient design and higher-order assembly of genome-scale DNA constructs remains a labor-intensive process. Given the complexity, computer assisted design tools that fragment large DNA sequences into fabricable DNA blocks are needed to pave the way towards streamlined assembly of biological systems. Here, we present the Genome Partitioner software implemented as a web-based interface that permits multi-level partitioning of genome-scale DNA designs. Without the need for specialized computing skills, biologists can submit their DNA designs to a fully automated pipeline that generates the optimal retrosynthetic route for higher-order DNA assembly. To test the algorithm, we partitioned a 783 kb Caulobacter crescentus genome design. We validated the partitioning strategy by assembling a 20 kb test segment encompassing a difficult to synthesize DNA sequence. Successful assembly from 1 kb subblocks into the 20 kb segment highlights the effectiveness of the Genome Partitioner for reducing synthesis costs and timelines for higher-order DNA assembly. The Genome Partitioner is broadly applicable to translate DNA designs into ready to order sequences that can be assembled with standardized protocols, thus offering new opportunities to harness the diversity of microbial genomes for synthetic biology applications. The Genome Partitioner web tool can be accessed at https://christenlab.ethz.ch/GenomePartitioner.

  20. Annotated Draft Genome Assemblies for the Northern Bobwhite (Colinus virginianus and the Scaled Quail (Callipepla squamata Reveal Disparate Estimates of Modern Genome Diversity and Historic Effective Population Size

    Directory of Open Access Journals (Sweden)

    David L. Oldeschulte

    2017-09-01

    Full Text Available Northern bobwhite (Colinus virginianus; hereafter bobwhite and scaled quail (Callipepla squamata populations have suffered precipitous declines across most of their US ranges. Illumina-based first- (v1.0 and second- (v2.0 generation draft genome assemblies for the scaled quail and the bobwhite produced N50 scaffold sizes of 1.035 and 2.042 Mb, thereby producing a 45-fold improvement in contiguity over the existing bobwhite assembly, and ≥90% of the assembled genomes were captured within 1313 and 8990 scaffolds, respectively. The scaled quail assembly (v1.0 = 1.045 Gb was ∼20% smaller than the bobwhite (v2.0 = 1.254 Gb, which was supported by kmer-based estimates of genome size. Nevertheless, estimates of GC content (41.72%; 42.66%, genome-wide repetitive content (10.40%; 10.43%, and MAKER-predicted protein coding genes (17,131; 17,165 were similar for the scaled quail (v1.0 and bobwhite (v2.0 assemblies, respectively. BUSCO analyses utilizing 3023 single-copy orthologs revealed a high level of assembly completeness for the scaled quail (v1.0; 84.8% and the bobwhite (v2.0; 82.5%, as verified by comparison with well-established avian genomes. We also detected 273 putative segmental duplications in the scaled quail genome (v1.0, and 711 in the bobwhite genome (v2.0, including some that were shared among both species. Autosomal variant prediction revealed ∼2.48 and 4.17 heterozygous variants per kilobase within the scaled quail (v1.0 and bobwhite (v2.0 genomes, respectively, and estimates of historic effective population size were uniformly higher for the bobwhite across all time points in a coalescent model. However, large-scale declines were predicted for both species beginning ∼15–20 KYA.

  1. Annotated Draft Genome Assemblies for the Northern Bobwhite (Colinus virginianus) and the Scaled Quail (Callipepla squamata) Reveal Disparate Estimates of Modern Genome Diversity and Historic Effective Population Size.

    Science.gov (United States)

    Oldeschulte, David L; Halley, Yvette A; Wilson, Miranda L; Bhattarai, Eric K; Brashear, Wesley; Hill, Joshua; Metz, Richard P; Johnson, Charles D; Rollins, Dale; Peterson, Markus J; Bickhart, Derek M; Decker, Jared E; Sewell, John F; Seabury, Christopher M

    2017-09-07

    Northern bobwhite ( Colinus virginianus ; hereafter bobwhite) and scaled quail ( Callipepla squamata ) populations have suffered precipitous declines across most of their US ranges. Illumina-based first- (v1.0) and second- (v2.0) generation draft genome assemblies for the scaled quail and the bobwhite produced N50 scaffold sizes of 1.035 and 2.042 Mb, thereby producing a 45-fold improvement in contiguity over the existing bobwhite assembly, and ≥90% of the assembled genomes were captured within 1313 and 8990 scaffolds, respectively. The scaled quail assembly (v1.0 = 1.045 Gb) was ∼20% smaller than the bobwhite (v2.0 = 1.254 Gb), which was supported by kmer-based estimates of genome size. Nevertheless, estimates of GC content (41.72%; 42.66%), genome-wide repetitive content (10.40%; 10.43%), and MAKER-predicted protein coding genes (17,131; 17,165) were similar for the scaled quail (v1.0) and bobwhite (v2.0) assemblies, respectively. BUSCO analyses utilizing 3023 single-copy orthologs revealed a high level of assembly completeness for the scaled quail (v1.0; 84.8%) and the bobwhite (v2.0; 82.5%), as verified by comparison with well-established avian genomes. We also detected 273 putative segmental duplications in the scaled quail genome (v1.0), and 711 in the bobwhite genome (v2.0), including some that were shared among both species. Autosomal variant prediction revealed ∼2.48 and 4.17 heterozygous variants per kilobase within the scaled quail (v1.0) and bobwhite (v2.0) genomes, respectively, and estimates of historic effective population size were uniformly higher for the bobwhite across all time points in a coalescent model. However, large-scale declines were predicted for both species beginning ∼15-20 KYA. Copyright © 2017 Oldeschulte et al.

  2. A multi-objective constraint-based approach for modeling genome-scale microbial ecosystems.

    Science.gov (United States)

    Budinich, Marko; Bourdon, Jérémie; Larhlimi, Abdelhalim; Eveillard, Damien

    2017-01-01

    Interplay within microbial communities impacts ecosystems on several scales, and elucidation of the consequent effects is a difficult task in ecology. In particular, the integration of genome-scale data within quantitative models of microbial ecosystems remains elusive. This study advocates the use of constraint-based modeling to build predictive models from recent high-resolution -omics datasets. Following recent studies that have demonstrated the accuracy of constraint-based models (CBMs) for simulating single-strain metabolic networks, we sought to study microbial ecosystems as a combination of single-strain metabolic networks that exchange nutrients. This study presents two multi-objective extensions of CBMs for modeling communities: multi-objective flux balance analysis (MO-FBA) and multi-objective flux variability analysis (MO-FVA). Both methods were applied to a hot spring mat model ecosystem. As a result, multiple trade-offs between nutrients and growth rates, as well as thermodynamically favorable relative abundances at community level, were emphasized. We expect this approach to be used for integrating genomic information in microbial ecosystems. Following models will provide insights about behaviors (including diversity) that take place at the ecosystem scale.

  3. A multi-objective constraint-based approach for modeling genome-scale microbial ecosystems.

    Directory of Open Access Journals (Sweden)

    Marko Budinich

    Full Text Available Interplay within microbial communities impacts ecosystems on several scales, and elucidation of the consequent effects is a difficult task in ecology. In particular, the integration of genome-scale data within quantitative models of microbial ecosystems remains elusive. This study advocates the use of constraint-based modeling to build predictive models from recent high-resolution -omics datasets. Following recent studies that have demonstrated the accuracy of constraint-based models (CBMs for simulating single-strain metabolic networks, we sought to study microbial ecosystems as a combination of single-strain metabolic networks that exchange nutrients. This study presents two multi-objective extensions of CBMs for modeling communities: multi-objective flux balance analysis (MO-FBA and multi-objective flux variability analysis (MO-FVA. Both methods were applied to a hot spring mat model ecosystem. As a result, multiple trade-offs between nutrients and growth rates, as well as thermodynamically favorable relative abundances at community level, were emphasized. We expect this approach to be used for integrating genomic information in microbial ecosystems. Following models will provide insights about behaviors (including diversity that take place at the ecosystem scale.

  4. Genome-scale model guided design of Propionibacterium for enhanced propionic acid production

    Directory of Open Access Journals (Sweden)

    Laura Navone

    2018-06-01

    Full Text Available Production of propionic acid by fermentation of propionibacteria has gained increasing attention in the past few years. However, biomanufacturing of propionic acid cannot compete with the current oxo-petrochemical synthesis process due to its well-established infrastructure, low oil prices and the high downstream purification costs of microbial production. Strain improvement to increase propionic acid yield is the best alternative to reduce downstream purification costs. The recent generation of genome-scale models for a number of Propionibacterium species facilitates the rational design of metabolic engineering strategies and provides a new opportunity to explore the metabolic potential of the Wood-Werkman cycle. Previous strategies for strain improvement have individually targeted acid tolerance, rate of propionate production or minimisation of by-products. Here we used the P. freudenreichii subsp. shermanii and the pan-Propionibacterium genome-scale metabolic models (GEMs to simultaneously target these combined issues. This was achieved by focussing on strategies which yield higher energies and directly suppress acetate formation. Using P. freudenreichii subsp. shermanii, two strategies were assessed. The first tested the ability to manipulate the redox balance to favour propionate production by over-expressing the first two enzymes of the pentose-phosphate pathway (PPP, Zwf (glucose-6-phosphate 1-dehydrogenase and Pgl (6-phosphogluconolactonase. Results showed a 4-fold increase in propionate to acetate ratio during the exponential growth phase. Secondly, the ability to enhance the energy yield from propionate production by over-expressing an ATP-dependent phosphoenolpyruvate carboxykinase (PEPCK and sodium-pumping methylmalonyl-CoA decarboxylase (MMD was tested, which extended the exponential growth phase. Together, these strategies demonstrate that in silico design strategies are predictive and can be used to reduce by-product formation in

  5. Genome sequencing and transcriptome analysis of Trichoderma reesei QM9978 strain reveals a distal chromosome translocation to be responsible for loss of vib1 expression and loss of cellulase induction.

    Science.gov (United States)

    Ivanova, Christa; Ramoni, Jonas; Aouam, Thiziri; Frischmann, Alexa; Seiboth, Bernhard; Baker, Scott E; Le Crom, Stéphane; Lemoine, Sophie; Margeot, Antoine; Bidard, Frédérique

    2017-01-01

    The hydrolysis of biomass to simple sugars used for the production of biofuels in biorefineries requires the action of cellulolytic enzyme mixtures. During the last 50 years, the ascomycete Trichoderma reesei , the main source of industrial cellulase and hemicellulase cocktails, has been subjected to several rounds of classical mutagenesis with the aim to obtain higher production levels. During these random genetic events, strains unable to produce cellulases were generated. Here, whole genome sequencing and transcriptomic analyses of the cellulase-negative strain QM9978 were used for the identification of mutations underlying this cellulase-negative phenotype. Sequence comparison of the cellulase-negative strain QM9978 to the reference strain QM6a identified a total of 43 mutations, of which 33 were located either close to or in coding regions. From those, we identified 23 single-nucleotide variants, nine InDels, and one translocation. The translocation occurred between chromosomes V and VII, is located upstream of the putative transcription factor vib1 , and abolishes its expression in QM9978 as detected during the transcriptomic analyses. Ectopic expression of vib1 under the control of its native promoter as well as overexpression of vib1 under the control of a strong constitutive promoter restored cellulase expression in QM9978, thus confirming that the translocation event is the reason for the cellulase-negative phenotype. Gene deletion of vib1 in the moderate producer strain QM9414 and in the high producer strain Rut-C30 reduced cellulase expression in both cases. Overexpression of vib1 in QM9414 and Rut-C30 had no effect on cellulase production, most likely because vib1 is already expressed at an optimal level under normal conditions. We were able to establish a link between a chromosomal translocation in QM9978 and the cellulase-negative phenotype of the strain. We identified the transcription factor vib1 as a key regulator of cellulases in T. reesei whose

  6. Integration of Genome Scale Metabolic Networks and Gene Regulation of Metabolic Enzymes With Physiologically Based Pharmacokinetics.

    Science.gov (United States)

    Maldonado, Elaina M; Leoncikas, Vytautas; Fisher, Ciarán P; Moore, J Bernadette; Plant, Nick J; Kierzek, Andrzej M

    2017-11-01

    The scope of physiologically based pharmacokinetic (PBPK) modeling can be expanded by assimilation of the mechanistic models of intracellular processes from systems biology field. The genome scale metabolic networks (GSMNs) represent a whole set of metabolic enzymes expressed in human tissues. Dynamic models of the gene regulation of key drug metabolism enzymes are available. Here, we introduce GSMNs and review ongoing work on integration of PBPK, GSMNs, and metabolic gene regulation. We demonstrate example models. © 2017 The Authors CPT: Pharmacometrics & Systems Pharmacology published by Wiley Periodicals, Inc. on behalf of American Society for Clinical Pharmacology and Therapeutics.

  7. Genome-scale modeling of the protein secretory machinery in yeast

    DEFF Research Database (Denmark)

    Feizi, Amir; Österlund, Tobias; Petranovic, Dina

    2013-01-01

    The protein secretory machinery in Eukarya is involved in post-translational modification (PTMs) and sorting of the secretory and many transmembrane proteins. While the secretory machinery has been well-studied using classic reductionist approaches, a holistic view of its complex nature is lacking....... Here, we present the first genome-scale model for the yeast secretory machinery which captures the knowledge generated through more than 50 years of research. The model is based on the concept of a Protein Specific Information Matrix (PSIM: characterized by seven PTMs features). An algorithm...

  8. Impact of Transcriptomics on Our Understanding of Pulmonary Fibrosis

    Science.gov (United States)

    Vukmirovic, Milica; Kaminski, Naftali

    2018-01-01

    Idiopathic pulmonary fibrosis (IPF) is a lethal fibrotic lung disease characterized by aberrant remodeling of the lung parenchyma with extensive changes to the phenotypes of all lung resident cells. The introduction of transcriptomics, genome scale profiling of thousands of RNA transcripts, caused a significant inversion in IPF research. Instead of generating hypotheses based on animal models of disease, or biological plausibility, with limited validation in humans, investigators were able to generate hypotheses based on unbiased molecular analysis of human samples and then use animal models of disease to test their hypotheses. In this review, we describe the insights made from transcriptomic analysis of human IPF samples. We describe how transcriptomic studies led to identification of novel genes and pathways involved in the human IPF lung such as: matrix metalloproteinases, WNT pathway, epithelial genes, role of microRNAs among others, as well as conceptual insights such as the involvement of developmental pathways and deep shifts in epithelial and fibroblast phenotypes. The impact of lung and transcriptomic studies on disease classification, endotype discovery, and reproducible biomarkers is also described in detail. Despite these impressive achievements, the impact of transcriptomic studies has been limited because they analyzed bulk tissue and did not address the cellular and spatial heterogeneity of the IPF lung. We discuss new emerging technologies and applications, such as single-cell RNAseq and microenvironment analysis that may address cellular and spatial heterogeneity. We end by making the point that most current tissue collections and resources are not amenable to analysis using the novel technologies. To take advantage of the new opportunities, we need new efforts of sample collections, this time focused on access to all the microenvironments and cells in the IPF lung. PMID:29670881

  9. Segregation distortion causes large-scale differences between male and female genomes in hybrid ants.

    Science.gov (United States)

    Kulmuni, Jonna; Seifert, Bernhard; Pamilo, Pekka

    2010-04-20

    Hybridization in isolated populations can lead either to hybrid breakdown and extinction or in some cases to speciation. The basis of hybrid breakdown lies in genetic incompatibilities between diverged genomes. In social Hymenoptera, the consequences of hybridization can differ from those in other animals because of haplodiploidy and sociality. Selection pressures differ between sexes because males are haploid and females are diploid. Furthermore, sociality and group living may allow survival of hybrid genotypes. We show that hybridization in Formica ants has resulted in a stable situation in which the males form two highly divergent gene pools whereas all the females are hybrids. This causes an exceptional situation with large-scale differences between male and female genomes. The genotype differences indicate strong transmission ratio distortion depending on offspring sex, whereby the mother transmits some alleles exclusively to her daughters and other alleles exclusively to her sons. The genetic differences between the sexes and the apparent lack of multilocus hybrid genotypes in males can be explained by recessive incompatibilities which cause the elimination of hybrid males because of their haploid genome. Alternatively, differentiation between sexes could be created by prezygotic segregation into male-forming and female-forming gametes in diploid females. Differentiation between sexes is stable and maintained throughout generations. The present study shows a unique outcome of hybridization and demonstrates that hybridization has the potential of generating evolutionary novelties in animals.

  10. Chromosome-scale comparative sequence analysis unravels molecular mechanisms of genome evolution between two wheat cultivars

    KAUST Repository

    Thind, Anupriya Kaur

    2018-02-08

    Background: Recent improvements in DNA sequencing and genome scaffolding have paved the way to generate high-quality de novo assemblies of pseudomolecules representing complete chromosomes of wheat and its wild relatives. These assemblies form the basis to compare the evolutionary dynamics of wheat genomes on a megabase-scale. Results: Here, we provide a comparative sequence analysis of the 700-megabase chromosome 2D between two bread wheat genotypes, the old landrace Chinese Spring and the elite Swiss spring wheat line CH Campala Lr22a. There was a high degree of sequence conservation between the two chromosomes. Analysis of large structural variations revealed four large insertions/deletions (InDels) of >100 kb. Based on the molecular signatures at the breakpoints, unequal crossing over and double-strand break repair were identified as the evolutionary mechanisms that caused these InDels. Three of the large InDels affected copy number of NLRs, a gene family involved in plant immunity. Analysis of single nucleotide polymorphism (SNP) density revealed three haploblocks of 8 Mb, 9 Mb and 48 Mb with a 35-fold increased SNP density compared to the rest of the chromosome. Conclusions: This comparative analysis of two high-quality chromosome assemblies enabled a comprehensive assessment of large structural variations. The insight obtained from this analysis will form the basis of future wheat pan-genome studies.

  11. Genome-scale metabolic network validation of Shewanella oneidensis using transposon insertion frequency analysis.

    Directory of Open Access Journals (Sweden)

    Hong Yang

    2014-09-01

    Full Text Available Transposon mutagenesis, in combination with parallel sequencing, is becoming a powerful tool for en-masse mutant analysis. A probability generating function was used to explain observed miniHimar transposon insertion patterns, and gene essentiality calls were made by transposon insertion frequency analysis (TIFA. TIFA incorporated the observed genome and sequence motif bias of the miniHimar transposon. The gene essentiality calls were compared to: 1 previous genome-wide direct gene-essentiality assignments; and, 2 flux balance analysis (FBA predictions from an existing genome-scale metabolic model of Shewanella oneidensis MR-1. A three-way comparison between FBA, TIFA, and the direct essentiality calls was made to validate the TIFA approach. The refinement in the interpretation of observed transposon insertions demonstrated that genes without insertions are not necessarily essential, and that genes that contain insertions are not always nonessential. The TIFA calls were in reasonable agreement with direct essentiality calls for S. oneidensis, but agreed more closely with E. coli essentiality calls for orthologs. The TIFA gene essentiality calls were in good agreement with the MR-1 FBA essentiality predictions, and the agreement between TIFA and FBA predictions was substantially better than between the FBA and the direct gene essentiality predictions.

  12. Sexagesimal scale for mapping human genome Escala sexagesimal para mapear el genoma humano

    Directory of Open Access Journals (Sweden)

    RICARDO CRUZ-COKE

    2001-03-01

    Full Text Available In a previous work I designed a diagram of the human genome based on a circular ideogram of the haploid set of chromosomes, using a low resolution scale of Megabase units. The purpose of this work is to draft a new scale to measure the physical map of the human genome at the highest resolution level. The entire length of the haploid genome of males is deployed in a circumference, marked with a sexagesimal scale with 360 degrees and 1296000 arc seconds. The radio of this circunference displays a semilogaritmic metric scale from 1 m up to the nanometer level. The base pair level of DNA sequences, 10-9 of this circunsference, is measured in milliarsec unit (mas, equivalent to a thousand of arcsecond. The "mas" unit, correspond to 1.27 nanometers (nm or 0.427 base pair (bp and it is the framework for measure DNA sequences. Thus the three billion base pairs of the human genome may be identified by 1296000000 "mas" units in continous correlation from number 1 to number 1296000000. This sexagesimal scale covers all the levels of the nuclear genetic material, from nucleotides to chromosomes. The locations of every codon and every gene may be numbered in the physical map of chomosome regions according to this new scale, instead of the partial kilobase and Megabase scales used today. The advantage of the new scale is the unification of the set of chromosomes under a continous scale of measurement at the DNA level, facilitating the correlation with the phenotypes of man and other speciesEn un trabajo anterior yo diseñé un diagrama del genoma humano basado en un ideograma circular del conjunto haploide de cromosomas, usando una escala de baja resolución en megabases. El propósito de este trabajo es el de diseñar una nueva escala para medir el mapa físico del genoma humano al más alto nivel de resolución. La longitud completa del genoma haploide del varon es extendido en una circunsferencia, marcada con una escala sexagesimal de 360 grados y 1296000

  13. pico-PLAZA, a genome database of microbial photosynthetic eukaryotes.

    Science.gov (United States)

    Vandepoele, Klaas; Van Bel, Michiel; Richard, Guilhem; Van Landeghem, Sofie; Verhelst, Bram; Moreau, Hervé; Van de Peer, Yves; Grimsley, Nigel; Piganeau, Gwenael

    2013-08-01

    With the advent of next generation genome sequencing, the number of sequenced algal genomes and transcriptomes is rapidly growing. Although a few genome portals exist to browse individual genome sequences, exploring complete genome information from multiple species for the analysis of user-defined sequences or gene lists remains a major challenge. pico-PLAZA is a web-based resource (http://bioinformatics.psb.ugent.be/pico-plaza/) for algal genomics that combines different data types with intuitive tools to explore genomic diversity, perform integrative evolutionary sequence analysis and study gene functions. Apart from homologous gene families, multiple sequence alignments, phylogenetic trees, Gene Ontology, InterPro and text-mining functional annotations, different interactive viewers are available to study genome organization using gene collinearity and synteny information. Different search functions, documentation pages, export functions and an extensive glossary are available to guide non-expert scientists. To illustrate the versatility of the platform, different case studies are presented demonstrating how pico-PLAZA can be used to functionally characterize large-scale EST/RNA-Seq data sets and to perform environmental genomics. Functional enrichments analysis of 16 Phaeodactylum tricornutum transcriptome libraries offers a molecular view on diatom adaptation to different environments of ecological relevance. Furthermore, we show how complementary genomic data sources can easily be combined to identify marker genes to study the diversity and distribution of algal species, for example in metagenomes, or to quantify intraspecific diversity from environmental strains. © 2013 John Wiley & Sons Ltd and Society for Applied Microbiology.

  14. Comparison of HapMap and 1000 Genomes Reference Panels in a Large-Scale Genome-Wide Association Study

    DEFF Research Database (Denmark)

    de Vries, Paul S; Sabater-Lleal, Maria; Chasman, Daniel I

    2017-01-01

    An increasing number of genome-wide association (GWA) studies are now using the higher resolution 1000 Genomes Project reference panel (1000G) for imputation, with the expectation that 1000G imputation will lead to the discovery of additional associated loci when compared to HapMap imputation. In...

  15. Techniques for Large-Scale Bacterial Genome Manipulation and Characterization of the Mutants with Respect to In Silico Metabolic Reconstructions.

    Science.gov (United States)

    diCenzo, George C; Finan, Turlough M

    2018-01-01

    The rate at which all genes within a bacterial genome can be identified far exceeds the ability to characterize these genes. To assist in associating genes with cellular functions, a large-scale bacterial genome deletion approach can be employed to rapidly screen tens to thousands of genes for desired phenotypes. Here, we provide a detailed protocol for the generation of deletions of large segments of bacterial genomes that relies on the activity of a site-specific recombinase. In this procedure, two recombinase recognition target sequences are introduced into known positions of a bacterial genome through single cross-over plasmid integration. Subsequent expression of the site-specific recombinase mediates recombination between the two target sequences, resulting in the excision of the intervening region and its loss from the genome. We further illustrate how this deletion system can be readily adapted to function as a large-scale in vivo cloning procedure, in which the region excised from the genome is captured as a replicative plasmid. We next provide a procedure for the metabolic analysis of bacterial large-scale genome deletion mutants using the Biolog Phenotype MicroArray™ system. Finally, a pipeline is described, and a sample Matlab script is provided, for the integration of the obtained data with a draft metabolic reconstruction for the refinement of the reactions and gene-protein-reaction relationships in a metabolic reconstruction.

  16. The Human Transcriptome: An Unfinished Story

    Directory of Open Access Journals (Sweden)

    Mihaela Pertea

    2012-06-01

    Full Text Available Despite recent technological advances, the study of the human transcriptome is still in its early stages. Here we provide an overview of the complex human transcriptomic landscape, present the bioinformatics challenges posed by the vast quantities of transcriptomic data, and discuss some of the studies that have tried to determine how much of the human genome is transcribed. Recent evidence has suggested that more than 90% of the human genome is transcribed into RNA. However, this view has been strongly contested by groups of scientists who argued that many of the observed transcripts are simply the result of transcriptional noise. In this review, we conclude that the full extent of transcription remains an open question that will not be fully addressed until we decipher the complete range and biological diversity of the transcribed genomic sequences.

  17. De novo transcriptome sequencing and sequence analysis of the malaria vector Anopheles sinensis (Diptera: Culicidae)

    Science.gov (United States)

    2014-01-01

    Background Anopheles sinensis is the major malaria vector in China and Southeast Asia. Vector control is one of the most effective measures to prevent malaria transmission. However, there is little transcriptome information available for the malaria vector. To better understand the biological basis of malaria transmission and to develop novel and effective means of vector control, there is a need to build a transcriptome dataset for functional genomics analysis by large-scale RNA sequencing (RNA-seq). Methods To provide a more comprehensive and complete transcriptome of An. sinensis, eggs, larvae, pupae, male adults and female adults RNA were pooled together for cDNA preparation, sequenced using the Illumina paired-end sequencing technology and assembled into unigenes. These unigenes were then analyzed in their genome mapping, functional annotation, homology, codon usage bias and simple sequence repeats (SSRs). Results Approximately 51.6 million clean reads were obtained, trimmed, and assembled into 38,504 unigenes with an average length of 571 bp, an N50 of 711 bp, and an average GC content 51.26%. Among them, 98.4% of unigenes could be mapped onto the reference genome, and 69% of unigenes could be annotated with known biological functions. Homology analysis identified certain numbers of An. sinensis unigenes that showed homology or being putative 1:1 orthologues with genomes of other Dipteran species. Codon usage bias was analyzed and 1,904 SSRs were detected, which will provide effective molecular markers for the population genetics of this species. Conclusions Our data and analysis provide the most comprehensive transcriptomic resource and characteristics currently available for An. sinensis, and will facilitate genetic, genomic studies, and further vector control of An. sinensis. PMID:25000941

  18. Chromosome-wise Protein Interaction Patterns and Their Impact on Functional Implications of Large-Scale Genomic Aberrations

    DEFF Research Database (Denmark)

    Kirk, Isa Kristina; Weinhold, Nils; Belling, Kirstine González-Izarzugaza

    2017-01-01

    Gene copy-number changes influence phenotypes through gene-dosage alteration and subsequent changes of protein complex stoichiometry. Human trisomies where gene copy numbers are increased uniformly over entire chromosomes provide generic cases for studying these relationships. In most trisomies......, gene and protein level alterations have fatal consequences. We used genome-wide protein-protein interaction data to identify chromosome-specific patterns of protein interactions. We found that some chromosomes encode proteins that interact infrequently with each other, chromosome 21 in particular. We...... combined the protein interaction data with transcriptome data from human brain tissue to investigate how this pattern of global interactions may affect cellular function. We identified highly connected proteins that also had coordinated gene expression. These proteins were associated with important...

  19. SWAP-Assembler 2: Optimization of De Novo Genome Assembler at Large Scale

    Energy Technology Data Exchange (ETDEWEB)

    Meng, Jintao; Seo, Sangmin; Balaji, Pavan; Wei, Yanjie; Wang, Bingqiang; Feng, Shengzhong

    2016-08-16

    In this paper, we analyze and optimize the most time-consuming steps of the SWAP-Assembler, a parallel genome assembler, so that it can scale to a large number of cores for huge genomes with the size of sequencing data ranging from terabyes to petabytes. According to the performance analysis results, the most time-consuming steps are input parallelization, k-mer graph construction, and graph simplification (edge merging). For the input parallelization, the input data is divided into virtual fragments with nearly equal size, and the start position and end position of each fragment are automatically separated at the beginning of the reads. In k-mer graph construction, in order to improve the communication efficiency, the message size is kept constant between any two processes by proportionally increasing the number of nucleotides to the number of processes in the input parallelization step for each round. The memory usage is also decreased because only a small part of the input data is processed in each round. With graph simplification, the communication protocol reduces the number of communication loops from four to two loops and decreases the idle communication time. The optimized assembler is denoted as SWAP-Assembler 2 (SWAP2). In our experiments using a 1000 Genomes project dataset of 4 terabytes (the largest dataset ever used for assembling) on the supercomputer Mira, the results show that SWAP2 scales to 131,072 cores with an efficiency of 40%. We also compared our work with both the HipMER assembler and the SWAP-Assembler. On the Yanhuang dataset of 300 gigabytes, SWAP2 shows a 3X speedup and 4X better scalability compared with the HipMer assembler and is 45 times faster than the SWAP-Assembler. The SWAP2 software is available at https://sourceforge.net/projects/swapassembler.

  20. Comprehensive Mapping of Pluripotent Stem Cell Metabolism Using Dynamic Genome-Scale Network Modeling

    Directory of Open Access Journals (Sweden)

    Sriram Chandrasekaran

    2017-12-01

    Full Text Available Summary: Metabolism is an emerging stem cell hallmark tied to cell fate, pluripotency, and self-renewal, yet systems-level understanding of stem cell metabolism has been limited by the lack of genome-scale network models. Here, we develop a systems approach to integrate time-course metabolomics data with a computational model of metabolism to analyze the metabolic state of naive and primed murine pluripotent stem cells. Using this approach, we find that one-carbon metabolism involving phosphoglycerate dehydrogenase, folate synthesis, and nucleotide synthesis is a key pathway that differs between the two states, resulting in differential sensitivity to anti-folates. The model also predicts that the pluripotency factor Lin28 regulates this one-carbon metabolic pathway, which we validate using metabolomics data from Lin28-deficient cells. Moreover, we identify and validate metabolic reactions related to S-adenosyl-methionine production that can differentially impact histone methylation in naive and primed cells. Our network-based approach provides a framework for characterizing metabolic changes influencing pluripotency and cell fate. : Chandrasekaran et al. use computational modeling, metabolomics, and metabolic inhibitors to discover metabolic differences between various pluripotent stem cell states and infer their impact on stem cell fate decisions. Keywords: systems biology, stem cell biology, metabolism, genome-scale modeling, pluripotency, histone methylation, naive (ground state, primed state, cell fate, metabolic network

  1. Quantitative Assessment of Thermodynamic Constraints on the Solution Space of Genome-Scale Metabolic Models

    Science.gov (United States)

    Hamilton, Joshua J.; Dwivedi, Vivek; Reed, Jennifer L.

    2013-01-01

    Constraint-based methods provide powerful computational techniques to allow understanding and prediction of cellular behavior. These methods rely on physiochemical constraints to eliminate infeasible behaviors from the space of available behaviors. One such constraint is thermodynamic feasibility, the requirement that intracellular flux distributions obey the laws of thermodynamics. The past decade has seen several constraint-based methods that interpret this constraint in different ways, including those that are limited to small networks, rely on predefined reaction directions, and/or neglect the relationship between reaction free energies and metabolite concentrations. In this work, we utilize one such approach, thermodynamics-based metabolic flux analysis (TMFA), to make genome-scale, quantitative predictions about metabolite concentrations and reaction free energies in the absence of prior knowledge of reaction directions, while accounting for uncertainties in thermodynamic estimates. We applied TMFA to a genome-scale network reconstruction of Escherichia coli and examined the effect of thermodynamic constraints on the flux space. We also assessed the predictive performance of TMFA against gene essentiality and quantitative metabolomics data, under both aerobic and anaerobic, and optimal and suboptimal growth conditions. Based on these results, we propose that TMFA is a useful tool for validating phenotypes and generating hypotheses, and that additional types of data and constraints can improve predictions of metabolite concentrations. PMID:23870272

  2. Network Thermodynamic Curation of Human and Yeast Genome-Scale Metabolic Models

    Science.gov (United States)

    Martínez, Verónica S.; Quek, Lake-Ee; Nielsen, Lars K.

    2014-01-01

    Genome-scale models are used for an ever-widening range of applications. Although there has been much focus on specifying the stoichiometric matrix, the predictive power of genome-scale models equally depends on reaction directions. Two-thirds of reactions in the two eukaryotic reconstructions Homo sapiens Recon 1 and Yeast 5 are specified as irreversible. However, these specifications are mainly based on biochemical textbooks or on their similarity to other organisms and are rarely underpinned by detailed thermodynamic analysis. In this study, a to our knowledge new workflow combining network-embedded thermodynamic and flux variability analysis was used to evaluate existing irreversibility constraints in Recon 1 and Yeast 5 and to identify new ones. A total of 27 and 16 new irreversible reactions were identified in Recon 1 and Yeast 5, respectively, whereas only four reactions were found with directions incorrectly specified against thermodynamics (three in Yeast 5 and one in Recon 1). The workflow further identified for both models several isolated internal loops that require further curation. The framework also highlighted the need for substrate channeling (in human) and ATP hydrolysis (in yeast) for the essential reaction catalyzed by phosphoribosylaminoimidazole carboxylase in purine metabolism. Finally, the framework highlighted differences in proline metabolism between yeast (cytosolic anabolism and mitochondrial catabolism) and humans (exclusively mitochondrial metabolism). We conclude that network-embedded thermodynamics facilitates the specification and validation of irreversibility constraints in compartmentalized metabolic models, at the same time providing further insight into network properties. PMID:25028891

  3. Data partitioning enables the use of standard SOAP Web Services in genome-scale workflows.

    Science.gov (United States)

    Sztromwasser, Pawel; Puntervoll, Pål; Petersen, Kjell

    2011-07-26

    Biological databases and computational biology tools are provided by research groups around the world, and made accessible on the Web. Combining these resources is a common practice in bioinformatics, but integration of heterogeneous and often distributed tools and datasets can be challenging. To date, this challenge has been commonly addressed in a pragmatic way, by tedious and error-prone scripting. Recently however a more reliable technique has been identified and proposed as the platform that would tie together bioinformatics resources, namely Web Services. In the last decade the Web Services have spread wide in bioinformatics, and earned the title of recommended technology. However, in the era of high-throughput experimentation, a major concern regarding Web Services is their ability to handle large-scale data traffic. We propose a stream-like communication pattern for standard SOAP Web Services, that enables efficient flow of large data traffic between a workflow orchestrator and Web Services. We evaluated the data-partitioning strategy by comparing it with typical communication patterns on an example pipeline for genomic sequence annotation. The results show that data-partitioning lowers resource demands of services and increases their throughput, which in consequence allows to execute in-silico experiments on genome-scale, using standard SOAP Web Services and workflows. As a proof-of-principle we annotated an RNA-seq dataset using a plain BPEL workflow engine.

  4. Genome-scale metabolic model of Pichia pastoris with native and humanized glycosylation of recombinant proteins.

    Science.gov (United States)

    Irani, Zahra Azimzadeh; Kerkhoven, Eduard J; Shojaosadati, Seyed Abbas; Nielsen, Jens

    2016-05-01

    Pichia pastoris is used for commercial production of human therapeutic proteins, and genome-scale models of P. pastoris metabolism have been generated in the past to study the metabolism and associated protein production by this yeast. A major challenge with clinical usage of recombinant proteins produced by P. pastoris is the difference in N-glycosylation of proteins produced by humans and this yeast. However, through metabolic engineering, a P. pastoris strain capable of producing humanized N-glycosylated proteins was constructed. The current genome-scale models of P. pastoris do not address native nor humanized N-glycosylation, and we therefore developed ihGlycopastoris, an extension to the iLC915 model with both native and humanized N-glycosylation for recombinant protein production, but also an estimation of N-glycosylation of P. pastoris native proteins. This new model gives a better prediction of protein yield, demonstrates the effect of the different types of N-glycosylation of protein yield, and can be used to predict potential targets for strain improvement. The model represents a step towards a more complete description of protein production in P. pastoris, which is required for using these models to understand and optimize protein production processes. © 2015 Wiley Periodicals, Inc.

  5. Data partitioning enables the use of standard SOAP Web Services in genome-scale workflows

    Directory of Open Access Journals (Sweden)

    Sztromwasser Paweł

    2011-06-01

    Full Text Available Biological databases and computational biology tools are provided by research groups around the world, and made accessible on the Web. Combining these resources is a common practice in bioinformatics, but integration of heterogeneous and often distributed tools and datasets can be challenging. To date, this challenge has been commonly addressed in a pragmatic way, by tedious and error-prone scripting. Recently however a more reliable technique has been identified and proposed as the platform that would tie together bioinformatics resources, namely Web Services. In the last decade the Web Services have spread wide in bioinformatics, and earned the title of recommended technology. However, in the era of high-throughput experimentation, a major concern regarding Web Services is their ability to handle large-scale data traffic. We propose a stream-like communication pattern for standard SOAP Web Services, that enables efficient flow of large data traffic between a workflow orchestrator and Web Services. We evaluated the data-partitioning strategy by comparing it with typical communication patterns on an example pipeline for genomic sequence annotation. The results show that data-partitioning lowers resource demands of services and increases their throughput, which in consequence allows to execute in-silico experiments on genome-scale, using standard SOAP Web Services and workflows. As a proof-of-principle we annotated an RNA-seq dataset using a plain BPEL workflow engine.

  6. A mixed-integer linear programming approach to the reduction of genome-scale metabolic networks.

    Science.gov (United States)

    Röhl, Annika; Bockmayr, Alexander

    2017-01-03

    Constraint-based analysis has become a widely used method to study metabolic networks. While some of the associated algorithms can be applied to genome-scale network reconstructions with several thousands of reactions, others are limited to small or medium-sized models. In 2015, Erdrich et al. introduced a method called NetworkReducer, which reduces large metabolic networks to smaller subnetworks, while preserving a set of biological requirements that can be specified by the user. Already in 2001, Burgard et al. developed a mixed-integer linear programming (MILP) approach for computing minimal reaction sets under a given growth requirement. Here we present an MILP approach for computing minimum subnetworks with the given properties. The minimality (with respect to the number of active reactions) is not guaranteed by NetworkReducer, while the method by Burgard et al. does not allow specifying the different biological requirements. Our procedure is about 5-10 times faster than NetworkReducer and can enumerate all minimum subnetworks in case there exist several ones. This allows identifying common reactions that are present in all subnetworks, and reactions appearing in alternative pathways. Applying complex analysis methods to genome-scale metabolic networks is often not possible in practice. Thus it may become necessary to reduce the size of the network while keeping important functionalities. We propose a MILP solution to this problem. Compared to previous work, our approach is more efficient and allows computing not only one, but even all minimum subnetworks satisfying the required properties.

  7. Genome-scale metabolic modeling of Mucor circinelloides and comparative analysis with other oleaginous species.

    Science.gov (United States)

    Vongsangnak, Wanwipa; Klanchui, Amornpan; Tawornsamretkit, Iyarest; Tatiyaborwornchai, Witthawin; Laoteng, Kobkul; Meechai, Asawin

    2016-06-01

    We present a novel genome-scale metabolic model iWV1213 of Mucor circinelloides, which is an oleaginous fungus for industrial applications. The model contains 1213 genes, 1413 metabolites and 1326 metabolic reactions across different compartments. We demonstrate that iWV1213 is able to accurately predict the growth rates of M. circinelloides on various nutrient sources and culture conditions using Flux Balance Analysis and Phenotypic Phase Plane analysis. Comparative analysis of three oleaginous genome-scale models, including M. circinelloides (iWV1213), Mortierella alpina (iCY1106) and Yarrowia lipolytica (iYL619_PCP) revealed that iWV1213 possesses a higher number of genes involved in carbohydrate, amino acid, and lipid metabolisms that might contribute to its versatility in nutrient utilization. Moreover, the identification of unique and common active reactions among the Zygomycetes oleaginous models using Flux Variability Analysis unveiled a set of gene/enzyme candidates as metabolic engineering targets for cellular improvement. Thus, iWV1213 offers a powerful metabolic engineering tool for multi-level omics analysis, enabling strain optimization as a cell factory platform of lipid-based production. Copyright © 2016 Elsevier B.V. All rights reserved.

  8. Predicting growth of the healthy infant using a genome scale metabolic model.

    Science.gov (United States)

    Nilsson, Avlant; Mardinoglu, Adil; Nielsen, Jens

    2017-01-01

    An estimated 165 million children globally have stunted growth, and extensive growth data are available. Genome scale metabolic models allow the simulation of molecular flux over each metabolic enzyme, and are well adapted to analyze biological systems. We used a human genome scale metabolic model to simulate the mechanisms of growth and integrate data about breast-milk intake and composition with the infant's biomass and energy expenditure of major organs. The model predicted daily metabolic fluxes from birth to age 6 months, and accurately reproduced standard growth curves and changes in body composition. The model corroborates the finding that essential amino and fatty acids do not limit growth, but that energy is the main growth limiting factor. Disruptions to the supply and demand of energy markedly affected the predicted growth, indicating that elevated energy expenditure may be detrimental. The model was used to simulate the metabolic effect of mineral deficiencies, and showed the greatest growth reduction for deficiencies in copper, iron, and magnesium ions which affect energy production through oxidative phosphorylation. The model and simulation method were integrated to a platform and shared with the research community. The growth model constitutes another step towards the complete representation of human metabolism, and may further help improve the understanding of the mechanisms underlying stunting.

  9. Quantitative assessment of thermodynamic constraints on the solution space of genome-scale metabolic models.

    Science.gov (United States)

    Hamilton, Joshua J; Dwivedi, Vivek; Reed, Jennifer L

    2013-07-16

    Constraint-based methods provide powerful computational techniques to allow understanding and prediction of cellular behavior. These methods rely on physiochemical constraints to eliminate infeasible behaviors from the space of available behaviors. One such constraint is thermodynamic feasibility, the requirement that intracellular flux distributions obey the laws of thermodynamics. The past decade has seen several constraint-based methods that interpret this constraint in different ways, including those that are limited to small networks, rely on predefined reaction directions, and/or neglect the relationship between reaction free energies and metabolite concentrations. In this work, we utilize one such approach, thermodynamics-based metabolic flux analysis (TMFA), to make genome-scale, quantitative predictions about metabolite concentrations and reaction free energies in the absence of prior knowledge of reaction directions, while accounting for uncertainties in thermodynamic estimates. We applied TMFA to a genome-scale network reconstruction of Escherichia coli and examined the effect of thermodynamic constraints on the flux space. We also assessed the predictive performance of TMFA against gene essentiality and quantitative metabolomics data, under both aerobic and anaerobic, and optimal and suboptimal growth conditions. Based on these results, we propose that TMFA is a useful tool for validating phenotypes and generating hypotheses, and that additional types of data and constraints can improve predictions of metabolite concentrations. Copyright © 2013 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  10. De Novo Assembly and Characterization of Sophora japonica Transcriptome Using RNA-seq

    Directory of Open Access Journals (Sweden)

    Liucun Zhu

    2014-01-01

    Full Text Available Sophora japonica Linn (Chinese Scholar Tree is a shrub species belonging to the subfamily Faboideae of the pea family Fabaceae. In this study, RNA sequencing of S. japonica transcriptome was performed to produce large expression datasets for functional genomic analysis. Approximate 86.1 million high-quality clean reads were generated and assembled de novo into 143010 unique transcripts and 57614 unigenes. The average length of unigenes was 901 bps with an N50 of 545 bps. Four public databases, including the NCBI nonredundant protein (NR, Swiss-Prot, Kyoto Encyclopedia of Genes and Genomes (KEGG, and the Cluster of Orthologous Groups (COG, were used to annotate unigenes through NCBI BLAST procedure. A total of 27541 of 57614 unigenes (47.8% were annotated for gene descriptions, conserved protein domains, or gene ontology. Moreover, an interaction network of unigenes in S. japonica was predicted based on known protein-protein interactions of putative orthologs of well-studied plant genomes. The transcriptome data of S. japonica reported here represents first genome-scale investigation of gene expressions in Faboideae plants. We expect that our study will provide a useful resource for further studies on gene expression, genomics, functional genomics, and protein-protein interaction in S. japonica.

  11. Genome-Wide Constitutively Expressed Gene Analysis and New Reference Gene Selection Based on Transcriptome Data: A Case Study from Poplar/Canker Disease Interaction

    Directory of Open Access Journals (Sweden)

    Jiaping Zhao

    2017-10-01

    Full Text Available A number of transcriptome datasets for differential expression (DE genes have been widely used for understanding organismal biology, but these datasets also contain untapped information that can be used to develop more precise analytical tools. With the use of transcriptome data generated from poplar/canker disease interaction system, we describe a methodology to identify candidate reference genes from high-throughput sequencing data. This methodology will improve the accuracy of RT-qPCR and will lead to better standards for the normalization of expression data. Expression stability analysis from xylem and phloem of Populus bejingensis inoculated with the fungal canker pathogen Botryosphaeria dothidea revealed that 729 poplar transcripts (1.11% were stably expressed, at a threshold level of coefficient of variance (CV of FPKM < 20% and maximum fold change (MFC of FPKM < 2.0. Expression stability and bioinformatics analysis suggested that commonly used house-keeping (HK genes were not the most appropriate internal controls: 70 of the 72 commonly used HK genes were not stably expressed, 45 of the 72 produced multiple isoform transcripts, and some of their reported primers produced unspecific amplicons in PCR amplification. RT-qPCR analysis to compare and evaluate the expression stability of 10 commonly used poplar HK genes and 20 of the 729 newly-identified stably expressed transcripts showed that some of the newly-identified genes (such as SSU_S8e, LSU_L5e, and 20S_PSU had higher stability ranking than most of commonly used HK genes. Based on these results, we recommend a pipeline for deriving reference genes from transcriptome data. An appropriate candidate gene should have a unique transcript, constitutive expression, CV value of expression < 20% (or possibly 30% and MFC value of expression <2, and an expression level of 50–1,000 units. Lastly, when four of the newly identified HK genes were used in the normalization of expression data for 20

  12. The testes transcriptome derived from the New World Screwworm, Cochliomyia hominivorax TSA

    Science.gov (United States)

    In a collaboration with National Center for Genome Resources researchers, we sequenced and assembled the testes transcriptome derived from the Pacora, Panama, production plant strain of the New World Screwworm, Cochliomyia hominivorax. This transcriptome contains 4,149 unigenes and the Transcriptome...

  13. PGen: large-scale genomic variations analysis workflow and browser in SoyKB.

    Science.gov (United States)

    Liu, Yang; Khan, Saad M; Wang, Juexin; Rynge, Mats; Zhang, Yuanxun; Zeng, Shuai; Chen, Shiyuan; Maldonado Dos Santos, Joao V; Valliyodan, Babu; Calyam, Prasad P; Merchant, Nirav; Nguyen, Henry T; Xu, Dong; Joshi, Trupti

    2016-10-06

    With the advances in next-generation sequencing (NGS) technology and significant reductions in sequencing costs, it is now possible to sequence large collections of germplasm in crops for detecting genome-scale genetic variations and to apply the knowledge towards improvements in traits. To efficiently facilitate large-scale NGS resequencing data analysis of genomic variations, we have developed "PGen", an integrated and optimized workflow using the Extreme Science and Engineering Discovery Environment (XSEDE) high-performance computing (HPC) virtual system, iPlant cloud data storage resources and Pegasus workflow management system (Pegasus-WMS). The workflow allows users to identify single nucleotide polymorphisms (SNPs) and insertion-deletions (indels), perform SNP annotations and conduct copy number variation analyses on multiple resequencing datasets in a user-friendly and seamless way. We have developed both a Linux version in GitHub ( https://github.com/pegasus-isi/PGen-GenomicVariations-Workflow ) and a web-based implementation of the PGen workflow integrated within the Soybean Knowledge Base (SoyKB), ( http://soykb.org/Pegasus/index.php ). Using PGen, we identified 10,218,140 single-nucleotide polymorphisms (SNPs) and 1,398,982 indels from analysis of 106 soybean lines sequenced at 15X coverage. 297,245 non-synonymous SNPs and 3330 copy number variation (CNV) regions were identified from this analysis. SNPs identified using PGen from additional soybean resequencing projects adding to 500+ soybean germplasm lines in total have been integrated. These SNPs are being utilized for trait improvement using genotype to phenotype prediction approaches developed in-house. In order to browse and access NGS data easily, we have also developed an NGS resequencing data browser ( http://soykb.org/NGS_Resequence/NGS_index.php ) within SoyKB to provide easy access to SNP and downstream analysis results for soybean researchers. PGen workflow has been optimized for the most

  14. Probing the genome-scale metabolic landscape of Bordetella pertussis, the causative agent of whooping cough.

    Science.gov (United States)

    Branco Dos Santos, Filipe; Olivier, Brett G; Boele, Joost; Smessaert, Vincent; De Rop, Philippe; Krumpochova, Petra; Klau, Gunnar W; Giera, Martin; Dehottay, Philippe; Teusink, Bas; Goffin, Philippe

    2017-08-25

    Whooping cough is a highly-contagious respiratory disease caused by Bordetella pertussi s. Despite vaccination, its incidence has been rising alarmingly, and yet, the physiology of B. pertussis remains poorly understood. We combined genome-scale metabolic reconstruction, a novel optimization algorithm and experimental data to probe the full metabolic potential of this pathogen, using strain Tohama I as a reference. Experimental validation showed that B. pertussis secretes a significant proportion of nitrogen as arginine and purine nucleosides, which may contribute to modulation of the host response. We also found that B. pertussis can be unexpectedly versatile, being able to metabolize many compounds while displaying minimal nutrient requirements. It can grow without cysteine - using inorganic sulfur sources such as thiosulfate - and it can grow on organic acids such as citrate or lactate as sole carbon sources, providing in vivo demonstration that its TCA cycle is functional. Although the metabolic reconstruction of eight additional strains indicates that the structural genes underlying this metabolic flexibility are widespread, experimental validation suggests a role of strain-specific regulatory mechanisms in shaping metabolic capabilities. Among five alternative strains tested, three were shown to grow on substrate combinations requiring a functional TCA cycle, but only one could use thiosulfate. Finally, the metabolic model was used to rationally design growth media with over two-fold improvements in pertussis toxin production. This study thus provides novel insights into B. pertussis physiology, and highlights the potential, but also limitations of models solely based on metabolic gene content. IMPORTANCE The metabolic capabilities of Bordetella pertussis - the causative agent of whooping cough - were investigated from a systems-level perspective. We constructed a comprehensive genome-scale metabolic model for B. pertussis , and challenged its predictions

  15. Genome-scale reconstruction of the metabolic network in Yersinia pestis, strain 91001

    Energy Technology Data Exchange (ETDEWEB)

    Navid, A; Almaas, E

    2009-01-13

    The gram-negative bacterium Yersinia pestis, the aetiological agent of bubonic plague, is one the deadliest pathogens known to man. Despite its historical reputation, plague is a modern disease which annually afflicts thousands of people. Public safety considerations greatly limit clinical experimentation on this organism and thus development of theoretical tools to analyze the capabilities of this pathogen is of utmost importance. Here, we report the first genome-scale metabolic model of Yersinia pestis biovar Mediaevalis based both on its recently annotated genome, and physiological and biochemical data from literature. Our model demonstrates excellent agreement with Y. pestis known metabolic needs and capabilities. Since Y. pestis is a meiotrophic organism, we have developed CryptFind, a systematic approach to identify all candidate cryptic genes responsible for known and theoretical meiotrophic phenomena. In addition to uncovering every known cryptic gene for Y. pestis, our analysis of the rhamnose fermentation pathway suggests that betB is the responsible cryptic gene. Despite all of our medical advances, we still do not have a vaccine for bubonic plague. Recent discoveries of antibiotic resistant strains of Yersinia pestis coupled with the threat of plague being used as a bioterrorism weapon compel us to develop new tools for studying the physiology of this deadly pathogen. Using our theoretical model, we can study the cell's phenotypic behavior under different circumstances and identify metabolic weaknesses which may be harnessed for the development of therapeutics. Additionally, the automatic identification of cryptic genes expands the usage of genomic data for pharmaceutical purposes.

  16. Genome-scale metabolic analysis of Clostridium thermocellum for bioethanol production

    Directory of Open Access Journals (Sweden)

    Brooks J Paul

    2010-03-01

    Full Text Available Abstract Background Microorganisms possess diverse metabolic capabilities that can potentially be leveraged for efficient production of biofuels. Clostridium thermocellum (ATCC 27405 is a thermophilic anaerobe that is both cellulolytic and ethanologenic, meaning that it can directly use the plant sugar, cellulose, and biochemically convert it to ethanol. A major challenge in using microorganisms for chemical production is the need to modify the organism to increase production efficiency. The process of properly engineering an organism is typically arduous. Results Here we present a genome-scale model of C. thermocellum metabolism, iSR432, for the purpose of establishing a computational tool to study the metabolic network of C. thermocellum and facilitate efforts to engineer C. thermocellum for biofuel production. The model consists of 577 reactions involving 525 intracellular metabolites, 432 genes, and a proteomic-based representation of a cellulosome. The process of constructing this metabolic model led to suggested annotation refinements for 27 genes and identification of areas of metabolism requiring further study. The accuracy of the iSR432 model was tested using experimental growth and by-product secretion data for growth on cellobiose and fructose. Analysis using this model captures the relationship between the reduction-oxidation state of the cell and ethanol secretion and allowed for prediction of gene deletions and environmental conditions that would increase ethanol production. Conclusions By incorporating genomic sequence data, network topology, and experimental measurements of enzyme activities and metabolite fluxes, we have generated a model that is reasonably accurate at predicting the cellular phenotype of C. thermocellum and establish a strong foundation for rational strain design. In addition, we are able to draw some important conclusions regarding the underlying metabolic mechanisms for observed behaviors of C. thermocellum

  17. The Transcriptomic Responses of Pinus massoniana to Drought Stress

    Directory of Open Access Journals (Sweden)

    Mingfeng Du

    2018-06-01

    Full Text Available Masson pine (Pinus massoniana is a major fast-growing timber species planted in southern China, a region of seasonal drought. Using a drought-tolerance genotype of Masson pine, we conducted large-scale transcriptome sequencing using Illumina technology. This work aimed to evaluate the transcriptomic responses of Masson pine to different levels of drought stress. First, 3397, 1695 and 1550 unigenes with differential expression were identified by comparing plants subjected to light, moderate or severe drought with control plants. Second, several gene ontology (GO categories (oxidation-reduction and metabolism and Kyoto Encyclopedia of Genes and Genomes (KEGG pathways (plant hormone signal transduction and metabolic pathways were enriched, indicating that the expression levels of some genes in these enriched GO terms and pathways were altered under drought stress. Third, several transcription factors (TFs associated with circadian rhythms (HY5 and LHY, signal transduction (ERF, and defense responses (WRKY were identified, and these TFs may play key roles in adapting to drought stress. Drought also caused significant changes in the expression of certain functional genes linked to osmotic adjustment (P5CS, abscisic acid (ABA responses (NCED, PYL, PP2C and SnRK, and reactive oxygen species (ROS scavenging (GPX, GST and GSR. These transcriptomic results provide insight into the molecular mechanisms of drought stress adaptation in Masson pine.

  18. Genome-scale modelling of microbial metabolism with temporal and spatial resolution.

    Science.gov (United States)

    Henson, Michael A

    2015-12-01

    Most natural microbial systems have evolved to function in environments with temporal and spatial variations. A major limitation to understanding such complex systems is the lack of mathematical modelling frameworks that connect the genomes of individual species and temporal and spatial variations in the environment to system behaviour. The goal of this review is to introduce the emerging field of spatiotemporal metabolic modelling based on genome-scale reconstructions of microbial metabolism. The extension of flux balance analysis (FBA) to account for both temporal and spatial variations in the environment is termed spatiotemporal FBA (SFBA). Following a brief overview of FBA and its established dynamic extension, the SFBA problem is introduced and recent progress is described. Three case studies are reviewed to illustrate the current state-of-the-art and possible future research directions are outlined. The author posits that SFBA is the next frontier for microbial metabolic modelling and a rapid increase in methods development and system applications is anticipated. © 2015 Authors; published by Portland Press Limited.

  19. Construction and analysis of a genome-scale metabolic network for Bacillus licheniformis WX-02.

    Science.gov (United States)

    Guo, Jing; Zhang, Hong; Wang, Cheng; Chang, Ji-Wei; Chen, Ling-Ling

    2016-05-01

    We constructed the genome-scale metabolic network of Bacillus licheniformis (B. licheniformis) WX-02 by combining genomic annotation, high-throughput phenotype microarray (PM) experiments and literature-based metabolic information. The accuracy of the metabolic network was assessed by an OmniLog PM experiment. The final metabolic model iWX1009 contains 1009 genes, 1141 metabolites and 1762 reactions, and the predicted metabolic phenotypes showed an agreement rate of 76.8% with experimental PM data. In addition, key metabolic features such as growth yield, utilization of different substrates and essential genes were identified by flux balance analysis. A total of 195 essential genes were predicted from LB medium, among which 149 were verified with the experimental essential gene set of B. subtilis 168. With the removal of 5 reactions from the network, pathways for poly-γ-glutamic acid (γ-PGA) synthesis were optimized and the γ-PGA yield reached 83.8 mmol/h. Furthermore, the important metabolites and pathways related to γ-PGA synthesis and bacterium growth were comprehensively analyzed. The present study provides valuable clues for exploring the metabolisms and metabolic regulation of γ-PGA synthesis in B. licheniformis WX-02. Copyright © 2016 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.

  20. Deriving metabolic engineering strategies from genome-scale modeling with flux ratio constraints.

    Science.gov (United States)

    Yen, Jiun Y; Nazem-Bokaee, Hadi; Freedman, Benjamin G; Athamneh, Ahmad I M; Senger, Ryan S

    2013-05-01

    Optimized production of bio-based fuels and chemicals from microbial cell factories is a central goal of systems metabolic engineering. To achieve this goal, a new computational method of using flux balance analysis with flux ratios (FBrAtio) was further developed in this research and applied to five case studies to evaluate and design metabolic engineering strategies. The approach was implemented using publicly available genome-scale metabolic flux models. Synthetic pathways were added to these models along with flux ratio constraints by FBrAtio to achieve increased (i) cellulose production from Arabidopsis thaliana; (ii) isobutanol production from Saccharomyces cerevisiae; (iii) acetone production from Synechocystis sp. PCC6803; (iv) H2 production from Escherichia coli MG1655; and (v) isopropanol, butanol, and ethanol (IBE) production from engineered Clostridium acetobutylicum. The FBrAtio approach was applied to each case to simulate a metabolic engineering strategy already implemented experimentally, and flux ratios were continually adjusted to find (i) the end-limit of increased production using the existing strategy, (ii) new potential strategies to increase production, and (iii) the impact of these metabolic engineering strategies on product yield and culture growth. The FBrAtio approach has the potential to design "fine-tuned" metabolic engineering strategies in silico that can be implemented directly with available genomic tools. Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  1. Reframed Genome-Scale Metabolic Model to Facilitate Genetic Design and Integration with Expression Data.

    Science.gov (United States)

    Gu, Deqing; Jian, Xingxing; Zhang, Cheng; Hua, Qiang

    2017-01-01

    Genome-scale metabolic network models (GEMs) have played important roles in the design of genetically engineered strains and helped biologists to decipher metabolism. However, due to the complex gene-reaction relationships that exist in model systems, most algorithms have limited capabilities with respect to directly predicting accurate genetic design for metabolic engineering. In particular, methods that predict reaction knockout strategies leading to overproduction are often impractical in terms of gene manipulations. Recently, we proposed a method named logical transformation of model (LTM) to simplify the gene-reaction associations by introducing intermediate pseudo reactions, which makes it possible to generate genetic design. Here, we propose an alternative method to relieve researchers from deciphering complex gene-reactions by adding pseudo gene controlling reactions. In comparison to LTM, this new method introduces fewer pseudo reactions and generates a much smaller model system named as gModel. We showed that gModel allows two seldom reported applications: identification of minimal genomes and design of minimal cell factories within a modified OptKnock framework. In addition, gModel could be used to integrate expression data directly and improve the performance of the E-Fmin method for predicting fluxes. In conclusion, the model transformation procedure will facilitate genetic research based on GEMs, extending their applications.

  2. A Consensus Genome-scale Reconstruction of Chinese Hamster Ovary Cell Metabolism

    KAUST Repository

    Hefzi, Hooman

    2016-11-23

    Chinese hamster ovary (CHO) cells dominate biotherapeutic protein production and are widely used in mammalian cell line engineering research. To elucidate metabolic bottlenecks in protein production and to guide cell engineering and bioprocess optimization, we reconstructed the metabolic pathways in CHO and associated them with >1,700 genes in the Cricetulus griseus genome. The genome-scale metabolic model based on this reconstruction, iCHO1766, and cell-line-specific models for CHO-K1, CHO-S, and CHO-DG44 cells provide the biochemical basis of growth and recombinant protein production. The models accurately predict growth phenotypes and known auxotrophies in CHO cells. With the models, we quantify the protein synthesis capacity of CHO cells and demonstrate that common bioprocess treatments, such as histone deacetylase inhibitors, inefficiently increase product yield. However, our simulations show that the metabolic resources in CHO are more than three times more efficiently utilized for growth or recombinant protein synthesis following targeted efforts to engineer the CHO secretory pathway. This model will further accelerate CHO cell engineering and help optimize bioprocesses.

  3. iCN718, an Updated and Improved Genome-Scale Metabolic Network Reconstruction of Acinetobacter baumannii AYE.

    Science.gov (United States)

    Norsigian, Charles J; Kavvas, Erol; Seif, Yara; Palsson, Bernhard O; Monk, Jonathan M

    2018-01-01

    Acinetobacter baumannii has become an urgent clinical threat due to the recent emergence of multi-drug resistant strains. There is thus a significant need to discover new therapeutic targets in this organism. One means for doing so is through the use of high-quality genome-scale reconstructions. Well-curated and accurate genome-scale models (GEMs) of A. baumannii would be useful for improving treatment options. We present an updated and improved genome-scale reconstruction of A. baumannii AYE, named iCN718, that improves and standardizes previous A. baumannii AYE reconstructions. iCN718 has 80% accuracy for predicting gene essentiality data and additionally can predict large-scale phenotypic data with as much as 89% accuracy, a new capability for an A. baumannii reconstruction. We further demonstrate that iCN718 can be used to analyze conserved metabolic functions in the A. baumannii core genome and to build strain-specific GEMs of 74 other A. baumannii strains from genome sequence alone. iCN718 will serve as a resource to integrate and synthesize new experimental data being generated for this urgent threat pathogen.

  4. Identifying all moiety conservation laws in genome-scale metabolic networks.

    Science.gov (United States)

    De Martino, Andrea; De Martino, Daniele; Mulet, Roberto; Pagnani, Andrea

    2014-01-01

    The stoichiometry of a metabolic network gives rise to a set of conservation laws for the aggregate level of specific pools of metabolites, which, on one hand, pose dynamical constraints that cross-link the variations of metabolite concentrations and, on the other, provide key insight into a cell's metabolic production capabilities. When the conserved quantity identifies with a chemical moiety, extracting all such conservation laws from the stoichiometry amounts to finding all non-negative integer solutions of a linear system, a programming problem known to be NP-hard. We present an efficient strategy to compute the complete set of integer conservation laws of a genome-scale stoichiometric matrix, also providing a certificate for correctness and maximality of the solution. Our method is deployed for the analysis of moiety conservation relationships in two large-scale reconstructions of the metabolism of the bacterium E. coli, in six tissue-specific human metabolic networks, and, finally, in the human reactome as a whole, revealing that bacterial metabolism could be evolutionarily designed to cover broader production spectra than human metabolism. Convergence to the full set of moiety conservation laws in each case is achieved in extremely reduced computing times. In addition, we uncover a scaling relation that links the size of the independent pool basis to the number of metabolites, for which we present an analytical explanation.

  5. Identifying all moiety conservation laws in genome-scale metabolic networks.

    Directory of Open Access Journals (Sweden)

    Andrea De Martino

    Full Text Available The stoichiometry of a metabolic network gives rise to a set of conservation laws for the aggregate level of specific pools of metabolites, which, on one hand, pose dynamical constraints that cross-link the variations of metabolite concentrations and, on the other, provide key insight into a cell's metabolic production capabilities. When the conserved quantity identifies with a chemical moiety, extracting all such conservation laws from the stoichiometry amounts to finding all non-negative integer solutions of a linear system, a programming problem known to be NP-hard. We present an efficient strategy to compute the complete set of integer conservation laws of a genome-scale stoichiometric matrix, also providing a certificate for correctness and maximality of the solution. Our method is deployed for the analysis of moiety conservation relationships in two large-scale reconstructions of the metabolism of the bacterium E. coli, in six tissue-specific human metabolic networks, and, finally, in the human reactome as a whole, revealing that bacterial metabolism could be evolutionarily designed to cover broader production spectra than human metabolism. Convergence to the full set of moiety conservation laws in each case is achieved in extremely reduced computing times. In addition, we uncover a scaling relation that links the size of the independent pool basis to the number of metabolites, for which we present an analytical explanation.

  6. GEnomes Management Application (GEM.app): a new software tool for large-scale collaborative genome analysis.

    Science.gov (United States)

    Gonzalez, Michael A; Lebrigio, Rafael F Acosta; Van Booven, Derek; Ulloa, Rick H; Powell, Eric; Speziani, Fiorella; Tekin, Mustafa; Schüle, Rebecca; Züchner, Stephan

    2013-06-01

    Novel genes are now identified at a rapid pace for many Mendelian disorders, and increasingly, for genetically complex phenotypes. However, new challenges have also become evident: (1) effectively managing larger exome and/or genome datasets, especially for smaller labs; (2) direct hands-on analysis and contextual interpretation of variant data in large genomic datasets; and (3) many small and medium-sized clinical and research-based investigative teams around the world are generating data that, if combined and shared, will significantly increase the opportunities for the entire community to identify new genes. To address these challenges, we have developed GEnomes Management Application (GEM.app), a software tool to annotate, manage, visualize, and analyze large genomic datasets (https://genomics.med.miami.edu/). GEM.app currently contains ∼1,600 whole exomes from 50 different phenotypes studied by 40 principal investigators from 15 different countries. The focus of GEM.app is on user-friendly analysis for nonbioinformaticians to make next-generation sequencing data directly accessible. Yet, GEM.app provides powerful and flexible filter options, including single family filtering, across family/phenotype queries, nested filtering, and evaluation of segregation in families. In addition, the system is fast, obtaining results within 4 sec across ∼1,200 exomes. We believe that this system will further enhance identification of genetic causes of human disease. © 2013 Wiley Periodicals, Inc.

  7. Moving image analysis to the cloud: A case study with a genome-scale tomographic study

    Energy Technology Data Exchange (ETDEWEB)

    Mader, Kevin [4Quant Ltd., Switzerland & Institute for Biomedical Engineering at University and ETH Zurich (Switzerland); Stampanoni, Marco [Institute for Biomedical Engineering at University and ETH Zurich, Switzerland & Swiss Light Source at Paul Scherrer Institut, Villigen (Switzerland)

    2016-01-28

    Over the last decade, the time required to measure a terabyte of microscopic imaging data has gone from years to minutes. This shift has moved many of the challenges away from experimental design and measurement to scalable storage, organization, and analysis. As many scientists and scientific institutions lack training and competencies in these areas, major bottlenecks have arisen and led to substantial delays and gaps between measurement, understanding, and dissemination. We present in this paper a framework for analyzing large 3D datasets using cloud-based computational and storage resources. We demonstrate its applicability by showing the setup and costs associated with the analysis of a genome-scale study of bone microstructure. We then evaluate the relative advantages and disadvantages associated with local versus cloud infrastructures.

  8. Moving image analysis to the cloud: A case study with a genome-scale tomographic study

    International Nuclear Information System (INIS)

    Mader, Kevin; Stampanoni, Marco

    2016-01-01

    Over the last decade, the time required to measure a terabyte of microscopic imaging data has gone from years to minutes. This shift has moved many of the challenges away from experimental design and measurement to scalable storage, organization, and analysis. As many scientists and scientific institutions lack training and competencies in these areas, major bottlenecks have arisen and led to substantial delays and gaps between measurement, understanding, and dissemination. We present in this paper a framework for analyzing large 3D datasets using cloud-based computational and storage resources. We demonstrate its applicability by showing the setup and costs associated with the analysis of a genome-scale study of bone microstructure. We then evaluate the relative advantages and disadvantages associated with local versus cloud infrastructures

  9. Principles of proteome allocation are revealed using proteomic data and genome-scale models

    DEFF Research Database (Denmark)

    Yang, Laurence; Yurkovich, James T.; Lloyd, Colton J.

    2016-01-01

    to metabolism and fitness. Using proteomics data, we formulated allocation constraints for key proteome sectors in the ME model. The resulting calibrated model effectively computed the "generalist" (wild-type) E. coli proteome and phenotype across diverse growth environments. Across 15 growth conditions......Integrating omics data to refine or make context-specific models is an active field of constraint-based modeling. Proteomics now cover over 95% of the Escherichia coli proteome by mass. Genome-scale models of Metabolism and macromolecular Expression (ME) compute proteome allocation linked...... of these sectors for the general stress response sigma factor sigma(S). Finally, the sector constraints represent a general formalism for integrating omics data from any experimental condition into constraint-based ME models. The constraints can be fine-grained (individual proteins) or coarse-grained (functionally...

  10. Bio-succinic acid production: Escherichia coli strains design from genome-scale perspectives

    Directory of Open Access Journals (Sweden)

    Bashir Sajo Mienda

    2017-10-01

    Full Text Available Escherichia coli (E. coli has been established to be a native producer of succinic acid (a platform chemical with different applications via mixed acid fermentation reactions. Genome-scale metabolic models (GEMs of E. coli have been published with capabilities of predicting strain design strategies for the production of bio-based succinic acid. Proof-of-principle strains are fundamentally constructed as a starting point for systems strategies for industrial strains development. Here, we review for the first time, the use of E. coli GEMs for construction of proof-of-principles strains for increasing succinic acid production. Specific case studies, where E. coli proof-of-principle strains were constructed for increasing bio-based succinic acid production from glucose and glycerol carbon sources have been highlighted. In addition, a propose systems strategies for industrial strain development that could be applicable for future microbial succinic acid production guided by GEMs have been presented.

  11. Reconstruction of genome-scale human metabolic models using omics data

    DEFF Research Database (Denmark)

    Ryu, Jae Yong; Kim, Hyun Uk; Lee, Sang Yup

    2015-01-01

    used to describe metabolic phenotypes of healthy and diseased human tissues and cells, and to predict therapeutic targets. Here we review recent trends in genome-scale human metabolic modeling, including various generic and tissue/cell type-specific human metabolic models developed to date, and methods......, databases and platforms used to construct them. For generic human metabolic models, we pay attention to Recon 2 and HMR 2.0 with emphasis on data sources used to construct them. Draft and high-quality tissue/cell type-specific human metabolic models have been generated using these generic human metabolic...... refined through gap filling, reaction directionality assignment and the subcellular localization of metabolic reactions. We review relevant tools for this model refinement procedure as well. Finally, we suggest the direction of further studies on reconstructing an improved human metabolic model....

  12. Reliable and efficient solution of genome-scale models of Metabolism and macromolecular Expression

    DEFF Research Database (Denmark)

    Ma, Ding; Yang, Laurence; Fleming, Ronan M. T.

    2017-01-01

    orders of magnitude. Data values also have greatly varying magnitudes. Standard double-precision solvers may return inaccurate solutions or report that no solution exists. Exact simplex solvers based on rational arithmetic require a near-optimal warm start to be practical on large problems (current ME......Constraint-Based Reconstruction and Analysis (COBRA) is currently the only methodology that permits integrated modeling of Metabolism and macromolecular Expression (ME) at genome-scale. Linear optimization computes steady-state flux solutions to ME models, but flux values are spread over many...... models have 70,000 constraints and variables and will grow larger). We have developed a quadrupleprecision version of our linear and nonlinear optimizer MINOS, and a solution procedure (DQQ) involving Double and Quad MINOS that achieves reliability and efficiency for ME models and other challenging...

  13. Zea mays iRS1563: A Comprehensive Genome-Scale Metabolic Reconstruction of Maize Metabolism

    Science.gov (United States)

    Saha, Rajib; Suthers, Patrick F.; Maranas, Costas D.

    2011-01-01

    The scope and breadth of genome-scale metabolic reconstructions have continued to expand over the last decade. Herein, we introduce a genome-scale model for a plant with direct applications to food and bioenergy production (i.e., maize). Maize annotation is still underway, which introduces significant challenges in the association of metabolic functions to genes. The developed model is designed to meet rigorous standards on gene-protein-reaction (GPR) associations, elementally and charged balanced reactions and a biomass reaction abstracting the relative contribution of all biomass constituents. The metabolic network contains 1,563 genes and 1,825 metabolites involved in 1,985 reactions from primary and secondary maize metabolism. For approximately 42% of the reactions direct literature evidence for the participation of the reaction in maize was found. As many as 445 reactions and 369 metabolites are unique to the maize model compared to the AraGEM model for A. thaliana. 674 metabolites and 893 reactions are present in Zea mays iRS1563 that are not accounted for in maize C4GEM. All reactions are elementally and charged balanced and localized into six different compartments (i.e., cytoplasm, mitochondrion, plastid, peroxisome, vacuole and extracellular). GPR associations are also established based on the functional annotation information and homology prediction accounting for monofunctional, multifunctional and multimeric proteins, isozymes and protein complexes. We describe results from performing flux balance analysis under different physiological conditions, (i.e., photosynthesis, photorespiration and respiration) of a C4 plant and also explore model predictions against experimental observations for two naturally occurring mutants (i.e., bm1 and bm3). The developed model corresponds to the largest and more complete to-date effort at cataloguing metabolism for a plant species. PMID:21755001

  14. Network thermodynamic curation of human and yeast genome-scale metabolic models.

    Science.gov (United States)

    Martínez, Verónica S; Quek, Lake-Ee; Nielsen, Lars K

    2014-07-15

    Genome-scale models are used for an ever-widening range of applications. Although there has been much focus on specifying the stoichiometric matrix, the predictive power of genome-scale models equally depends on reaction directions. Two-thirds of reactions in the two eukaryotic reconstructions Homo sapiens Recon 1 and Yeast 5 are specified as irreversible. However, these specifications are mainly based on biochemical textbooks or on their similarity to other organisms and are rarely underpinned by detailed thermodynamic analysis. In this study, a to our knowledge new workflow combining network-embedded thermodynamic and flux variability analysis was used to evaluate existing irreversibility constraints in Recon 1 and Yeast 5 and to identify new ones. A total of 27 and 16 new irreversible reactions were identified in Recon 1 and Yeast 5, respectively, whereas only four reactions were found with directions incorrectly specified against thermodynamics (three in Yeast 5 and one in Recon 1). The workflow further identified for both models several isolated internal loops that require further curation. The framework also highlighted the need for substrate channeling (in human) and ATP hydrolysis (in yeast) for the essential reaction catalyzed by phosphoribosylaminoimidazole carboxylase in purine metabolism. Finally, the framework highlighted differences in proline metabolism between yeast (cytosolic anabolism and mitochondrial catabolism) and humans (exclusively mitochondrial metabolism). We conclude that network-embedded thermodynamics facilitates the specification and validation of irreversibility constraints in compartmentalized metabolic models, at the same time providing further insight into network properties. Copyright © 2014 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  15. Genome-scale model guided design of Propionibacterium for enhanced propionic acid production.

    Science.gov (United States)

    Navone, Laura; McCubbin, Tim; Gonzalez-Garcia, Ricardo A; Nielsen, Lars K; Marcellin, Esteban

    2018-06-01

    Production of propionic acid by fermentation of propionibacteria has gained increasing attention in the past few years. However, biomanufacturing of propionic acid cannot compete with the current oxo-petrochemical synthesis process due to its well-established infrastructure, low oil prices and the high downstream purification costs of microbial production. Strain improvement to increase propionic acid yield is the best alternative to reduce downstream purification costs. The recent generation of genome-scale models for a number of Propionibacterium species facilitates the rational design of metabolic engineering strategies and provides a new opportunity to explore the metabolic potential of the Wood-Werkman cycle. Previous strategies for strain improvement have individually targeted acid tolerance, rate of propionate production or minimisation of by-products. Here we used the P. freudenreichii subsp . shermanii and the pan- Propionibacterium genome-scale metabolic models (GEMs) to simultaneously target these combined issues. This was achieved by focussing on strategies which yield higher energies and directly suppress acetate formation. Using P. freudenreichii subsp . shermanii , two strategies were assessed. The first tested the ability to manipulate the redox balance to favour propionate production by over-expressing the first two enzymes of the pentose-phosphate pathway (PPP), Zwf (glucose-6-phosphate 1-dehydrogenase) and Pgl (6-phosphogluconolactonase). Results showed a 4-fold increase in propionate to acetate ratio during the exponential growth phase. Secondly, the ability to enhance the energy yield from propionate production by over-expressing an ATP-dependent phosphoenolpyruvate carboxykinase (PEPCK) and sodium-pumping methylmalonyl-CoA decarboxylase (MMD) was tested, which extended the exponential growth phase. Together, these strategies demonstrate that in silico design strategies are predictive and can be used to reduce by-product formation in

  16. Genome-scale consequences of cofactor balancing in engineered pentose utilization pathways in Saccharomyces cerevisiae.

    Directory of Open Access Journals (Sweden)

    Amit Ghosh

    Full Text Available Biofuels derived from lignocellulosic biomass offer promising alternative renewable energy sources for transportation fuels. Significant effort has been made to engineer Saccharomyces cerevisiae to efficiently ferment pentose sugars such as D-xylose and L-arabinose into biofuels such as ethanol through heterologous expression of the fungal D-xylose and L-arabinose pathways. However, one of the major bottlenecks in these fungal pathways is that the cofactors are not balanced, which contributes to inefficient utilization of pentose sugars. We utilized a genome-scale model of S. cerevisiae to predict the maximal achievable growth rate for cofactor balanced and imbalanced D-xylose and L-arabinose utilization pathways. Dynamic flux balance analysis (DFBA was used to simulate batch fermentation of glucose, D-xylose, and L-arabinose. The dynamic models and experimental results are in good agreement for the wild type and for the engineered D-xylose utilization pathway. Cofactor balancing the engineered D-xylose and L-arabinose utilization pathways simulated an increase in ethanol batch production of 24.7% while simultaneously reducing the predicted substrate utilization time by 70%. Furthermore, the effects of cofactor balancing the engineered pentose utilization pathways were evaluated throughout the genome-scale metabolic network. This work not only provides new insights to the global network effects of cofactor balancing but also provides useful guidelines for engineering a recombinant yeast strain with cofactor balanced engineered pathways that efficiently co-utilizes pentose and hexose sugars for biofuels production. Experimental switching of cofactor usage in enzymes has been demonstrated, but is a time-consuming effort. Therefore, systems biology models that can predict the likely outcome of such strain engineering efforts are highly useful for motivating which efforts are likely to be worth the significant time investment.

  17. TCW: transcriptome computational workbench.

    Directory of Open Access Journals (Sweden)

    Carol Soderlund

    Full Text Available BACKGROUND: The analysis of transcriptome data involves many steps and various programs, along with organization of large amounts of data and results. Without a methodical approach for storage, analysis and query, the resulting ad hoc analysis can lead to human error, loss of data and results, inefficient use of time, and lack of verifiability, repeatability, and extensibility. METHODOLOGY: The Transcriptome Computational Workbench (TCW provides Java graphical interfaces for methodical analysis for both single and comparative transcriptome data without the use of a reference genome (e.g. for non-model organisms. The singleTCW interface steps the user through importing transcript sequences (e.g. Illumina or assembling long sequences (e.g. Sanger, 454, transcripts, annotating the sequences, and performing differential expression analysis using published statistical programs in R. The data, metadata, and results are stored in a MySQL database. The multiTCW interface builds a comparison database by importing sequence and annotation from one or more single TCW databases, executes the ESTscan program to translate the sequences into proteins, and then incorporates one or more clusterings, where the clustering options are to execute the orthoMCL program, compute transitive closure, or import clusters. Both singleTCW and multiTCW allow extensive query and display of the results, where singleTCW displays the alignment of annotation hits to transcript sequences, and multiTCW displays multiple transcript alignments with MUSCLE or pairwise alignments. The query programs can be executed on the desktop for fastest analysis, or from the web for sharing the results. CONCLUSION: It is now affordable to buy a multi-processor machine, and easy to install Java and MySQL. By simply downloading the TCW, the user can interactively analyze, query and view their data. The TCW allows in-depth data mining of the results, which can lead to a better understanding of the

  18. A Genomic Survey of SCPP Family Genes in Fishes Provides Novel Insights into the Evolution of Fish Scales.

    Science.gov (United States)

    Lv, Yunyun; Kawasaki, Kazuhiko; Li, Jia; Li, Yanping; Bian, Chao; Huang, Yu; You, Xinxin; Shi, Qiong

    2017-11-16

    The family of secretory calcium-binding phosphoproteins (SCPPs) have been considered vital to skeletal tissue mineralization. However, most previous SCPP studies focused on phylogenetically distant animals but not on those closely related species. Here we provide novel insights into the coevolution of SCPP genes and fish scales in 10 species from Otophysi . According to their scale phenotypes, these fishes can be divided into three groups, i.e., scaled, sparsely scaled, and scaleless. We identified homologous SCPP genes in the genomes of these species and revealed an absence of some SCPP members in some genomes, suggesting an uneven evolutionary history of SCPP genes in fishes. In addition, most of these SCPP genes, with the exception of SPP1 , individually form one or two gene cluster(s) on each corresponding genome. Furthermore, we constructed phylogenetic trees using maximum likelihood method to estimate their evolution. The phylogenetic topology mostly supports two subclasses in some species, such as Cyprinus carpio , Sinocyclocheilus anshuiensis , S. grahamin , and S. rhinocerous , but not in the other examined fishes. By comparing the gene structures of recently reported candidate genes, SCPP1 and SCPP5 , for determining scale phenotypes, we found that the hypothesis is suitable for Astyanax mexicanus , but denied by S. anshuiensis , even though they are both sparsely scaled for cave adaptation. Thus, we conclude that, although different fish species display similar scale phenotypes, the underlying genetic changes however might be diverse. In summary, this paper accelerates the recognition of the SCPP family in teleosts for potential scale evolution.

  19. The Escherichia coli transcriptome linked to growth fitness

    Directory of Open Access Journals (Sweden)

    Bei-Wen Ying

    2016-03-01

    Full Text Available A series of Escherichia coli strains with varied genomic sequences were subjected to high-density microarray analyses to elucidate the fitness-correlated transcriptomes. Fitness, which is commonly evaluated by the growth rate during the exponential phase, is not only determined by the genome but is also linked to growth conditions, e.g., temperature. We previously reported genetic and environmental contributions to E. coli transcriptomes and evolutionary transcriptome changes in thermal adaptation. Here, we describe experimental details on how to prepare microarray samples that truly represent the growth fitness of the E. coli cells. A step-by-step record of sample preparation procedures that correspond to growing cells and transcriptome data sets that are deposited at the GEO database (GSE33212, GSE52770, GSE61739 are also provided for reference. Keywords: Transcriptome, Growth fitness, Escherichia coli, Microarray

  20. Determining the control circuitry of redox metabolism at the genome-scale.

    Directory of Open Access Journals (Sweden)

    Stephen Federowicz

    2014-04-01

    Full Text Available Determining how facultative anaerobic organisms sense and direct cellular responses to electron acceptor availability has been a subject of intense study. However, even in the model organism Escherichia coli, established mechanisms only explain a small fraction of the hundreds of genes that are regulated during electron acceptor shifts. Here we propose a qualitative model that accounts for the full breadth of regulated genes by detailing how two global transcription factors (TFs, ArcA and Fnr of E. coli, sense key metabolic redox ratios and act on a genome-wide basis to regulate anabolic, catabolic, and energy generation pathways. We first fill gaps in our knowledge of this transcriptional regulatory network by carrying out ChIP-chip and gene expression experiments to identify 463 regulatory events. We then interfaced this reconstructed regulatory network with a highly curated genome-scale metabolic model to show that ArcA and Fnr regulate >80% of total metabolic flux and 96% of differential gene expression across fermentative and nitrate respiratory conditions. Based on the data, we propose a feedforward with feedback trim regulatory scheme, given the extensive repression of catabolic genes by ArcA and extensive activation of chemiosmotic genes by Fnr. We further corroborated this regulatory scheme by showing a 0.71 r(2 (p<1e-6 correlation between changes in metabolic flux and changes in regulatory activity across fermentative and nitrate respiratory conditions. Finally, we are able to relate the proposed model to a wealth of previously generated data by contextualizing the existing transcriptional regulatory network.

  1. The population genomics of begomoviruses: global scale population structure and gene flow

    Directory of Open Access Journals (Sweden)

    Prasanna HC

    2010-09-01

    Full Text Available Abstract Background The rapidly growing availability of diverse full genome sequences from across the world is increasing the feasibility of studying the large-scale population processes that underly observable pattern of virus diversity. In particular, characterizing the genetic structure of virus populations could potentially reveal much about how factors such as geographical distributions, host ranges and gene flow between populations combine to produce the discontinuous patterns of genetic diversity that we perceive as distinct virus species. Among the richest and most diverse full genome datasets that are available is that for the dicotyledonous plant infecting genus, Begomovirus, in the Family Geminiviridae. The begomoviruses all share the same whitefly vector, are highly recombinogenic and are distributed throughout tropical and subtropical regions where they seriously threaten the food security of the world's poorest people. Results We focus here on using a model-based population genetic approach to identify the genetically distinct sub-populations within the global begomovirus meta-population. We demonstrate the existence of at least seven major sub-populations that can further be sub-divided into as many as thirty four significantly differentiated and genetically cohesive minor sub-populations. Using the population structure framework revealed in the present study, we further explored the extent of gene flow and recombination between genetic populations. Conclusions Although geographical barriers are apparently the most significant underlying cause of the seven major population sub-divisions, within the framework of these sub-divisions, we explore patterns of gene flow to reveal that both host range differences and genetic barriers to recombination have probably been major contributors to the minor population sub-divisions that we have identified. We believe that the global Begomovirus population structure revealed here could

  2. The architecture of ArgR-DNA complexes at the genome-scale in Escherichia coli

    DEFF Research Database (Denmark)

    Cho, Suhyung; Cho, Yoo-Bok; Kang, Taek Jin

    2015-01-01

    DNA-binding motifs that are recognized by transcription factors (TFs) have been well studied; however, challenges remain in determining the in vivo architecture of TF-DNA complexes on a genome-scale. Here, we determined the in vivo architecture of Escherichia coli arginine repressor (ArgR)-DNA co...

  3. A pipeline for the de novo assembly of the Themira biloba (Sepsidae: Diptera) transcriptome using a multiple k-mer length approach.

    Science.gov (United States)

    Melicher, Dacotah; Torson, Alex S; Dworkin, Ian; Bowsher, Julia H

    2014-03-12

    The Sepsidae family of flies is a model for investigating how sexual selection shapes courtship and sexual dimorphism in a comparative framework. However, like many non-model systems, there are few molecular resources available. Large-scale sequencing and assembly have not been performed in any sepsid, and the lack of a closely related genome makes investigation of gene expression challenging. Our goal was to develop an automated pipeline for de novo transcriptome assembly, and to use that pipeline to assemble and analyze the transcriptome of the sepsid Themira biloba. Our bioinformatics pipeline uses cloud computing services to assemble and analyze the transcriptome with off-site data management, processing, and backup. It uses a multiple k-mer length approach combined with a second meta-assembly to extend transcripts and recover more bases of transcript sequences than standard single k-mer assembly. We used 454 sequencing to generate 1.48 million reads from cDNA generated from embryo, larva, and pupae of T. biloba and assembled a transcriptome consisting of 24,495 contigs. Annotation identified 16,705 transcripts, including those involved in embryogenesis and limb patterning. We assembled transcriptomes from an additional three non-model organisms to demonstrate that our pipeline assembled a higher-quality transcriptome than single k-mer approaches across multiple species. The pipeline we have developed for assembly and analysis increases contig length, recovers unique transcripts, and assembles more base pairs than other methods through the use of a meta-assembly. The T. biloba transcriptome is a critical resource for performing large-scale RNA-Seq investigations of gene expression patterns, and is the first transcriptome sequenced in this Dipteran family.

  4. Noise analysis of genome-scale protein synthesis using a discrete computational model of translation

    Energy Technology Data Exchange (ETDEWEB)

    Racle, Julien; Hatzimanikatis, Vassily, E-mail: vassily.hatzimanikatis@epfl.ch [Laboratory of Computational Systems Biotechnology, Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne (Switzerland); Swiss Institute of Bioinformatics (SIB), CH-1015 Lausanne (Switzerland); Stefaniuk, Adam Jan [Laboratory of Computational Systems Biotechnology, Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne (Switzerland)

    2015-07-28

    Noise in genetic networks has been the subject of extensive experimental and computational studies. However, very few of these studies have considered noise properties using mechanistic models that account for the discrete movement of ribosomes and RNA polymerases along their corresponding templates (messenger RNA (mRNA) and DNA). The large size of these systems, which scales with the number of genes, mRNA copies, codons per mRNA, and ribosomes, is responsible for some of the challenges. Additionally, one should be able to describe the dynamics of ribosome exchange between the free ribosome pool and those bound to mRNAs, as well as how mRNA species compete for ribosomes. We developed an efficient algorithm for stochastic simulations that addresses these issues and used it to study the contribution and trade-offs of noise to translation properties (rates, time delays, and rate-limiting steps). The algorithm scales linearly with the number of mRNA copies, which allowed us to study the importance of genome-scale competition between mRNAs for the same ribosomes. We determined that noise is minimized under conditions maximizing the specific synthesis rate. Moreover, sensitivity analysis of the stochastic system revealed the importance of the elongation rate in the resultant noise, whereas the translation initiation rate constant was more closely related to the average protein synthesis rate. We observed significant differences between our results and the noise properties of the most commonly used translation models. Overall, our studies demonstrate that the use of full mechanistic models is essential for the study of noise in translation and transcription.

  5. Genome-wide transcriptome profiling of black poplar (Populus nigra L.) under boron toxicity revealed candidate genes responsible in boron uptake, transport and detoxification.

    Science.gov (United States)

    Yıldırım, Kubilay; Uylaş, Senem

    2016-12-01

    Boron (B) is an essential nutrient for normal growth of plants. Despite its low abundance in soils, it could be highly toxic to plants in especially arid and semi-arid environments. Poplars are known to be tolerant species to B toxicity and accumulation. However, physiological and gene regulation responses of these trees to B toxicity have not been investigated yet. Here, B accumulation and tolerance level of black poplar clones were firstly tested in the current study. Rooted cutting of these clones were treated with elevated B toxicity to select the most B accumulator and tolerant genotype. Then we carried out a microarray based transcriptome experiment on the leaves and roots of this genotype to find out transcriptional networks, genes and molecular mechanisms behind B toxicity tolerance. The results of the study indicated that black poplar is quite suitable for phytoremediation of B pollution. It could resist 15 ppm soil B content and >1500 ppm B accumulation in leaves, which are highly toxic concentrations for almost all agricultural plants. Transcriptomics results of study revealed totally 1625 and 1419 altered probe sets under 15 ppm B toxicity in leaf and root tissues, respectively. The highest induction were recorded for the probes sets annotated to tyrosine aminotransferase, ATP binding cassette transporters, glutathione S transferases and metallochaperone proteins. Strong up regulation of these genes attributed to internal excretion of B into the cell vacuole and existence of B detoxification processes in black poplar. Many other candidate genes functional in signalling, gene regulation, antioxidation, B uptake and transport processes were also identified in this hyper B accumulator plant for the first time with the current study. Copyright © 2016 Elsevier Masson SAS. All rights reserved.

  6. RegPrecise 3.0--a resource for genome-scale exploration of transcriptional regulation in bacteria.

    Science.gov (United States)

    Novichkov, Pavel S; Kazakov, Alexey E; Ravcheev, Dmitry A; Leyn, Semen A; Kovaleva, Galina Y; Sutormin, Roman A; Kazanov, Marat D; Riehl, William; Arkin, Adam P; Dubchak, Inna; Rodionov, Dmitry A

    2013-11-01

    Genome-scale prediction of gene regulation and reconstruction of transcriptional regulatory networks in prokaryotes is one of the critical tasks of modern genomics. Bacteria from different taxonomic groups, whose lifestyles and natural environments are substantially different, possess highly diverged transcriptional regulatory networks. The comparative genomics approaches are useful for in silico reconstruction of bacterial regulons and networks operated by both transcription factors (TFs) and RNA regulatory elements (riboswitches). RegPrecise (http://regprecise.lbl.gov) is a web resource for collection, visualization and analysis of transcriptional regulons reconstructed by comparative genomics. We significantly expanded a reference collection of manually curated regulons we introduced earlier. RegPrecise 3.0 provides access to inferred regulatory interactions organized by phylogenetic, structural and functional properties. Taxonomy-specific collections include 781 TF regulogs inferred in more than 160 genomes representing 14 taxonomic groups of Bacteria. TF-specific collections include regulogs for a selected subset of 40 TFs reconstructed across more than 30 taxonomic lineages. Novel collections of regulons operated by RNA regulatory elements (riboswitches) include near 400 regulogs inferred in 24 bacterial lineages. RegPrecise 3.0 provides four classifications of the reference regulons implemented as controlled vocabularies: 55 TF protein families; 43 RNA motif families; ~150 biological processes or metabolic pathways; and ~200 effectors or environmental signals. Genome-wide visualization of regulatory networks and metabolic pathways covered by the reference regulons are available for all studied genomes. A separate section of RegPrecise 3.0 contains draft regulatory networks in 640 genomes obtained by an conservative propagation of the reference regulons to closely related genomes. RegPrecise 3.0 gives access to the transcriptional regulons reconstructed in